This is the 4th in a series covering Pythonic code written by Michael Kennedy of Talk Python To Me. Be sure to catch the whole series with 5 powerful Pythonic recommendations and over 45 minutes of video examples.
Any time you are writing a method that returns a sequence (especially if that sequence is a list), I recommend you pause and think whether that method should actually be a generator. They can add tremendous flexibility and performance to your applications.
BUT WHAT IS A GENERATOR?
What’s a generator? Well, with regard to methods, it involves the keyword yield rather than the return keyword in Python. The best way to describe how yield works is to write a method which is NOT a generator. For that we’ll use this basic Fibonacci sequence method:
def classic_fibonacci(limit): nums = [] current, nxt = 0, 1 while current < limit: current, nxt = nxt, nxt + current nums.append(current) return nums
Notice a few things about this method.
- Calling classic_fibonacci(1000) will not return until all 1,000 have been generated.
- There is no practical way to iterate over this (infinite) set of numbers to look for a break condition other than the upper bound to compute. (e.g. suppose we are searching for f(n) such that f(n) is prime and f(n+1) is also prime? What number do we pass for the limit?)
- We will hold all 1,000 (or whatever limit is) numbers in memory at once.
We can actually simplify this method using yield and gracefully solve all three issues above in the process!
Here’s a better version of that method as a generator.
def generator_fibonacci(): current, nxt = 0, 1 while True: current, nxt = nxt, nxt + current yield current
Now let’s compare. If we iterate over these two methods, we’ll get the same output (assuming the same number of items are processed):
for n in classic_fibonacci(10): print(n, end=',') # prints 1,1,2,3,5,8,13,21,34,55 for n in generator_fibonacci(): if n > 55: break print(n, end=',') # prints 1,1,2,3,5,8,13,21,34,55
If we call these methods and then ask for the type of return value, we’ll see something very different:
print(type(classic_fibonacci(10))) # prints <class 'list'> print(type(generator_fibonacci())) # prints <class 'generator'>
In fact, the way generator_fibonacci works is very ingenious and has great performance benefits. To fully appreciate them, it’s best to see them in action. So be sure to watch this video and we’ll step through both methods using a debugger and more.
PROCESSING LARGE DATA SETS WITH YIELD AND GENERATORS
You can find the code from this video on GitHub.
Michael Kennedy
@mkennedy
‘Ello, I’m Jamal – a Tokyo-based, indie-hacking, FinTech software developer with a dependence on data.
I write Shakespeare-grade code, nowadays mostly in Python and JavaScript and productivity runs in my veins.
I’m friendly, so feel free to say hello!