• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
Data Dependence

Data Dependence

Learn about software development. Covering topics such as Coding in Python and JavaScript, developer productivity and more.

  • Home
  • Top Resources
  • Who Are You?
  • Show Search
Hide Search
You are here: Home / Programming / [Video Series] Taking Your Python Skills to the Next Level With Pythonic Code – Processing Large Data Sets With Yield and Generators

[Video Series] Taking Your Python Skills to the Next Level With Pythonic Code – Processing Large Data Sets With Yield and Generators

· · 3 Comments

This is the 4th in a series covering Pythonic code written by Michael Kennedy of Talk Python To Me. Be sure to catch the whole series with 5 powerful Pythonic recommendations and over 45 minutes of video examples.

Any time you are writing a method that returns a sequence (especially if that sequence is a list), I recommend you pause and think whether that method should actually be a generator. They can add tremendous flexibility and performance to your applications.

BUT WHAT IS A GENERATOR?

What’s a generator? Well, with regard to methods, it involves the keyword yield rather than the return keyword in Python. The best way to describe how yield works is to write a method which is NOT a generator. For that we’ll use this basic Fibonacci sequence method:

def classic_fibonacci(limit):
    nums = []
    current, nxt = 0, 1

    while current < limit: 
        current, nxt = nxt, nxt + current 
        nums.append(current) 
    
    return nums

Notice a few things about this method.

  1. Calling classic_fibonacci(1000) will not return until all 1,000 have been generated.
  2. There is no practical way to iterate over this (infinite) set of numbers to look for a break condition other than the upper bound to compute. (e.g. suppose we are searching for f(n) such that f(n) is prime and f(n+1) is also prime? What number do we pass for the limit?)
  3. We will hold all 1,000 (or whatever limit is) numbers in memory at once.

We can actually simplify this method using yield and gracefully solve all three issues above in the process!

Here’s a better version of that method as a generator.

def generator_fibonacci(): 
    current, nxt = 0, 1 
    while True: 
        current, nxt = nxt, nxt + current 
        yield current

Now let’s compare. If we iterate over these two methods, we’ll get the same output (assuming the same number of items are processed):

for n in classic_fibonacci(10): 
    print(n, end=',') 
# prints 1,1,2,3,5,8,13,21,34,55 
for n in generator_fibonacci(): 
    if n > 55:
        break
    print(n, end=',')
# prints 1,1,2,3,5,8,13,21,34,55

If we call these methods and then ask for the type of return value, we’ll see something very different:

print(type(classic_fibonacci(10))) # prints <class 'list'>
print(type(generator_fibonacci())) # prints <class 'generator'>

In fact, the way generator_fibonacci works is very ingenious and has great performance benefits. To fully appreciate them, it’s best to see them in action. So be sure to watch this video and we’ll step through both methods using a debugger and more.

PROCESSING LARGE DATA SETS WITH YIELD AND GENERATORS

You can find the code from this video on GitHub.

Michael Kennedy
@mkennedy

Don't forget to share and follow!

Remember, don't forget to share this post so that other people can see it too! Also, make sure you subscribe to this blog's mailing list and follow me on Twitter so that you don't miss out on any useful posts!

I read all comments, so if you have something to say, something to share or questions and the like, leave a comment below!

‘Ello, I’m Jamal – a Tokyo-based, indie-hacking, FinTech software developer with a dependence on data.

I write Shakespeare-grade code, nowadays mostly in Python and JavaScript and productivity runs in my veins.

I’m friendly, so feel free to say hello!

Twitter is the best place for a chat.

Share this:

  • Click to share on Twitter (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Pocket (Opens in new window)

Related

Programming generators, python, pythonic, yield

Primary Sidebar

LEVEL UP with the Data Dependence Newsletter! Subscribe!


Top Posts & Pages

  • How to Find Unclosed Tags and Brackets Using a Stack
  • How to Build a GUI in Python 3.5
  • Mocking in Python - How to Bypass Expensive and External Code Within Your Tests
  • How to Logically Group Your Python Code into Modules
  • A Quick Guide to Slicing in Python - Become a Python Ninja

Data Dependence

Copyright © 2023 · Monochrome Pro on Genesis Framework · WordPress · Log in

  • Home
  • Top Resources
  • Who Are You?