Generators are lazy iterators created by generator functions (using
yield) or generator expressions (using
(an_expression for x in an_iterator)).
- yield from
Generator expressions are similar to list, dictionary and set comprehensions, but are enclosed with parentheses. The parentheses do not have to be present when they are used as the sole argument for a function call.
This example generates the 10 first perfect squares, including 0 (in which x = 0).
Generator functions are similar to regular functions, except that they have one or more
yield statements in their body. Such functions cannot
return any values (however empty
returns are allowed if you want to stop the generator early).
This generator function is equivalent to the previous generator expression, it outputs the same.
Note: all generator expressions have their own equivalent functions, but not vice versa.
A generator expression can be used without parentheses if both parentheses would be repeated otherwise:
Calling a generator function produces a generator object, which can later be iterated over. Unlike other types of iterators, generator objects may only be traversed once.
Notice that a generator's body is not immediately executed: when you call
function() in the example above, it immediately returns a generator object, without executing even the first print statement. This allows generators to consume less memory than functions that return a list, and it allows creating generators that produce infinitely long sequences.
For this reason, generators are often used in data science, and other contexts involving large amounts of data. Another advantage is that other code can immediately use the values yielded by a generator, without waiting for the complete sequence to be produced.
However, if you need to use the values produced by a generator more than once, and if generating them costs more than storing, it may be better to store the yielded values as a
list than to re-generate the sequence. See 'Resetting a generator' below for more details.
Typically a generator object is used in a loop, or in any function that requires an iterable:
Since generator objects are iterators, one can iterate over them manually using the
next() function. Doing so will return the yielded values one by one on each subsequent invocation.
Under the hood, each time you call
next() on a generator, Python executes statements in the body of the generator function until it hits the next
yield statement. At this point it returns the argument of the
yield command, and remembers the point where that happened. Calling
next() once again will resume execution from that point and continue until the next
If Python reaches the end of the generator function without encountering any more
StopIteration exception is raised (this is normal, all iterators behave in the same way).
Note that in Python 2 generator objects had
.next() methods that could be used to iterate through the yielded values manually. In Python 3 this method was replaced with the
.__next__() standard for all iterators.
Resetting a generator
Remember that you can only iterate through the objects generated by a generator once. If you have already iterated through the objects in a script, any further attempt do so will yield
If you need to use the objects generated by a generator more than once, you can either define the generator function again and use it a second time, or, alternatively, you can store the output of the generator function in a list on first use. Re-defining the generator function will be a good option if you are dealing with large volumes of data, and storing a list of all data items would take up a lot of disc space. Conversely, if it is costly to generate the items initially, you may prefer to store the generated items in a list so that you can re-use them.
Generators can be used to implement coroutines:
Coroutines are commonly used to implement state machines, as they are primarily useful for creating single-method procedures that require a state to function properly. They operate on an existing state and return the value obtained on completion of the operation.
It's possible to create generator iterators using a comprehension-like syntax.
If a function doesn't necessarily need to be passed a list, you can save on characters (and improve readability) by placing a generator expression inside a function call. The parenthesis from the function call implicitly make your expression a generator expression.
Additionally, you will save on memory because instead of loading the entire list you are iterating over (
[0, 1, 2, 3] in the above example), the generator allows Python to use values as needed.
Generators can be used to represent infinite sequences:
Infinite sequence of numbers as above can also be generated with the help of
itertools.count. The above code could be written as below
You can use generator comprehensions on infinite generators to produce new generators:
Be aware that an infinite generator does not have an end, so passing it to any function that will attempt to consume the generator entirely will have dire consequences:
Instead, use list/set comprehensions with
xrange for python < 3.0):
itertools.islice() to slice the iterator to a subset:
Note that the original generator is updated too, just like all other generators coming from the same "root":
An infinite sequence can also be iterated with a
for-loop. Make sure to include a conditional
break statement so that the loop would terminate eventually:
Classic example - Fibonacci numbers
Iterating over generators in parallel
To iterate over several generators in parallel, use the
In python 2 you should use
itertools.izip instead. Here we can also see that the all the
zip functions yield tuples.
Note that zip will stop iterating as soon as one of the iterables runs out of items. If you'd like to iterate for as long as the longest iterable, use
A generator object supports the iterator protocol. That is, it provides a
next() method (
__next__() in Python 3.x), which is used to step through its execution, and its
__iter__ method returns itself. This means that a generator can be used in any language construct which supports generic iterable objects.
Refactoring list-building code
Suppose you have complex code that creates and returns a list by starting with a blank list and repeatedly appending to it:
When it's not practical to replace the inner logic with a list comprehension, you can turn the entire function into a generator in-place, and then collect the results:
If the logic is recursive, use
yield from to include all the values from the recursive call in a "flattened" result:
next function is useful even without iterating. Passing a generator expression to
next is a quick way to search for the first occurrence of an element matching some predicate. Procedural code like
can be replaced with:
For this purpose, it may be desirable to create an alias, such as
first = next, or a wrapper function to convert the exception:
Sending objects to a generator
In addition to receiving values from a generator, it is possible to send an object to a generator using the
What happens here is the following:
- When you first call
next(generator), the program advances to the first
yieldstatement, and returns the value of
totalat that point, which is 0. The execution of the generator suspends at this point.
- When you then call
generator.send(x), the interpreter takes the argument
xand makes it the return value of the last
yieldstatement, which gets assigned to
value. The generator then proceeds as usual, until it yields the next value.
- When you finally call
next(generator), the program treats this as if you're sending
Noneto the generator. There is nothing special about
None, however, this example uses
Noneas a special value to ask the generator to stop.
The next() function
next() built-in is a convenient wrapper which can be used to receive a value from any iterator (including a generator iterator) and to provide a default value in case the iterator is exhausted.
The syntax is
next(iterator[, default]). If iterator ends and a default value was passed, it is returned. If no default was provided,
StopIteration is raised.
Using a generator to find Fibonacci Numbers
A practical use case of a generator is to iterate through values of an infinite series. Here's an example of finding the first ten terms of the Fibonacci Sequence.
0, 1, 1, 2, 3, 5, 8, 13, 21, 34
Yield with recursion: recursively listing all files in a directory
First, import the libraries that work with files:
A helper function to read only files from a directory:
Another helper function to get only the subdirectories:
Now use these functions to recursively get all files within a directory and all its subdirectories (using generators):
This function can be simplified using
Yielding all values from another iterable
yield from if you want to yield all values from another iterable:
This works with generators as well.