python tip

Save Memory With Generators

In tip #2 I showed you list comprehension. But a list is not always the best choice. Let's say we have a very large list with 10000 items and we want to calculate the sum over all the items. We can of course do this with a list, but we might run into memory issues. This is a perfect example where we can use generators. Similar to list comprehension we can use generator comprehension that has the same syntax but with parenthesis instead of square brackets. A generator computes our elements lazily, i.e., it produces only one item at a time and only when asked for it. If we calculate the sum over this generator, we see that we get the same correct result.

# list comprehension
my_list = [i for i in range(10000)]
print(sum(my_list)) # 49995000

# generator comprehension
my_gen = (i for i in range(10000))
print(sum(my_gen)) # 49995000

Now let's inspect the size of both the list and the generator with the built-in sys.getsizeof() method. For the list we get over 80000 bytes and for the generator we only get approximately 128 bytes because it only generates one item at a time. This can make a huge difference when working with large data, so it's always good to keep the generator in mind!

import sys 

my_list = [i for i in range(10000)]
print(sys.getsizeof(my_list), 'bytes') # 87616 bytes

my_gen = (i for i in range(10000))
print(sys.getsizeof(my_gen), 'bytes') # 128 bytes