This is the second post in a series covering Pythonic code written by Michael Kennedy of Talk Python To Me. Be sure to catch the whole series with 5 powerful Pythonic recommendations and over 45 minutes of video examples.
What if I told you there was a simple technique you can apply to your custom classes that would dramatically decrease the memory usage of your application? Well, there is; it involves the keyword __slots__ and that’s what this post is all about.
The story starts with a blog post from Oyster.com with a powerful title and image:
SAVING 9 GB OF RAM WITH PYTHON’S __SLOTS__
Oyster.com is a travel website with lots of traffic. They were able to use a very simple technique to “hack” python’s type memory management using the __slots__ language feature. Read oyters.com’s story here.
To understand slots, it’s best to start by discussing how fields are associated with standard Python classes. Consider this class:
class Measurement: def __init__(self, x, y, value): self.x = x self.y = y self.val = value
If we allocate a few of them, setting some values via the initializer and even dynamically after allocation on one of the instances, you’ll have something like this in memory:
m1 = Measurement(1, 2, "Happy") m2 = Measurement(7, 10, "Crazy") m2.other = True
Notice that each instance has a pointer to the __dict__ containing the underlying field names and values. We can look at them like this:
print(m1.__dict__) # {'x': 1, 'y': 2, 'val': 'Happy'} print(m2.__dict__) # {'x': 7, 'y': 10, 'other': True, 'val': 'Crazy'}
This is how custom types in Python work and how they are meant to work.
However, if we aren’t customizing the fields after creation of our types (e.g. the line we set m2.other = True), and we have millions of instance of that type, it can be inefficient. We have millions of copies of those dictionaries including the duplicated key names.
USING SLOTS
We can adjust our Measurement class slightly to change how fields are stored to remove this duplication and 1-to-1 dictionary allocation.
class Measurement: __slots__ = ['x', 'y', 'val'] def __init__(self, x, y, value): self.x = x self.y = y self.val = value
Then we have a new picture in memory:
Now the field names are associated with the type Measurement, not the instances of Measurement. This is great because we only have one instances of the type (i.e. Measurement) and potentially millions of Measurement instances (i.e. m1, m2, …).
To see this in action, check out my video on this demonstrating this technique as well as comparing it to plain tuples and the trade-offs associated.
VIDEO: HACKING PYTHON’S MEMORY WITH __SLOTS__
You can find the code from this video on GitHub.
Michael Kennedy
@mkennedy
‘Ello, I’m Jamal – a Tokyo-based, indie-hacking, FinTech software developer with a dependence on data.
I write Shakespeare-grade code, nowadays mostly in Python and JavaScript and productivity runs in my veins.
I’m friendly, so feel free to say hello!