Taking Your Python Skills to the Next Level With Pythonic Code – Stop Using Lists for Everything
This is the 1st in a series covering Pythonic code written by Michael Kennedy of Talk Python To Me. Be sure to catch the whole series with 5 powerful Pythonic recommendations and over 45 minutes of video examples.
What is the most used container type in Python? The list, of course. If you have a bunch of data you need to store and then access again later, chances are you’ll start like this:
data = []
The list is a fabulous data structure and is highly optimised in Python, it’s implemented in C internally in default implementation of Python (known as CPython) and so on. But sometimes it’s just the wrong data structure for the problem you’re trying to solve.
SO WHEN SHOULDN’T I USE A LIST?
Let’s consider a data science-y / mathematical scenario. Suppose we have 500,000 complex items with multiple attributes. We need to, at runtime, compute a small set of these that we are searching for by some special value or attribute of the item. Then, we need to go into our data structure and pull those out.
This is entirely doable with a list. Here’s a bit code to think about:
DataPoint = collections.namedtuple("DataPoint", "id x y temp quality") data = load_data_set() # returns list of 500,000 DataPoints interesting_ids = compute_desired_items() # returns 100 IDs interesting_points = [] for i in interesting_ids: pt = find_point_by_id_in_list(data_list, i) interesting_points.append(pt)
This works but it is unbelievably inefficient. The pythonic way would be to drop that final loop and use a dictionary with the key being the ID to search by:
# create a dictionary to find elements by id: data_dict = {d.id: d for d in data_list} interesting_points = [] for d_id in interesting_ids: # dictionary look up rather than manual search d = data_dict[d_id] interesting_points.append(d)
So the big question; is this any faster? Is it any better? Check out the video tutorial to see the full explanation. The short answer is 100% yes.
VIDEO: STOP USING LISTS FOR EVERYTHING
WANT THE SOURCE CODE?
Get the full source code for this example on GitHub.
Michael Kennedy
‘Ello, I’m Jamal – a Tokyo-based, indie-hacking, FinTech software developer with a dependence on data.
I write Shakespeare-grade code, nowadays mostly in Python and JavaScript and productivity runs in my veins.
I’m friendly, so feel free to say hello!