Learning Python and buying groceries

I write a lot of Python. I write a lot about Python. It’s what my sixty-year old Systems professor described as “executable pseudocode” (with a slight level of disdain). Python code is incredibly readable: try browsing GitHub’s Python repositories compared to their C++ ones.

One of the things I love about the language is the Zen of Python, ‘guiding principles’ for the language. One of these is:

There should be one — and preferably only one — obvious way to do it.

This tenet, unfortunately, is usually violated. Python has a remarkable advantage in that if you write some code and it looks like it should work, it’ll probably work. Still, that doesn’t mean that your first attempt at solving a problem is always going to be ideal. In this blog post, I want to take a pretty common Python pattern and break it down, showing you some cool, concise things you can do with the language to improve your code.

To start off, we’re in the middle of a trip to the grocery store:

grocery_list = ['ale', 'vodka', 'bananas', 'more ale', 'bread', 'hummus', 'diet soda']
purchased_items = ['ale', 'vodka', 'bananas', 'bread', 'diet soda']

And we want to check if we’ve bought all of our groceries! The most obvious algorithm is this to do this is to go item-by-item in our list, checking in our cart that it’s there:

done_with_groceries = True
for item in grocery_list:
    if item not in purchased_items:
        done_with_groceries = False
if not done_with_groceries:
    print "Haven't finished yet :("

This is relatively readable, verbose code, and will work perfectly fine. However, I really don’t like that done_with_groceries variable: it isn’t being used outside the scope of that snippet, and it feels clumsy. It gives a bit more semantics to what we’re doing, but it seems to clog down the code a little bit.

One of the little-known features in Python is that for and while loops can have an else clause. Specifically, these else clauses are executed if the loop terminates naturally, which is fancy talk for leaving the loop without a break statement. Let’s test this out:

# Else is executed if the for loop terminates naturally (ie not via break)
for item in grocery_list:
    if item not in purchased_items:
        print "Haven't finished yet :("
        break
else:
    print "Finished!"

Woo! It’s definitely less dense code — but is it better code?

Empty List Syndrome

I’d argue that while it’s important — and helpful — to know how to use for...else, this might not be the best way to do it, because it wouldn’t make sense to anyone who doesn’t understand the mechanics behind it. While it gets rid of the unnecessary boolean, we lose some of the readability that makes Python so wonderful in the first place.

Instead, let’s turn to a different fun fact — in Python, empty containers (lists, sets, dictionaries) evaluate to false. Put another way, the following code does not throw an error:

assert not []

So how do we take advantage of this? Let’s create a list — using fancy list comprehensions — of all elements in the grocery list that aren’t in our cart. If that list is empty, then we’ve bought everything.

# Concise option: [] evaluates to false
if [item for item in grocery_list if item not in purchased_items]:
    print "Haven't finished yet :("

Concise, no? That list comprehension is dense, but surprisingly readable.

Game, Sets, Match

Another way to treat this entire issue is by re-examining how we’re treating the two data sets. Instead of lists, we should be treating them as mathematical sets. You may remember these from Discrete Math or Data Structures. If not, sets are merely a specific type of list: unique collections of unordered elements. They have a bunch of simple methods that are fairly consistent across all languages: just like a stack in Java being pretty much identical to a stack in Python, sets have a set (heh) of methods that are perfect to tackle problems like these. Below are two wonderful ways to tackle this problem:

# Treating things as a set!
if not set(grocery_list).issubset(purchased_items):
    print "Haven't finished yet :(" 

if set(grocery_list) - set(purchased_items):
    print "Haven't finished yet :("

Concise. Readable. Simple. Sets are great and are pretty much never used enough, despite how easy they are to use (hell, they’re even part of Python’s standard library — you don’t need to import anything.)

So, in conclusion, we’ve learned:

  • Thinking is important.
  • For loops have else statements.
  • Empty containers evaluate to false.
  • Readability is important.
  • Sets are cool.

As you can probably imagine, the grocery list metaphor is pretty extendible. No matter what you’re doing as a programmer you’re going to be dealing with lists and list comparisons at some point; while they might be database-driven instead of hard-coded, and you’ll be testing for more sophisticated results than item membership, it’s important to take a step back and make sure you’re approaching a problem and arriving at a beautiful result. (And never forget to buy ‘more ale’.)

If you felt like there’s another approach to this generic problem that I’ve missed, shoot me an email and I’ll be sure to add it! And if you like weird/cool Python articles like these, be sure to follow me on Twitter.