Expert Pythonistas Hate This: Five Quirky Things About Python That Will Shock You

Expert Pythonistas Hate This: Five Quirky Things About Python That Will Shock You

At EverQuote, we emphasize Heart as a core value. We use biweekly tech talks as a way to come together across teams and learn the intricacies of our tools. Here are some of the quirky things we learned about Python--hope you enjoy as much as we did!

Other folks have gone over some common gotchas, and we hope to build on that in this post.

1. Building a list of lists

Imagine we're building a blackjack engine, and we need to initialize a data structure that tracks multiple decks of cards. We decide on a two-dimensional list. Let's initialize this list by taking advantage of

>>> decks = [[]] * 5

Now suppose we try to use this initialized structure and add a single card to the first deck.

>>> decks[0].append("queen of hearts")
>>> print(decks)

You might expect this to be:

[["queen of hearts"],[],[],[],[]]

But what actually happens is:

[["queen of hearts"], ["queen of hearts"], ["queen of hearts"], ["queen of hearts"], ["queen of hearts"]]

That's not what we want! It looks like the card got added to all the decks. Why is that? Well, turns out this is equivalent to:

decks = []
deck = []
for _ in range(5):
    decks.append(deck)

But what we actually wanted was:

decks = []
for _ in range(5):
    deck = []
    decks.append(deck)

The "one obvious way" of doing it:

>>> decks = [[] for _ in range(5)]

By using a list comprehension, we ensure that [] is evaluated separately for each iteration of range(5), creating five separate list objects. [[]] * 5 looks like it should accomplish the same thing, but it actually reuses the same inner list for all five elements of the outer list.

>>> decks[0].append("queen of hearts")
>>> print(decks)
[['queen of hearts'], [], [], [], []]

2. Mutable default arguments

Another place where list object literals might bite you is default arguments for functions and methods. Suppose we want the ability to add cards to a deck, creating a new deck if needed.

def add_card(card, deck=[]):
    deck.append(card)
    return deck

deck1 = add_card("ace of spades")
print(deck1)
deck2 = add_card("two of spades")
print(deck2)

You might expect the output to be:

["ace of spades"]
["two of spades"]

But what we actually get is:

["ace of spades"]
["ace of spades", "two of spades"]

The issue here is that deck=[] creates a persistent object identity that is bound as the default argument. The idiomatic way to handle this would be to do the following:

def add_card(card, deck=None):
    if not deck:
        deck = []
    deck.append(card)

3. Primitive integer identity

a = 256
b = 256
print(a is b)  # true
a = 257
b = 257
print(a is b)  # false?!

This counterintuitive behavior is actually due to an implementation detail of cPython. As a performance optimization, cPython maintains a static global array of Python objects for the integers -5 to 256, inclusive. Other implementations of Python (like PyPy) will not necessarily exhibit this behavior.

In Python, the is operator checks that the id of two objects are the same. That is, a is b is equivalent to id(a) == id(b). cPython treats id as the location in memory. Since integer literals for the special numbers -5 to 256 always resolve to the same underlying object,

One of the primary idiomatic use cases for the is operator is comparison to None. PEP 8 actually recommends this, specifically because the behavior of == can actually be overwritten by the compared operand. We'll come to understand this a little more deeply as we dive into magic methods.

4. Late binding closures

A closure is a technique that scoped name binding, allowing nested functions to inherit variables from an enclosing environment. In this example, we have a nested function print_card that uses the free variable card, which is defined in its enclosing function print_deck.

def print_deck():
    printers = []
    deck = ["ace of spades", "two of spades", "three of spades"]
    for card in deck:
        def print_card():
            print(card)
        printers.append(print_card)
    for printer in printers:
        printer()

So this function print_deck creates a list of functions called printers, and then simply loops through each printer function to execute them all. Let's call print_deck.

print_deck()

You might expect output like this:

ace of spades
two of spades
three of spades

What actually happens (annotated with some comments) is this:

three of spades  # ?
three of spades  # ?!
three of spades

Since card is actually defined in print_deck, each printer function has a "lazy" reference to the bound variable card. As print_deck is executed, this variable is reassigned to each element of deck. By the end of the loop, deck is now assigned the value of "three of spades", so when we finally execute all the printers, that's what we get!

One approach to get our desired behavior would be to use default arguments to force an early binding of the variable. card=card looks like an ugly hack since we're defining a new card that is a snapshot of the outer card, and that's a totally valid perspective. People unfamiliar with late binding closures might read this code and see that expression as a no-op, even though it serves the critical role of binding card early.

def print_deck():
    printers = []
    deck = ["ace of spades", "two of spades", "three of spades"]
    for card in deck:
        def print_card(card=card):
            print(card)
        printers.append(print_card)
    for printer in printers:
        printer()

5. Augmented assignment with tuples


Now that we've created our list of decks for our blackjack game, we want to make sure that the decks can't be altered, i.e. ensure that our list is immutable. One way to go about this is to convert it to a tuple. To test that your list of decks is indeed immutable, we try to add another card using an augmented operator i.e. "+=".

decks = tuple([["ace of spades"], ["three of clubs"], ["queen of hearts"]])
decks[2] += ["eight of diamonds"]

You might expect this to throw an exception as tuples are immutable, and you'd be right.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

However, the interpreter will still (maybe unexpectedly) append the to the nested list. If you print the tuple you'll get:

print(decks)
["ace of spades"], ["three of clubs"], ["queen of hearts", "eight of diamonds"]]

This quirk only happens if you try to use an augmented assignment += on a list within a tuple.

If you run the disassembler on that line of code, dis.dis(str), you will see that STORE_SUBSCRwhich evaluates TOS1[TOS] = TOS2—is run before that particular value is returned. This line would fail if TOS1 were a tuple. Since we've already mutated the list beforehand with INPLACE_ADD, that list will stay mutated, and the exception will still be raised.

Example of using the disassembler:

from dis import dis
t = (1,[2,3])
dis("t[1] += [1]")
  1           0 LOAD_NAME                0 (t)
              2 LOAD_CONST               0 (1)
              4 DUP_TOP_TWO
              6 BINARY_SUBSCR
              8 LOAD_CONST               0 (1)
             10 BUILD_LIST               1
             12 INPLACE_ADD
             14 ROT_THREE
             16 STORE_SUBSCR
             18 LOAD_CONST               1 (None)
             20 RETURN_VALUE

When we first discovered this, the discussion on Slack was quite entertaining:

Screenshot of a Slack message from Binam Kayastha: "still trying to read the bytecode disassembly to make sure I understand" you all are on a different level, joy emoji

Conclusion

Every language has its quirks. By building a deeper understanding of our tools, we improve our chances of avoiding nasty, difficult-to-debug errors. Python is a powerful language that prioritizes elegance and readability. It expects a lot of the developer, and it does not hold your hand as you write. Armed with the knowledge of how our tools work, we can spend more time creating new value and less time pulling out our hair. At least, that's the philosophy at EverQuote.

If you're interested in learning more fun and interesting things about our tech stack, check out our blog. If you're looking to solve interesting problems with a fun and motivated team, EverQuote is hiring!

And remember to always use == and be mindful of closures! :)