One of my goals for this book has been to teach you as little Python as possible. When there were two ways to do something, I picked one and avoided mentioning the other. Or sometimes I put the second one into an exercise.
Now I want to go back for some of the good bits that got left behind. Python provides a number of features that are not really necessary—you can write good code without them—but with them you can sometimes write code that’s more concise, readable or efficient, and sometimes all three.
We saw conditional statements in Section 8. Conditional statements are often used to choose one of two values; for example:
This statement checks whether x is positive. If so, it computes math.log. If not, math.log would raise a ValueError. To avoid stopping the program, we generate a “NaN”, which is a special floating-point value that represents “Not a Number”.
We can write this statement more concisely using a conditional expression:
You can almost read this line like English: “y gets log-x if x is greater than 0; otherwise it gets NaN”.
Recursive functions can sometimes be rewritten using conditional expressions. For example, here is a recursive version of factorial:
We can rewrite it like this:
Another use of conditional expressions is handling optional arguments. For example, here is the init method from GoodKangaroo (see Exercise 29.1):
We can rewrite this one like this:
In general, you can replace a conditional statement with a conditional expression if both branches contain simple expressions that are either returned or assigned to the same variable.
example, this function takes a list of strings, maps the string method capitalize to the elements, and returns a new list of strings:
We can write this more concisely using a list comprehension:
The bracket operators indicate that we are constructing a new list. The expression inside the brackets specifies the elements of the list, and the for clause indicates what sequence we are traversing.
The syntax of a list comprehension is a little awkward because the loop variable, s in this example, appears in the expression before we get to the definition.
List comprehensions can also be used for filtering. For example, this function selects only the elements of t that are upper case, and returns a new list:
We can rewrite it using a list comprehension
List comprehensions are concise and easy to read, at least for simple expressions. And they are usually faster than the equivalent for loops, sometimes much faster. So if you are mad at me for not mentioning them earlier, I understand.
But, in my defense, list comprehensions are harder to debug because you can’t put a print statement inside the loop. I suggest that you use them only if the computation is simple enough that you are likely to get it right the first time. And for beginners that means never.
Generator expressions are similar to list comprehensions, but with parentheses instead of square brackets:
The result is a generator object that knows how to iterate through a sequence of values. But unlike a list comprehension, it does not compute the values all at once; it waits to be asked. The built-in function next gets the next value from the generator:
When you get to the end of the sequence, next raises a StopIteration exception. You can also use a for loop to iterate through the values:
The generator object keeps track of where it is in the sequence, so the for loop picks up where next left off. Once the generator is exhausted, it continues to raise StopIteration:
Generator expressions are often used with functions like sum, max, and min:
Python provides a built-in function, any, that takes a sequence of boolean values and returns True if any of the values are True. It works on lists:
But it is often used with generator expressions:
That example isn’t very useful because it does the same thing as the in operator. But we could use any to rewrite some of the search functions we wrote in Section 17.3. For example, we could write avoids like this:
The function almost reads like English, “word avoids forbidden if there are not any forbidden letters in word.”
Using any with a generator expression is efficient because it stops immediately if it finds a True value, so it doesn’t have to evaluate the whole sequence.
Python provides another built-in function, all, that returns
True if every element of the sequence is True. As
an exercise, use all to re-write uses_all
from
Section 17.3.
In Section LABEL:dictsub I use dictionaries to find the words that appear in a document but not in a word list. The function I wrote takes d1, which contains the words from the document as keys, and d2, which contains the list of words. It returns a dictionary that contains the keys from d1 that are not in d2.
In all of these dictionaries, the values are None because we never use them. As a result, we waste some storage space.
Python provides another built-in type, called a set, that behaves like a collection of dictionary keys with no values. Adding elements to a set is fast; so is checking membership. And sets provide methods and operators to compute common set operations.
For example, set subtraction is available as a method called difference or as an operator, -. So we can rewrite subtract like this:
The result is a set instead of a dictionary, but for operations like iteration, the behavior is the same.
Some of the exercises in this book can be done concisely and
efficiently with sets. For example, here is a solution to
has_duplicates
, from
Exercise 21.7, that uses a dictionary:
When an element appears for the first time, it is added to the dictionary. If the same element appears again, the function returns True.
Using sets, we can write the same function like this:
An element can only appear in a set once, so if an element in t appears more than once, the set will be smaller than t. If there are no duplicates, the set will be the same size as t.
We can also use sets to do some of the exercises in
Chapter 17. For example, here’s a version of
uses_only
with a loop:
uses_only
checks whether all letters in word are
in available. We can rewrite it like this:
The <=
operator checks whether one set is a subset of another,
including the possibility that they are equal, which is true if all
the letters in word appear in available.
As an exercise, rewrite avoids
using sets.
A Counter is like a set, except that if an element appears more than once, the Counter keeps track of how many times it appears. If you are familiar with the mathematical idea of a multiset, a Counter is a natural way to represent a multiset.
Counter is defined in a standard module called collections, so you have to import it. You can initialize a Counter with a string, list, or anything else that supports iteration:
Counters behave like dictionaries in many ways; they map from each key to the number of times it appears. As in dictionaries, the keys have to be hashable.
Unlike dictionaries, Counters don’t raise an exception if you access an element that doesn’t appear. Instead, they return 0:
We can use Counters to rewrite is_anagram
from
Exercise 21.6:
If two words are anagrams, they contain the same letters with the same counts, so their Counters are equivalent.
Counters provide methods and operators to perform set-like operations,
including addition, subtraction, union and intersection. And
they provide an often-useful method, most_common
, which
returns a list of value-frequency pairs, sorted from most common to
least:
The collections module also provides defaultdict, which is like a dictionary except that if you access a key that doesn’t exist, it can generate a new value on the fly.
When you create a defaultdict, you provide a function that’s used to create new values. A function used to create objects is sometimes called a factory. The built-in functions that create lists, sets, and other types can be used as factories:
Notice that the argument is list, which is a class object, not list(), which is a new list. The function you provide doesn’t get called unless you access a key that doesn’t exist.
The new list, which we’re calling t, is also added to the dictionary. So if we modify t, the change appears in d:
If you are making a dictionary of lists, you can often write simpler code using defaultdict. In my solution to Exercise LABEL:anagrams, which you can get from http://thinkpython2.com/code/anagram_sets.py, I make a dictionary that maps from a sorted string of letters to the list of words that can be spelled with those letters. For example, ’opst’ maps to the list [’opts’, ’post’, ’pots’, ’spot’, ’stop’, ’tops’].
Here’s the original code:
This can be simplified using setdefault, which you might have used in Exercise 25.2:
This solution has the drawback that it makes a new list every time, regardless of whether it is needed. For lists, that’s no big deal, but if the factory function is complicated, it might be.
We can avoid this problem and simplify the code using a defaultdict:
Many simple objects are basically collections of related values. For example, the Point object defined in Chapter 27 contains two numbers, x and y. When you define a class like this, you usually start with an init method and a str method:
This is a lot of code to convey a small amount of information. Python provides a more concise way to say the same thing:
The first argument is the name of the class you want to create. The second is a list of the attributes Point objects should have, as strings. The return value from namedtuple is a class object:
Point automatically provides methods like __init__
and
__str__
so you don’t have to write them.
To create a Point object, you use the Point class as a function:
The init method assigns the arguments to attributes using the names you provided. The str method prints a representation of the Point object and its attributes.
You can access the elements of the named tuple by name:
But you can also treat a named tuple as a tuple:
Named tuples provide a quick way to define simple classes. The drawback is that simple classes don’t always stay simple. You might decide later that you want to add methods to a named tuple. In that case, you could define a new class that inherits from the named tuple:
Or you could switch to a conventional class definition.
In Section LABEL:gather, we saw how to write a function that gathers its arguments into a tuple:
You can call this function with any number of positional arguments (that is, arguments that don’t have keywords):
But the * operator doesn’t gather keyword arguments:
To gather keyword arguments, you can use the ** operator:
You can call the keyword gathering parameter anything you want, but kwargs is a common choice. The result is a dictionary that maps from keywords to values:
If you have a dictionary of keywords and values, you can use the scatter operator, ** to call a function:
Without the scatter operator, the function would treat d as a single positional argument, so it would assign d to x and complain because there’s nothing to assign to y:
When you are working with functions that have a large number of parameters, it is often useful to create and pass around dictionaries that specify frequently used options.
An expression that has one of two values, depending on a condition.
An expression with a for loop in square brackets that yields a new list.
An expression with a for loop in parentheses that yields a generator object.
A mathematical entity that represents a mapping between the elements of a set and the number of times they appear.
A function, usually passed as a parameter, used to create objects.
The following is a function that computes the binomial coefficient recursively.
Rewrite the body of the function using nested conditional expressions.
One note: this function is not very efficient because it ends up computing the same values over and over. You could make it more efficient by memoizing (see Section 25). But you will find that it’s harder to memoize if you write it using conditional expressions.