Think Python for CS114 22 The in operator 24 Looping through dictionaries

Chapter 23 Dictionaries

This chapter presents another built-in type called a dictionary. Dictionaries are one of Python’s best features; they are the building blocks of many efficient and elegant algorithms.

23.1 A dictionary is a mapping

A dictionary is like a list, but more general. In a list, the indices have to be integers; in a dictionary they can be (almost) any type.

A dictionary contains a collection of indices, which are called keys, and a collection of values. Each key is associated with a single value. The association of a key and a value is called a key-value pair or sometimes an item.

In mathematical language, a dictionary represents a mapping from keys to values, so you can also say that each key “maps to” a value. As an example, we’ll build a dictionary that maps from English to Spanish words, so the keys and the values are all strings.

⬇

>>> eng2sp = {}

>>> eng2sp

{}

The squiggly-brackets, {}, represent an empty dictionary. To add items to the dictionary, you can use square brackets:

⬇

>>> eng2sp['one'] = 'uno'

This line creates an item that maps from the key ’one’ to the value ’uno’. If we print the dictionary again, we see a key-value pair with a colon between the key and value:

⬇

>>> eng2sp

{'one': 'uno'}

This output format is also an input format. For example, you can create a new dictionary with three items:

⬇

>>> eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}

Unlike lists, when working with dictionaries there is no direct way to ask for values by index. That is, you cannot ask for “the $0$ -th item” or “the $n$ -th item”. Instead, you use the keys to look up the corresponding values:

⬇

>>> eng2sp['two']

'dos'

The key ’two’ always maps to the value ’dos’ so the order of the items doesn’t matter.

If the key isn’t in the dictionary, you get an exception:

⬇

>>> eng2sp['four']

KeyError: 'four'

The len function works on dictionaries; it returns the number of key-value pairs:

⬇

>>> len(eng2sp)

The in operator works on dictionaries, too; it tells you whether something appears as a key in the dictionary (appearing as a value is not good enough).

⬇

>>> 'one' in eng2sp

True

>>> 'uno' in eng2sp

False

23.2 Looping to create dictionaries

Suppose you are given a string and you want to count how many times each letter appears. There are several ways you could do it:

1.

You could create 26 variables, one for each letter of the alphabet. Then you could traverse the string and, for each character, increment the corresponding counter, probably using a chained conditional.
2.

You could create a list with 26 elements. Then you could convert each character to a number (using the built-in function ord), use the number as an index into the list, and increment the appropriate counter.
3.

You could create a dictionary with characters as keys and counters as the corresponding values. The first time you see a character, you would add an item to the dictionary. After that you would increment the value of an existing item.

Each of these options performs the same computation, but each of them implements that computation in a different way.

An implementation is a way of performing a computation; some implementations are better than others. For example, an advantage of the dictionary implementation is that we don’t have to know ahead of time which letters appear in the string and we only have to make room for the letters that do appear.

Here is what the code might look like:

⬇

def histogram(s: str) -> dict[str, int]:

d = {}

for c in s:

if c not in d:

d[c] = 1

else:

d[c] += 1

return d

The name of the function is histogram, which is a statistical term for a collection of counters (or frequencies).

The first line of the function creates an empty dictionary. The for loop traverses the string. Each time through the loop, if the character c is not in the dictionary, we create a new item with key c and the initial value 1 (since we have seen this letter once). If c is already in the dictionary we increment d[c].

Here’s how it works:

⬇

>>> h = histogram('brontosaurus')

>>> h

{'b': 1, 'r': 2, 'o': 2, 'n': 1, 't': 1, 's': 2, 'a': 1, 'u': 2}

The histogram indicates that the letters ’a’ and ’b’ appear once; ’o’ appears twice, and so on.