Think Python for CS114

Chapter 14 Strings

Strings are not like integers, floats, and booleans. A string is a sequence, which means it is an ordered collection of other values. In this chapter you’ll see how to access the characters that make up a string, and you’ll learn about some of the methods strings provide.

14.1 A string is a sequence

A string is a sequence of characters. You can access the characters one at a time with the bracket operator:

>>> fruit = 'banana'
>>> letter = fruit[1]

The second statement selects character number 1 from fruit and assigns it to letter.

The expression in brackets is called an index. The index indicates which character in the sequence you want (hence the name).

But you might not get what you expect:

>>> letter
'a'

For most people, the first letter of ’banana’ is b, not a. But for computer scientists, the index is an offset from the beginning of the string, and the offset of the first letter is zero.

>>> letter = fruit[0]
>>> letter
'b'

So b is the 0th letter (“zero-eth”) of ’banana’, a is the 1th letter (“one-eth”), and n is the 2th letter (“two-eth”).

As an index you can use an expression that contains variables and operators:

>>> i = 1
>>> fruit[i]
'a'
>>> fruit[i+1]
'n'

But the value of the index has to be an integer. Otherwise you get:

>>> letter = fruit[1.5]
TypeError: string indices must be integers

14.2 len

len is a built-in function that returns the number of characters in a string:

>>> fruit = 'banana'
>>> len(fruit)
6

To get the last letter of a string, you might be tempted to try something like this:

>>> length = len(fruit)
>>> last = fruit[length]
IndexError: string index out of range

The reason for the IndexError is that there is no letter in ’banana’ with the index 6. Since we started counting at zero, the six letters are numbered 0 to 5. To get the last character, you have to subtract 1 from length:

>>> last = fruit[length-1]
>>> last
'a'

Or you can use negative indices, which count backward from the end of the string. The expression fruit[-1] yields the last letter, fruit[-2] yields the second to last, and so on.

14.3 String slices

A segment of a string is called a slice. Selecting a slice is similar to selecting a character:

>>> s = 'Monty Python'
>>> s[0:5]
'Monty'
>>> s[6:12]
'Python'

The operator [n:m] returns the part of the string from the “n-eth” character to the “m-eth” character, including the first but excluding the last. This behavior is counterintuitive, but it might help to imagine the indices pointing between the characters, as in Figure 14.1.

Slice indices.

Figure 14.1: Slice indices.

If you omit the first index (before the colon), the slice starts at the beginning of the string. If you omit the second index, the slice goes to the end of the string:

>>> fruit = 'banana'
>>> fruit[:3]
'ban'
>>> fruit[3:]
'ana'

If the first index is greater than or equal to the second the result is an empty string, represented by two quotation marks:

>>> fruit = 'banana'
>>> fruit[3:3]
''

An empty string contains no characters and has length 0, but other than that, it is the same as any other string.

Continuing this example, what do you think fruit[:] means? Try it and see.