String formatting

We've seen that strings support several methods.

count lower rstrip strip

join lstrip split upper

Today we'll examine another method for strings, named format. This method is particularly complex, but it is very useful.

First, a bit of review. Recall that we saw the below code at left that displays the multiplication table listed at right.

for m in range(1, 5): line = '' for n in range(1, 5): line = line + ' ' + str(m * n) print(line) 1 2 3 4 2 4 6 8 3 5 9 12 4 8 12 16

The output is rather ugly, however: You'd expect the final column — the final 4 in the first row, the 8 in the second, the 12 in the third, and the 16 in the fourth — to line up neatly, but instead the column veers rightward. This is a case where format is particularly useful: formatting output for display in a table.

The format method is for creating a string. It works starting from a string that provides a “template” for a string that is produced. Let's start from an example.

template = 'square of {0:3d} is {1:6d}' result = template.format(5, 5 ** 2)

As you can see, the format method takes some arguments (5 and 5² in this example); the method's job is to create a string based on the template that incorporates these arguments. In the resulting string, the template is reproduced character-for-character, except that any set of braces leads to including an argument in place of that braced group. In this example, the template contains the string “{0:3d}”. In between the braces are two portions separated by a colon.

The first portion, before the colon, identifies which argument to format should be substituted. We count the arguments from 0, so 5 is argument 0 while 5² = 25 is argument 1. In the case of “{0:3d}”, we're identifying that we want the 0th argument, so the result will incorporate 5 in place of the “{0:3d}”.
The second portion, following the colon, explains how to format that argument. In this case, the 3 specifies how many characters should be produced in the result, and the d specifies that the argument should be included as a decimal integer (i.e., base 10, as opposed to binary). Of course, the decimal number 5 just takes one character, though the template says to take three, so format will add an extra two spaces before it. In place of “{0:3d}”, then, format would write “ 5” into the result.

When format reaches “{1:6d}”, it will place argument 1, which is 25, into six characters of the result — i.e., “ 25”. Overall, then, the value of result is the string “square of 5 is 25”.

Now let's go back to our multiplication table. In this case, we want each product to be written across two characters so that the columns line up neatly. We can do this using “{0:2d}” as our template, as in “'{0:2d}'.format(m * n)”. The following illustrates the complete program, along with the resulting display.

for m in range(1, 5): line = '' for n in range(1, 5): line = line + ' ' + '{0:2d}'.format(m * n) print(line) 1 2 3 4 2 4 6 8 3 5 9 12 4 8 12 16

That second assignment statement is a bit long; you might instead prefer a shorter version:

line = '{0} {1:2d}'.format(line, m * n)

Notice that in this case we have a formatting specifier “{0}”, which doesn't have a colon as before. The 0 identifies which argument to paste in its place — the value of line in this case, which is incorporated into the result without any attempt at formatting.

The format method accepts a very broad variety of formatting specifiers, pretty much working in anything that you might imagine — things like inserting thousands separators (as in 1,024), including a positive/negative sign always, and changing the justification (so numbers are left-justified rather than the default of right-justified). Few people bother to remember all the options. But here are four of the most valuable, which are worth remembering, using capital letters to denote variables.

“{ARG}” (as we've seen in “{0}) incorporates the value found in the ARGth argument to format into the resulting string.
“{ARG:WIDd}” (as we've seen in “{1:6d}) incorporates the value found in the ARGth argument to format into the resulting string, writing that argument as a base-10 decimal integer across WID characters. If the integer doesn't require WID characters, spaces are added to the front.
“{ARG:WIDs}” (as in “{0:6s}) incorporates the value found in the ARGth argument to format into the resulting string, writing that argument as a string across WID characters. If the string doesn't require WID characters, spaces are added to the end.
“{ARG:WID.PREf}” (as in “{1:6.3f}) incorporates the value found in the ARGth argument to format into the resulting string, writing that argument as a floating-point number across WID characters using PRE digits after the decimal point. If the string doesn't require WID characters, spaces are added to the front.

The following little program illustrates the two new types.

words = ['banana', 'apple', 'watermelon'] for w in words: print('{0:10s} {1:2d} {2:6.3f}'.format(w, len(w), len(w) ** 0.5)

The output of this program:

banana 6 2.449 apple 5 2.236 watermelon 10 3.162

As you can see, each word w is displayed left-justified across 10 characters, followed by space and two characters devoted to w's length, followed by a space and six characters devoted to the square root of that length. The first column is left-justified while the others are right-justified, since format automatically left-justifies strings across their required lengths but right-justifies numbers. The last column has three digits to the right of each decimal point, as the formatting template indicates.