1 :mod:`string` --- Common string operations 2 ========================================== 3 4 .. module:: string 5 :synopsis: Common string operations. 6 7 8 .. index:: module: re 9 10 **Source code:** :source:`Lib/string.py` 11 12 -------------- 13 14 The :mod:`string` module contains a number of useful constants and 15 classes, as well as some deprecated legacy functions that are also 16 available as methods on strings. In addition, Python's built-in string 17 classes support the sequence type methods described in the 18 :ref:`typesseq` section, and also the string-specific methods described 19 in the :ref:`string-methods` section. To output formatted strings use 20 template strings or the ``%`` operator described in the 21 :ref:`string-formatting` section. Also, see the :mod:`re` module for 22 string functions based on regular expressions. 23 24 String constants 25 ---------------- 26 27 The constants defined in this module are: 28 29 30 .. data:: ascii_letters 31 32 The concatenation of the :const:`ascii_lowercase` and :const:`ascii_uppercase` 33 constants described below. This value is not locale-dependent. 34 35 36 .. data:: ascii_lowercase 37 38 The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``. This value is not 39 locale-dependent and will not change. 40 41 42 .. data:: ascii_uppercase 43 44 The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. This value is not 45 locale-dependent and will not change. 46 47 48 .. data:: digits 49 50 The string ``'0123456789'``. 51 52 53 .. data:: hexdigits 54 55 The string ``'0123456789abcdefABCDEF'``. 56 57 58 .. data:: letters 59 60 The concatenation of the strings :const:`lowercase` and :const:`uppercase` 61 described below. The specific value is locale-dependent, and will be updated 62 when :func:`locale.setlocale` is called. 63 64 65 .. data:: lowercase 66 67 A string containing all the characters that are considered lowercase letters. 68 On most systems this is the string ``'abcdefghijklmnopqrstuvwxyz'``. The 69 specific value is locale-dependent, and will be updated when 70 :func:`locale.setlocale` is called. 71 72 73 .. data:: octdigits 74 75 The string ``'01234567'``. 76 77 78 .. data:: punctuation 79 80 String of ASCII characters which are considered punctuation characters in the 81 ``C`` locale. 82 83 84 .. data:: printable 85 86 String of characters which are considered printable. This is a combination of 87 :const:`digits`, :const:`letters`, :const:`punctuation`, and 88 :const:`whitespace`. 89 90 91 .. data:: uppercase 92 93 A string containing all the characters that are considered uppercase letters. 94 On most systems this is the string ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. The 95 specific value is locale-dependent, and will be updated when 96 :func:`locale.setlocale` is called. 97 98 99 .. data:: whitespace 100 101 A string containing all characters that are considered whitespace. On most 102 systems this includes the characters space, tab, linefeed, return, formfeed, and 103 vertical tab. 104 105 106 .. _new-string-formatting: 107 108 Custom String Formatting 109 ------------------------ 110 111 .. versionadded:: 2.6 112 113 The built-in str and unicode classes provide the ability 114 to do complex variable substitutions and value formatting via the 115 :meth:`str.format` method described in :pep:`3101`. The :class:`Formatter` 116 class in the :mod:`string` module allows you to create and customize your own 117 string formatting behaviors using the same implementation as the built-in 118 :meth:`~str.format` method. 119 120 .. class:: Formatter 121 122 The :class:`Formatter` class has the following public methods: 123 124 .. method:: format(format_string, *args, **kwargs) 125 126 The primary API method. It takes a format string and 127 an arbitrary set of positional and keyword arguments. 128 It is just a wrapper that calls :meth:`vformat`. 129 130 .. method:: vformat(format_string, args, kwargs) 131 132 This function does the actual work of formatting. It is exposed as a 133 separate function for cases where you want to pass in a predefined 134 dictionary of arguments, rather than unpacking and repacking the 135 dictionary as individual arguments using the ``*args`` and ``**kwargs`` 136 syntax. :meth:`vformat` does the work of breaking up the format string 137 into character data and replacement fields. It calls the various 138 methods described below. 139 140 In addition, the :class:`Formatter` defines a number of methods that are 141 intended to be replaced by subclasses: 142 143 .. method:: parse(format_string) 144 145 Loop over the format_string and return an iterable of tuples 146 (*literal_text*, *field_name*, *format_spec*, *conversion*). This is used 147 by :meth:`vformat` to break the string into either literal text, or 148 replacement fields. 149 150 The values in the tuple conceptually represent a span of literal text 151 followed by a single replacement field. If there is no literal text 152 (which can happen if two replacement fields occur consecutively), then 153 *literal_text* will be a zero-length string. If there is no replacement 154 field, then the values of *field_name*, *format_spec* and *conversion* 155 will be ``None``. 156 157 .. method:: get_field(field_name, args, kwargs) 158 159 Given *field_name* as returned by :meth:`parse` (see above), convert it to 160 an object to be formatted. Returns a tuple (obj, used_key). The default 161 version takes strings of the form defined in :pep:`3101`, such as 162 "0[name]" or "label.title". *args* and *kwargs* are as passed in to 163 :meth:`vformat`. The return value *used_key* has the same meaning as the 164 *key* parameter to :meth:`get_value`. 165 166 .. method:: get_value(key, args, kwargs) 167 168 Retrieve a given field value. The *key* argument will be either an 169 integer or a string. If it is an integer, it represents the index of the 170 positional argument in *args*; if it is a string, then it represents a 171 named argument in *kwargs*. 172 173 The *args* parameter is set to the list of positional arguments to 174 :meth:`vformat`, and the *kwargs* parameter is set to the dictionary of 175 keyword arguments. 176 177 For compound field names, these functions are only called for the first 178 component of the field name; Subsequent components are handled through 179 normal attribute and indexing operations. 180 181 So for example, the field expression '0.name' would cause 182 :meth:`get_value` to be called with a *key* argument of 0. The ``name`` 183 attribute will be looked up after :meth:`get_value` returns by calling the 184 built-in :func:`getattr` function. 185 186 If the index or keyword refers to an item that does not exist, then an 187 :exc:`IndexError` or :exc:`KeyError` should be raised. 188 189 .. method:: check_unused_args(used_args, args, kwargs) 190 191 Implement checking for unused arguments if desired. The arguments to this 192 function is the set of all argument keys that were actually referred to in 193 the format string (integers for positional arguments, and strings for 194 named arguments), and a reference to the *args* and *kwargs* that was 195 passed to vformat. The set of unused args can be calculated from these 196 parameters. :meth:`check_unused_args` is assumed to raise an exception if 197 the check fails. 198 199 .. method:: format_field(value, format_spec) 200 201 :meth:`format_field` simply calls the global :func:`format` built-in. The 202 method is provided so that subclasses can override it. 203 204 .. method:: convert_field(value, conversion) 205 206 Converts the value (returned by :meth:`get_field`) given a conversion type 207 (as in the tuple returned by the :meth:`parse` method). The default 208 version understands 's' (str), 'r' (repr) and 'a' (ascii) conversion 209 types. 210 211 212 .. _formatstrings: 213 214 Format String Syntax 215 -------------------- 216 217 The :meth:`str.format` method and the :class:`Formatter` class share the same 218 syntax for format strings (although in the case of :class:`Formatter`, 219 subclasses can define their own format string syntax). 220 221 Format strings contain "replacement fields" surrounded by curly braces ``{}``. 222 Anything that is not contained in braces is considered literal text, which is 223 copied unchanged to the output. If you need to include a brace character in the 224 literal text, it can be escaped by doubling: ``{{`` and ``}}``. 225 226 The grammar for a replacement field is as follows: 227 228 .. productionlist:: sf 229 replacement_field: "{" [`field_name`] ["!" `conversion`] [":" `format_spec`] "}" 230 field_name: arg_name ("." `attribute_name` | "[" `element_index` "]")* 231 arg_name: [`identifier` | `integer`] 232 attribute_name: `identifier` 233 element_index: `integer` | `index_string` 234 index_string: <any source character except "]"> + 235 conversion: "r" | "s" 236 format_spec: <described in the next section> 237 238 In less formal terms, the replacement field can start with a *field_name* that specifies 239 the object whose value is to be formatted and inserted 240 into the output instead of the replacement field. 241 The *field_name* is optionally followed by a *conversion* field, which is 242 preceded by an exclamation point ``'!'``, and a *format_spec*, which is preceded 243 by a colon ``':'``. These specify a non-default format for the replacement value. 244 245 See also the :ref:`formatspec` section. 246 247 The *field_name* itself begins with an *arg_name* that is either a number or a 248 keyword. If it's a number, it refers to a positional argument, and if it's a keyword, 249 it refers to a named keyword argument. If the numerical arg_names in a format string 250 are 0, 1, 2, ... in sequence, they can all be omitted (not just some) 251 and the numbers 0, 1, 2, ... will be automatically inserted in that order. 252 Because *arg_name* is not quote-delimited, it is not possible to specify arbitrary 253 dictionary keys (e.g., the strings ``'10'`` or ``':-]'``) within a format string. 254 The *arg_name* can be followed by any number of index or 255 attribute expressions. An expression of the form ``'.name'`` selects the named 256 attribute using :func:`getattr`, while an expression of the form ``'[index]'`` 257 does an index lookup using :func:`__getitem__`. 258 259 .. versionchanged:: 2.7 260 The positional argument specifiers can be omitted, so ``'{} {}'`` is 261 equivalent to ``'{0} {1}'``. 262 263 Some simple format string examples:: 264 265 "First, thou shalt count to {0}" # References first positional argument 266 "Bring me a {}" # Implicitly references the first positional argument 267 "From {} to {}" # Same as "From {0} to {1}" 268 "My quest is {name}" # References keyword argument 'name' 269 "Weight in tons {0.weight}" # 'weight' attribute of first positional arg 270 "Units destroyed: {players[0]}" # First element of keyword argument 'players'. 271 272 The *conversion* field causes a type coercion before formatting. Normally, the 273 job of formatting a value is done by the :meth:`__format__` method of the value 274 itself. However, in some cases it is desirable to force a type to be formatted 275 as a string, overriding its own definition of formatting. By converting the 276 value to a string before calling :meth:`__format__`, the normal formatting logic 277 is bypassed. 278 279 Two conversion flags are currently supported: ``'!s'`` which calls :func:`str` 280 on the value, and ``'!r'`` which calls :func:`repr`. 281 282 Some examples:: 283 284 "Harold's a clever {0!s}" # Calls str() on the argument first 285 "Bring out the holy {name!r}" # Calls repr() on the argument first 286 287 The *format_spec* field contains a specification of how the value should be 288 presented, including such details as field width, alignment, padding, decimal 289 precision and so on. Each value type can define its own "formatting 290 mini-language" or interpretation of the *format_spec*. 291 292 Most built-in types support a common formatting mini-language, which is 293 described in the next section. 294 295 A *format_spec* field can also include nested replacement fields within it. 296 These nested replacement fields may contain a field name, conversion flag 297 and format specification, but deeper nesting is 298 not allowed. The replacement fields within the 299 format_spec are substituted before the *format_spec* string is interpreted. 300 This allows the formatting of a value to be dynamically specified. 301 302 See the :ref:`formatexamples` section for some examples. 303 304 305 .. _formatspec: 306 307 Format Specification Mini-Language 308 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 309 310 "Format specifications" are used within replacement fields contained within a 311 format string to define how individual values are presented (see 312 :ref:`formatstrings`). They can also be passed directly to the built-in 313 :func:`format` function. Each formattable type may define how the format 314 specification is to be interpreted. 315 316 Most built-in types implement the following options for format specifications, 317 although some of the formatting options are only supported by the numeric types. 318 319 A general convention is that an empty format string (``""``) produces 320 the same result as if you had called :func:`str` on the value. A 321 non-empty format string typically modifies the result. 322 323 The general form of a *standard format specifier* is: 324 325 .. productionlist:: sf 326 format_spec: [[`fill`]`align`][`sign`][#][0][`width`][,][.`precision`][`type`] 327 fill: <any character> 328 align: "<" | ">" | "=" | "^" 329 sign: "+" | "-" | " " 330 width: `integer` 331 precision: `integer` 332 type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" 333 334 If a valid *align* value is specified, it can be preceded by a *fill* 335 character that can be any character and defaults to a space if omitted. 336 It is not possible to use a literal curly brace ("``{``" or "``}``") as 337 the *fill* character when using the :meth:`str.format` 338 method. However, it is possible to insert a curly brace 339 with a nested replacement field. This limitation doesn't 340 affect the :func:`format` function. 341 342 The meaning of the various alignment options is as follows: 343 344 +---------+----------------------------------------------------------+ 345 | Option | Meaning | 346 +=========+==========================================================+ 347 | ``'<'`` | Forces the field to be left-aligned within the available | 348 | | space (this is the default for most objects). | 349 +---------+----------------------------------------------------------+ 350 | ``'>'`` | Forces the field to be right-aligned within the | 351 | | available space (this is the default for numbers). | 352 +---------+----------------------------------------------------------+ 353 | ``'='`` | Forces the padding to be placed after the sign (if any) | 354 | | but before the digits. This is used for printing fields | 355 | | in the form '+000000120'. This alignment option is only | 356 | | valid for numeric types. It becomes the default when '0'| 357 | | immediately precedes the field width. | 358 +---------+----------------------------------------------------------+ 359 | ``'^'`` | Forces the field to be centered within the available | 360 | | space. | 361 +---------+----------------------------------------------------------+ 362 363 Note that unless a minimum field width is defined, the field width will always 364 be the same size as the data to fill it, so that the alignment option has no 365 meaning in this case. 366 367 The *sign* option is only valid for number types, and can be one of the 368 following: 369 370 +---------+----------------------------------------------------------+ 371 | Option | Meaning | 372 +=========+==========================================================+ 373 | ``'+'`` | indicates that a sign should be used for both | 374 | | positive as well as negative numbers. | 375 +---------+----------------------------------------------------------+ 376 | ``'-'`` | indicates that a sign should be used only for negative | 377 | | numbers (this is the default behavior). | 378 +---------+----------------------------------------------------------+ 379 | space | indicates that a leading space should be used on | 380 | | positive numbers, and a minus sign on negative numbers. | 381 +---------+----------------------------------------------------------+ 382 383 The ``'#'`` option is only valid for integers, and only for binary, octal, or 384 hexadecimal output. If present, it specifies that the output will be prefixed 385 by ``'0b'``, ``'0o'``, or ``'0x'``, respectively. 386 387 The ``','`` option signals the use of a comma for a thousands separator. 388 For a locale aware separator, use the ``'n'`` integer presentation type 389 instead. 390 391 .. versionchanged:: 2.7 392 Added the ``','`` option (see also :pep:`378`). 393 394 *width* is a decimal integer defining the minimum field width. If not 395 specified, then the field width will be determined by the content. 396 397 When no explicit alignment is given, preceding the *width* field by a zero 398 (``'0'``) character enables 399 sign-aware zero-padding for numeric types. This is equivalent to a *fill* 400 character of ``'0'`` with an *alignment* type of ``'='``. 401 402 The *precision* is a decimal number indicating how many digits should be 403 displayed after the decimal point for a floating point value formatted with 404 ``'f'`` and ``'F'``, or before and after the decimal point for a floating point 405 value formatted with ``'g'`` or ``'G'``. For non-number types the field 406 indicates the maximum field size - in other words, how many characters will be 407 used from the field content. The *precision* is not allowed for integer values. 408 409 Finally, the *type* determines how the data should be presented. 410 411 The available string presentation types are: 412 413 +---------+----------------------------------------------------------+ 414 | Type | Meaning | 415 +=========+==========================================================+ 416 | ``'s'`` | String format. This is the default type for strings and | 417 | | may be omitted. | 418 +---------+----------------------------------------------------------+ 419 | None | The same as ``'s'``. | 420 +---------+----------------------------------------------------------+ 421 422 The available integer presentation types are: 423 424 +---------+----------------------------------------------------------+ 425 | Type | Meaning | 426 +=========+==========================================================+ 427 | ``'b'`` | Binary format. Outputs the number in base 2. | 428 +---------+----------------------------------------------------------+ 429 | ``'c'`` | Character. Converts the integer to the corresponding | 430 | | unicode character before printing. | 431 +---------+----------------------------------------------------------+ 432 | ``'d'`` | Decimal Integer. Outputs the number in base 10. | 433 +---------+----------------------------------------------------------+ 434 | ``'o'`` | Octal format. Outputs the number in base 8. | 435 +---------+----------------------------------------------------------+ 436 | ``'x'`` | Hex format. Outputs the number in base 16, using lower- | 437 | | case letters for the digits above 9. | 438 +---------+----------------------------------------------------------+ 439 | ``'X'`` | Hex format. Outputs the number in base 16, using upper- | 440 | | case letters for the digits above 9. | 441 +---------+----------------------------------------------------------+ 442 | ``'n'`` | Number. This is the same as ``'d'``, except that it uses | 443 | | the current locale setting to insert the appropriate | 444 | | number separator characters. | 445 +---------+----------------------------------------------------------+ 446 | None | The same as ``'d'``. | 447 +---------+----------------------------------------------------------+ 448 449 In addition to the above presentation types, integers can be formatted 450 with the floating point presentation types listed below (except 451 ``'n'`` and ``None``). When doing so, :func:`float` is used to convert the 452 integer to a floating point number before formatting. 453 454 The available presentation types for floating point and decimal values are: 455 456 +---------+----------------------------------------------------------+ 457 | Type | Meaning | 458 +=========+==========================================================+ 459 | ``'e'`` | Exponent notation. Prints the number in scientific | 460 | | notation using the letter 'e' to indicate the exponent. | 461 | | The default precision is ``6``. | 462 +---------+----------------------------------------------------------+ 463 | ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an | 464 | | upper case 'E' as the separator character. | 465 +---------+----------------------------------------------------------+ 466 | ``'f'`` | Fixed point. Displays the number as a fixed-point | 467 | | number. The default precision is ``6``. | 468 +---------+----------------------------------------------------------+ 469 | ``'F'`` | Fixed point. Same as ``'f'``. | 470 +---------+----------------------------------------------------------+ 471 | ``'g'`` | General format. For a given precision ``p >= 1``, | 472 | | this rounds the number to ``p`` significant digits and | 473 | | then formats the result in either fixed-point format | 474 | | or in scientific notation, depending on its magnitude. | 475 | | | 476 | | The precise rules are as follows: suppose that the | 477 | | result formatted with presentation type ``'e'`` and | 478 | | precision ``p-1`` would have exponent ``exp``. Then | 479 | | if ``-4 <= exp < p``, the number is formatted | 480 | | with presentation type ``'f'`` and precision | 481 | | ``p-1-exp``. Otherwise, the number is formatted | 482 | | with presentation type ``'e'`` and precision ``p-1``. | 483 | | In both cases insignificant trailing zeros are removed | 484 | | from the significand, and the decimal point is also | 485 | | removed if there are no remaining digits following it. | 486 | | | 487 | | Positive and negative infinity, positive and negative | 488 | | zero, and nans, are formatted as ``inf``, ``-inf``, | 489 | | ``0``, ``-0`` and ``nan`` respectively, regardless of | 490 | | the precision. | 491 | | | 492 | | A precision of ``0`` is treated as equivalent to a | 493 | | precision of ``1``. The default precision is ``6``. | 494 +---------+----------------------------------------------------------+ 495 | ``'G'`` | General format. Same as ``'g'`` except switches to | 496 | | ``'E'`` if the number gets too large. The | 497 | | representations of infinity and NaN are uppercased, too. | 498 +---------+----------------------------------------------------------+ 499 | ``'n'`` | Number. This is the same as ``'g'``, except that it uses | 500 | | the current locale setting to insert the appropriate | 501 | | number separator characters. | 502 +---------+----------------------------------------------------------+ 503 | ``'%'`` | Percentage. Multiplies the number by 100 and displays | 504 | | in fixed (``'f'``) format, followed by a percent sign. | 505 +---------+----------------------------------------------------------+ 506 | None | The same as ``'g'``. | 507 +---------+----------------------------------------------------------+ 508 509 510 511 .. _formatexamples: 512 513 Format examples 514 ^^^^^^^^^^^^^^^ 515 516 This section contains examples of the :meth:`str.format` syntax and 517 comparison with the old ``%``-formatting. 518 519 In most of the cases the syntax is similar to the old ``%``-formatting, with the 520 addition of the ``{}`` and with ``:`` used instead of ``%``. 521 For example, ``'%03.2f'`` can be translated to ``'{:03.2f}'``. 522 523 The new format syntax also supports new and different options, shown in the 524 follow examples. 525 526 Accessing arguments by position:: 527 528 >>> '{0}, {1}, {2}'.format('a', 'b', 'c') 529 'a, b, c' 530 >>> '{}, {}, {}'.format('a', 'b', 'c') # 2.7+ only 531 'a, b, c' 532 >>> '{2}, {1}, {0}'.format('a', 'b', 'c') 533 'c, b, a' 534 >>> '{2}, {1}, {0}'.format(*'abc') # unpacking argument sequence 535 'c, b, a' 536 >>> '{0}{1}{0}'.format('abra', 'cad') # arguments' indices can be repeated 537 'abracadabra' 538 539 Accessing arguments by name:: 540 541 >>> 'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W') 542 'Coordinates: 37.24N, -115.81W' 543 >>> coord = {'latitude': '37.24N', 'longitude': '-115.81W'} 544 >>> 'Coordinates: {latitude}, {longitude}'.format(**coord) 545 'Coordinates: 37.24N, -115.81W' 546 547 Accessing arguments' attributes:: 548 549 >>> c = 3-5j 550 >>> ('The complex number {0} is formed from the real part {0.real} ' 551 ... 'and the imaginary part {0.imag}.').format(c) 552 'The complex number (3-5j) is formed from the real part 3.0 and the imaginary part -5.0.' 553 >>> class Point(object): 554 ... def __init__(self, x, y): 555 ... self.x, self.y = x, y 556 ... def __str__(self): 557 ... return 'Point({self.x}, {self.y})'.format(self=self) 558 ... 559 >>> str(Point(4, 2)) 560 'Point(4, 2)' 561 562 563 Accessing arguments' items:: 564 565 >>> coord = (3, 5) 566 >>> 'X: {0[0]}; Y: {0[1]}'.format(coord) 567 'X: 3; Y: 5' 568 569 Replacing ``%s`` and ``%r``:: 570 571 >>> "repr() shows quotes: {!r}; str() doesn't: {!s}".format('test1', 'test2') 572 "repr() shows quotes: 'test1'; str() doesn't: test2" 573 574 Aligning the text and specifying a width:: 575 576 >>> '{:<30}'.format('left aligned') 577 'left aligned ' 578 >>> '{:>30}'.format('right aligned') 579 ' right aligned' 580 >>> '{:^30}'.format('centered') 581 ' centered ' 582 >>> '{:*^30}'.format('centered') # use '*' as a fill char 583 '***********centered***********' 584 585 Replacing ``%+f``, ``%-f``, and ``% f`` and specifying a sign:: 586 587 >>> '{:+f}; {:+f}'.format(3.14, -3.14) # show it always 588 '+3.140000; -3.140000' 589 >>> '{: f}; {: f}'.format(3.14, -3.14) # show a space for positive numbers 590 ' 3.140000; -3.140000' 591 >>> '{:-f}; {:-f}'.format(3.14, -3.14) # show only the minus -- same as '{:f}; {:f}' 592 '3.140000; -3.140000' 593 594 Replacing ``%x`` and ``%o`` and converting the value to different bases:: 595 596 >>> # format also supports binary numbers 597 >>> "int: {0:d}; hex: {0:x}; oct: {0:o}; bin: {0:b}".format(42) 598 'int: 42; hex: 2a; oct: 52; bin: 101010' 599 >>> # with 0x, 0o, or 0b as prefix: 600 >>> "int: {0:d}; hex: {0:#x}; oct: {0:#o}; bin: {0:#b}".format(42) 601 'int: 42; hex: 0x2a; oct: 0o52; bin: 0b101010' 602 603 Using the comma as a thousands separator:: 604 605 >>> '{:,}'.format(1234567890) 606 '1,234,567,890' 607 608 Expressing a percentage:: 609 610 >>> points = 19.5 611 >>> total = 22 612 >>> 'Correct answers: {:.2%}'.format(points/total) 613 'Correct answers: 88.64%' 614 615 Using type-specific formatting:: 616 617 >>> import datetime 618 >>> d = datetime.datetime(2010, 7, 4, 12, 15, 58) 619 >>> '{:%Y-%m-%d %H:%M:%S}'.format(d) 620 '2010-07-04 12:15:58' 621 622 Nesting arguments and more complex examples:: 623 624 >>> for align, text in zip('<^>', ['left', 'center', 'right']): 625 ... '{0:{fill}{align}16}'.format(text, fill=align, align=align) 626 ... 627 'left<<<<<<<<<<<<' 628 '^^^^^center^^^^^' 629 '>>>>>>>>>>>right' 630 >>> 631 >>> octets = [192, 168, 0, 1] 632 >>> '{:02X}{:02X}{:02X}{:02X}'.format(*octets) 633 'C0A80001' 634 >>> int(_, 16) 635 3232235521 636 >>> 637 >>> width = 5 638 >>> for num in range(5,12): 639 ... for base in 'dXob': 640 ... print '{0:{width}{base}}'.format(num, base=base, width=width), 641 ... print 642 ... 643 5 5 5 101 644 6 6 6 110 645 7 7 7 111 646 8 8 10 1000 647 9 9 11 1001 648 10 A 12 1010 649 11 B 13 1011 650 651 652 653 Template strings 654 ---------------- 655 656 .. versionadded:: 2.4 657 658 Templates provide simpler string substitutions as described in :pep:`292`. 659 Instead of the normal ``%``\ -based substitutions, Templates support ``$``\ 660 -based substitutions, using the following rules: 661 662 * ``$$`` is an escape; it is replaced with a single ``$``. 663 664 * ``$identifier`` names a substitution placeholder matching a mapping key of 665 ``"identifier"``. By default, ``"identifier"`` must spell a Python 666 identifier. The first non-identifier character after the ``$`` character 667 terminates this placeholder specification. 668 669 * ``${identifier}`` is equivalent to ``$identifier``. It is required when valid 670 identifier characters follow the placeholder but are not part of the 671 placeholder, such as ``"${noun}ification"``. 672 673 Any other appearance of ``$`` in the string will result in a :exc:`ValueError` 674 being raised. 675 676 The :mod:`string` module provides a :class:`Template` class that implements 677 these rules. The methods of :class:`Template` are: 678 679 680 .. class:: Template(template) 681 682 The constructor takes a single argument which is the template string. 683 684 685 .. method:: substitute(mapping[, **kws]) 686 687 Performs the template substitution, returning a new string. *mapping* is 688 any dictionary-like object with keys that match the placeholders in the 689 template. Alternatively, you can provide keyword arguments, where the 690 keywords are the placeholders. When both *mapping* and *kws* are given 691 and there are duplicates, the placeholders from *kws* take precedence. 692 693 694 .. method:: safe_substitute(mapping[, **kws]) 695 696 Like :meth:`substitute`, except that if placeholders are missing from 697 *mapping* and *kws*, instead of raising a :exc:`KeyError` exception, the 698 original placeholder will appear in the resulting string intact. Also, 699 unlike with :meth:`substitute`, any other appearances of the ``$`` will 700 simply return ``$`` instead of raising :exc:`ValueError`. 701 702 While other exceptions may still occur, this method is called "safe" 703 because substitutions always tries to return a usable string instead of 704 raising an exception. In another sense, :meth:`safe_substitute` may be 705 anything other than safe, since it will silently ignore malformed 706 templates containing dangling delimiters, unmatched braces, or 707 placeholders that are not valid Python identifiers. 708 709 :class:`Template` instances also provide one public data attribute: 710 711 .. attribute:: template 712 713 This is the object passed to the constructor's *template* argument. In 714 general, you shouldn't change it, but read-only access is not enforced. 715 716 Here is an example of how to use a Template:: 717 718 >>> from string import Template 719 >>> s = Template('$who likes $what') 720 >>> s.substitute(who='tim', what='kung pao') 721 'tim likes kung pao' 722 >>> d = dict(who='tim') 723 >>> Template('Give $who $100').substitute(d) 724 Traceback (most recent call last): 725 ... 726 ValueError: Invalid placeholder in string: line 1, col 11 727 >>> Template('$who likes $what').substitute(d) 728 Traceback (most recent call last): 729 ... 730 KeyError: 'what' 731 >>> Template('$who likes $what').safe_substitute(d) 732 'tim likes $what' 733 734 Advanced usage: you can derive subclasses of :class:`Template` to customize the 735 placeholder syntax, delimiter character, or the entire regular expression used 736 to parse template strings. To do this, you can override these class attributes: 737 738 * *delimiter* -- This is the literal string describing a placeholder introducing 739 delimiter. The default value is ``$``. Note that this should *not* be a 740 regular expression, as the implementation will call :meth:`re.escape` on this 741 string as needed. 742 743 * *idpattern* -- This is the regular expression describing the pattern for 744 non-braced placeholders (the braces will be added automatically as 745 appropriate). The default value is the regular expression 746 ``[_a-z][_a-z0-9]*``. 747 748 Alternatively, you can provide the entire regular expression pattern by 749 overriding the class attribute *pattern*. If you do this, the value must be a 750 regular expression object with four named capturing groups. The capturing 751 groups correspond to the rules given above, along with the invalid placeholder 752 rule: 753 754 * *escaped* -- This group matches the escape sequence, e.g. ``$$``, in the 755 default pattern. 756 757 * *named* -- This group matches the unbraced placeholder name; it should not 758 include the delimiter in capturing group. 759 760 * *braced* -- This group matches the brace enclosed placeholder name; it should 761 not include either the delimiter or braces in the capturing group. 762 763 * *invalid* -- This group matches any other delimiter pattern (usually a single 764 delimiter), and it should appear last in the regular expression. 765 766 767 String functions 768 ---------------- 769 770 The following functions are available to operate on string and Unicode objects. 771 They are not available as string methods. 772 773 774 .. function:: capwords(s[, sep]) 775 776 Split the argument into words using :meth:`str.split`, capitalize each word 777 using :meth:`str.capitalize`, and join the capitalized words using 778 :meth:`str.join`. If the optional second argument *sep* is absent 779 or ``None``, runs of whitespace characters are replaced by a single space 780 and leading and trailing whitespace are removed, otherwise *sep* is used to 781 split and join the words. 782 783 784 .. function:: maketrans(from, to) 785 786 Return a translation table suitable for passing to :func:`translate`, that will 787 map each character in *from* into the character at the same position in *to*; 788 *from* and *to* must have the same length. 789 790 .. note:: 791 792 Don't use strings derived from :const:`lowercase` and :const:`uppercase` as 793 arguments; in some locales, these don't have the same length. For case 794 conversions, always use :meth:`str.lower` and :meth:`str.upper`. 795 796 797 Deprecated string functions 798 --------------------------- 799 800 The following list of functions are also defined as methods of string and 801 Unicode objects; see section :ref:`string-methods` for more information on 802 those. You should consider these functions as deprecated, although they will 803 not be removed until Python 3. The functions defined in this module are: 804 805 806 .. function:: atof(s) 807 808 .. deprecated:: 2.0 809 Use the :func:`float` built-in function. 810 811 .. index:: builtin: float 812 813 Convert a string to a floating point number. The string must have the standard 814 syntax for a floating point literal in Python, optionally preceded by a sign 815 (``+`` or ``-``). Note that this behaves identical to the built-in function 816 :func:`float` when passed a string. 817 818 .. note:: 819 820 .. index:: 821 single: NaN 822 single: Infinity 823 824 When passing in a string, values for NaN and Infinity may be returned, depending 825 on the underlying C library. The specific set of strings accepted which cause 826 these values to be returned depends entirely on the C library and is known to 827 vary. 828 829 830 .. function:: atoi(s[, base]) 831 832 .. deprecated:: 2.0 833 Use the :func:`int` built-in function. 834 835 .. index:: builtin: eval 836 837 Convert string *s* to an integer in the given *base*. The string must consist 838 of one or more digits, optionally preceded by a sign (``+`` or ``-``). The 839 *base* defaults to 10. If it is 0, a default base is chosen depending on the 840 leading characters of the string (after stripping the sign): ``0x`` or ``0X`` 841 means 16, ``0`` means 8, anything else means 10. If *base* is 16, a leading 842 ``0x`` or ``0X`` is always accepted, though not required. This behaves 843 identically to the built-in function :func:`int` when passed a string. (Also 844 note: for a more flexible interpretation of numeric literals, use the built-in 845 function :func:`eval`.) 846 847 848 .. function:: atol(s[, base]) 849 850 .. deprecated:: 2.0 851 Use the :func:`long` built-in function. 852 853 .. index:: builtin: long 854 855 Convert string *s* to a long integer in the given *base*. The string must 856 consist of one or more digits, optionally preceded by a sign (``+`` or ``-``). 857 The *base* argument has the same meaning as for :func:`atoi`. A trailing ``l`` 858 or ``L`` is not allowed, except if the base is 0. Note that when invoked 859 without *base* or with *base* set to 10, this behaves identical to the built-in 860 function :func:`long` when passed a string. 861 862 863 .. function:: capitalize(word) 864 865 Return a copy of *word* with only its first character capitalized. 866 867 868 .. function:: expandtabs(s[, tabsize]) 869 870 Expand tabs in a string replacing them by one or more spaces, depending on the 871 current column and the given tab size. The column number is reset to zero after 872 each newline occurring in the string. This doesn't understand other non-printing 873 characters or escape sequences. The tab size defaults to 8. 874 875 876 .. function:: find(s, sub[, start[,end]]) 877 878 Return the lowest index in *s* where the substring *sub* is found such that 879 *sub* is wholly contained in ``s[start:end]``. Return ``-1`` on failure. 880 Defaults for *start* and *end* and interpretation of negative values is the same 881 as for slices. 882 883 884 .. function:: rfind(s, sub[, start[, end]]) 885 886 Like :func:`find` but find the highest index. 887 888 889 .. function:: index(s, sub[, start[, end]]) 890 891 Like :func:`find` but raise :exc:`ValueError` when the substring is not found. 892 893 894 .. function:: rindex(s, sub[, start[, end]]) 895 896 Like :func:`rfind` but raise :exc:`ValueError` when the substring is not found. 897 898 899 .. function:: count(s, sub[, start[, end]]) 900 901 Return the number of (non-overlapping) occurrences of substring *sub* in string 902 ``s[start:end]``. Defaults for *start* and *end* and interpretation of negative 903 values are the same as for slices. 904 905 906 .. function:: lower(s) 907 908 Return a copy of *s*, but with upper case letters converted to lower case. 909 910 911 .. function:: split(s[, sep[, maxsplit]]) 912 913 Return a list of the words of the string *s*. If the optional second argument 914 *sep* is absent or ``None``, the words are separated by arbitrary strings of 915 whitespace characters (space, tab, newline, return, formfeed). If the second 916 argument *sep* is present and not ``None``, it specifies a string to be used as 917 the word separator. The returned list will then have one more item than the 918 number of non-overlapping occurrences of the separator in the string. 919 If *maxsplit* is given, at most *maxsplit* number of splits occur, and the 920 remainder of the string is returned as the final element of the list (thus, 921 the list will have at most ``maxsplit+1`` elements). If *maxsplit* is not 922 specified or ``-1``, then there is no limit on the number of splits (all 923 possible splits are made). 924 925 The behavior of split on an empty string depends on the value of *sep*. If *sep* 926 is not specified, or specified as ``None``, the result will be an empty list. 927 If *sep* is specified as any string, the result will be a list containing one 928 element which is an empty string. 929 930 931 .. function:: rsplit(s[, sep[, maxsplit]]) 932 933 Return a list of the words of the string *s*, scanning *s* from the end. To all 934 intents and purposes, the resulting list of words is the same as returned by 935 :func:`split`, except when the optional third argument *maxsplit* is explicitly 936 specified and nonzero. If *maxsplit* is given, at most *maxsplit* number of 937 splits -- the *rightmost* ones -- occur, and the remainder of the string is 938 returned as the first element of the list (thus, the list will have at most 939 ``maxsplit+1`` elements). 940 941 .. versionadded:: 2.4 942 943 944 .. function:: splitfields(s[, sep[, maxsplit]]) 945 946 This function behaves identically to :func:`split`. (In the past, :func:`split` 947 was only used with one argument, while :func:`splitfields` was only used with 948 two arguments.) 949 950 951 .. function:: join(words[, sep]) 952 953 Concatenate a list or tuple of words with intervening occurrences of *sep*. 954 The default value for *sep* is a single space character. It is always true that 955 ``string.join(string.split(s, sep), sep)`` equals *s*. 956 957 958 .. function:: joinfields(words[, sep]) 959 960 This function behaves identically to :func:`join`. (In the past, :func:`join` 961 was only used with one argument, while :func:`joinfields` was only used with two 962 arguments.) Note that there is no :meth:`joinfields` method on string objects; 963 use the :meth:`join` method instead. 964 965 966 .. function:: lstrip(s[, chars]) 967 968 Return a copy of the string with leading characters removed. If *chars* is 969 omitted or ``None``, whitespace characters are removed. If given and not 970 ``None``, *chars* must be a string; the characters in the string will be 971 stripped from the beginning of the string this method is called on. 972 973 .. versionchanged:: 2.2.3 974 The *chars* parameter was added. The *chars* parameter cannot be passed in 975 earlier 2.2 versions. 976 977 978 .. function:: rstrip(s[, chars]) 979 980 Return a copy of the string with trailing characters removed. If *chars* is 981 omitted or ``None``, whitespace characters are removed. If given and not 982 ``None``, *chars* must be a string; the characters in the string will be 983 stripped from the end of the string this method is called on. 984 985 .. versionchanged:: 2.2.3 986 The *chars* parameter was added. The *chars* parameter cannot be passed in 987 earlier 2.2 versions. 988 989 990 .. function:: strip(s[, chars]) 991 992 Return a copy of the string with leading and trailing characters removed. If 993 *chars* is omitted or ``None``, whitespace characters are removed. If given and 994 not ``None``, *chars* must be a string; the characters in the string will be 995 stripped from the both ends of the string this method is called on. 996 997 .. versionchanged:: 2.2.3 998 The *chars* parameter was added. The *chars* parameter cannot be passed in 999 earlier 2.2 versions. 1000 1001 1002 .. function:: swapcase(s) 1003 1004 Return a copy of *s*, but with lower case letters converted to upper case and 1005 vice versa. 1006 1007 1008 .. function:: translate(s, table[, deletechars]) 1009 1010 Delete all characters from *s* that are in *deletechars* (if present), and then 1011 translate the characters using *table*, which must be a 256-character string 1012 giving the translation for each character value, indexed by its ordinal. If 1013 *table* is ``None``, then only the character deletion step is performed. 1014 1015 1016 .. function:: upper(s) 1017 1018 Return a copy of *s*, but with lower case letters converted to upper case. 1019 1020 1021 .. function:: ljust(s, width[, fillchar]) 1022 rjust(s, width[, fillchar]) 1023 center(s, width[, fillchar]) 1024 1025 These functions respectively left-justify, right-justify and center a string in 1026 a field of given width. They return a string that is at least *width* 1027 characters wide, created by padding the string *s* with the character *fillchar* 1028 (default is a space) until the given width on the right, left or both sides. 1029 The string is never truncated. 1030 1031 1032 .. function:: zfill(s, width) 1033 1034 Pad a numeric string *s* on the left with zero digits until the 1035 given *width* is reached. Strings starting with a sign are handled 1036 correctly. 1037 1038 1039 .. function:: replace(s, old, new[, maxreplace]) 1040 1041 Return a copy of string *s* with all occurrences of substring *old* replaced 1042 by *new*. If the optional argument *maxreplace* is given, the first 1043 *maxreplace* occurrences are replaced. 1044 1045