Home | History | Annotate | Download | only in pydoc_data

Lines Matching refs:productions

63  'strings': '\nString literals\n***************\n\nString literals are described by the following lexical definitions:\n\n   stringliteral   ::= [stringprefix](shortstring | longstring)\n   stringprefix    ::= "r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR"\n                    | "b" | "B" | "br" | "Br" | "bR" | "BR"\n   shortstring     ::= "\'" shortstringitem* "\'" | \'"\' shortstringitem* \'"\'\n   longstring      ::= "\'\'\'" longstringitem* "\'\'\'"\n                  | \'"""\' longstringitem* \'"""\'\n   shortstringitem ::= shortstringchar | escapeseq\n   longstringitem  ::= longstringchar | escapeseq\n   shortstringchar ::= <any source character except "\\" or newline or the quote>\n   longstringchar  ::= <any source character except "\\">\n   escapeseq       ::= "\\" <any ASCII character>\n\nOne syntactic restriction not indicated by these productions is that\nwhitespace is not allowed between the ``stringprefix`` and the rest of\nthe string literal. The source character set is defined by the\nencoding declaration; it is ASCII if no encoding declaration is given\nin the source file; see section *Encoding declarations*.\n\nIn plain English: String literals can be enclosed in matching single\nquotes (``\'``) or double quotes (``"``).  They can also be enclosed in\nmatching groups of three single or double quotes (these are generally\nreferred to as *triple-quoted strings*).  The backslash (``\\``)\ncharacter is used to escape characters that otherwise have a special\nmeaning, such as newline, backslash itself, or the quote character.\nString literals may optionally be prefixed with a letter ``\'r\'`` or\n``\'R\'``; such strings are called *raw strings* and use different rules\nfor interpreting backslash escape sequences.  A prefix of ``\'u\'`` or\n``\'U\'`` makes the string a Unicode string.  Unicode strings use the\nUnicode character set as defined by the Unicode Consortium and ISO\n10646.  Some additional escape sequences, described below, are\navailable in Unicode strings. A prefix of ``\'b\'`` or ``\'B\'`` is\nignored in Python 2; it indicates that the literal should become a\nbytes literal in Python 3 (e.g. when code is automatically converted\nwith 2to3).  A ``\'u\'`` or ``\'b\'`` prefix may be followed by an ``\'r\'``\nprefix.\n\nIn triple-quoted strings, unescaped newlines and quotes are allowed\n(and are retained), except that three unescaped quotes in a row\nterminate the string.  (A "quote" is the character used to open the\nstring, i.e. either ``\'`` or ``"``.)\n\nUnless an ``\'r\'`` or ``\'R\'`` prefix is present, escape sequences in\nstrings are interpreted according to rules similar to those used by\nStandard C.  The recognized escape sequences are:\n\n+-------------------+-----------------------------------+---------+\n| Escape Sequence   | Meaning                           | Notes   |\n+===================+===================================+=========+\n| ``\\newline``      | Ignored                           |         |\n+-------------------+-----------------------------------+---------+\n| ``\\\\``            | Backslash (``\\``)                 |         |\n+-------------------+-----------------------------------+---------+\n| ``\\\'``            | Single quote (``\'``)              |         |\n+-------------------+-----------------------------------+---------+\n| ``\\"``            | Double quote (``"``)              |         |\n+-------------------+-----------------------------------+---------+\n| ``\\a``            | ASCII Bell (BEL)                  |         |\n+-------------------+-----------------------------------+---------+\n| ``\\b``            | ASCII Backspace (BS)              |         |\n+-------------------+-----------------------------------+---------+\n| ``\\f``            | ASCII Formfeed (FF)               |         |\n+-------------------+-----------------------------------+---------+\n| ``\\n``            | ASCII Linefeed (LF)               |         |\n+-------------------+-----------------------------------+---------+\n| ``\\N{name}``      | Character named *name* in the     |         |\n|                   | Unicode database (Unicode only)   |         |\n+-------------------+-----------------------------------+---------+\n| ``\\r``            | ASCII Carriage Return (CR)        |         |\n+-------------------+-----------------------------------+---------+\n| ``\\t``            | ASCII Horizontal Tab (TAB)        |         |\n+-------------------+-----------------------------------+---------+\n| ``\\uxxxx``        | Character with 16-bit hex value   | (1)     |\n|                   | *xxxx* (Unicode only)             |         |\n+-------------------+-----------------------------------+---------+\n| ``\\Uxxxxxxxx``    | Character with 32-bit hex value   | (2)     |\n|                   | *xxxxxxxx* (Unicode only)         |         |\n+-------------------+-----------------------------------+---------+\n| ``\\v``            | ASCII Vertical Tab (VT)           |         |\n+-------------------+-----------------------------------+---------+\n| ``\\ooo``          | Character with octal value *ooo*  | (3,5)   |\n+-------------------+-----------------------------------+---------+\n| ``\\xhh``          | Character with hex value *hh*     | (4,5)   |\n+-------------------+-----------------------------------+---------+\n\nNotes:\n\n1. Individual code units which form parts of a surrogate pair can be\n   encoded using this escape sequence.\n\n2. Any Unicode character can be encoded this way, but characters\n   outside the Basic Multilingual Plane (BMP) will be encoded using a\n   surrogate pair if Python is compiled to use 16-bit code units (the\n   default).  Individual code units which form parts of a surrogate\n   pair can be encoded using this escape sequence.\n\n3. As in Standard C, up to three octal digits are accepted.\n\n4. Unlike in Standard C, exactly two hex digits are required.\n\n5. In a string literal, hexadecimal and octal escapes denote the byte\n   with the given value; it is not necessary that the byte encodes a\n   character in the source character set. In a Unicode literal, these\n   escapes denote a Unicode character with the given value.\n\nUnlike Standard C, all unrecognized escape sequences are left in the\nstring unchanged, i.e., *the backslash is left in the string*.  (This\nbehavior is useful when debugging: if an escape sequence is mistyped,\nthe resulting output is more easily recognized as broken.)  It is also\nimportant to note that the escape sequences marked as "(Unicode only)"\nin the table above fall into the category of unrecognized escapes for\nnon-Unicode string literals.\n\nWhen an ``\'r\'`` or ``\'R\'`` prefix is present, a character following a\nbackslash is included in the string without change, and *all\nbackslashes are left in the string*.  For example, the string literal\n``r"\\n"`` consists of two characters: a backslash and a lowercase\n``\'n\'``.  String quotes can be escaped with a backslash, but the\nbackslash remains in the string; for example, ``r"\\""`` is a valid\nstring literal consisting of two characters: a backslash and a double\nquote; ``r"\\"`` is not a valid string literal (even a raw string\ncannot end in an odd number of backslashes).  Specifically, *a raw\nstring cannot end in a single backslash* (since the backslash would\nescape the following quote character).  Note also that a single\nbackslash followed by a newline is interpreted as those two characters\nas part of the string, *not* as a line continuation.\n\nWhen an ``\'r\'`` or ``\'R\'`` prefix is used in conjunction with a\n``\'u\'`` or ``\'U\'`` prefix, then the ``\\uXXXX`` and ``\\UXXXXXXXX``\nescape sequences are processed while  *all other backslashes are left\nin the string*. For example, the string literal ``ur"\\u0062\\n"``\nconsists of three Unicode characters: \'LATIN SMALL LETTER B\', \'REVERSE\nSOLIDUS\', and \'LATIN SMALL LETTER N\'. Backslashes can be escaped with\na preceding backslash; however, both remain in the string.  As a\nresult, ``\\uXXXX`` escape sequences are only recognized when there are\nan odd number of backslashes.\n',