Home | History | Annotate | Download | only in pydoc_data

Lines Matching refs:Surrogate

66  'strings': u'\nString literals\n***************\n\nString literals are described by the following lexical definitions:\n\n   stringliteral   ::= [stringprefix](shortstring | longstring)\n   stringprefix    ::= "r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR"\n                    | "b" | "B" | "br" | "Br" | "bR" | "BR"\n   shortstring     ::= "\'" shortstringitem* "\'" | \'"\' shortstringitem* \'"\'\n   longstring      ::= "\'\'\'" longstringitem* "\'\'\'"\n                  | \'"""\' longstringitem* \'"""\'\n   shortstringitem ::= shortstringchar | escapeseq\n   longstringitem  ::= longstringchar | escapeseq\n   shortstringchar ::= <any source character except "\\" or newline or the quote>\n   longstringchar  ::= <any source character except "\\">\n   escapeseq       ::= "\\" <any ASCII character>\n\nOne syntactic restriction not indicated by these productions is that\nwhitespace is not allowed between the "stringprefix" and the rest of\nthe string literal. The source character set is defined by the\nencoding declaration; it is ASCII if no encoding declaration is given\nin the source file; see section Encoding declarations.\n\nIn plain English: String literals can be enclosed in matching single\nquotes ("\'") or double quotes (""").  They can also be enclosed in\nmatching groups of three single or double quotes (these are generally\nreferred to as *triple-quoted strings*).  The backslash ("\\")\ncharacter is used to escape characters that otherwise have a special\nmeaning, such as newline, backslash itself, or the quote character.\nString literals may optionally be prefixed with a letter "\'r\'" or\n"\'R\'"; such strings are called *raw strings* and use different rules\nfor interpreting backslash escape sequences.  A prefix of "\'u\'" or\n"\'U\'" makes the string a Unicode string.  Unicode strings use the\nUnicode character set as defined by the Unicode Consortium and ISO\n10646.  Some additional escape sequences, described below, are\navailable in Unicode strings. A prefix of "\'b\'" or "\'B\'" is ignored in\nPython 2; it indicates that the literal should become a bytes literal\nin Python 3 (e.g. when code is automatically converted with 2to3).  A\n"\'u\'" or "\'b\'" prefix may be followed by an "\'r\'" prefix.\n\nIn triple-quoted strings, unescaped newlines and quotes are allowed\n(and are retained), except that three unescaped quotes in a row\nterminate the string.  (A "quote" is the character used to open the\nstring, i.e. either "\'" or """.)\n\nUnless an "\'r\'" or "\'R\'" prefix is present, escape sequences in\nstrings are interpreted according to rules similar to those used by\nStandard C.  The recognized escape sequences are:\n\n+-------------------+-----------------------------------+---------+\n| Escape Sequence   | Meaning                           | Notes   |\n+===================+===================================+=========+\n| "\\newline"        | Ignored                           |         |\n+-------------------+-----------------------------------+---------+\n| "\\\\"              | Backslash ("\\")                   |         |\n+-------------------+-----------------------------------+---------+\n| "\\\'"              | Single quote ("\'")                |         |\n+-------------------+-----------------------------------+---------+\n| "\\""              | Double quote (""")                |         |\n+-------------------+-----------------------------------+---------+\n| "\\a"              | ASCII Bell (BEL)                  |         |\n+-------------------+-----------------------------------+---------+\n| "\\b"              | ASCII Backspace (BS)              |         |\n+-------------------+-----------------------------------+---------+\n| "\\f"              | ASCII Formfeed (FF)               |         |\n+-------------------+-----------------------------------+---------+\n| "\\n"              | ASCII Linefeed (LF)               |         |\n+-------------------+-----------------------------------+---------+\n| "\\N{name}"        | Character named *name* in the     |         |\n|                   | Unicode database (Unicode only)   |         |\n+-------------------+-----------------------------------+---------+\n| "\\r"              | ASCII Carriage Return (CR)        |         |\n+-------------------+-----------------------------------+---------+\n| "\\t"              | ASCII Horizontal Tab (TAB)        |         |\n+-------------------+-----------------------------------+---------+\n| "\\uxxxx"          | Character with 16-bit hex value   | (1)     |\n|                   | *xxxx* (Unicode only)             |         |\n+-------------------+-----------------------------------+---------+\n| "\\Uxxxxxxxx"      | Character with 32-bit hex value   | (2)     |\n|                   | *xxxxxxxx* (Unicode only)         |         |\n+-------------------+-----------------------------------+---------+\n| "\\v"              | ASCII Vertical Tab (VT)           |         |\n+-------------------+-----------------------------------+---------+\n| "\\ooo"            | Character with octal value *ooo*  | (3,5)   |\n+-------------------+-----------------------------------+---------+\n| "\\xhh"            | Character with hex value *hh*     | (4,5)   |\n+-------------------+-----------------------------------+---------+\n\nNotes:\n\n1. Individual code units which form parts of a surrogate pair can\n   be encoded using this escape sequence.\n\n2. Any Unicode character can be encoded this way, but characters\n   outside the Basic Multilingual Plane (BMP) will be encoded using a\n   surrogate pair if Python is compiled to use 16-bit code units (the\n   default).\n\n3. As in Standard C, up to three octal digits are accepted.\n\n4. Unlike in Standard C, exactly two hex digits are required.\n\n5. In a string literal, hexadecimal and octal escapes denote the\n   byte with the given value; it is not necessary that the byte\n   encodes a character in the source character set. In a Unicode\n   literal, these escapes denote a Unicode character with the given\n   value.\n\nUnlike Standard C, all unrecognized escape sequences are left in the\nstring unchanged, i.e., *the backslash is left in the string*.  (This\nbehavior is useful when debugging: if an escape sequence is mistyped,\nthe resulting output is more easily recognized as broken.)  It is also\nimportant to note that the escape sequences marked as "(Unicode only)"\nin the table above fall into the category of unrecognized escapes for\nnon-Unicode string literals.\n\nWhen an "\'r\'" or "\'R\'" prefix is present, a character following a\nbackslash is included in the string without change, and *all\nbackslashes are left in the string*.  For example, the string literal\n"r"\\n"" consists of two characters: a backslash and a lowercase "\'n\'".\nString quotes can be escaped with a backslash, but the backslash\nremains in the string; for example, "r"\\""" is a valid string literal\nconsisting of two characters: a backslash and a double quote; "r"\\""\nis not a valid string literal (even a raw string cannot end in an odd\nnumber of backslashes).  Specifically, *a raw string cannot end in a\nsingle backslash* (since the backslash would escape the following\nquote character).  Note also that a single backslash followed by a\nnewline is interpreted as those two characters as part of the string,\n*not* as a line continuation.\n\nWhen an "\'r\'" or "\'R\'" prefix is used in conjunction with a "\'u\'" or\n"\'U\'" prefix, then the "\\uXXXX" and "\\UXXXXXXXX" escape sequences are\nprocessed while  *all other backslashes are left in the string*. For\nexample, the string literal "ur"\\u0062\\n"" consists of three Unicode\ncharacters: \'LATIN SMALL LETTER B\', \'REVERSE SOLIDUS\', and \'LATIN\nSMALL LETTER N\'. Backslashes can be escaped with a preceding\nbackslash; however, both remain in the string.  As a result, "\\uXXXX"\nescape sequences are only recognized when there are an odd number of\nbackslashes.\n',

70 'types': u'\nThe standard type hierarchy\n***************************\n\nBelow is a list of the types that are built into Python. Extension\nmodules (written in C, Java, or other languages, depending on the\nimplementation) can define additional types. Future versions of\nPython may add types to the type hierarchy (e.g., rational numbers,\nefficiently stored arrays of integers, etc.).\n\nSome of the type descriptions below contain a paragraph listing\n\'special attributes.\' These are attributes that provide access to the\nimplementation and are not intended for general use. Their definition\nmay change in the future.\n\nNone\n This type has a single value. There is a single object with this\n value. This object is accessed through the built-in name "None". It\n is used to signify the absence of a value in many situations, e.g.,\n it is returned from functions that don\'t explicitly return\n anything. Its truth value is false.\n\nNotImplemented\n This type has a single value. There is a single object with this\n value. This object is accessed through the built-in name\n "NotImplemented". Numeric methods and rich comparison methods may\n return this value if they do not implement the operation for the\n operands provided. (The interpreter will then try the reflected\n operation, or some other fallback, depending on the operator.) Its\n truth value is true.\n\nEllipsis\n This type has a single value. There is a single object with this\n value. This object is accessed through the built-in name\n "Ellipsis". It is used to indicate the presence of the "..." syntax\n in a slice. Its truth value is true.\n\n"numbers.Number"\n These are created by numeric literals and returned as results by\n arithmetic operators and arithmetic built-in functions. Numeric\n objects are immutable; once created their value never changes.\n Python numbers are of course strongly related to mathematical\n numbers, but subject to the limitations of numerical representation\n in computers.\n\n Python distinguishes between integers, floating point numbers, and\n complex numbers:\n\n "numbers.Integral"\n These represent elements from the mathematical set of integers\n (positive and negative).\n\n There are three types of integers:\n\n Plain integers\n These represent numbers in the range -2147483648 through\n 2147483647. (The range may be larger on machines with a\n larger natural word size, but not smaller.) When the result\n of an operation would fall outside this range, the result is\n normally returned as a long integer (in some cases, the\n exception "OverflowError" is raised instead). For the\n purpose of shift and mask operations, integers are assumed to\n have a binary, 2\'s complement notation using 32 or more bits,\n and hiding no bits from the user (i.e., all 4294967296\n different bit patterns correspond to different values).\n\n Long integers\n These represent numbers in an unlimited range, subject to\n available (virtual) memory only. For the purpose of shift\n and mask operations, a binary representation is assumed, and\n negative numbers are represented in a variant of 2\'s\n complement which gives the illusion of an infinite string of\n sign bits extending to the left.\n\n Booleans\n These represent the truth values False and True. The two\n objects representing the values "False" and "True" are the\n only Boolean objects. The Boolean type is a subtype of plain\n integers, and Boolean values behave like the values 0 and 1,\n respectively, in almost all contexts, the exception being\n that when converted to a string, the strings ""False"" or\n ""True"" are returned, respectively.\n\n The rules for integer representation are intended to give the\n most meaningful interpretation of shift and mask operations\n involving negative integers and the least surprises when\n switching between the plain and long integer domains. Any\n operation, if it yields a result in the plain integer domain,\n will yield the same result in the long integer domain or when\n using mixed operands. The switch between domains is transparent\n to the programmer.\n\n "numbers.Real" ("float")\n These represent machine-level double precision floating point\n numbers. You are at the mercy of the underlying machine\n architecture (and C or Java implementation) for the accepted\n range and handling of overflow. Python does not support single-\n precision floating point numbers; the savings in processor and\n memory usage that are usually the reason for using these are\n dwarfed by the overhead of using objects in Python, so there is\n no reason to complicate the language with two kinds of floating\n point numbers.\n\n "numbers.Complex"\n These represent complex numbers as a pair of machine-level\n double precision floating point numbers. The same caveats apply\n as for floating point numbers. The real and imaginary parts of a\n complex number "z" can be retrieved through the read-only\n attributes "z.real" and "z.imag".\n\nSequences\n These represent finite ordered sets indexed by non-negative\n numbers. The built-in function "len()" returns the number of items\n of a sequence. When the length of a sequence is *n*, the index set\n contains the numbers 0, 1, ..., *n*-1. Item *i* of sequence *a* is\n selected by "a[i]".\n\n Sequences also support slicing: "a[i:j]" selects all items with\n index *k* such that *i* "<=" *k* "<" *j*. When used as an\n expression, a slice is a sequence of the same type. This implies\n that the index set is renumbered so that it starts at 0.\n\n Some sequences also support "extended slicing" with a third "step"\n parameter: "a[i:j:k]" selects all items of *a* with index *x* where\n "x = i + n*k", *n* ">=" "0" and *i* "<=" *x* "<" *j*.\n\n Sequences are distinguished according to their mutability:\n\n Immutable sequences\n An object of an immutable sequence type cannot change once it is\n created. (If the object contains references to other objects,\n these other objects may be mutable and may be changed; however,\n the collection of objects directly referenced by an immutable\n object cannot change.)\n\n The following types are immutable sequences:\n\n Strings\n The items of a string are characters. There is no separate\n character type; a character is represented by a string of one\n item. Characters represent (at least) 8-bit bytes. The\n built-in functions "chr()" and "ord()" convert between\n characters and nonnegative integers representing the byte\n values. Bytes with the values 0-127 usually represent the\n corresponding ASCII values, but the interpretation of values\n is up to the program. The string data type is also used to\n represent arrays of bytes, e.g., to hold data read from a\n file.\n\n (On systems whose native character set is not ASCII, strings\n may use EBCDIC in their internal representation, provided the\n functions "chr()" and "ord()" implement a mapping between\n ASCII and EBCDIC, and string comparison preserves the ASCII\n order. Or perhaps someone can propose a better rule?)\n\n Unicode\n The items of a Unicode object are Unicode code units. A\n Unicode code unit is represented by a Unicode object of one\n item and can hold either a 16-bit or 32-bit value\n representing a Unicode ordinal (the maximum value for the\n ordinal is given in "sys.maxunicode", and depends on how\n Python is configured at compile time). Surrogate