Home | History | Annotate | Download | only in c-api
      1 .. highlightlang:: c
      2 
      3 .. _stringobjects:
      4 
      5 String/Bytes Objects
      6 --------------------
      7 
      8 These functions raise :exc:`TypeError` when expecting a string parameter and are
      9 called with a non-string parameter.
     10 
     11 .. note::
     12 
     13    These functions have been renamed to PyBytes_* in Python 3.x. Unless
     14    otherwise noted, the PyBytes functions available in 3.x are aliased to their
     15    PyString_* equivalents to help porting.
     16 
     17 .. index:: object: string
     18 
     19 
     20 .. c:type:: PyStringObject
     21 
     22    This subtype of :c:type:`PyObject` represents a Python string object.
     23 
     24 
     25 .. c:var:: PyTypeObject PyString_Type
     26 
     27    .. index:: single: StringType (in module types)
     28 
     29    This instance of :c:type:`PyTypeObject` represents the Python string type; it is
     30    the same object as ``str`` and ``types.StringType`` in the Python layer. .
     31 
     32 
     33 .. c:function:: int PyString_Check(PyObject *o)
     34 
     35    Return true if the object *o* is a string object or an instance of a subtype of
     36    the string type.
     37 
     38    .. versionchanged:: 2.2
     39       Allowed subtypes to be accepted.
     40 
     41 
     42 .. c:function:: int PyString_CheckExact(PyObject *o)
     43 
     44    Return true if the object *o* is a string object, but not an instance of a
     45    subtype of the string type.
     46 
     47    .. versionadded:: 2.2
     48 
     49 
     50 .. c:function:: PyObject* PyString_FromString(const char *v)
     51 
     52    Return a new string object with a copy of the string *v* as value on success,
     53    and *NULL* on failure.  The parameter *v* must not be *NULL*; it will not be
     54    checked.
     55 
     56 
     57 .. c:function:: PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
     58 
     59    Return a new string object with a copy of the string *v* as value and length
     60    *len* on success, and *NULL* on failure.  If *v* is *NULL*, the contents of the
     61    string are uninitialized.
     62 
     63    .. versionchanged:: 2.5
     64       This function used an :c:type:`int` type for *len*. This might require
     65       changes in your code for properly supporting 64-bit systems.
     66 
     67 
     68 .. c:function:: PyObject* PyString_FromFormat(const char *format, ...)
     69 
     70    Take a C :c:func:`printf`\ -style *format* string and a variable number of
     71    arguments, calculate the size of the resulting Python string and return a string
     72    with the values formatted into it.  The variable arguments must be C types and
     73    must correspond exactly to the format characters in the *format* string.  The
     74    following format characters are allowed:
     75 
     76    .. % This should be exactly the same as the table in PyErr_Format.
     77    .. % One should just refer to the other.
     78    .. % The descriptions for %zd and %zu are wrong, but the truth is complicated
     79    .. % because not all compilers support the %z width modifier -- we fake it
     80    .. % when necessary via interpolating PY_FORMAT_SIZE_T.
     81    .. % Similar comments apply to the %ll width modifier and
     82    .. % PY_FORMAT_LONG_LONG.
     83    .. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
     84 
     85    +-------------------+---------------+--------------------------------+
     86    | Format Characters | Type          | Comment                        |
     87    +===================+===============+================================+
     88    | :attr:`%%`        | *n/a*         | The literal % character.       |
     89    +-------------------+---------------+--------------------------------+
     90    | :attr:`%c`        | int           | A single character,            |
     91    |                   |               | represented as a C int.        |
     92    +-------------------+---------------+--------------------------------+
     93    | :attr:`%d`        | int           | Exactly equivalent to          |
     94    |                   |               | ``printf("%d")``.              |
     95    +-------------------+---------------+--------------------------------+
     96    | :attr:`%u`        | unsigned int  | Exactly equivalent to          |
     97    |                   |               | ``printf("%u")``.              |
     98    +-------------------+---------------+--------------------------------+
     99    | :attr:`%ld`       | long          | Exactly equivalent to          |
    100    |                   |               | ``printf("%ld")``.             |
    101    +-------------------+---------------+--------------------------------+
    102    | :attr:`%lu`       | unsigned long | Exactly equivalent to          |
    103    |                   |               | ``printf("%lu")``.             |
    104    +-------------------+---------------+--------------------------------+
    105    | :attr:`%lld`      | long long     | Exactly equivalent to          |
    106    |                   |               | ``printf("%lld")``.            |
    107    +-------------------+---------------+--------------------------------+
    108    | :attr:`%llu`      | unsigned      | Exactly equivalent to          |
    109    |                   | long long     | ``printf("%llu")``.            |
    110    +-------------------+---------------+--------------------------------+
    111    | :attr:`%zd`       | Py_ssize_t    | Exactly equivalent to          |
    112    |                   |               | ``printf("%zd")``.             |
    113    +-------------------+---------------+--------------------------------+
    114    | :attr:`%zu`       | size_t        | Exactly equivalent to          |
    115    |                   |               | ``printf("%zu")``.             |
    116    +-------------------+---------------+--------------------------------+
    117    | :attr:`%i`        | int           | Exactly equivalent to          |
    118    |                   |               | ``printf("%i")``.              |
    119    +-------------------+---------------+--------------------------------+
    120    | :attr:`%x`        | int           | Exactly equivalent to          |
    121    |                   |               | ``printf("%x")``.              |
    122    +-------------------+---------------+--------------------------------+
    123    | :attr:`%s`        | char\*        | A null-terminated C character  |
    124    |                   |               | array.                         |
    125    +-------------------+---------------+--------------------------------+
    126    | :attr:`%p`        | void\*        | The hex representation of a C  |
    127    |                   |               | pointer. Mostly equivalent to  |
    128    |                   |               | ``printf("%p")`` except that   |
    129    |                   |               | it is guaranteed to start with |
    130    |                   |               | the literal ``0x`` regardless  |
    131    |                   |               | of what the platform's         |
    132    |                   |               | ``printf`` yields.             |
    133    +-------------------+---------------+--------------------------------+
    134 
    135    An unrecognized format character causes all the rest of the format string to be
    136    copied as-is to the result string, and any extra arguments discarded.
    137 
    138    .. note::
    139 
    140       The `"%lld"` and `"%llu"` format specifiers are only available
    141       when :const:`HAVE_LONG_LONG` is defined.
    142 
    143    .. versionchanged:: 2.7
    144       Support for `"%lld"` and `"%llu"` added.
    145 
    146 
    147 .. c:function:: PyObject* PyString_FromFormatV(const char *format, va_list vargs)
    148 
    149    Identical to :c:func:`PyString_FromFormat` except that it takes exactly two
    150    arguments.
    151 
    152 
    153 .. c:function:: Py_ssize_t PyString_Size(PyObject *string)
    154 
    155    Return the length of the string in string object *string*.
    156 
    157    .. versionchanged:: 2.5
    158       This function returned an :c:type:`int` type. This might require changes
    159       in your code for properly supporting 64-bit systems.
    160 
    161 
    162 .. c:function:: Py_ssize_t PyString_GET_SIZE(PyObject *string)
    163 
    164    Macro form of :c:func:`PyString_Size` but without error checking.
    165 
    166    .. versionchanged:: 2.5
    167       This macro returned an :c:type:`int` type. This might require changes in
    168       your code for properly supporting 64-bit systems.
    169 
    170 
    171 .. c:function:: char* PyString_AsString(PyObject *string)
    172 
    173    Return a NUL-terminated representation of the contents of *string*.  The pointer
    174    refers to the internal buffer of *string*, not a copy.  The data must not be
    175    modified in any way, unless the string was just created using
    176    ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated.  If
    177    *string* is a Unicode object, this function computes the default encoding of
    178    *string* and operates on that.  If *string* is not a string object at all,
    179    :c:func:`PyString_AsString` returns *NULL* and raises :exc:`TypeError`.
    180 
    181 
    182 .. c:function:: char* PyString_AS_STRING(PyObject *string)
    183 
    184    Macro form of :c:func:`PyString_AsString` but without error checking.  Only
    185    string objects are supported; no Unicode objects should be passed.
    186 
    187 
    188 .. c:function:: int PyString_AsStringAndSize(PyObject *obj, char **buffer, Py_ssize_t *length)
    189 
    190    Return a NUL-terminated representation of the contents of the object *obj*
    191    through the output variables *buffer* and *length*.
    192 
    193    The function accepts both string and Unicode objects as input. For Unicode
    194    objects it returns the default encoded version of the object.  If *length* is
    195    *NULL*, the resulting buffer may not contain NUL characters; if it does, the
    196    function returns ``-1`` and a :exc:`TypeError` is raised.
    197 
    198    The buffer refers to an internal string buffer of *obj*, not a copy. The data
    199    must not be modified in any way, unless the string was just created using
    200    ``PyString_FromStringAndSize(NULL, size)``.  It must not be deallocated.  If
    201    *string* is a Unicode object, this function computes the default encoding of
    202    *string* and operates on that.  If *string* is not a string object at all,
    203    :c:func:`PyString_AsStringAndSize` returns ``-1`` and raises :exc:`TypeError`.
    204 
    205    .. versionchanged:: 2.5
    206       This function used an :c:type:`int *` type for *length*. This might
    207       require changes in your code for properly supporting 64-bit systems.
    208 
    209 
    210 .. c:function:: void PyString_Concat(PyObject **string, PyObject *newpart)
    211 
    212    Create a new string object in *\*string* containing the contents of *newpart*
    213    appended to *string*; the caller will own the new reference.  The reference to
    214    the old value of *string* will be stolen.  If the new string cannot be created,
    215    the old reference to *string* will still be discarded and the value of
    216    *\*string* will be set to *NULL*; the appropriate exception will be set.
    217 
    218 
    219 .. c:function:: void PyString_ConcatAndDel(PyObject **string, PyObject *newpart)
    220 
    221    Create a new string object in *\*string* containing the contents of *newpart*
    222    appended to *string*.  This version decrements the reference count of *newpart*.
    223 
    224 
    225 .. c:function:: int _PyString_Resize(PyObject **string, Py_ssize_t newsize)
    226 
    227    A way to resize a string object even though it is "immutable". Only use this to
    228    build up a brand new string object; don't use this if the string may already be
    229    known in other parts of the code.  It is an error to call this function if the
    230    refcount on the input string object is not one. Pass the address of an existing
    231    string object as an lvalue (it may be written into), and the new size desired.
    232    On success, *\*string* holds the resized string object and ``0`` is returned;
    233    the address in *\*string* may differ from its input value.  If the reallocation
    234    fails, the original string object at *\*string* is deallocated, *\*string* is
    235    set to *NULL*, a memory exception is set, and ``-1`` is returned.
    236 
    237    .. versionchanged:: 2.5
    238       This function used an :c:type:`int` type for *newsize*. This might
    239       require changes in your code for properly supporting 64-bit systems.
    240 
    241 .. c:function:: PyObject* PyString_Format(PyObject *format, PyObject *args)
    242 
    243    Return a new string object from *format* and *args*. Analogous to ``format %
    244    args``.  The *args* argument must be a tuple or dict.
    245 
    246 
    247 .. c:function:: void PyString_InternInPlace(PyObject **string)
    248 
    249    Intern the argument *\*string* in place.  The argument must be the address of a
    250    pointer variable pointing to a Python string object.  If there is an existing
    251    interned string that is the same as *\*string*, it sets *\*string* to it
    252    (decrementing the reference count of the old string object and incrementing the
    253    reference count of the interned string object), otherwise it leaves *\*string*
    254    alone and interns it (incrementing its reference count).  (Clarification: even
    255    though there is a lot of talk about reference counts, think of this function as
    256    reference-count-neutral; you own the object after the call if and only if you
    257    owned it before the call.)
    258 
    259    .. note::
    260 
    261       This function is not available in 3.x and does not have a PyBytes alias.
    262 
    263 
    264 .. c:function:: PyObject* PyString_InternFromString(const char *v)
    265 
    266    A combination of :c:func:`PyString_FromString` and
    267    :c:func:`PyString_InternInPlace`, returning either a new string object that has
    268    been interned, or a new ("owned") reference to an earlier interned string object
    269    with the same value.
    270 
    271    .. note::
    272 
    273       This function is not available in 3.x and does not have a PyBytes alias.
    274 
    275 
    276 .. c:function:: PyObject* PyString_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
    277 
    278    Create an object by decoding *size* bytes of the encoded buffer *s* using the
    279    codec registered for *encoding*.  *encoding* and *errors* have the same meaning
    280    as the parameters of the same name in the :func:`unicode` built-in function.
    281    The codec to be used is looked up using the Python codec registry.  Return
    282    *NULL* if an exception was raised by the codec.
    283 
    284    .. note::
    285 
    286       This function is not available in 3.x and does not have a PyBytes alias.
    287 
    288    .. versionchanged:: 2.5
    289       This function used an :c:type:`int` type for *size*. This might require
    290       changes in your code for properly supporting 64-bit systems.
    291 
    292 
    293 .. c:function:: PyObject* PyString_AsDecodedObject(PyObject *str, const char *encoding, const char *errors)
    294 
    295    Decode a string object by passing it to the codec registered for *encoding* and
    296    return the result as Python object. *encoding* and *errors* have the same
    297    meaning as the parameters of the same name in the string :meth:`encode` method.
    298    The codec to be used is looked up using the Python codec registry. Return *NULL*
    299    if an exception was raised by the codec.
    300 
    301    .. note::
    302 
    303       This function is not available in 3.x and does not have a PyBytes alias.
    304 
    305 
    306 .. c:function:: PyObject* PyString_Encode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
    307 
    308    Encode the :c:type:`char` buffer of the given size by passing it to the codec
    309    registered for *encoding* and return a Python object. *encoding* and *errors*
    310    have the same meaning as the parameters of the same name in the string
    311    :meth:`encode` method. The codec to be used is looked up using the Python codec
    312    registry.  Return *NULL* if an exception was raised by the codec.
    313 
    314    .. note::
    315 
    316       This function is not available in 3.x and does not have a PyBytes alias.
    317 
    318    .. versionchanged:: 2.5
    319       This function used an :c:type:`int` type for *size*. This might require
    320       changes in your code for properly supporting 64-bit systems.
    321 
    322 
    323 .. c:function:: PyObject* PyString_AsEncodedObject(PyObject *str, const char *encoding, const char *errors)
    324 
    325    Encode a string object using the codec registered for *encoding* and return the
    326    result as Python object. *encoding* and *errors* have the same meaning as the
    327    parameters of the same name in the string :meth:`encode` method. The codec to be
    328    used is looked up using the Python codec registry. Return *NULL* if an exception
    329    was raised by the codec.
    330 
    331    .. note::
    332 
    333       This function is not available in 3.x and does not have a PyBytes alias.
    334