Home | History | Annotate | Download | only in extending
      1 .. highlightlang:: c
      2 
      3 
      4 .. _defining-new-types:
      5 
      6 ******************
      7 Defining New Types
      8 ******************
      9 
     10 .. sectionauthor:: Michael Hudson <mwh (a] python.net>
     11 .. sectionauthor:: Dave Kuhlman <dkuhlman (a] rexx.com>
     12 .. sectionauthor:: Jim Fulton <jim (a] zope.com>
     13 
     14 
     15 As mentioned in the last chapter, Python allows the writer of an extension
     16 module to define new types that can be manipulated from Python code, much like
     17 strings and lists in core Python.
     18 
     19 This is not hard; the code for all extension types follows a pattern, but there
     20 are some details that you need to understand before you can get started.
     21 
     22 
     23 .. _dnt-basics:
     24 
     25 The Basics
     26 ==========
     27 
     28 The Python runtime sees all Python objects as variables of type
     29 :c:type:`PyObject\*`, which serves as a "base type" for all Python objects.
     30 :c:type:`PyObject` itself only contains the refcount and a pointer to the
     31 object's "type object".  This is where the action is; the type object determines
     32 which (C) functions get called when, for instance, an attribute gets looked
     33 up on an object or it is multiplied by another object.  These C functions
     34 are called "type methods".
     35 
     36 So, if you want to define a new object type, you need to create a new type
     37 object.
     38 
     39 This sort of thing can only be explained by example, so here's a minimal, but
     40 complete, module that defines a new type:
     41 
     42 .. literalinclude:: ../includes/noddy.c
     43 
     44 
     45 Now that's quite a bit to take in at once, but hopefully bits will seem familiar
     46 from the last chapter.
     47 
     48 The first bit that will be new is::
     49 
     50    typedef struct {
     51        PyObject_HEAD
     52    } noddy_NoddyObject;
     53 
     54 This is what a Noddy object will contain---in this case, nothing more than what
     55 every Python object contains---a field called ``ob_base`` of type
     56 :c:type:`PyObject`.  :c:type:`PyObject` in turn, contains an ``ob_refcnt``
     57 field and a pointer to a type object.  These can be accessed using the macros
     58 :c:macro:`Py_REFCNT` and :c:macro:`Py_TYPE` respectively.  These are the fields
     59 the :c:macro:`PyObject_HEAD` macro brings in.  The reason for the macro is to
     60 standardize the layout and to enable special debugging fields in debug builds.
     61 
     62 Note that there is no semicolon after the :c:macro:`PyObject_HEAD` macro;
     63 one is included in the macro definition.  Be wary of adding one by
     64 accident; it's easy to do from habit, and your compiler might not complain,
     65 but someone else's probably will!  (On Windows, MSVC is known to call this an
     66 error and refuse to compile the code.)
     67 
     68 For contrast, let's take a look at the corresponding definition for standard
     69 Python floats::
     70 
     71    typedef struct {
     72        PyObject_HEAD
     73        double ob_fval;
     74    } PyFloatObject;
     75 
     76 Moving on, we come to the crunch --- the type object. ::
     77 
     78    static PyTypeObject noddy_NoddyType = {
     79        PyVarObject_HEAD_INIT(NULL, 0)
     80        "noddy.Noddy",             /* tp_name */
     81        sizeof(noddy_NoddyObject), /* tp_basicsize */
     82        0,                         /* tp_itemsize */
     83        0,                         /* tp_dealloc */
     84        0,                         /* tp_print */
     85        0,                         /* tp_getattr */
     86        0,                         /* tp_setattr */
     87        0,                         /* tp_as_async */
     88        0,                         /* tp_repr */
     89        0,                         /* tp_as_number */
     90        0,                         /* tp_as_sequence */
     91        0,                         /* tp_as_mapping */
     92        0,                         /* tp_hash  */
     93        0,                         /* tp_call */
     94        0,                         /* tp_str */
     95        0,                         /* tp_getattro */
     96        0,                         /* tp_setattro */
     97        0,                         /* tp_as_buffer */
     98        Py_TPFLAGS_DEFAULT,        /* tp_flags */
     99        "Noddy objects",           /* tp_doc */
    100    };
    101 
    102 Now if you go and look up the definition of :c:type:`PyTypeObject` in
    103 :file:`object.h` you'll see that it has many more fields that the definition
    104 above.  The remaining fields will be filled with zeros by the C compiler, and
    105 it's common practice to not specify them explicitly unless you need them.
    106 
    107 This is so important that we're going to pick the top of it apart still
    108 further::
    109 
    110    PyVarObject_HEAD_INIT(NULL, 0)
    111 
    112 This line is a bit of a wart; what we'd like to write is::
    113 
    114    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    115 
    116 as the type of a type object is "type", but this isn't strictly conforming C and
    117 some compilers complain.  Fortunately, this member will be filled in for us by
    118 :c:func:`PyType_Ready`. ::
    119 
    120    "noddy.Noddy",              /* tp_name */
    121 
    122 The name of our type.  This will appear in the default textual representation of
    123 our objects and in some error messages, for example::
    124 
    125    >>> "" + noddy.new_noddy()
    126    Traceback (most recent call last):
    127      File "<stdin>", line 1, in ?
    128    TypeError: cannot add type "noddy.Noddy" to string
    129 
    130 Note that the name is a dotted name that includes both the module name and the
    131 name of the type within the module. The module in this case is :mod:`noddy` and
    132 the type is :class:`Noddy`, so we set the type name to :class:`noddy.Noddy`.
    133 One side effect of using an undotted name is that the pydoc documentation tool
    134 will not list the new type in the module documentation. ::
    135 
    136    sizeof(noddy_NoddyObject),  /* tp_basicsize */
    137 
    138 This is so that Python knows how much memory to allocate when you call
    139 :c:func:`PyObject_New`.
    140 
    141 .. note::
    142 
    143    If you want your type to be subclassable from Python, and your type has the same
    144    :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple
    145    inheritance.  A Python subclass of your type will have to list your type first
    146    in its :attr:`~class.__bases__`, or else it will not be able to call your type's
    147    :meth:`__new__` method without getting an error.  You can avoid this problem by
    148    ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its
    149    base type does.  Most of the time, this will be true anyway, because either your
    150    base type will be :class:`object`, or else you will be adding data members to
    151    your base type, and therefore increasing its size.
    152 
    153 ::
    154 
    155    0,                          /* tp_itemsize */
    156 
    157 This has to do with variable length objects like lists and strings. Ignore this
    158 for now.
    159 
    160 Skipping a number of type methods that we don't provide, we set the class flags
    161 to :const:`Py_TPFLAGS_DEFAULT`. ::
    162 
    163    Py_TPFLAGS_DEFAULT,        /* tp_flags */
    164 
    165 All types should include this constant in their flags.  It enables all of the
    166 members defined until at least Python 3.3.  If you need further members,
    167 you will need to OR the corresponding flags.
    168 
    169 We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. ::
    170 
    171    "Noddy objects",           /* tp_doc */
    172 
    173 Now we get into the type methods, the things that make your objects different
    174 from the others.  We aren't going to implement any of these in this version of
    175 the module.  We'll expand this example later to have more interesting behavior.
    176 
    177 For now, all we want to be able to do is to create new :class:`Noddy` objects.
    178 To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new` implementation.
    179 In this case, we can just use the default implementation provided by the API
    180 function :c:func:`PyType_GenericNew`.  We'd like to just assign this to the
    181 :c:member:`~PyTypeObject.tp_new` slot, but we can't, for portability sake, On some platforms or
    182 compilers, we can't statically initialize a structure member with a function
    183 defined in another C module, so, instead, we'll assign the :c:member:`~PyTypeObject.tp_new` slot
    184 in the module initialization function just before calling
    185 :c:func:`PyType_Ready`::
    186 
    187    noddy_NoddyType.tp_new = PyType_GenericNew;
    188    if (PyType_Ready(&noddy_NoddyType) < 0)
    189        return;
    190 
    191 All the other type methods are *NULL*, so we'll go over them later --- that's
    192 for a later section!
    193 
    194 Everything else in the file should be familiar, except for some code in
    195 :c:func:`PyInit_noddy`::
    196 
    197    if (PyType_Ready(&noddy_NoddyType) < 0)
    198        return;
    199 
    200 This initializes the :class:`Noddy` type, filing in a number of members,
    201 including :attr:`ob_type` that we initially set to *NULL*. ::
    202 
    203    PyModule_AddObject(m, "Noddy", (PyObject *)&noddy_NoddyType);
    204 
    205 This adds the type to the module dictionary.  This allows us to create
    206 :class:`Noddy` instances by calling the :class:`Noddy` class::
    207 
    208    >>> import noddy
    209    >>> mynoddy = noddy.Noddy()
    210 
    211 That's it!  All that remains is to build it; put the above code in a file called
    212 :file:`noddy.c` and ::
    213 
    214    from distutils.core import setup, Extension
    215    setup(name="noddy", version="1.0",
    216          ext_modules=[Extension("noddy", ["noddy.c"])])
    217 
    218 in a file called :file:`setup.py`; then typing
    219 
    220 .. code-block:: shell-session
    221 
    222    $ python setup.py build
    223 
    224 at a shell should produce a file :file:`noddy.so` in a subdirectory; move to
    225 that directory and fire up Python --- you should be able to ``import noddy`` and
    226 play around with Noddy objects.
    227 
    228 That wasn't so hard, was it?
    229 
    230 Of course, the current Noddy type is pretty uninteresting. It has no data and
    231 doesn't do anything. It can't even be subclassed.
    232 
    233 
    234 Adding data and methods to the Basic example
    235 --------------------------------------------
    236 
    237 Let's extend the basic example to add some data and methods.  Let's also make
    238 the type usable as a base class. We'll create a new module, :mod:`noddy2` that
    239 adds these capabilities:
    240 
    241 .. literalinclude:: ../includes/noddy2.c
    242 
    243 
    244 This version of the module has a number of changes.
    245 
    246 We've added an extra include::
    247 
    248    #include <structmember.h>
    249 
    250 This include provides declarations that we use to handle attributes, as
    251 described a bit later.
    252 
    253 The name of the :class:`Noddy` object structure has been shortened to
    254 :class:`Noddy`.  The type object name has been shortened to :class:`NoddyType`.
    255 
    256 The  :class:`Noddy` type now has three data attributes, *first*, *last*, and
    257 *number*.  The *first* and *last* variables are Python strings containing first
    258 and last names. The *number* attribute is an integer.
    259 
    260 The object structure is updated accordingly::
    261 
    262    typedef struct {
    263        PyObject_HEAD
    264        PyObject *first;
    265        PyObject *last;
    266        int number;
    267    } Noddy;
    268 
    269 Because we now have data to manage, we have to be more careful about object
    270 allocation and deallocation.  At a minimum, we need a deallocation method::
    271 
    272    static void
    273    Noddy_dealloc(Noddy* self)
    274    {
    275        Py_XDECREF(self->first);
    276        Py_XDECREF(self->last);
    277        Py_TYPE(self)->tp_free((PyObject*)self);
    278    }
    279 
    280 which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member::
    281 
    282    (destructor)Noddy_dealloc, /*tp_dealloc*/
    283 
    284 This method decrements the reference counts of the two Python attributes. We use
    285 :c:func:`Py_XDECREF` here because the :attr:`first` and :attr:`last` members
    286 could be *NULL*.  It then calls the :c:member:`~PyTypeObject.tp_free` member of the object's type
    287 to free the object's memory.  Note that the object's type might not be
    288 :class:`NoddyType`, because the object may be an instance of a subclass.
    289 
    290 We want to make sure that the first and last names are initialized to empty
    291 strings, so we provide a new method::
    292 
    293    static PyObject *
    294    Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
    295    {
    296        Noddy *self;
    297 
    298        self = (Noddy *)type->tp_alloc(type, 0);
    299        if (self != NULL) {
    300            self->first = PyUnicode_FromString("");
    301            if (self->first == NULL) {
    302                Py_DECREF(self);
    303                return NULL;
    304            }
    305 
    306            self->last = PyUnicode_FromString("");
    307            if (self->last == NULL) {
    308                Py_DECREF(self);
    309                return NULL;
    310            }
    311 
    312            self->number = 0;
    313        }
    314 
    315        return (PyObject *)self;
    316    }
    317 
    318 and install it in the :c:member:`~PyTypeObject.tp_new` member::
    319 
    320    Noddy_new,                 /* tp_new */
    321 
    322 The new member is responsible for creating (as opposed to initializing) objects
    323 of the type.  It is exposed in Python as the :meth:`__new__` method.  See the
    324 paper titled "Unifying types and classes in Python" for a detailed discussion of
    325 the :meth:`__new__` method.  One reason to implement a new method is to assure
    326 the initial values of instance variables.  In this case, we use the new method
    327 to make sure that the initial values of the members :attr:`first` and
    328 :attr:`last` are not *NULL*. If we didn't care whether the initial values were
    329 *NULL*, we could have used :c:func:`PyType_GenericNew` as our new method, as we
    330 did before.  :c:func:`PyType_GenericNew` initializes all of the instance variable
    331 members to *NULL*.
    332 
    333 The new method is a static method that is passed the type being instantiated and
    334 any arguments passed when the type was called, and that returns the new object
    335 created. New methods always accept positional and keyword arguments, but they
    336 often ignore the arguments, leaving the argument handling to initializer
    337 methods. Note that if the type supports subclassing, the type passed may not be
    338 the type being defined.  The new method calls the :c:member:`~PyTypeObject.tp_alloc` slot to
    339 allocate memory. We don't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather
    340 :c:func:`PyType_Ready` fills it for us by inheriting it from our base class,
    341 which is :class:`object` by default.  Most types use the default allocation.
    342 
    343 .. note::
    344 
    345    If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one that calls a base type's
    346    :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`), you must *not* try to determine what method
    347    to call using method resolution order at runtime.  Always statically determine
    348    what type you are going to call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via
    349    ``type->tp_base->tp_new``.  If you do not do this, Python subclasses of your
    350    type that also inherit from other Python-defined classes may not work correctly.
    351    (Specifically, you may not be able to create instances of such subclasses
    352    without getting a :exc:`TypeError`.)
    353 
    354 We provide an initialization function::
    355 
    356    static int
    357    Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
    358    {
    359        PyObject *first=NULL, *last=NULL, *tmp;
    360 
    361        static char *kwlist[] = {"first", "last", "number", NULL};
    362 
    363        if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
    364                                          &first, &last,
    365                                          &self->number))
    366            return -1;
    367 
    368        if (first) {
    369            tmp = self->first;
    370            Py_INCREF(first);
    371            self->first = first;
    372            Py_XDECREF(tmp);
    373        }
    374 
    375        if (last) {
    376            tmp = self->last;
    377            Py_INCREF(last);
    378            self->last = last;
    379            Py_XDECREF(tmp);
    380        }
    381 
    382        return 0;
    383    }
    384 
    385 by filling the :c:member:`~PyTypeObject.tp_init` slot. ::
    386 
    387    (initproc)Noddy_init,         /* tp_init */
    388 
    389 The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the :meth:`__init__` method. It
    390 is used to initialize an object after it's created. Unlike the new method, we
    391 can't guarantee that the initializer is called.  The initializer isn't called
    392 when unpickling objects and it can be overridden.  Our initializer accepts
    393 arguments to provide initial values for our instance. Initializers always accept
    394 positional and keyword arguments. Initializers should return either 0 on
    395 success or -1 on error.
    396 
    397 Initializers can be called multiple times.  Anyone can call the :meth:`__init__`
    398 method on our objects.  For this reason, we have to be extra careful when
    399 assigning the new values.  We might be tempted, for example to assign the
    400 :attr:`first` member like this::
    401 
    402    if (first) {
    403        Py_XDECREF(self->first);
    404        Py_INCREF(first);
    405        self->first = first;
    406    }
    407 
    408 But this would be risky.  Our type doesn't restrict the type of the
    409 :attr:`first` member, so it could be any kind of object.  It could have a
    410 destructor that causes code to be executed that tries to access the
    411 :attr:`first` member.  To be paranoid and protect ourselves against this
    412 possibility, we almost always reassign members before decrementing their
    413 reference counts.  When don't we have to do this?
    414 
    415 * when we absolutely know that the reference count is greater than 1
    416 
    417 * when we know that deallocation of the object [#]_ will not cause any calls
    418   back into our type's code
    419 
    420 * when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc` handler when
    421   garbage-collections is not supported [#]_
    422 
    423 We want to expose our instance variables as attributes. There are a
    424 number of ways to do that. The simplest way is to define member definitions::
    425 
    426    static PyMemberDef Noddy_members[] = {
    427        {"first", T_OBJECT_EX, offsetof(Noddy, first), 0,
    428         "first name"},
    429        {"last", T_OBJECT_EX, offsetof(Noddy, last), 0,
    430         "last name"},
    431        {"number", T_INT, offsetof(Noddy, number), 0,
    432         "noddy number"},
    433        {NULL}  /* Sentinel */
    434    };
    435 
    436 and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot::
    437 
    438    Noddy_members,             /* tp_members */
    439 
    440 Each member definition has a member name, type, offset, access flags and
    441 documentation string. See the :ref:`Generic-Attribute-Management` section below for
    442 details.
    443 
    444 A disadvantage of this approach is that it doesn't provide a way to restrict the
    445 types of objects that can be assigned to the Python attributes.  We expect the
    446 first and last names to be strings, but any Python objects can be assigned.
    447 Further, the attributes can be deleted, setting the C pointers to *NULL*.  Even
    448 though we can make sure the members are initialized to non-*NULL* values, the
    449 members can be set to *NULL* if the attributes are deleted.
    450 
    451 We define a single method, :meth:`name`, that outputs the objects name as the
    452 concatenation of the first and last names. ::
    453 
    454    static PyObject *
    455    Noddy_name(Noddy* self)
    456    {
    457        if (self->first == NULL) {
    458            PyErr_SetString(PyExc_AttributeError, "first");
    459            return NULL;
    460        }
    461 
    462        if (self->last == NULL) {
    463            PyErr_SetString(PyExc_AttributeError, "last");
    464            return NULL;
    465        }
    466 
    467        return PyUnicode_FromFormat("%S %S", self->first, self->last);
    468    }
    469 
    470 The method is implemented as a C function that takes a :class:`Noddy` (or
    471 :class:`Noddy` subclass) instance as the first argument.  Methods always take an
    472 instance as the first argument. Methods often take positional and keyword
    473 arguments as well, but in this case we don't take any and don't need to accept
    474 a positional argument tuple or keyword argument dictionary. This method is
    475 equivalent to the Python method::
    476 
    477    def name(self):
    478       return "%s %s" % (self.first, self.last)
    479 
    480 Note that we have to check for the possibility that our :attr:`first` and
    481 :attr:`last` members are *NULL*.  This is because they can be deleted, in which
    482 case they are set to *NULL*.  It would be better to prevent deletion of these
    483 attributes and to restrict the attribute values to be strings.  We'll see how to
    484 do that in the next section.
    485 
    486 Now that we've defined the method, we need to create an array of method
    487 definitions::
    488 
    489    static PyMethodDef Noddy_methods[] = {
    490        {"name", (PyCFunction)Noddy_name, METH_NOARGS,
    491         "Return the name, combining the first and last name"
    492        },
    493        {NULL}  /* Sentinel */
    494    };
    495 
    496 and assign them to the :c:member:`~PyTypeObject.tp_methods` slot::
    497 
    498    Noddy_methods,             /* tp_methods */
    499 
    500 Note that we used the :const:`METH_NOARGS` flag to indicate that the method is
    501 passed no arguments.
    502 
    503 Finally, we'll make our type usable as a base class.  We've written our methods
    504 carefully so far so that they don't make any assumptions about the type of the
    505 object being created or used, so all we need to do is to add the
    506 :const:`Py_TPFLAGS_BASETYPE` to our class flag definition::
    507 
    508    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
    509 
    510 We rename :c:func:`PyInit_noddy` to :c:func:`PyInit_noddy2` and update the module
    511 name in the :c:type:`PyModuleDef` struct.
    512 
    513 Finally, we update our :file:`setup.py` file to build the new module::
    514 
    515    from distutils.core import setup, Extension
    516    setup(name="noddy", version="1.0",
    517          ext_modules=[
    518             Extension("noddy", ["noddy.c"]),
    519             Extension("noddy2", ["noddy2.c"]),
    520             ])
    521 
    522 
    523 Providing finer control over data attributes
    524 --------------------------------------------
    525 
    526 In this section, we'll provide finer control over how the :attr:`first` and
    527 :attr:`last` attributes are set in the :class:`Noddy` example. In the previous
    528 version of our module, the instance variables :attr:`first` and :attr:`last`
    529 could be set to non-string values or even deleted. We want to make sure that
    530 these attributes always contain strings.
    531 
    532 .. literalinclude:: ../includes/noddy3.c
    533 
    534 
    535 To provide greater control, over the :attr:`first` and :attr:`last` attributes,
    536 we'll use custom getter and setter functions.  Here are the functions for
    537 getting and setting the :attr:`first` attribute::
    538 
    539    Noddy_getfirst(Noddy *self, void *closure)
    540    {
    541        Py_INCREF(self->first);
    542        return self->first;
    543    }
    544 
    545    static int
    546    Noddy_setfirst(Noddy *self, PyObject *value, void *closure)
    547    {
    548      if (value == NULL) {
    549        PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
    550        return -1;
    551      }
    552 
    553      if (! PyUnicode_Check(value)) {
    554        PyErr_SetString(PyExc_TypeError,
    555                        "The first attribute value must be a str");
    556        return -1;
    557      }
    558 
    559      Py_DECREF(self->first);
    560      Py_INCREF(value);
    561      self->first = value;
    562 
    563      return 0;
    564    }
    565 
    566 The getter function is passed a :class:`Noddy` object and a "closure", which is
    567 void pointer. In this case, the closure is ignored. (The closure supports an
    568 advanced usage in which definition data is passed to the getter and setter. This
    569 could, for example, be used to allow a single set of getter and setter functions
    570 that decide the attribute to get or set based on data in the closure.)
    571 
    572 The setter function is passed the :class:`Noddy` object, the new value, and the
    573 closure. The new value may be *NULL*, in which case the attribute is being
    574 deleted.  In our setter, we raise an error if the attribute is deleted or if the
    575 attribute value is not a string.
    576 
    577 We create an array of :c:type:`PyGetSetDef` structures::
    578 
    579    static PyGetSetDef Noddy_getseters[] = {
    580        {"first",
    581         (getter)Noddy_getfirst, (setter)Noddy_setfirst,
    582         "first name",
    583         NULL},
    584        {"last",
    585         (getter)Noddy_getlast, (setter)Noddy_setlast,
    586         "last name",
    587         NULL},
    588        {NULL}  /* Sentinel */
    589    };
    590 
    591 and register it in the :c:member:`~PyTypeObject.tp_getset` slot::
    592 
    593    Noddy_getseters,           /* tp_getset */
    594 
    595 to register our attribute getters and setters.
    596 
    597 The last item in a :c:type:`PyGetSetDef` structure is the closure mentioned
    598 above. In this case, we aren't using the closure, so we just pass *NULL*.
    599 
    600 We also remove the member definitions for these attributes::
    601 
    602    static PyMemberDef Noddy_members[] = {
    603        {"number", T_INT, offsetof(Noddy, number), 0,
    604         "noddy number"},
    605        {NULL}  /* Sentinel */
    606    };
    607 
    608 We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only allow strings [#]_ to
    609 be passed::
    610 
    611    static int
    612    Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
    613    {
    614        PyObject *first=NULL, *last=NULL, *tmp;
    615 
    616        static char *kwlist[] = {"first", "last", "number", NULL};
    617 
    618        if (! PyArg_ParseTupleAndKeywords(args, kwds, "|SSi", kwlist,
    619                                          &first, &last,
    620                                          &self->number))
    621            return -1;
    622 
    623        if (first) {
    624            tmp = self->first;
    625            Py_INCREF(first);
    626            self->first = first;
    627            Py_DECREF(tmp);
    628        }
    629 
    630        if (last) {
    631            tmp = self->last;
    632            Py_INCREF(last);
    633            self->last = last;
    634            Py_DECREF(tmp);
    635        }
    636 
    637        return 0;
    638    }
    639 
    640 With these changes, we can assure that the :attr:`first` and :attr:`last`
    641 members are never *NULL* so we can remove checks for *NULL* values in almost all
    642 cases. This means that most of the :c:func:`Py_XDECREF` calls can be converted to
    643 :c:func:`Py_DECREF` calls. The only place we can't change these calls is in the
    644 deallocator, where there is the possibility that the initialization of these
    645 members failed in the constructor.
    646 
    647 We also rename the module initialization function and module name in the
    648 initialization function, as we did before, and we add an extra definition to the
    649 :file:`setup.py` file.
    650 
    651 
    652 Supporting cyclic garbage collection
    653 ------------------------------------
    654 
    655 Python has a cyclic-garbage collector that can identify unneeded objects even
    656 when their reference counts are not zero. This can happen when objects are
    657 involved in cycles.  For example, consider::
    658 
    659    >>> l = []
    660    >>> l.append(l)
    661    >>> del l
    662 
    663 In this example, we create a list that contains itself. When we delete it, it
    664 still has a reference from itself. Its reference count doesn't drop to zero.
    665 Fortunately, Python's cyclic-garbage collector will eventually figure out that
    666 the list is garbage and free it.
    667 
    668 In the second version of the :class:`Noddy` example, we allowed any kind of
    669 object to be stored in the :attr:`first` or :attr:`last` attributes. [#]_ This
    670 means that :class:`Noddy` objects can participate in cycles::
    671 
    672    >>> import noddy2
    673    >>> n = noddy2.Noddy()
    674    >>> l = [n]
    675    >>> n.first = l
    676 
    677 This is pretty silly, but it gives us an excuse to add support for the
    678 cyclic-garbage collector to the :class:`Noddy` example.  To support cyclic
    679 garbage collection, types need to fill two slots and set a class flag that
    680 enables these slots:
    681 
    682 .. literalinclude:: ../includes/noddy4.c
    683 
    684 
    685 The traversal method provides access to subobjects that could participate in
    686 cycles::
    687 
    688    static int
    689    Noddy_traverse(Noddy *self, visitproc visit, void *arg)
    690    {
    691        int vret;
    692 
    693        if (self->first) {
    694            vret = visit(self->first, arg);
    695            if (vret != 0)
    696                return vret;
    697        }
    698        if (self->last) {
    699            vret = visit(self->last, arg);
    700            if (vret != 0)
    701                return vret;
    702        }
    703 
    704        return 0;
    705    }
    706 
    707 For each subobject that can participate in cycles, we need to call the
    708 :c:func:`visit` function, which is passed to the traversal method. The
    709 :c:func:`visit` function takes as arguments the subobject and the extra argument
    710 *arg* passed to the traversal method.  It returns an integer value that must be
    711 returned if it is non-zero.
    712 
    713 Python provides a :c:func:`Py_VISIT` macro that automates calling visit
    714 functions.  With :c:func:`Py_VISIT`, :c:func:`Noddy_traverse` can be simplified::
    715 
    716    static int
    717    Noddy_traverse(Noddy *self, visitproc visit, void *arg)
    718    {
    719        Py_VISIT(self->first);
    720        Py_VISIT(self->last);
    721        return 0;
    722    }
    723 
    724 .. note::
    725 
    726    Note that the :c:member:`~PyTypeObject.tp_traverse` implementation must name its arguments exactly
    727    *visit* and *arg* in order to use :c:func:`Py_VISIT`.  This is to encourage
    728    uniformity across these boring implementations.
    729 
    730 We also need to provide a method for clearing any subobjects that can
    731 participate in cycles.  We implement the method and reimplement the deallocator
    732 to use it::
    733 
    734    static int
    735    Noddy_clear(Noddy *self)
    736    {
    737        PyObject *tmp;
    738 
    739        tmp = self->first;
    740        self->first = NULL;
    741        Py_XDECREF(tmp);
    742 
    743        tmp = self->last;
    744        self->last = NULL;
    745        Py_XDECREF(tmp);
    746 
    747        return 0;
    748    }
    749 
    750    static void
    751    Noddy_dealloc(Noddy* self)
    752    {
    753        Noddy_clear(self);
    754        Py_TYPE(self)->tp_free((PyObject*)self);
    755    }
    756 
    757 Notice the use of a temporary variable in :c:func:`Noddy_clear`. We use the
    758 temporary variable so that we can set each member to *NULL* before decrementing
    759 its reference count.  We do this because, as was discussed earlier, if the
    760 reference count drops to zero, we might cause code to run that calls back into
    761 the object.  In addition, because we now support garbage collection, we also
    762 have to worry about code being run that triggers garbage collection.  If garbage
    763 collection is run, our :c:member:`~PyTypeObject.tp_traverse` handler could get called. We can't
    764 take a chance of having :c:func:`Noddy_traverse` called when a member's reference
    765 count has dropped to zero and its value hasn't been set to *NULL*.
    766 
    767 Python provides a :c:func:`Py_CLEAR` that automates the careful decrementing of
    768 reference counts.  With :c:func:`Py_CLEAR`, the :c:func:`Noddy_clear` function can
    769 be simplified::
    770 
    771    static int
    772    Noddy_clear(Noddy *self)
    773    {
    774        Py_CLEAR(self->first);
    775        Py_CLEAR(self->last);
    776        return 0;
    777    }
    778 
    779 Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags::
    780 
    781    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /* tp_flags */
    782 
    783 That's pretty much it.  If we had written custom :c:member:`~PyTypeObject.tp_alloc` or
    784 :c:member:`~PyTypeObject.tp_free` slots, we'd need to modify them for cyclic-garbage collection.
    785 Most extensions will use the versions automatically provided.
    786 
    787 
    788 Subclassing other types
    789 -----------------------
    790 
    791 It is possible to create new extension types that are derived from existing
    792 types. It is easiest to inherit from the built in types, since an extension can
    793 easily use the :class:`PyTypeObject` it needs. It can be difficult to share
    794 these :class:`PyTypeObject` structures between extension modules.
    795 
    796 In this example we will create a :class:`Shoddy` type that inherits from the
    797 built-in :class:`list` type. The new type will be completely compatible with
    798 regular lists, but will have an additional :meth:`increment` method that
    799 increases an internal counter. ::
    800 
    801    >>> import shoddy
    802    >>> s = shoddy.Shoddy(range(3))
    803    >>> s.extend(s)
    804    >>> print(len(s))
    805    6
    806    >>> print(s.increment())
    807    1
    808    >>> print(s.increment())
    809    2
    810 
    811 .. literalinclude:: ../includes/shoddy.c
    812 
    813 
    814 As you can see, the source code closely resembles the :class:`Noddy` examples in
    815 previous sections. We will break down the main differences between them. ::
    816 
    817    typedef struct {
    818        PyListObject list;
    819        int state;
    820    } Shoddy;
    821 
    822 The primary difference for derived type objects is that the base type's object
    823 structure must be the first value. The base type will already include the
    824 :c:func:`PyObject_HEAD` at the beginning of its structure.
    825 
    826 When a Python object is a :class:`Shoddy` instance, its *PyObject\** pointer can
    827 be safely cast to both *PyListObject\** and *Shoddy\**. ::
    828 
    829    static int
    830    Shoddy_init(Shoddy *self, PyObject *args, PyObject *kwds)
    831    {
    832        if (PyList_Type.tp_init((PyObject *)self, args, kwds) < 0)
    833           return -1;
    834        self->state = 0;
    835        return 0;
    836    }
    837 
    838 In the :attr:`__init__` method for our type, we can see how to call through to
    839 the :attr:`__init__` method of the base type.
    840 
    841 This pattern is important when writing a type with custom :attr:`new` and
    842 :attr:`dealloc` methods. The :attr:`new` method should not actually create the
    843 memory for the object with :c:member:`~PyTypeObject.tp_alloc`, that will be handled by the base
    844 class when calling its :c:member:`~PyTypeObject.tp_new`.
    845 
    846 When filling out the :c:func:`PyTypeObject` for the :class:`Shoddy` type, you see
    847 a slot for :c:func:`tp_base`. Due to cross platform compiler issues, you can't
    848 fill that field directly with the :c:func:`PyList_Type`; it can be done later in
    849 the module's :c:func:`init` function. ::
    850 
    851    PyMODINIT_FUNC
    852    PyInit_shoddy(void)
    853    {
    854        PyObject *m;
    855 
    856        ShoddyType.tp_base = &PyList_Type;
    857        if (PyType_Ready(&ShoddyType) < 0)
    858            return NULL;
    859 
    860        m = PyModule_Create(&shoddymodule);
    861        if (m == NULL)
    862            return NULL;
    863 
    864        Py_INCREF(&ShoddyType);
    865        PyModule_AddObject(m, "Shoddy", (PyObject *) &ShoddyType);
    866        return m;
    867    }
    868 
    869 Before calling :c:func:`PyType_Ready`, the type structure must have the
    870 :c:member:`~PyTypeObject.tp_base` slot filled in. When we are deriving a new type, it is not
    871 necessary to fill out the :c:member:`~PyTypeObject.tp_alloc` slot with :c:func:`PyType_GenericNew`
    872 -- the allocate function from the base type will be inherited.
    873 
    874 After that, calling :c:func:`PyType_Ready` and adding the type object to the
    875 module is the same as with the basic :class:`Noddy` examples.
    876 
    877 
    878 .. _dnt-type-methods:
    879 
    880 Type Methods
    881 ============
    882 
    883 This section aims to give a quick fly-by on the various type methods you can
    884 implement and what they do.
    885 
    886 Here is the definition of :c:type:`PyTypeObject`, with some fields only used in
    887 debug builds omitted:
    888 
    889 .. literalinclude:: ../includes/typestruct.h
    890 
    891 
    892 Now that's a *lot* of methods.  Don't worry too much though - if you have a type
    893 you want to define, the chances are very good that you will only implement a
    894 handful of these.
    895 
    896 As you probably expect by now, we're going to go over this and give more
    897 information about the various handlers.  We won't go in the order they are
    898 defined in the structure, because there is a lot of historical baggage that
    899 impacts the ordering of the fields; be sure your type initialization keeps the
    900 fields in the right order!  It's often easiest to find an example that includes
    901 all the fields you need (even if they're initialized to ``0``) and then change
    902 the values to suit your new type. ::
    903 
    904    const char *tp_name; /* For printing */
    905 
    906 The name of the type - as mentioned in the last section, this will appear in
    907 various places, almost entirely for diagnostic purposes. Try to choose something
    908 that will be helpful in such a situation! ::
    909 
    910    Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */
    911 
    912 These fields tell the runtime how much memory to allocate when new objects of
    913 this type are created.  Python has some built-in support for variable length
    914 structures (think: strings, lists) which is where the :c:member:`~PyTypeObject.tp_itemsize` field
    915 comes in.  This will be dealt with later. ::
    916 
    917    const char *tp_doc;
    918 
    919 Here you can put a string (or its address) that you want returned when the
    920 Python script references ``obj.__doc__`` to retrieve the doc string.
    921 
    922 Now we come to the basic type methods---the ones most extension types will
    923 implement.
    924 
    925 
    926 Finalization and De-allocation
    927 ------------------------------
    928 
    929 .. index::
    930    single: object; deallocation
    931    single: deallocation, object
    932    single: object; finalization
    933    single: finalization, of objects
    934 
    935 ::
    936 
    937    destructor tp_dealloc;
    938 
    939 This function is called when the reference count of the instance of your type is
    940 reduced to zero and the Python interpreter wants to reclaim it.  If your type
    941 has memory to free or other clean-up to perform, you can put it here.  The
    942 object itself needs to be freed here as well.  Here is an example of this
    943 function::
    944 
    945    static void
    946    newdatatype_dealloc(newdatatypeobject * obj)
    947    {
    948        free(obj->obj_UnderlyingDatatypePtr);
    949        Py_TYPE(obj)->tp_free(obj);
    950    }
    951 
    952 .. index::
    953    single: PyErr_Fetch()
    954    single: PyErr_Restore()
    955 
    956 One important requirement of the deallocator function is that it leaves any
    957 pending exceptions alone.  This is important since deallocators are frequently
    958 called as the interpreter unwinds the Python stack; when the stack is unwound
    959 due to an exception (rather than normal returns), nothing is done to protect the
    960 deallocators from seeing that an exception has already been set.  Any actions
    961 which a deallocator performs which may cause additional Python code to be
    962 executed may detect that an exception has been set.  This can lead to misleading
    963 errors from the interpreter.  The proper way to protect against this is to save
    964 a pending exception before performing the unsafe action, and restoring it when
    965 done.  This can be done using the :c:func:`PyErr_Fetch` and
    966 :c:func:`PyErr_Restore` functions::
    967 
    968    static void
    969    my_dealloc(PyObject *obj)
    970    {
    971        MyObject *self = (MyObject *) obj;
    972        PyObject *cbresult;
    973 
    974        if (self->my_callback != NULL) {
    975            PyObject *err_type, *err_value, *err_traceback;
    976 
    977            /* This saves the current exception state */
    978            PyErr_Fetch(&err_type, &err_value, &err_traceback);
    979 
    980            cbresult = PyObject_CallObject(self->my_callback, NULL);
    981            if (cbresult == NULL)
    982                PyErr_WriteUnraisable(self->my_callback);
    983            else
    984                Py_DECREF(cbresult);
    985 
    986            /* This restores the saved exception state */
    987            PyErr_Restore(err_type, err_value, err_traceback);
    988 
    989            Py_DECREF(self->my_callback);
    990        }
    991        Py_TYPE(obj)->tp_free((PyObject*)self);
    992    }
    993 
    994 .. note::
    995    There are limitations to what you can safely do in a deallocator function.
    996    First, if your type supports garbage collection (using :c:member:`~PyTypeObject.tp_traverse`
    997    and/or :c:member:`~PyTypeObject.tp_clear`), some of the object's members can have been
    998    cleared or finalized by the time :c:member:`~PyTypeObject.tp_dealloc` is called.  Second, in
    999    :c:member:`~PyTypeObject.tp_dealloc`, your object is in an unstable state: its reference
   1000    count is equal to zero.  Any call to a non-trivial object or API (as in the
   1001    example above) might end up calling :c:member:`~PyTypeObject.tp_dealloc` again, causing a
   1002    double free and a crash.
   1003 
   1004    Starting with Python 3.4, it is recommended not to put any complex
   1005    finalization code in :c:member:`~PyTypeObject.tp_dealloc`, and instead use the new
   1006    :c:member:`~PyTypeObject.tp_finalize` type method.
   1007 
   1008    .. seealso::
   1009       :pep:`442` explains the new finalization scheme.
   1010 
   1011 .. index::
   1012    single: string; object representation
   1013    builtin: repr
   1014 
   1015 Object Presentation
   1016 -------------------
   1017 
   1018 In Python, there are two ways to generate a textual representation of an object:
   1019 the :func:`repr` function, and the :func:`str` function.  (The :func:`print`
   1020 function just calls :func:`str`.)  These handlers are both optional.
   1021 
   1022 ::
   1023 
   1024    reprfunc tp_repr;
   1025    reprfunc tp_str;
   1026 
   1027 The :c:member:`~PyTypeObject.tp_repr` handler should return a string object containing a
   1028 representation of the instance for which it is called.  Here is a simple
   1029 example::
   1030 
   1031    static PyObject *
   1032    newdatatype_repr(newdatatypeobject * obj)
   1033    {
   1034        return PyUnicode_FromFormat("Repr-ified_newdatatype{{size:\%d}}",
   1035                                    obj->obj_UnderlyingDatatypePtr->size);
   1036    }
   1037 
   1038 If no :c:member:`~PyTypeObject.tp_repr` handler is specified, the interpreter will supply a
   1039 representation that uses the type's :c:member:`~PyTypeObject.tp_name` and a uniquely-identifying
   1040 value for the object.
   1041 
   1042 The :c:member:`~PyTypeObject.tp_str` handler is to :func:`str` what the :c:member:`~PyTypeObject.tp_repr` handler
   1043 described above is to :func:`repr`; that is, it is called when Python code calls
   1044 :func:`str` on an instance of your object.  Its implementation is very similar
   1045 to the :c:member:`~PyTypeObject.tp_repr` function, but the resulting string is intended for human
   1046 consumption.  If :c:member:`~PyTypeObject.tp_str` is not specified, the :c:member:`~PyTypeObject.tp_repr` handler is
   1047 used instead.
   1048 
   1049 Here is a simple example::
   1050 
   1051    static PyObject *
   1052    newdatatype_str(newdatatypeobject * obj)
   1053    {
   1054        return PyUnicode_FromFormat("Stringified_newdatatype{{size:\%d}}",
   1055                                    obj->obj_UnderlyingDatatypePtr->size);
   1056    }
   1057 
   1058 
   1059 
   1060 Attribute Management
   1061 --------------------
   1062 
   1063 For every object which can support attributes, the corresponding type must
   1064 provide the functions that control how the attributes are resolved.  There needs
   1065 to be a function which can retrieve attributes (if any are defined), and another
   1066 to set attributes (if setting attributes is allowed).  Removing an attribute is
   1067 a special case, for which the new value passed to the handler is *NULL*.
   1068 
   1069 Python supports two pairs of attribute handlers; a type that supports attributes
   1070 only needs to implement the functions for one pair.  The difference is that one
   1071 pair takes the name of the attribute as a :c:type:`char\*`, while the other
   1072 accepts a :c:type:`PyObject\*`.  Each type can use whichever pair makes more
   1073 sense for the implementation's convenience. ::
   1074 
   1075    getattrfunc  tp_getattr;        /* char * version */
   1076    setattrfunc  tp_setattr;
   1077    /* ... */
   1078    getattrofunc tp_getattro;       /* PyObject * version */
   1079    setattrofunc tp_setattro;
   1080 
   1081 If accessing attributes of an object is always a simple operation (this will be
   1082 explained shortly), there are generic implementations which can be used to
   1083 provide the :c:type:`PyObject\*` version of the attribute management functions.
   1084 The actual need for type-specific attribute handlers almost completely
   1085 disappeared starting with Python 2.2, though there are many examples which have
   1086 not been updated to use some of the new generic mechanism that is available.
   1087 
   1088 
   1089 .. _generic-attribute-management:
   1090 
   1091 Generic Attribute Management
   1092 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   1093 
   1094 Most extension types only use *simple* attributes.  So, what makes the
   1095 attributes simple?  There are only a couple of conditions that must be met:
   1096 
   1097 #. The name of the attributes must be known when :c:func:`PyType_Ready` is
   1098    called.
   1099 
   1100 #. No special processing is needed to record that an attribute was looked up or
   1101    set, nor do actions need to be taken based on the value.
   1102 
   1103 Note that this list does not place any restrictions on the values of the
   1104 attributes, when the values are computed, or how relevant data is stored.
   1105 
   1106 When :c:func:`PyType_Ready` is called, it uses three tables referenced by the
   1107 type object to create :term:`descriptor`\s which are placed in the dictionary of the
   1108 type object.  Each descriptor controls access to one attribute of the instance
   1109 object.  Each of the tables is optional; if all three are *NULL*, instances of
   1110 the type will only have attributes that are inherited from their base type, and
   1111 should leave the :c:member:`~PyTypeObject.tp_getattro` and :c:member:`~PyTypeObject.tp_setattro` fields *NULL* as
   1112 well, allowing the base type to handle attributes.
   1113 
   1114 The tables are declared as three fields of the type object::
   1115 
   1116    struct PyMethodDef *tp_methods;
   1117    struct PyMemberDef *tp_members;
   1118    struct PyGetSetDef *tp_getset;
   1119 
   1120 If :c:member:`~PyTypeObject.tp_methods` is not *NULL*, it must refer to an array of
   1121 :c:type:`PyMethodDef` structures.  Each entry in the table is an instance of this
   1122 structure::
   1123 
   1124    typedef struct PyMethodDef {
   1125        const char  *ml_name;       /* method name */
   1126        PyCFunction  ml_meth;       /* implementation function */
   1127        int          ml_flags;      /* flags */
   1128        const char  *ml_doc;        /* docstring */
   1129    } PyMethodDef;
   1130 
   1131 One entry should be defined for each method provided by the type; no entries are
   1132 needed for methods inherited from a base type.  One additional entry is needed
   1133 at the end; it is a sentinel that marks the end of the array.  The
   1134 :attr:`ml_name` field of the sentinel must be *NULL*.
   1135 
   1136 The second table is used to define attributes which map directly to data stored
   1137 in the instance.  A variety of primitive C types are supported, and access may
   1138 be read-only or read-write.  The structures in the table are defined as::
   1139 
   1140    typedef struct PyMemberDef {
   1141        char *name;
   1142        int   type;
   1143        int   offset;
   1144        int   flags;
   1145        char *doc;
   1146    } PyMemberDef;
   1147 
   1148 For each entry in the table, a :term:`descriptor` will be constructed and added to the
   1149 type which will be able to extract a value from the instance structure.  The
   1150 :attr:`type` field should contain one of the type codes defined in the
   1151 :file:`structmember.h` header; the value will be used to determine how to
   1152 convert Python values to and from C values.  The :attr:`flags` field is used to
   1153 store flags which control how the attribute can be accessed.
   1154 
   1155 The following flag constants are defined in :file:`structmember.h`; they may be
   1156 combined using bitwise-OR.
   1157 
   1158 +---------------------------+----------------------------------------------+
   1159 | Constant                  | Meaning                                      |
   1160 +===========================+==============================================+
   1161 | :const:`READONLY`         | Never writable.                              |
   1162 +---------------------------+----------------------------------------------+
   1163 | :const:`READ_RESTRICTED`  | Not readable in restricted mode.             |
   1164 +---------------------------+----------------------------------------------+
   1165 | :const:`WRITE_RESTRICTED` | Not writable in restricted mode.             |
   1166 +---------------------------+----------------------------------------------+
   1167 | :const:`RESTRICTED`       | Not readable or writable in restricted mode. |
   1168 +---------------------------+----------------------------------------------+
   1169 
   1170 .. index::
   1171    single: READONLY
   1172    single: READ_RESTRICTED
   1173    single: WRITE_RESTRICTED
   1174    single: RESTRICTED
   1175 
   1176 An interesting advantage of using the :c:member:`~PyTypeObject.tp_members` table to build
   1177 descriptors that are used at runtime is that any attribute defined this way can
   1178 have an associated doc string simply by providing the text in the table.  An
   1179 application can use the introspection API to retrieve the descriptor from the
   1180 class object, and get the doc string using its :attr:`__doc__` attribute.
   1181 
   1182 As with the :c:member:`~PyTypeObject.tp_methods` table, a sentinel entry with a :attr:`name` value
   1183 of *NULL* is required.
   1184 
   1185 .. XXX Descriptors need to be explained in more detail somewhere, but not here.
   1186 
   1187    Descriptor objects have two handler functions which correspond to the
   1188    \member{tp_getattro} and \member{tp_setattro} handlers.  The
   1189    \method{__get__()} handler is a function which is passed the descriptor,
   1190    instance, and type objects, and returns the value of the attribute, or it
   1191    returns \NULL{} and sets an exception.  The \method{__set__()} handler is
   1192    passed the descriptor, instance, type, and new value;
   1193 
   1194 
   1195 Type-specific Attribute Management
   1196 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   1197 
   1198 For simplicity, only the :c:type:`char\*` version will be demonstrated here; the
   1199 type of the name parameter is the only difference between the :c:type:`char\*`
   1200 and :c:type:`PyObject\*` flavors of the interface. This example effectively does
   1201 the same thing as the generic example above, but does not use the generic
   1202 support added in Python 2.2.  It explains how the handler functions are
   1203 called, so that if you do need to extend their functionality, you'll understand
   1204 what needs to be done.
   1205 
   1206 The :c:member:`~PyTypeObject.tp_getattr` handler is called when the object requires an attribute
   1207 look-up.  It is called in the same situations where the :meth:`__getattr__`
   1208 method of a class would be called.
   1209 
   1210 Here is an example::
   1211 
   1212    static PyObject *
   1213    newdatatype_getattr(newdatatypeobject *obj, char *name)
   1214    {
   1215        if (strcmp(name, "data") == 0)
   1216        {
   1217            return PyLong_FromLong(obj->data);
   1218        }
   1219 
   1220        PyErr_Format(PyExc_AttributeError,
   1221                     "'%.50s' object has no attribute '%.400s'",
   1222                     tp->tp_name, name);
   1223        return NULL;
   1224    }
   1225 
   1226 The :c:member:`~PyTypeObject.tp_setattr` handler is called when the :meth:`__setattr__` or
   1227 :meth:`__delattr__` method of a class instance would be called.  When an
   1228 attribute should be deleted, the third parameter will be *NULL*.  Here is an
   1229 example that simply raises an exception; if this were really all you wanted, the
   1230 :c:member:`~PyTypeObject.tp_setattr` handler should be set to *NULL*. ::
   1231 
   1232    static int
   1233    newdatatype_setattr(newdatatypeobject *obj, char *name, PyObject *v)
   1234    {
   1235        (void)PyErr_Format(PyExc_RuntimeError, "Read-only attribute: \%s", name);
   1236        return -1;
   1237    }
   1238 
   1239 Object Comparison
   1240 -----------------
   1241 
   1242 ::
   1243 
   1244    richcmpfunc tp_richcompare;
   1245 
   1246 The :c:member:`~PyTypeObject.tp_richcompare` handler is called when comparisons are needed.  It is
   1247 analogous to the :ref:`rich comparison methods <richcmpfuncs>`, like
   1248 :meth:`__lt__`, and also called by :c:func:`PyObject_RichCompare` and
   1249 :c:func:`PyObject_RichCompareBool`.
   1250 
   1251 This function is called with two Python objects and the operator as arguments,
   1252 where the operator is one of ``Py_EQ``, ``Py_NE``, ``Py_LE``, ``Py_GT``,
   1253 ``Py_LT`` or ``Py_GT``.  It should compare the two objects with respect to the
   1254 specified operator and return ``Py_True`` or ``Py_False`` if the comparison is
   1255 successful, ``Py_NotImplemented`` to indicate that comparison is not
   1256 implemented and the other object's comparison method should be tried, or *NULL*
   1257 if an exception was set.
   1258 
   1259 Here is a sample implementation, for a datatype that is considered equal if the
   1260 size of an internal pointer is equal::
   1261 
   1262    static PyObject *
   1263    newdatatype_richcmp(PyObject *obj1, PyObject *obj2, int op)
   1264    {
   1265        PyObject *result;
   1266        int c, size1, size2;
   1267 
   1268        /* code to make sure that both arguments are of type
   1269           newdatatype omitted */
   1270 
   1271        size1 = obj1->obj_UnderlyingDatatypePtr->size;
   1272        size2 = obj2->obj_UnderlyingDatatypePtr->size;
   1273 
   1274        switch (op) {
   1275        case Py_LT: c = size1 <  size2; break;
   1276        case Py_LE: c = size1 <= size2; break;
   1277        case Py_EQ: c = size1 == size2; break;
   1278        case Py_NE: c = size1 != size2; break;
   1279        case Py_GT: c = size1 >  size2; break;
   1280        case Py_GE: c = size1 >= size2; break;
   1281        }
   1282        result = c ? Py_True : Py_False;
   1283        Py_INCREF(result);
   1284        return result;
   1285     }
   1286 
   1287 
   1288 Abstract Protocol Support
   1289 -------------------------
   1290 
   1291 Python supports a variety of *abstract* 'protocols;' the specific interfaces
   1292 provided to use these interfaces are documented in :ref:`abstract`.
   1293 
   1294 
   1295 A number of these abstract interfaces were defined early in the development of
   1296 the Python implementation.  In particular, the number, mapping, and sequence
   1297 protocols have been part of Python since the beginning.  Other protocols have
   1298 been added over time.  For protocols which depend on several handler routines
   1299 from the type implementation, the older protocols have been defined as optional
   1300 blocks of handlers referenced by the type object.  For newer protocols there are
   1301 additional slots in the main type object, with a flag bit being set to indicate
   1302 that the slots are present and should be checked by the interpreter.  (The flag
   1303 bit does not indicate that the slot values are non-*NULL*. The flag may be set
   1304 to indicate the presence of a slot, but a slot may still be unfilled.) ::
   1305 
   1306    PyNumberMethods   *tp_as_number;
   1307    PySequenceMethods *tp_as_sequence;
   1308    PyMappingMethods  *tp_as_mapping;
   1309 
   1310 If you wish your object to be able to act like a number, a sequence, or a
   1311 mapping object, then you place the address of a structure that implements the C
   1312 type :c:type:`PyNumberMethods`, :c:type:`PySequenceMethods`, or
   1313 :c:type:`PyMappingMethods`, respectively. It is up to you to fill in this
   1314 structure with appropriate values. You can find examples of the use of each of
   1315 these in the :file:`Objects` directory of the Python source distribution. ::
   1316 
   1317    hashfunc tp_hash;
   1318 
   1319 This function, if you choose to provide it, should return a hash number for an
   1320 instance of your data type. Here is a moderately pointless example::
   1321 
   1322    static long
   1323    newdatatype_hash(newdatatypeobject *obj)
   1324    {
   1325        long result;
   1326        result = obj->obj_UnderlyingDatatypePtr->size;
   1327        result = result * 3;
   1328        return result;
   1329    }
   1330 
   1331 ::
   1332 
   1333    ternaryfunc tp_call;
   1334 
   1335 This function is called when an instance of your data type is "called", for
   1336 example, if ``obj1`` is an instance of your data type and the Python script
   1337 contains ``obj1('hello')``, the :c:member:`~PyTypeObject.tp_call` handler is invoked.
   1338 
   1339 This function takes three arguments:
   1340 
   1341 #. *arg1* is the instance of the data type which is the subject of the call. If
   1342    the call is ``obj1('hello')``, then *arg1* is ``obj1``.
   1343 
   1344 #. *arg2* is a tuple containing the arguments to the call.  You can use
   1345    :c:func:`PyArg_ParseTuple` to extract the arguments.
   1346 
   1347 #. *arg3* is a dictionary of keyword arguments that were passed. If this is
   1348    non-*NULL* and you support keyword arguments, use
   1349    :c:func:`PyArg_ParseTupleAndKeywords` to extract the arguments.  If you do not
   1350    want to support keyword arguments and this is non-*NULL*, raise a
   1351    :exc:`TypeError` with a message saying that keyword arguments are not supported.
   1352 
   1353 Here is a desultory example of the implementation of the call function. ::
   1354 
   1355    /* Implement the call function.
   1356     *    obj1 is the instance receiving the call.
   1357     *    obj2 is a tuple containing the arguments to the call, in this
   1358     *         case 3 strings.
   1359     */
   1360    static PyObject *
   1361    newdatatype_call(newdatatypeobject *obj, PyObject *args, PyObject *other)
   1362    {
   1363        PyObject *result;
   1364        char *arg1;
   1365        char *arg2;
   1366        char *arg3;
   1367 
   1368        if (!PyArg_ParseTuple(args, "sss:call", &arg1, &arg2, &arg3)) {
   1369            return NULL;
   1370        }
   1371        result = PyUnicode_FromFormat(
   1372            "Returning -- value: [\%d] arg1: [\%s] arg2: [\%s] arg3: [\%s]\n",
   1373            obj->obj_UnderlyingDatatypePtr->size,
   1374            arg1, arg2, arg3);
   1375        return result;
   1376    }
   1377 
   1378 ::
   1379 
   1380    /* Iterators */
   1381    getiterfunc tp_iter;
   1382    iternextfunc tp_iternext;
   1383 
   1384 These functions provide support for the iterator protocol.  Any object which
   1385 wishes to support iteration over its contents (which may be generated during
   1386 iteration) must implement the ``tp_iter`` handler.  Objects which are returned
   1387 by a ``tp_iter`` handler must implement both the ``tp_iter`` and ``tp_iternext``
   1388 handlers. Both handlers take exactly one parameter, the instance for which they
   1389 are being called, and return a new reference.  In the case of an error, they
   1390 should set an exception and return *NULL*.
   1391 
   1392 For an object which represents an iterable collection, the ``tp_iter`` handler
   1393 must return an iterator object.  The iterator object is responsible for
   1394 maintaining the state of the iteration.  For collections which can support
   1395 multiple iterators which do not interfere with each other (as lists and tuples
   1396 do), a new iterator should be created and returned.  Objects which can only be
   1397 iterated over once (usually due to side effects of iteration) should implement
   1398 this handler by returning a new reference to themselves, and should also
   1399 implement the ``tp_iternext`` handler.  File objects are an example of such an
   1400 iterator.
   1401 
   1402 Iterator objects should implement both handlers.  The ``tp_iter`` handler should
   1403 return a new reference to the iterator (this is the same as the ``tp_iter``
   1404 handler for objects which can only be iterated over destructively).  The
   1405 ``tp_iternext`` handler should return a new reference to the next object in the
   1406 iteration if there is one.  If the iteration has reached the end, it may return
   1407 *NULL* without setting an exception or it may set :exc:`StopIteration`; avoiding
   1408 the exception can yield slightly better performance.  If an actual error occurs,
   1409 it should set an exception and return *NULL*.
   1410 
   1411 
   1412 .. _weakref-support:
   1413 
   1414 Weak Reference Support
   1415 ----------------------
   1416 
   1417 One of the goals of Python's weak-reference implementation is to allow any type
   1418 to participate in the weak reference mechanism without incurring the overhead on
   1419 those objects which do not benefit by weak referencing (such as numbers).
   1420 
   1421 For an object to be weakly referencable, the extension must include a
   1422 :c:type:`PyObject\*` field in the instance structure for the use of the weak
   1423 reference mechanism; it must be initialized to *NULL* by the object's
   1424 constructor.  It must also set the :c:member:`~PyTypeObject.tp_weaklistoffset` field of the
   1425 corresponding type object to the offset of the field. For example, the instance
   1426 type is defined with the following structure::
   1427 
   1428    typedef struct {
   1429        PyObject_HEAD
   1430        PyClassObject *in_class;       /* The class object */
   1431        PyObject      *in_dict;        /* A dictionary */
   1432        PyObject      *in_weakreflist; /* List of weak references */
   1433    } PyInstanceObject;
   1434 
   1435 The statically-declared type object for instances is defined this way::
   1436 
   1437    PyTypeObject PyInstance_Type = {
   1438        PyVarObject_HEAD_INIT(&PyType_Type, 0)
   1439        0,
   1440        "module.instance",
   1441 
   1442        /* Lots of stuff omitted for brevity... */
   1443 
   1444        Py_TPFLAGS_DEFAULT,                         /* tp_flags */
   1445        0,                                          /* tp_doc */
   1446        0,                                          /* tp_traverse */
   1447        0,                                          /* tp_clear */
   1448        0,                                          /* tp_richcompare */
   1449        offsetof(PyInstanceObject, in_weakreflist), /* tp_weaklistoffset */
   1450    };
   1451 
   1452 The type constructor is responsible for initializing the weak reference list to
   1453 *NULL*::
   1454 
   1455    static PyObject *
   1456    instance_new() {
   1457        /* Other initialization stuff omitted for brevity */
   1458 
   1459        self->in_weakreflist = NULL;
   1460 
   1461        return (PyObject *) self;
   1462    }
   1463 
   1464 The only further addition is that the destructor needs to call the weak
   1465 reference manager to clear any weak references.  This is only required if the
   1466 weak reference list is non-*NULL*::
   1467 
   1468    static void
   1469    instance_dealloc(PyInstanceObject *inst)
   1470    {
   1471        /* Allocate temporaries if needed, but do not begin
   1472           destruction just yet.
   1473         */
   1474 
   1475        if (inst->in_weakreflist != NULL)
   1476            PyObject_ClearWeakRefs((PyObject *) inst);
   1477 
   1478        /* Proceed with object destruction normally. */
   1479    }
   1480 
   1481 
   1482 More Suggestions
   1483 ----------------
   1484 
   1485 Remember that you can omit most of these functions, in which case you provide
   1486 ``0`` as a value.  There are type definitions for each of the functions you must
   1487 provide.  They are in :file:`object.h` in the Python include directory that
   1488 comes with the source distribution of Python.
   1489 
   1490 In order to learn how to implement any specific method for your new data type,
   1491 do the following: Download and unpack the Python source distribution.  Go to
   1492 the :file:`Objects` directory, then search the C source files for ``tp_`` plus
   1493 the function you want (for example, ``tp_richcompare``).  You will find examples
   1494 of the function you want to implement.
   1495 
   1496 When you need to verify that an object is an instance of the type you are
   1497 implementing, use the :c:func:`PyObject_TypeCheck` function. A sample of its use
   1498 might be something like the following::
   1499 
   1500    if (! PyObject_TypeCheck(some_object, &MyType)) {
   1501        PyErr_SetString(PyExc_TypeError, "arg #1 not a mything");
   1502        return NULL;
   1503    }
   1504 
   1505 .. rubric:: Footnotes
   1506 
   1507 .. [#] This is true when we know that the object is a basic type, like a string or a
   1508    float.
   1509 
   1510 .. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler in this example, because our
   1511    type doesn't support garbage collection. Even if a type supports garbage
   1512    collection, there are calls that can be made to "untrack" the object from
   1513    garbage collection, however, these calls are advanced and not covered here.
   1514 
   1515 .. [#] We now know that the first and last members are strings, so perhaps we could be
   1516    less careful about decrementing their reference counts, however, we accept
   1517    instances of string subclasses. Even though deallocating normal strings won't
   1518    call back into our objects, we can't guarantee that deallocating an instance of
   1519    a string subclass won't call back into our objects.
   1520 
   1521 .. [#] Even in the third version, we aren't guaranteed to avoid cycles.  Instances of
   1522    string subclasses are allowed and string subclasses could allow cycles even if
   1523    normal strings don't.
   1524