1 .. highlightlang:: c 2 3 4 .. _defining-new-types: 5 6 ****************** 7 Defining New Types 8 ****************** 9 10 .. sectionauthor:: Michael Hudson <mwh (a] python.net> 11 .. sectionauthor:: Dave Kuhlman <dkuhlman (a] rexx.com> 12 .. sectionauthor:: Jim Fulton <jim (a] zope.com> 13 14 15 As mentioned in the last chapter, Python allows the writer of an extension 16 module to define new types that can be manipulated from Python code, much like 17 strings and lists in core Python. 18 19 This is not hard; the code for all extension types follows a pattern, but there 20 are some details that you need to understand before you can get started. 21 22 23 .. _dnt-basics: 24 25 The Basics 26 ========== 27 28 The Python runtime sees all Python objects as variables of type 29 :c:type:`PyObject\*`, which serves as a "base type" for all Python objects. 30 :c:type:`PyObject` itself only contains the refcount and a pointer to the 31 object's "type object". This is where the action is; the type object determines 32 which (C) functions get called when, for instance, an attribute gets looked 33 up on an object or it is multiplied by another object. These C functions 34 are called "type methods". 35 36 So, if you want to define a new object type, you need to create a new type 37 object. 38 39 This sort of thing can only be explained by example, so here's a minimal, but 40 complete, module that defines a new type: 41 42 .. literalinclude:: ../includes/noddy.c 43 44 45 Now that's quite a bit to take in at once, but hopefully bits will seem familiar 46 from the last chapter. 47 48 The first bit that will be new is:: 49 50 typedef struct { 51 PyObject_HEAD 52 } noddy_NoddyObject; 53 54 This is what a Noddy object will contain---in this case, nothing more than what 55 every Python object contains---a field called ``ob_base`` of type 56 :c:type:`PyObject`. :c:type:`PyObject` in turn, contains an ``ob_refcnt`` 57 field and a pointer to a type object. These can be accessed using the macros 58 :c:macro:`Py_REFCNT` and :c:macro:`Py_TYPE` respectively. These are the fields 59 the :c:macro:`PyObject_HEAD` macro brings in. The reason for the macro is to 60 standardize the layout and to enable special debugging fields in debug builds. 61 62 Note that there is no semicolon after the :c:macro:`PyObject_HEAD` macro; 63 one is included in the macro definition. Be wary of adding one by 64 accident; it's easy to do from habit, and your compiler might not complain, 65 but someone else's probably will! (On Windows, MSVC is known to call this an 66 error and refuse to compile the code.) 67 68 For contrast, let's take a look at the corresponding definition for standard 69 Python floats:: 70 71 typedef struct { 72 PyObject_HEAD 73 double ob_fval; 74 } PyFloatObject; 75 76 Moving on, we come to the crunch --- the type object. :: 77 78 static PyTypeObject noddy_NoddyType = { 79 PyVarObject_HEAD_INIT(NULL, 0) 80 "noddy.Noddy", /* tp_name */ 81 sizeof(noddy_NoddyObject), /* tp_basicsize */ 82 0, /* tp_itemsize */ 83 0, /* tp_dealloc */ 84 0, /* tp_print */ 85 0, /* tp_getattr */ 86 0, /* tp_setattr */ 87 0, /* tp_as_async */ 88 0, /* tp_repr */ 89 0, /* tp_as_number */ 90 0, /* tp_as_sequence */ 91 0, /* tp_as_mapping */ 92 0, /* tp_hash */ 93 0, /* tp_call */ 94 0, /* tp_str */ 95 0, /* tp_getattro */ 96 0, /* tp_setattro */ 97 0, /* tp_as_buffer */ 98 Py_TPFLAGS_DEFAULT, /* tp_flags */ 99 "Noddy objects", /* tp_doc */ 100 }; 101 102 Now if you go and look up the definition of :c:type:`PyTypeObject` in 103 :file:`object.h` you'll see that it has many more fields that the definition 104 above. The remaining fields will be filled with zeros by the C compiler, and 105 it's common practice to not specify them explicitly unless you need them. 106 107 This is so important that we're going to pick the top of it apart still 108 further:: 109 110 PyVarObject_HEAD_INIT(NULL, 0) 111 112 This line is a bit of a wart; what we'd like to write is:: 113 114 PyVarObject_HEAD_INIT(&PyType_Type, 0) 115 116 as the type of a type object is "type", but this isn't strictly conforming C and 117 some compilers complain. Fortunately, this member will be filled in for us by 118 :c:func:`PyType_Ready`. :: 119 120 "noddy.Noddy", /* tp_name */ 121 122 The name of our type. This will appear in the default textual representation of 123 our objects and in some error messages, for example:: 124 125 >>> "" + noddy.new_noddy() 126 Traceback (most recent call last): 127 File "<stdin>", line 1, in ? 128 TypeError: cannot add type "noddy.Noddy" to string 129 130 Note that the name is a dotted name that includes both the module name and the 131 name of the type within the module. The module in this case is :mod:`noddy` and 132 the type is :class:`Noddy`, so we set the type name to :class:`noddy.Noddy`. 133 One side effect of using an undotted name is that the pydoc documentation tool 134 will not list the new type in the module documentation. :: 135 136 sizeof(noddy_NoddyObject), /* tp_basicsize */ 137 138 This is so that Python knows how much memory to allocate when you call 139 :c:func:`PyObject_New`. 140 141 .. note:: 142 143 If you want your type to be subclassable from Python, and your type has the same 144 :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple 145 inheritance. A Python subclass of your type will have to list your type first 146 in its :attr:`~class.__bases__`, or else it will not be able to call your type's 147 :meth:`__new__` method without getting an error. You can avoid this problem by 148 ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its 149 base type does. Most of the time, this will be true anyway, because either your 150 base type will be :class:`object`, or else you will be adding data members to 151 your base type, and therefore increasing its size. 152 153 :: 154 155 0, /* tp_itemsize */ 156 157 This has to do with variable length objects like lists and strings. Ignore this 158 for now. 159 160 Skipping a number of type methods that we don't provide, we set the class flags 161 to :const:`Py_TPFLAGS_DEFAULT`. :: 162 163 Py_TPFLAGS_DEFAULT, /* tp_flags */ 164 165 All types should include this constant in their flags. It enables all of the 166 members defined until at least Python 3.3. If you need further members, 167 you will need to OR the corresponding flags. 168 169 We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. :: 170 171 "Noddy objects", /* tp_doc */ 172 173 Now we get into the type methods, the things that make your objects different 174 from the others. We aren't going to implement any of these in this version of 175 the module. We'll expand this example later to have more interesting behavior. 176 177 For now, all we want to be able to do is to create new :class:`Noddy` objects. 178 To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new` implementation. 179 In this case, we can just use the default implementation provided by the API 180 function :c:func:`PyType_GenericNew`. We'd like to just assign this to the 181 :c:member:`~PyTypeObject.tp_new` slot, but we can't, for portability sake, On some platforms or 182 compilers, we can't statically initialize a structure member with a function 183 defined in another C module, so, instead, we'll assign the :c:member:`~PyTypeObject.tp_new` slot 184 in the module initialization function just before calling 185 :c:func:`PyType_Ready`:: 186 187 noddy_NoddyType.tp_new = PyType_GenericNew; 188 if (PyType_Ready(&noddy_NoddyType) < 0) 189 return; 190 191 All the other type methods are *NULL*, so we'll go over them later --- that's 192 for a later section! 193 194 Everything else in the file should be familiar, except for some code in 195 :c:func:`PyInit_noddy`:: 196 197 if (PyType_Ready(&noddy_NoddyType) < 0) 198 return; 199 200 This initializes the :class:`Noddy` type, filing in a number of members, 201 including :attr:`ob_type` that we initially set to *NULL*. :: 202 203 PyModule_AddObject(m, "Noddy", (PyObject *)&noddy_NoddyType); 204 205 This adds the type to the module dictionary. This allows us to create 206 :class:`Noddy` instances by calling the :class:`Noddy` class:: 207 208 >>> import noddy 209 >>> mynoddy = noddy.Noddy() 210 211 That's it! All that remains is to build it; put the above code in a file called 212 :file:`noddy.c` and :: 213 214 from distutils.core import setup, Extension 215 setup(name="noddy", version="1.0", 216 ext_modules=[Extension("noddy", ["noddy.c"])]) 217 218 in a file called :file:`setup.py`; then typing 219 220 .. code-block:: shell-session 221 222 $ python setup.py build 223 224 at a shell should produce a file :file:`noddy.so` in a subdirectory; move to 225 that directory and fire up Python --- you should be able to ``import noddy`` and 226 play around with Noddy objects. 227 228 That wasn't so hard, was it? 229 230 Of course, the current Noddy type is pretty uninteresting. It has no data and 231 doesn't do anything. It can't even be subclassed. 232 233 234 Adding data and methods to the Basic example 235 -------------------------------------------- 236 237 Let's extend the basic example to add some data and methods. Let's also make 238 the type usable as a base class. We'll create a new module, :mod:`noddy2` that 239 adds these capabilities: 240 241 .. literalinclude:: ../includes/noddy2.c 242 243 244 This version of the module has a number of changes. 245 246 We've added an extra include:: 247 248 #include <structmember.h> 249 250 This include provides declarations that we use to handle attributes, as 251 described a bit later. 252 253 The name of the :class:`Noddy` object structure has been shortened to 254 :class:`Noddy`. The type object name has been shortened to :class:`NoddyType`. 255 256 The :class:`Noddy` type now has three data attributes, *first*, *last*, and 257 *number*. The *first* and *last* variables are Python strings containing first 258 and last names. The *number* attribute is an integer. 259 260 The object structure is updated accordingly:: 261 262 typedef struct { 263 PyObject_HEAD 264 PyObject *first; 265 PyObject *last; 266 int number; 267 } Noddy; 268 269 Because we now have data to manage, we have to be more careful about object 270 allocation and deallocation. At a minimum, we need a deallocation method:: 271 272 static void 273 Noddy_dealloc(Noddy* self) 274 { 275 Py_XDECREF(self->first); 276 Py_XDECREF(self->last); 277 Py_TYPE(self)->tp_free((PyObject*)self); 278 } 279 280 which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member:: 281 282 (destructor)Noddy_dealloc, /*tp_dealloc*/ 283 284 This method decrements the reference counts of the two Python attributes. We use 285 :c:func:`Py_XDECREF` here because the :attr:`first` and :attr:`last` members 286 could be *NULL*. It then calls the :c:member:`~PyTypeObject.tp_free` member of the object's type 287 to free the object's memory. Note that the object's type might not be 288 :class:`NoddyType`, because the object may be an instance of a subclass. 289 290 We want to make sure that the first and last names are initialized to empty 291 strings, so we provide a new method:: 292 293 static PyObject * 294 Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds) 295 { 296 Noddy *self; 297 298 self = (Noddy *)type->tp_alloc(type, 0); 299 if (self != NULL) { 300 self->first = PyUnicode_FromString(""); 301 if (self->first == NULL) { 302 Py_DECREF(self); 303 return NULL; 304 } 305 306 self->last = PyUnicode_FromString(""); 307 if (self->last == NULL) { 308 Py_DECREF(self); 309 return NULL; 310 } 311 312 self->number = 0; 313 } 314 315 return (PyObject *)self; 316 } 317 318 and install it in the :c:member:`~PyTypeObject.tp_new` member:: 319 320 Noddy_new, /* tp_new */ 321 322 The new member is responsible for creating (as opposed to initializing) objects 323 of the type. It is exposed in Python as the :meth:`__new__` method. See the 324 paper titled "Unifying types and classes in Python" for a detailed discussion of 325 the :meth:`__new__` method. One reason to implement a new method is to assure 326 the initial values of instance variables. In this case, we use the new method 327 to make sure that the initial values of the members :attr:`first` and 328 :attr:`last` are not *NULL*. If we didn't care whether the initial values were 329 *NULL*, we could have used :c:func:`PyType_GenericNew` as our new method, as we 330 did before. :c:func:`PyType_GenericNew` initializes all of the instance variable 331 members to *NULL*. 332 333 The new method is a static method that is passed the type being instantiated and 334 any arguments passed when the type was called, and that returns the new object 335 created. New methods always accept positional and keyword arguments, but they 336 often ignore the arguments, leaving the argument handling to initializer 337 methods. Note that if the type supports subclassing, the type passed may not be 338 the type being defined. The new method calls the :c:member:`~PyTypeObject.tp_alloc` slot to 339 allocate memory. We don't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather 340 :c:func:`PyType_Ready` fills it for us by inheriting it from our base class, 341 which is :class:`object` by default. Most types use the default allocation. 342 343 .. note:: 344 345 If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one that calls a base type's 346 :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`), you must *not* try to determine what method 347 to call using method resolution order at runtime. Always statically determine 348 what type you are going to call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via 349 ``type->tp_base->tp_new``. If you do not do this, Python subclasses of your 350 type that also inherit from other Python-defined classes may not work correctly. 351 (Specifically, you may not be able to create instances of such subclasses 352 without getting a :exc:`TypeError`.) 353 354 We provide an initialization function:: 355 356 static int 357 Noddy_init(Noddy *self, PyObject *args, PyObject *kwds) 358 { 359 PyObject *first=NULL, *last=NULL, *tmp; 360 361 static char *kwlist[] = {"first", "last", "number", NULL}; 362 363 if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist, 364 &first, &last, 365 &self->number)) 366 return -1; 367 368 if (first) { 369 tmp = self->first; 370 Py_INCREF(first); 371 self->first = first; 372 Py_XDECREF(tmp); 373 } 374 375 if (last) { 376 tmp = self->last; 377 Py_INCREF(last); 378 self->last = last; 379 Py_XDECREF(tmp); 380 } 381 382 return 0; 383 } 384 385 by filling the :c:member:`~PyTypeObject.tp_init` slot. :: 386 387 (initproc)Noddy_init, /* tp_init */ 388 389 The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the :meth:`__init__` method. It 390 is used to initialize an object after it's created. Unlike the new method, we 391 can't guarantee that the initializer is called. The initializer isn't called 392 when unpickling objects and it can be overridden. Our initializer accepts 393 arguments to provide initial values for our instance. Initializers always accept 394 positional and keyword arguments. Initializers should return either 0 on 395 success or -1 on error. 396 397 Initializers can be called multiple times. Anyone can call the :meth:`__init__` 398 method on our objects. For this reason, we have to be extra careful when 399 assigning the new values. We might be tempted, for example to assign the 400 :attr:`first` member like this:: 401 402 if (first) { 403 Py_XDECREF(self->first); 404 Py_INCREF(first); 405 self->first = first; 406 } 407 408 But this would be risky. Our type doesn't restrict the type of the 409 :attr:`first` member, so it could be any kind of object. It could have a 410 destructor that causes code to be executed that tries to access the 411 :attr:`first` member. To be paranoid and protect ourselves against this 412 possibility, we almost always reassign members before decrementing their 413 reference counts. When don't we have to do this? 414 415 * when we absolutely know that the reference count is greater than 1 416 417 * when we know that deallocation of the object [#]_ will not cause any calls 418 back into our type's code 419 420 * when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc` handler when 421 garbage-collections is not supported [#]_ 422 423 We want to expose our instance variables as attributes. There are a 424 number of ways to do that. The simplest way is to define member definitions:: 425 426 static PyMemberDef Noddy_members[] = { 427 {"first", T_OBJECT_EX, offsetof(Noddy, first), 0, 428 "first name"}, 429 {"last", T_OBJECT_EX, offsetof(Noddy, last), 0, 430 "last name"}, 431 {"number", T_INT, offsetof(Noddy, number), 0, 432 "noddy number"}, 433 {NULL} /* Sentinel */ 434 }; 435 436 and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot:: 437 438 Noddy_members, /* tp_members */ 439 440 Each member definition has a member name, type, offset, access flags and 441 documentation string. See the :ref:`Generic-Attribute-Management` section below for 442 details. 443 444 A disadvantage of this approach is that it doesn't provide a way to restrict the 445 types of objects that can be assigned to the Python attributes. We expect the 446 first and last names to be strings, but any Python objects can be assigned. 447 Further, the attributes can be deleted, setting the C pointers to *NULL*. Even 448 though we can make sure the members are initialized to non-*NULL* values, the 449 members can be set to *NULL* if the attributes are deleted. 450 451 We define a single method, :meth:`name`, that outputs the objects name as the 452 concatenation of the first and last names. :: 453 454 static PyObject * 455 Noddy_name(Noddy* self) 456 { 457 if (self->first == NULL) { 458 PyErr_SetString(PyExc_AttributeError, "first"); 459 return NULL; 460 } 461 462 if (self->last == NULL) { 463 PyErr_SetString(PyExc_AttributeError, "last"); 464 return NULL; 465 } 466 467 return PyUnicode_FromFormat("%S %S", self->first, self->last); 468 } 469 470 The method is implemented as a C function that takes a :class:`Noddy` (or 471 :class:`Noddy` subclass) instance as the first argument. Methods always take an 472 instance as the first argument. Methods often take positional and keyword 473 arguments as well, but in this case we don't take any and don't need to accept 474 a positional argument tuple or keyword argument dictionary. This method is 475 equivalent to the Python method:: 476 477 def name(self): 478 return "%s %s" % (self.first, self.last) 479 480 Note that we have to check for the possibility that our :attr:`first` and 481 :attr:`last` members are *NULL*. This is because they can be deleted, in which 482 case they are set to *NULL*. It would be better to prevent deletion of these 483 attributes and to restrict the attribute values to be strings. We'll see how to 484 do that in the next section. 485 486 Now that we've defined the method, we need to create an array of method 487 definitions:: 488 489 static PyMethodDef Noddy_methods[] = { 490 {"name", (PyCFunction)Noddy_name, METH_NOARGS, 491 "Return the name, combining the first and last name" 492 }, 493 {NULL} /* Sentinel */ 494 }; 495 496 and assign them to the :c:member:`~PyTypeObject.tp_methods` slot:: 497 498 Noddy_methods, /* tp_methods */ 499 500 Note that we used the :const:`METH_NOARGS` flag to indicate that the method is 501 passed no arguments. 502 503 Finally, we'll make our type usable as a base class. We've written our methods 504 carefully so far so that they don't make any assumptions about the type of the 505 object being created or used, so all we need to do is to add the 506 :const:`Py_TPFLAGS_BASETYPE` to our class flag definition:: 507 508 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/ 509 510 We rename :c:func:`PyInit_noddy` to :c:func:`PyInit_noddy2` and update the module 511 name in the :c:type:`PyModuleDef` struct. 512 513 Finally, we update our :file:`setup.py` file to build the new module:: 514 515 from distutils.core import setup, Extension 516 setup(name="noddy", version="1.0", 517 ext_modules=[ 518 Extension("noddy", ["noddy.c"]), 519 Extension("noddy2", ["noddy2.c"]), 520 ]) 521 522 523 Providing finer control over data attributes 524 -------------------------------------------- 525 526 In this section, we'll provide finer control over how the :attr:`first` and 527 :attr:`last` attributes are set in the :class:`Noddy` example. In the previous 528 version of our module, the instance variables :attr:`first` and :attr:`last` 529 could be set to non-string values or even deleted. We want to make sure that 530 these attributes always contain strings. 531 532 .. literalinclude:: ../includes/noddy3.c 533 534 535 To provide greater control, over the :attr:`first` and :attr:`last` attributes, 536 we'll use custom getter and setter functions. Here are the functions for 537 getting and setting the :attr:`first` attribute:: 538 539 Noddy_getfirst(Noddy *self, void *closure) 540 { 541 Py_INCREF(self->first); 542 return self->first; 543 } 544 545 static int 546 Noddy_setfirst(Noddy *self, PyObject *value, void *closure) 547 { 548 if (value == NULL) { 549 PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute"); 550 return -1; 551 } 552 553 if (! PyUnicode_Check(value)) { 554 PyErr_SetString(PyExc_TypeError, 555 "The first attribute value must be a str"); 556 return -1; 557 } 558 559 Py_DECREF(self->first); 560 Py_INCREF(value); 561 self->first = value; 562 563 return 0; 564 } 565 566 The getter function is passed a :class:`Noddy` object and a "closure", which is 567 void pointer. In this case, the closure is ignored. (The closure supports an 568 advanced usage in which definition data is passed to the getter and setter. This 569 could, for example, be used to allow a single set of getter and setter functions 570 that decide the attribute to get or set based on data in the closure.) 571 572 The setter function is passed the :class:`Noddy` object, the new value, and the 573 closure. The new value may be *NULL*, in which case the attribute is being 574 deleted. In our setter, we raise an error if the attribute is deleted or if the 575 attribute value is not a string. 576 577 We create an array of :c:type:`PyGetSetDef` structures:: 578 579 static PyGetSetDef Noddy_getseters[] = { 580 {"first", 581 (getter)Noddy_getfirst, (setter)Noddy_setfirst, 582 "first name", 583 NULL}, 584 {"last", 585 (getter)Noddy_getlast, (setter)Noddy_setlast, 586 "last name", 587 NULL}, 588 {NULL} /* Sentinel */ 589 }; 590 591 and register it in the :c:member:`~PyTypeObject.tp_getset` slot:: 592 593 Noddy_getseters, /* tp_getset */ 594 595 to register our attribute getters and setters. 596 597 The last item in a :c:type:`PyGetSetDef` structure is the closure mentioned 598 above. In this case, we aren't using the closure, so we just pass *NULL*. 599 600 We also remove the member definitions for these attributes:: 601 602 static PyMemberDef Noddy_members[] = { 603 {"number", T_INT, offsetof(Noddy, number), 0, 604 "noddy number"}, 605 {NULL} /* Sentinel */ 606 }; 607 608 We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only allow strings [#]_ to 609 be passed:: 610 611 static int 612 Noddy_init(Noddy *self, PyObject *args, PyObject *kwds) 613 { 614 PyObject *first=NULL, *last=NULL, *tmp; 615 616 static char *kwlist[] = {"first", "last", "number", NULL}; 617 618 if (! PyArg_ParseTupleAndKeywords(args, kwds, "|SSi", kwlist, 619 &first, &last, 620 &self->number)) 621 return -1; 622 623 if (first) { 624 tmp = self->first; 625 Py_INCREF(first); 626 self->first = first; 627 Py_DECREF(tmp); 628 } 629 630 if (last) { 631 tmp = self->last; 632 Py_INCREF(last); 633 self->last = last; 634 Py_DECREF(tmp); 635 } 636 637 return 0; 638 } 639 640 With these changes, we can assure that the :attr:`first` and :attr:`last` 641 members are never *NULL* so we can remove checks for *NULL* values in almost all 642 cases. This means that most of the :c:func:`Py_XDECREF` calls can be converted to 643 :c:func:`Py_DECREF` calls. The only place we can't change these calls is in the 644 deallocator, where there is the possibility that the initialization of these 645 members failed in the constructor. 646 647 We also rename the module initialization function and module name in the 648 initialization function, as we did before, and we add an extra definition to the 649 :file:`setup.py` file. 650 651 652 Supporting cyclic garbage collection 653 ------------------------------------ 654 655 Python has a cyclic-garbage collector that can identify unneeded objects even 656 when their reference counts are not zero. This can happen when objects are 657 involved in cycles. For example, consider:: 658 659 >>> l = [] 660 >>> l.append(l) 661 >>> del l 662 663 In this example, we create a list that contains itself. When we delete it, it 664 still has a reference from itself. Its reference count doesn't drop to zero. 665 Fortunately, Python's cyclic-garbage collector will eventually figure out that 666 the list is garbage and free it. 667 668 In the second version of the :class:`Noddy` example, we allowed any kind of 669 object to be stored in the :attr:`first` or :attr:`last` attributes. [#]_ This 670 means that :class:`Noddy` objects can participate in cycles:: 671 672 >>> import noddy2 673 >>> n = noddy2.Noddy() 674 >>> l = [n] 675 >>> n.first = l 676 677 This is pretty silly, but it gives us an excuse to add support for the 678 cyclic-garbage collector to the :class:`Noddy` example. To support cyclic 679 garbage collection, types need to fill two slots and set a class flag that 680 enables these slots: 681 682 .. literalinclude:: ../includes/noddy4.c 683 684 685 The traversal method provides access to subobjects that could participate in 686 cycles:: 687 688 static int 689 Noddy_traverse(Noddy *self, visitproc visit, void *arg) 690 { 691 int vret; 692 693 if (self->first) { 694 vret = visit(self->first, arg); 695 if (vret != 0) 696 return vret; 697 } 698 if (self->last) { 699 vret = visit(self->last, arg); 700 if (vret != 0) 701 return vret; 702 } 703 704 return 0; 705 } 706 707 For each subobject that can participate in cycles, we need to call the 708 :c:func:`visit` function, which is passed to the traversal method. The 709 :c:func:`visit` function takes as arguments the subobject and the extra argument 710 *arg* passed to the traversal method. It returns an integer value that must be 711 returned if it is non-zero. 712 713 Python provides a :c:func:`Py_VISIT` macro that automates calling visit 714 functions. With :c:func:`Py_VISIT`, :c:func:`Noddy_traverse` can be simplified:: 715 716 static int 717 Noddy_traverse(Noddy *self, visitproc visit, void *arg) 718 { 719 Py_VISIT(self->first); 720 Py_VISIT(self->last); 721 return 0; 722 } 723 724 .. note:: 725 726 Note that the :c:member:`~PyTypeObject.tp_traverse` implementation must name its arguments exactly 727 *visit* and *arg* in order to use :c:func:`Py_VISIT`. This is to encourage 728 uniformity across these boring implementations. 729 730 We also need to provide a method for clearing any subobjects that can 731 participate in cycles. We implement the method and reimplement the deallocator 732 to use it:: 733 734 static int 735 Noddy_clear(Noddy *self) 736 { 737 PyObject *tmp; 738 739 tmp = self->first; 740 self->first = NULL; 741 Py_XDECREF(tmp); 742 743 tmp = self->last; 744 self->last = NULL; 745 Py_XDECREF(tmp); 746 747 return 0; 748 } 749 750 static void 751 Noddy_dealloc(Noddy* self) 752 { 753 Noddy_clear(self); 754 Py_TYPE(self)->tp_free((PyObject*)self); 755 } 756 757 Notice the use of a temporary variable in :c:func:`Noddy_clear`. We use the 758 temporary variable so that we can set each member to *NULL* before decrementing 759 its reference count. We do this because, as was discussed earlier, if the 760 reference count drops to zero, we might cause code to run that calls back into 761 the object. In addition, because we now support garbage collection, we also 762 have to worry about code being run that triggers garbage collection. If garbage 763 collection is run, our :c:member:`~PyTypeObject.tp_traverse` handler could get called. We can't 764 take a chance of having :c:func:`Noddy_traverse` called when a member's reference 765 count has dropped to zero and its value hasn't been set to *NULL*. 766 767 Python provides a :c:func:`Py_CLEAR` that automates the careful decrementing of 768 reference counts. With :c:func:`Py_CLEAR`, the :c:func:`Noddy_clear` function can 769 be simplified:: 770 771 static int 772 Noddy_clear(Noddy *self) 773 { 774 Py_CLEAR(self->first); 775 Py_CLEAR(self->last); 776 return 0; 777 } 778 779 Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags:: 780 781 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /* tp_flags */ 782 783 That's pretty much it. If we had written custom :c:member:`~PyTypeObject.tp_alloc` or 784 :c:member:`~PyTypeObject.tp_free` slots, we'd need to modify them for cyclic-garbage collection. 785 Most extensions will use the versions automatically provided. 786 787 788 Subclassing other types 789 ----------------------- 790 791 It is possible to create new extension types that are derived from existing 792 types. It is easiest to inherit from the built in types, since an extension can 793 easily use the :class:`PyTypeObject` it needs. It can be difficult to share 794 these :class:`PyTypeObject` structures between extension modules. 795 796 In this example we will create a :class:`Shoddy` type that inherits from the 797 built-in :class:`list` type. The new type will be completely compatible with 798 regular lists, but will have an additional :meth:`increment` method that 799 increases an internal counter. :: 800 801 >>> import shoddy 802 >>> s = shoddy.Shoddy(range(3)) 803 >>> s.extend(s) 804 >>> print(len(s)) 805 6 806 >>> print(s.increment()) 807 1 808 >>> print(s.increment()) 809 2 810 811 .. literalinclude:: ../includes/shoddy.c 812 813 814 As you can see, the source code closely resembles the :class:`Noddy` examples in 815 previous sections. We will break down the main differences between them. :: 816 817 typedef struct { 818 PyListObject list; 819 int state; 820 } Shoddy; 821 822 The primary difference for derived type objects is that the base type's object 823 structure must be the first value. The base type will already include the 824 :c:func:`PyObject_HEAD` at the beginning of its structure. 825 826 When a Python object is a :class:`Shoddy` instance, its *PyObject\** pointer can 827 be safely cast to both *PyListObject\** and *Shoddy\**. :: 828 829 static int 830 Shoddy_init(Shoddy *self, PyObject *args, PyObject *kwds) 831 { 832 if (PyList_Type.tp_init((PyObject *)self, args, kwds) < 0) 833 return -1; 834 self->state = 0; 835 return 0; 836 } 837 838 In the :attr:`__init__` method for our type, we can see how to call through to 839 the :attr:`__init__` method of the base type. 840 841 This pattern is important when writing a type with custom :attr:`new` and 842 :attr:`dealloc` methods. The :attr:`new` method should not actually create the 843 memory for the object with :c:member:`~PyTypeObject.tp_alloc`, that will be handled by the base 844 class when calling its :c:member:`~PyTypeObject.tp_new`. 845 846 When filling out the :c:func:`PyTypeObject` for the :class:`Shoddy` type, you see 847 a slot for :c:func:`tp_base`. Due to cross platform compiler issues, you can't 848 fill that field directly with the :c:func:`PyList_Type`; it can be done later in 849 the module's :c:func:`init` function. :: 850 851 PyMODINIT_FUNC 852 PyInit_shoddy(void) 853 { 854 PyObject *m; 855 856 ShoddyType.tp_base = &PyList_Type; 857 if (PyType_Ready(&ShoddyType) < 0) 858 return NULL; 859 860 m = PyModule_Create(&shoddymodule); 861 if (m == NULL) 862 return NULL; 863 864 Py_INCREF(&ShoddyType); 865 PyModule_AddObject(m, "Shoddy", (PyObject *) &ShoddyType); 866 return m; 867 } 868 869 Before calling :c:func:`PyType_Ready`, the type structure must have the 870 :c:member:`~PyTypeObject.tp_base` slot filled in. When we are deriving a new type, it is not 871 necessary to fill out the :c:member:`~PyTypeObject.tp_alloc` slot with :c:func:`PyType_GenericNew` 872 -- the allocate function from the base type will be inherited. 873 874 After that, calling :c:func:`PyType_Ready` and adding the type object to the 875 module is the same as with the basic :class:`Noddy` examples. 876 877 878 .. _dnt-type-methods: 879 880 Type Methods 881 ============ 882 883 This section aims to give a quick fly-by on the various type methods you can 884 implement and what they do. 885 886 Here is the definition of :c:type:`PyTypeObject`, with some fields only used in 887 debug builds omitted: 888 889 .. literalinclude:: ../includes/typestruct.h 890 891 892 Now that's a *lot* of methods. Don't worry too much though - if you have a type 893 you want to define, the chances are very good that you will only implement a 894 handful of these. 895 896 As you probably expect by now, we're going to go over this and give more 897 information about the various handlers. We won't go in the order they are 898 defined in the structure, because there is a lot of historical baggage that 899 impacts the ordering of the fields; be sure your type initialization keeps the 900 fields in the right order! It's often easiest to find an example that includes 901 all the fields you need (even if they're initialized to ``0``) and then change 902 the values to suit your new type. :: 903 904 const char *tp_name; /* For printing */ 905 906 The name of the type - as mentioned in the last section, this will appear in 907 various places, almost entirely for diagnostic purposes. Try to choose something 908 that will be helpful in such a situation! :: 909 910 Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */ 911 912 These fields tell the runtime how much memory to allocate when new objects of 913 this type are created. Python has some built-in support for variable length 914 structures (think: strings, lists) which is where the :c:member:`~PyTypeObject.tp_itemsize` field 915 comes in. This will be dealt with later. :: 916 917 const char *tp_doc; 918 919 Here you can put a string (or its address) that you want returned when the 920 Python script references ``obj.__doc__`` to retrieve the doc string. 921 922 Now we come to the basic type methods---the ones most extension types will 923 implement. 924 925 926 Finalization and De-allocation 927 ------------------------------ 928 929 .. index:: 930 single: object; deallocation 931 single: deallocation, object 932 single: object; finalization 933 single: finalization, of objects 934 935 :: 936 937 destructor tp_dealloc; 938 939 This function is called when the reference count of the instance of your type is 940 reduced to zero and the Python interpreter wants to reclaim it. If your type 941 has memory to free or other clean-up to perform, you can put it here. The 942 object itself needs to be freed here as well. Here is an example of this 943 function:: 944 945 static void 946 newdatatype_dealloc(newdatatypeobject * obj) 947 { 948 free(obj->obj_UnderlyingDatatypePtr); 949 Py_TYPE(obj)->tp_free(obj); 950 } 951 952 .. index:: 953 single: PyErr_Fetch() 954 single: PyErr_Restore() 955 956 One important requirement of the deallocator function is that it leaves any 957 pending exceptions alone. This is important since deallocators are frequently 958 called as the interpreter unwinds the Python stack; when the stack is unwound 959 due to an exception (rather than normal returns), nothing is done to protect the 960 deallocators from seeing that an exception has already been set. Any actions 961 which a deallocator performs which may cause additional Python code to be 962 executed may detect that an exception has been set. This can lead to misleading 963 errors from the interpreter. The proper way to protect against this is to save 964 a pending exception before performing the unsafe action, and restoring it when 965 done. This can be done using the :c:func:`PyErr_Fetch` and 966 :c:func:`PyErr_Restore` functions:: 967 968 static void 969 my_dealloc(PyObject *obj) 970 { 971 MyObject *self = (MyObject *) obj; 972 PyObject *cbresult; 973 974 if (self->my_callback != NULL) { 975 PyObject *err_type, *err_value, *err_traceback; 976 977 /* This saves the current exception state */ 978 PyErr_Fetch(&err_type, &err_value, &err_traceback); 979 980 cbresult = PyObject_CallObject(self->my_callback, NULL); 981 if (cbresult == NULL) 982 PyErr_WriteUnraisable(self->my_callback); 983 else 984 Py_DECREF(cbresult); 985 986 /* This restores the saved exception state */ 987 PyErr_Restore(err_type, err_value, err_traceback); 988 989 Py_DECREF(self->my_callback); 990 } 991 Py_TYPE(obj)->tp_free((PyObject*)self); 992 } 993 994 .. note:: 995 There are limitations to what you can safely do in a deallocator function. 996 First, if your type supports garbage collection (using :c:member:`~PyTypeObject.tp_traverse` 997 and/or :c:member:`~PyTypeObject.tp_clear`), some of the object's members can have been 998 cleared or finalized by the time :c:member:`~PyTypeObject.tp_dealloc` is called. Second, in 999 :c:member:`~PyTypeObject.tp_dealloc`, your object is in an unstable state: its reference 1000 count is equal to zero. Any call to a non-trivial object or API (as in the 1001 example above) might end up calling :c:member:`~PyTypeObject.tp_dealloc` again, causing a 1002 double free and a crash. 1003 1004 Starting with Python 3.4, it is recommended not to put any complex 1005 finalization code in :c:member:`~PyTypeObject.tp_dealloc`, and instead use the new 1006 :c:member:`~PyTypeObject.tp_finalize` type method. 1007 1008 .. seealso:: 1009 :pep:`442` explains the new finalization scheme. 1010 1011 .. index:: 1012 single: string; object representation 1013 builtin: repr 1014 1015 Object Presentation 1016 ------------------- 1017 1018 In Python, there are two ways to generate a textual representation of an object: 1019 the :func:`repr` function, and the :func:`str` function. (The :func:`print` 1020 function just calls :func:`str`.) These handlers are both optional. 1021 1022 :: 1023 1024 reprfunc tp_repr; 1025 reprfunc tp_str; 1026 1027 The :c:member:`~PyTypeObject.tp_repr` handler should return a string object containing a 1028 representation of the instance for which it is called. Here is a simple 1029 example:: 1030 1031 static PyObject * 1032 newdatatype_repr(newdatatypeobject * obj) 1033 { 1034 return PyUnicode_FromFormat("Repr-ified_newdatatype{{size:\%d}}", 1035 obj->obj_UnderlyingDatatypePtr->size); 1036 } 1037 1038 If no :c:member:`~PyTypeObject.tp_repr` handler is specified, the interpreter will supply a 1039 representation that uses the type's :c:member:`~PyTypeObject.tp_name` and a uniquely-identifying 1040 value for the object. 1041 1042 The :c:member:`~PyTypeObject.tp_str` handler is to :func:`str` what the :c:member:`~PyTypeObject.tp_repr` handler 1043 described above is to :func:`repr`; that is, it is called when Python code calls 1044 :func:`str` on an instance of your object. Its implementation is very similar 1045 to the :c:member:`~PyTypeObject.tp_repr` function, but the resulting string is intended for human 1046 consumption. If :c:member:`~PyTypeObject.tp_str` is not specified, the :c:member:`~PyTypeObject.tp_repr` handler is 1047 used instead. 1048 1049 Here is a simple example:: 1050 1051 static PyObject * 1052 newdatatype_str(newdatatypeobject * obj) 1053 { 1054 return PyUnicode_FromFormat("Stringified_newdatatype{{size:\%d}}", 1055 obj->obj_UnderlyingDatatypePtr->size); 1056 } 1057 1058 1059 1060 Attribute Management 1061 -------------------- 1062 1063 For every object which can support attributes, the corresponding type must 1064 provide the functions that control how the attributes are resolved. There needs 1065 to be a function which can retrieve attributes (if any are defined), and another 1066 to set attributes (if setting attributes is allowed). Removing an attribute is 1067 a special case, for which the new value passed to the handler is *NULL*. 1068 1069 Python supports two pairs of attribute handlers; a type that supports attributes 1070 only needs to implement the functions for one pair. The difference is that one 1071 pair takes the name of the attribute as a :c:type:`char\*`, while the other 1072 accepts a :c:type:`PyObject\*`. Each type can use whichever pair makes more 1073 sense for the implementation's convenience. :: 1074 1075 getattrfunc tp_getattr; /* char * version */ 1076 setattrfunc tp_setattr; 1077 /* ... */ 1078 getattrofunc tp_getattro; /* PyObject * version */ 1079 setattrofunc tp_setattro; 1080 1081 If accessing attributes of an object is always a simple operation (this will be 1082 explained shortly), there are generic implementations which can be used to 1083 provide the :c:type:`PyObject\*` version of the attribute management functions. 1084 The actual need for type-specific attribute handlers almost completely 1085 disappeared starting with Python 2.2, though there are many examples which have 1086 not been updated to use some of the new generic mechanism that is available. 1087 1088 1089 .. _generic-attribute-management: 1090 1091 Generic Attribute Management 1092 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1093 1094 Most extension types only use *simple* attributes. So, what makes the 1095 attributes simple? There are only a couple of conditions that must be met: 1096 1097 #. The name of the attributes must be known when :c:func:`PyType_Ready` is 1098 called. 1099 1100 #. No special processing is needed to record that an attribute was looked up or 1101 set, nor do actions need to be taken based on the value. 1102 1103 Note that this list does not place any restrictions on the values of the 1104 attributes, when the values are computed, or how relevant data is stored. 1105 1106 When :c:func:`PyType_Ready` is called, it uses three tables referenced by the 1107 type object to create :term:`descriptor`\s which are placed in the dictionary of the 1108 type object. Each descriptor controls access to one attribute of the instance 1109 object. Each of the tables is optional; if all three are *NULL*, instances of 1110 the type will only have attributes that are inherited from their base type, and 1111 should leave the :c:member:`~PyTypeObject.tp_getattro` and :c:member:`~PyTypeObject.tp_setattro` fields *NULL* as 1112 well, allowing the base type to handle attributes. 1113 1114 The tables are declared as three fields of the type object:: 1115 1116 struct PyMethodDef *tp_methods; 1117 struct PyMemberDef *tp_members; 1118 struct PyGetSetDef *tp_getset; 1119 1120 If :c:member:`~PyTypeObject.tp_methods` is not *NULL*, it must refer to an array of 1121 :c:type:`PyMethodDef` structures. Each entry in the table is an instance of this 1122 structure:: 1123 1124 typedef struct PyMethodDef { 1125 const char *ml_name; /* method name */ 1126 PyCFunction ml_meth; /* implementation function */ 1127 int ml_flags; /* flags */ 1128 const char *ml_doc; /* docstring */ 1129 } PyMethodDef; 1130 1131 One entry should be defined for each method provided by the type; no entries are 1132 needed for methods inherited from a base type. One additional entry is needed 1133 at the end; it is a sentinel that marks the end of the array. The 1134 :attr:`ml_name` field of the sentinel must be *NULL*. 1135 1136 The second table is used to define attributes which map directly to data stored 1137 in the instance. A variety of primitive C types are supported, and access may 1138 be read-only or read-write. The structures in the table are defined as:: 1139 1140 typedef struct PyMemberDef { 1141 char *name; 1142 int type; 1143 int offset; 1144 int flags; 1145 char *doc; 1146 } PyMemberDef; 1147 1148 For each entry in the table, a :term:`descriptor` will be constructed and added to the 1149 type which will be able to extract a value from the instance structure. The 1150 :attr:`type` field should contain one of the type codes defined in the 1151 :file:`structmember.h` header; the value will be used to determine how to 1152 convert Python values to and from C values. The :attr:`flags` field is used to 1153 store flags which control how the attribute can be accessed. 1154 1155 The following flag constants are defined in :file:`structmember.h`; they may be 1156 combined using bitwise-OR. 1157 1158 +---------------------------+----------------------------------------------+ 1159 | Constant | Meaning | 1160 +===========================+==============================================+ 1161 | :const:`READONLY` | Never writable. | 1162 +---------------------------+----------------------------------------------+ 1163 | :const:`READ_RESTRICTED` | Not readable in restricted mode. | 1164 +---------------------------+----------------------------------------------+ 1165 | :const:`WRITE_RESTRICTED` | Not writable in restricted mode. | 1166 +---------------------------+----------------------------------------------+ 1167 | :const:`RESTRICTED` | Not readable or writable in restricted mode. | 1168 +---------------------------+----------------------------------------------+ 1169 1170 .. index:: 1171 single: READONLY 1172 single: READ_RESTRICTED 1173 single: WRITE_RESTRICTED 1174 single: RESTRICTED 1175 1176 An interesting advantage of using the :c:member:`~PyTypeObject.tp_members` table to build 1177 descriptors that are used at runtime is that any attribute defined this way can 1178 have an associated doc string simply by providing the text in the table. An 1179 application can use the introspection API to retrieve the descriptor from the 1180 class object, and get the doc string using its :attr:`__doc__` attribute. 1181 1182 As with the :c:member:`~PyTypeObject.tp_methods` table, a sentinel entry with a :attr:`name` value 1183 of *NULL* is required. 1184 1185 .. XXX Descriptors need to be explained in more detail somewhere, but not here. 1186 1187 Descriptor objects have two handler functions which correspond to the 1188 \member{tp_getattro} and \member{tp_setattro} handlers. The 1189 \method{__get__()} handler is a function which is passed the descriptor, 1190 instance, and type objects, and returns the value of the attribute, or it 1191 returns \NULL{} and sets an exception. The \method{__set__()} handler is 1192 passed the descriptor, instance, type, and new value; 1193 1194 1195 Type-specific Attribute Management 1196 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1197 1198 For simplicity, only the :c:type:`char\*` version will be demonstrated here; the 1199 type of the name parameter is the only difference between the :c:type:`char\*` 1200 and :c:type:`PyObject\*` flavors of the interface. This example effectively does 1201 the same thing as the generic example above, but does not use the generic 1202 support added in Python 2.2. It explains how the handler functions are 1203 called, so that if you do need to extend their functionality, you'll understand 1204 what needs to be done. 1205 1206 The :c:member:`~PyTypeObject.tp_getattr` handler is called when the object requires an attribute 1207 look-up. It is called in the same situations where the :meth:`__getattr__` 1208 method of a class would be called. 1209 1210 Here is an example:: 1211 1212 static PyObject * 1213 newdatatype_getattr(newdatatypeobject *obj, char *name) 1214 { 1215 if (strcmp(name, "data") == 0) 1216 { 1217 return PyLong_FromLong(obj->data); 1218 } 1219 1220 PyErr_Format(PyExc_AttributeError, 1221 "'%.50s' object has no attribute '%.400s'", 1222 tp->tp_name, name); 1223 return NULL; 1224 } 1225 1226 The :c:member:`~PyTypeObject.tp_setattr` handler is called when the :meth:`__setattr__` or 1227 :meth:`__delattr__` method of a class instance would be called. When an 1228 attribute should be deleted, the third parameter will be *NULL*. Here is an 1229 example that simply raises an exception; if this were really all you wanted, the 1230 :c:member:`~PyTypeObject.tp_setattr` handler should be set to *NULL*. :: 1231 1232 static int 1233 newdatatype_setattr(newdatatypeobject *obj, char *name, PyObject *v) 1234 { 1235 (void)PyErr_Format(PyExc_RuntimeError, "Read-only attribute: \%s", name); 1236 return -1; 1237 } 1238 1239 Object Comparison 1240 ----------------- 1241 1242 :: 1243 1244 richcmpfunc tp_richcompare; 1245 1246 The :c:member:`~PyTypeObject.tp_richcompare` handler is called when comparisons are needed. It is 1247 analogous to the :ref:`rich comparison methods <richcmpfuncs>`, like 1248 :meth:`__lt__`, and also called by :c:func:`PyObject_RichCompare` and 1249 :c:func:`PyObject_RichCompareBool`. 1250 1251 This function is called with two Python objects and the operator as arguments, 1252 where the operator is one of ``Py_EQ``, ``Py_NE``, ``Py_LE``, ``Py_GT``, 1253 ``Py_LT`` or ``Py_GT``. It should compare the two objects with respect to the 1254 specified operator and return ``Py_True`` or ``Py_False`` if the comparison is 1255 successful, ``Py_NotImplemented`` to indicate that comparison is not 1256 implemented and the other object's comparison method should be tried, or *NULL* 1257 if an exception was set. 1258 1259 Here is a sample implementation, for a datatype that is considered equal if the 1260 size of an internal pointer is equal:: 1261 1262 static PyObject * 1263 newdatatype_richcmp(PyObject *obj1, PyObject *obj2, int op) 1264 { 1265 PyObject *result; 1266 int c, size1, size2; 1267 1268 /* code to make sure that both arguments are of type 1269 newdatatype omitted */ 1270 1271 size1 = obj1->obj_UnderlyingDatatypePtr->size; 1272 size2 = obj2->obj_UnderlyingDatatypePtr->size; 1273 1274 switch (op) { 1275 case Py_LT: c = size1 < size2; break; 1276 case Py_LE: c = size1 <= size2; break; 1277 case Py_EQ: c = size1 == size2; break; 1278 case Py_NE: c = size1 != size2; break; 1279 case Py_GT: c = size1 > size2; break; 1280 case Py_GE: c = size1 >= size2; break; 1281 } 1282 result = c ? Py_True : Py_False; 1283 Py_INCREF(result); 1284 return result; 1285 } 1286 1287 1288 Abstract Protocol Support 1289 ------------------------- 1290 1291 Python supports a variety of *abstract* 'protocols;' the specific interfaces 1292 provided to use these interfaces are documented in :ref:`abstract`. 1293 1294 1295 A number of these abstract interfaces were defined early in the development of 1296 the Python implementation. In particular, the number, mapping, and sequence 1297 protocols have been part of Python since the beginning. Other protocols have 1298 been added over time. For protocols which depend on several handler routines 1299 from the type implementation, the older protocols have been defined as optional 1300 blocks of handlers referenced by the type object. For newer protocols there are 1301 additional slots in the main type object, with a flag bit being set to indicate 1302 that the slots are present and should be checked by the interpreter. (The flag 1303 bit does not indicate that the slot values are non-*NULL*. The flag may be set 1304 to indicate the presence of a slot, but a slot may still be unfilled.) :: 1305 1306 PyNumberMethods *tp_as_number; 1307 PySequenceMethods *tp_as_sequence; 1308 PyMappingMethods *tp_as_mapping; 1309 1310 If you wish your object to be able to act like a number, a sequence, or a 1311 mapping object, then you place the address of a structure that implements the C 1312 type :c:type:`PyNumberMethods`, :c:type:`PySequenceMethods`, or 1313 :c:type:`PyMappingMethods`, respectively. It is up to you to fill in this 1314 structure with appropriate values. You can find examples of the use of each of 1315 these in the :file:`Objects` directory of the Python source distribution. :: 1316 1317 hashfunc tp_hash; 1318 1319 This function, if you choose to provide it, should return a hash number for an 1320 instance of your data type. Here is a moderately pointless example:: 1321 1322 static long 1323 newdatatype_hash(newdatatypeobject *obj) 1324 { 1325 long result; 1326 result = obj->obj_UnderlyingDatatypePtr->size; 1327 result = result * 3; 1328 return result; 1329 } 1330 1331 :: 1332 1333 ternaryfunc tp_call; 1334 1335 This function is called when an instance of your data type is "called", for 1336 example, if ``obj1`` is an instance of your data type and the Python script 1337 contains ``obj1('hello')``, the :c:member:`~PyTypeObject.tp_call` handler is invoked. 1338 1339 This function takes three arguments: 1340 1341 #. *arg1* is the instance of the data type which is the subject of the call. If 1342 the call is ``obj1('hello')``, then *arg1* is ``obj1``. 1343 1344 #. *arg2* is a tuple containing the arguments to the call. You can use 1345 :c:func:`PyArg_ParseTuple` to extract the arguments. 1346 1347 #. *arg3* is a dictionary of keyword arguments that were passed. If this is 1348 non-*NULL* and you support keyword arguments, use 1349 :c:func:`PyArg_ParseTupleAndKeywords` to extract the arguments. If you do not 1350 want to support keyword arguments and this is non-*NULL*, raise a 1351 :exc:`TypeError` with a message saying that keyword arguments are not supported. 1352 1353 Here is a desultory example of the implementation of the call function. :: 1354 1355 /* Implement the call function. 1356 * obj1 is the instance receiving the call. 1357 * obj2 is a tuple containing the arguments to the call, in this 1358 * case 3 strings. 1359 */ 1360 static PyObject * 1361 newdatatype_call(newdatatypeobject *obj, PyObject *args, PyObject *other) 1362 { 1363 PyObject *result; 1364 char *arg1; 1365 char *arg2; 1366 char *arg3; 1367 1368 if (!PyArg_ParseTuple(args, "sss:call", &arg1, &arg2, &arg3)) { 1369 return NULL; 1370 } 1371 result = PyUnicode_FromFormat( 1372 "Returning -- value: [\%d] arg1: [\%s] arg2: [\%s] arg3: [\%s]\n", 1373 obj->obj_UnderlyingDatatypePtr->size, 1374 arg1, arg2, arg3); 1375 return result; 1376 } 1377 1378 :: 1379 1380 /* Iterators */ 1381 getiterfunc tp_iter; 1382 iternextfunc tp_iternext; 1383 1384 These functions provide support for the iterator protocol. Any object which 1385 wishes to support iteration over its contents (which may be generated during 1386 iteration) must implement the ``tp_iter`` handler. Objects which are returned 1387 by a ``tp_iter`` handler must implement both the ``tp_iter`` and ``tp_iternext`` 1388 handlers. Both handlers take exactly one parameter, the instance for which they 1389 are being called, and return a new reference. In the case of an error, they 1390 should set an exception and return *NULL*. 1391 1392 For an object which represents an iterable collection, the ``tp_iter`` handler 1393 must return an iterator object. The iterator object is responsible for 1394 maintaining the state of the iteration. For collections which can support 1395 multiple iterators which do not interfere with each other (as lists and tuples 1396 do), a new iterator should be created and returned. Objects which can only be 1397 iterated over once (usually due to side effects of iteration) should implement 1398 this handler by returning a new reference to themselves, and should also 1399 implement the ``tp_iternext`` handler. File objects are an example of such an 1400 iterator. 1401 1402 Iterator objects should implement both handlers. The ``tp_iter`` handler should 1403 return a new reference to the iterator (this is the same as the ``tp_iter`` 1404 handler for objects which can only be iterated over destructively). The 1405 ``tp_iternext`` handler should return a new reference to the next object in the 1406 iteration if there is one. If the iteration has reached the end, it may return 1407 *NULL* without setting an exception or it may set :exc:`StopIteration`; avoiding 1408 the exception can yield slightly better performance. If an actual error occurs, 1409 it should set an exception and return *NULL*. 1410 1411 1412 .. _weakref-support: 1413 1414 Weak Reference Support 1415 ---------------------- 1416 1417 One of the goals of Python's weak-reference implementation is to allow any type 1418 to participate in the weak reference mechanism without incurring the overhead on 1419 those objects which do not benefit by weak referencing (such as numbers). 1420 1421 For an object to be weakly referencable, the extension must include a 1422 :c:type:`PyObject\*` field in the instance structure for the use of the weak 1423 reference mechanism; it must be initialized to *NULL* by the object's 1424 constructor. It must also set the :c:member:`~PyTypeObject.tp_weaklistoffset` field of the 1425 corresponding type object to the offset of the field. For example, the instance 1426 type is defined with the following structure:: 1427 1428 typedef struct { 1429 PyObject_HEAD 1430 PyClassObject *in_class; /* The class object */ 1431 PyObject *in_dict; /* A dictionary */ 1432 PyObject *in_weakreflist; /* List of weak references */ 1433 } PyInstanceObject; 1434 1435 The statically-declared type object for instances is defined this way:: 1436 1437 PyTypeObject PyInstance_Type = { 1438 PyVarObject_HEAD_INIT(&PyType_Type, 0) 1439 0, 1440 "module.instance", 1441 1442 /* Lots of stuff omitted for brevity... */ 1443 1444 Py_TPFLAGS_DEFAULT, /* tp_flags */ 1445 0, /* tp_doc */ 1446 0, /* tp_traverse */ 1447 0, /* tp_clear */ 1448 0, /* tp_richcompare */ 1449 offsetof(PyInstanceObject, in_weakreflist), /* tp_weaklistoffset */ 1450 }; 1451 1452 The type constructor is responsible for initializing the weak reference list to 1453 *NULL*:: 1454 1455 static PyObject * 1456 instance_new() { 1457 /* Other initialization stuff omitted for brevity */ 1458 1459 self->in_weakreflist = NULL; 1460 1461 return (PyObject *) self; 1462 } 1463 1464 The only further addition is that the destructor needs to call the weak 1465 reference manager to clear any weak references. This is only required if the 1466 weak reference list is non-*NULL*:: 1467 1468 static void 1469 instance_dealloc(PyInstanceObject *inst) 1470 { 1471 /* Allocate temporaries if needed, but do not begin 1472 destruction just yet. 1473 */ 1474 1475 if (inst->in_weakreflist != NULL) 1476 PyObject_ClearWeakRefs((PyObject *) inst); 1477 1478 /* Proceed with object destruction normally. */ 1479 } 1480 1481 1482 More Suggestions 1483 ---------------- 1484 1485 Remember that you can omit most of these functions, in which case you provide 1486 ``0`` as a value. There are type definitions for each of the functions you must 1487 provide. They are in :file:`object.h` in the Python include directory that 1488 comes with the source distribution of Python. 1489 1490 In order to learn how to implement any specific method for your new data type, 1491 do the following: Download and unpack the Python source distribution. Go to 1492 the :file:`Objects` directory, then search the C source files for ``tp_`` plus 1493 the function you want (for example, ``tp_richcompare``). You will find examples 1494 of the function you want to implement. 1495 1496 When you need to verify that an object is an instance of the type you are 1497 implementing, use the :c:func:`PyObject_TypeCheck` function. A sample of its use 1498 might be something like the following:: 1499 1500 if (! PyObject_TypeCheck(some_object, &MyType)) { 1501 PyErr_SetString(PyExc_TypeError, "arg #1 not a mything"); 1502 return NULL; 1503 } 1504 1505 .. rubric:: Footnotes 1506 1507 .. [#] This is true when we know that the object is a basic type, like a string or a 1508 float. 1509 1510 .. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler in this example, because our 1511 type doesn't support garbage collection. Even if a type supports garbage 1512 collection, there are calls that can be made to "untrack" the object from 1513 garbage collection, however, these calls are advanced and not covered here. 1514 1515 .. [#] We now know that the first and last members are strings, so perhaps we could be 1516 less careful about decrementing their reference counts, however, we accept 1517 instances of string subclasses. Even though deallocating normal strings won't 1518 call back into our objects, we can't guarantee that deallocating an instance of 1519 a string subclass won't call back into our objects. 1520 1521 .. [#] Even in the third version, we aren't guaranteed to avoid cycles. Instances of 1522 string subclasses are allowed and string subclasses could allow cycles even if 1523 normal strings don't. 1524