Home | History | Annotate | Download | only in howto
      1 ======================
      2 Descriptor HowTo Guide
      3 ======================
      4 
      5 :Author: Raymond Hettinger
      6 :Contact: <python at rcn dot com>
      7 
      8 .. Contents::
      9 
     10 Abstract
     11 --------
     12 
     13 Defines descriptors, summarizes the protocol, and shows how descriptors are
     14 called.  Examines a custom descriptor and several built-in python descriptors
     15 including functions, properties, static methods, and class methods.  Shows how
     16 each works by giving a pure Python equivalent and a sample application.
     17 
     18 Learning about descriptors not only provides access to a larger toolset, it
     19 creates a deeper understanding of how Python works and an appreciation for the
     20 elegance of its design.
     21 
     22 
     23 Definition and Introduction
     24 ---------------------------
     25 
     26 In general, a descriptor is an object attribute with "binding behavior", one
     27 whose attribute access has been overridden by methods in the descriptor
     28 protocol.  Those methods are :meth:`__get__`, :meth:`__set__`, and
     29 :meth:`__delete__`.  If any of those methods are defined for an object, it is
     30 said to be a descriptor.
     31 
     32 The default behavior for attribute access is to get, set, or delete the
     33 attribute from an object's dictionary.  For instance, ``a.x`` has a lookup chain
     34 starting with ``a.__dict__['x']``, then ``type(a).__dict__['x']``, and
     35 continuing through the base classes of ``type(a)`` excluding metaclasses. If the
     36 looked-up value is an object defining one of the descriptor methods, then Python
     37 may override the default behavior and invoke the descriptor method instead.
     38 Where this occurs in the precedence chain depends on which descriptor methods
     39 were defined.  Note that descriptors are only invoked for new style objects or
     40 classes (a class is new style if it inherits from :class:`object` or
     41 :class:`type`).
     42 
     43 Descriptors are a powerful, general purpose protocol.  They are the mechanism
     44 behind properties, methods, static methods, class methods, and :func:`super()`.
     45 They are used throughout Python itself to implement the new style classes
     46 introduced in version 2.2.  Descriptors simplify the underlying C-code and offer
     47 a flexible set of new tools for everyday Python programs.
     48 
     49 
     50 Descriptor Protocol
     51 -------------------
     52 
     53 ``descr.__get__(self, obj, type=None) --> value``
     54 
     55 ``descr.__set__(self, obj, value) --> None``
     56 
     57 ``descr.__delete__(self, obj) --> None``
     58 
     59 That is all there is to it.  Define any of these methods and an object is
     60 considered a descriptor and can override default behavior upon being looked up
     61 as an attribute.
     62 
     63 If an object defines both :meth:`__get__` and :meth:`__set__`, it is considered
     64 a data descriptor.  Descriptors that only define :meth:`__get__` are called
     65 non-data descriptors (they are typically used for methods but other uses are
     66 possible).
     67 
     68 Data and non-data descriptors differ in how overrides are calculated with
     69 respect to entries in an instance's dictionary.  If an instance's dictionary
     70 has an entry with the same name as a data descriptor, the data descriptor
     71 takes precedence.  If an instance's dictionary has an entry with the same
     72 name as a non-data descriptor, the dictionary entry takes precedence.
     73 
     74 To make a read-only data descriptor, define both :meth:`__get__` and
     75 :meth:`__set__` with the :meth:`__set__` raising an :exc:`AttributeError` when
     76 called.  Defining the :meth:`__set__` method with an exception raising
     77 placeholder is enough to make it a data descriptor.
     78 
     79 
     80 Invoking Descriptors
     81 --------------------
     82 
     83 A descriptor can be called directly by its method name.  For example,
     84 ``d.__get__(obj)``.
     85 
     86 Alternatively, it is more common for a descriptor to be invoked automatically
     87 upon attribute access.  For example, ``obj.d`` looks up ``d`` in the dictionary
     88 of ``obj``.  If ``d`` defines the method :meth:`__get__`, then ``d.__get__(obj)``
     89 is invoked according to the precedence rules listed below.
     90 
     91 The details of invocation depend on whether ``obj`` is an object or a class.
     92 Either way, descriptors only work for new style objects and classes.  A class is
     93 new style if it is a subclass of :class:`object`.
     94 
     95 For objects, the machinery is in :meth:`object.__getattribute__` which
     96 transforms ``b.x`` into ``type(b).__dict__['x'].__get__(b, type(b))``.  The
     97 implementation works through a precedence chain that gives data descriptors
     98 priority over instance variables, instance variables priority over non-data
     99 descriptors, and assigns lowest priority to :meth:`__getattr__` if provided.
    100 The full C implementation can be found in :c:func:`PyObject_GenericGetAttr()` in
    101 :source:`Objects/object.c`.
    102 
    103 For classes, the machinery is in :meth:`type.__getattribute__` which transforms
    104 ``B.x`` into ``B.__dict__['x'].__get__(None, B)``.  In pure Python, it looks
    105 like::
    106 
    107     def __getattribute__(self, key):
    108         "Emulate type_getattro() in Objects/typeobject.c"
    109         v = object.__getattribute__(self, key)
    110         if hasattr(v, '__get__'):
    111             return v.__get__(None, self)
    112         return v
    113 
    114 The important points to remember are:
    115 
    116 * descriptors are invoked by the :meth:`__getattribute__` method
    117 * overriding :meth:`__getattribute__` prevents automatic descriptor calls
    118 * :meth:`__getattribute__` is only available with new style classes and objects
    119 * :meth:`object.__getattribute__` and :meth:`type.__getattribute__` make
    120   different calls to :meth:`__get__`.
    121 * data descriptors always override instance dictionaries.
    122 * non-data descriptors may be overridden by instance dictionaries.
    123 
    124 The object returned by ``super()`` also has a custom :meth:`__getattribute__`
    125 method for invoking descriptors.  The call ``super(B, obj).m()`` searches
    126 ``obj.__class__.__mro__`` for the base class ``A`` immediately following ``B``
    127 and then returns ``A.__dict__['m'].__get__(obj, B)``.  If not a descriptor,
    128 ``m`` is returned unchanged.  If not in the dictionary, ``m`` reverts to a
    129 search using :meth:`object.__getattribute__`.
    130 
    131 Note, in Python 2.2, ``super(B, obj).m()`` would only invoke :meth:`__get__` if
    132 ``m`` was a data descriptor.  In Python 2.3, non-data descriptors also get
    133 invoked unless an old-style class is involved.  The implementation details are
    134 in :c:func:`super_getattro()` in :source:`Objects/typeobject.c`.
    135 
    136 .. _`Guido's Tutorial`: https://www.python.org/download/releases/2.2.3/descrintro/#cooperation
    137 
    138 The details above show that the mechanism for descriptors is embedded in the
    139 :meth:`__getattribute__()` methods for :class:`object`, :class:`type`, and
    140 :func:`super`.  Classes inherit this machinery when they derive from
    141 :class:`object` or if they have a meta-class providing similar functionality.
    142 Likewise, classes can turn-off descriptor invocation by overriding
    143 :meth:`__getattribute__()`.
    144 
    145 
    146 Descriptor Example
    147 ------------------
    148 
    149 The following code creates a class whose objects are data descriptors which
    150 print a message for each get or set.  Overriding :meth:`__getattribute__` is
    151 alternate approach that could do this for every attribute.  However, this
    152 descriptor is useful for monitoring just a few chosen attributes::
    153 
    154     class RevealAccess(object):
    155         """A data descriptor that sets and returns values
    156            normally and prints a message logging their access.
    157         """
    158 
    159         def __init__(self, initval=None, name='var'):
    160             self.val = initval
    161             self.name = name
    162 
    163         def __get__(self, obj, objtype):
    164             print 'Retrieving', self.name
    165             return self.val
    166 
    167         def __set__(self, obj, val):
    168             print 'Updating', self.name
    169             self.val = val
    170 
    171     >>> class MyClass(object):
    172     ...     x = RevealAccess(10, 'var "x"')
    173     ...     y = 5
    174     ...
    175     >>> m = MyClass()
    176     >>> m.x
    177     Retrieving var "x"
    178     10
    179     >>> m.x = 20
    180     Updating var "x"
    181     >>> m.x
    182     Retrieving var "x"
    183     20
    184     >>> m.y
    185     5
    186 
    187 The protocol is simple and offers exciting possibilities.  Several use cases are
    188 so common that they have been packaged into individual function calls.
    189 Properties, bound and unbound methods, static methods, and class methods are all
    190 based on the descriptor protocol.
    191 
    192 
    193 Properties
    194 ----------
    195 
    196 Calling :func:`property` is a succinct way of building a data descriptor that
    197 triggers function calls upon access to an attribute.  Its signature is::
    198 
    199     property(fget=None, fset=None, fdel=None, doc=None) -> property attribute
    200 
    201 The documentation shows a typical use to define a managed attribute ``x``::
    202 
    203     class C(object):
    204         def getx(self): return self.__x
    205         def setx(self, value): self.__x = value
    206         def delx(self): del self.__x
    207         x = property(getx, setx, delx, "I'm the 'x' property.")
    208 
    209 To see how :func:`property` is implemented in terms of the descriptor protocol,
    210 here is a pure Python equivalent::
    211 
    212     class Property(object):
    213         "Emulate PyProperty_Type() in Objects/descrobject.c"
    214 
    215         def __init__(self, fget=None, fset=None, fdel=None, doc=None):
    216             self.fget = fget
    217             self.fset = fset
    218             self.fdel = fdel
    219             if doc is None and fget is not None:
    220                 doc = fget.__doc__
    221             self.__doc__ = doc
    222 
    223         def __get__(self, obj, objtype=None):
    224             if obj is None:
    225                 return self
    226             if self.fget is None:
    227                 raise AttributeError("unreadable attribute")
    228             return self.fget(obj)
    229 
    230         def __set__(self, obj, value):
    231             if self.fset is None:
    232                 raise AttributeError("can't set attribute")
    233             self.fset(obj, value)
    234 
    235         def __delete__(self, obj):
    236             if self.fdel is None:
    237                 raise AttributeError("can't delete attribute")
    238             self.fdel(obj)
    239 
    240         def getter(self, fget):
    241             return type(self)(fget, self.fset, self.fdel, self.__doc__)
    242 
    243         def setter(self, fset):
    244             return type(self)(self.fget, fset, self.fdel, self.__doc__)
    245 
    246         def deleter(self, fdel):
    247             return type(self)(self.fget, self.fset, fdel, self.__doc__)
    248 
    249 The :func:`property` builtin helps whenever a user interface has granted
    250 attribute access and then subsequent changes require the intervention of a
    251 method.
    252 
    253 For instance, a spreadsheet class may grant access to a cell value through
    254 ``Cell('b10').value``. Subsequent improvements to the program require the cell
    255 to be recalculated on every access; however, the programmer does not want to
    256 affect existing client code accessing the attribute directly.  The solution is
    257 to wrap access to the value attribute in a property data descriptor::
    258 
    259     class Cell(object):
    260         . . .
    261         def getvalue(self):
    262             "Recalculate the cell before returning value"
    263             self.recalc()
    264             return self._value
    265         value = property(getvalue)
    266 
    267 
    268 Functions and Methods
    269 ---------------------
    270 
    271 Python's object oriented features are built upon a function based environment.
    272 Using non-data descriptors, the two are merged seamlessly.
    273 
    274 Class dictionaries store methods as functions.  In a class definition, methods
    275 are written using :keyword:`def` and :keyword:`lambda`, the usual tools for
    276 creating functions.  The only difference from regular functions is that the
    277 first argument is reserved for the object instance.  By Python convention, the
    278 instance reference is called *self* but may be called *this* or any other
    279 variable name.
    280 
    281 To support method calls, functions include the :meth:`__get__` method for
    282 binding methods during attribute access.  This means that all functions are
    283 non-data descriptors which return bound or unbound methods depending whether
    284 they are invoked from an object or a class.  In pure python, it works like
    285 this::
    286 
    287     class Function(object):
    288         . . .
    289         def __get__(self, obj, objtype=None):
    290             "Simulate func_descr_get() in Objects/funcobject.c"
    291             return types.MethodType(self, obj, objtype)
    292 
    293 Running the interpreter shows how the function descriptor works in practice::
    294 
    295     >>> class D(object):
    296     ...     def f(self, x):
    297     ...         return x
    298     ...
    299     >>> d = D()
    300     >>> D.__dict__['f']  # Stored internally as a function
    301     <function f at 0x00C45070>
    302     >>> D.f              # Get from a class becomes an unbound method
    303     <unbound method D.f>
    304     >>> d.f              # Get from an instance becomes a bound method
    305     <bound method D.f of <__main__.D object at 0x00B18C90>>
    306 
    307 The output suggests that bound and unbound methods are two different types.
    308 While they could have been implemented that way, the actual C implementation of
    309 :c:type:`PyMethod_Type` in :source:`Objects/classobject.c` is a single object
    310 with two different representations depending on whether the :attr:`im_self`
    311 field is set or is *NULL* (the C equivalent of ``None``).
    312 
    313 Likewise, the effects of calling a method object depend on the :attr:`im_self`
    314 field. If set (meaning bound), the original function (stored in the
    315 :attr:`im_func` field) is called as expected with the first argument set to the
    316 instance.  If unbound, all of the arguments are passed unchanged to the original
    317 function. The actual C implementation of :func:`instancemethod_call()` is only
    318 slightly more complex in that it includes some type checking.
    319 
    320 
    321 Static Methods and Class Methods
    322 --------------------------------
    323 
    324 Non-data descriptors provide a simple mechanism for variations on the usual
    325 patterns of binding functions into methods.
    326 
    327 To recap, functions have a :meth:`__get__` method so that they can be converted
    328 to a method when accessed as attributes.  The non-data descriptor transforms an
    329 ``obj.f(*args)`` call into ``f(obj, *args)``.  Calling ``klass.f(*args)``
    330 becomes ``f(*args)``.
    331 
    332 This chart summarizes the binding and its two most useful variants:
    333 
    334       +-----------------+----------------------+------------------+
    335       | Transformation  | Called from an       | Called from a    |
    336       |                 | Object               | Class            |
    337       +=================+======================+==================+
    338       | function        | f(obj, \*args)       | f(\*args)        |
    339       +-----------------+----------------------+------------------+
    340       | staticmethod    | f(\*args)            | f(\*args)        |
    341       +-----------------+----------------------+------------------+
    342       | classmethod     | f(type(obj), \*args) | f(klass, \*args) |
    343       +-----------------+----------------------+------------------+
    344 
    345 Static methods return the underlying function without changes.  Calling either
    346 ``c.f`` or ``C.f`` is the equivalent of a direct lookup into
    347 ``object.__getattribute__(c, "f")`` or ``object.__getattribute__(C, "f")``. As a
    348 result, the function becomes identically accessible from either an object or a
    349 class.
    350 
    351 Good candidates for static methods are methods that do not reference the
    352 ``self`` variable.
    353 
    354 For instance, a statistics package may include a container class for
    355 experimental data.  The class provides normal methods for computing the average,
    356 mean, median, and other descriptive statistics that depend on the data. However,
    357 there may be useful functions which are conceptually related but do not depend
    358 on the data.  For instance, ``erf(x)`` is handy conversion routine that comes up
    359 in statistical work but does not directly depend on a particular dataset.
    360 It can be called either from an object or the class:  ``s.erf(1.5) --> .9332`` or
    361 ``Sample.erf(1.5) --> .9332``.
    362 
    363 Since staticmethods return the underlying function with no changes, the example
    364 calls are unexciting::
    365 
    366     >>> class E(object):
    367     ...     def f(x):
    368     ...         print x
    369     ...     f = staticmethod(f)
    370     ...
    371     >>> print E.f(3)
    372     3
    373     >>> print E().f(3)
    374     3
    375 
    376 Using the non-data descriptor protocol, a pure Python version of
    377 :func:`staticmethod` would look like this::
    378 
    379     class StaticMethod(object):
    380         "Emulate PyStaticMethod_Type() in Objects/funcobject.c"
    381 
    382         def __init__(self, f):
    383             self.f = f
    384 
    385         def __get__(self, obj, objtype=None):
    386             return self.f
    387 
    388 Unlike static methods, class methods prepend the class reference to the
    389 argument list before calling the function.  This format is the same
    390 for whether the caller is an object or a class::
    391 
    392     >>> class E(object):
    393     ...     def f(klass, x):
    394     ...          return klass.__name__, x
    395     ...     f = classmethod(f)
    396     ...
    397     >>> print E.f(3)
    398     ('E', 3)
    399     >>> print E().f(3)
    400     ('E', 3)
    401 
    402 
    403 This behavior is useful whenever the function only needs to have a class
    404 reference and does not care about any underlying data.  One use for classmethods
    405 is to create alternate class constructors.  In Python 2.3, the classmethod
    406 :func:`dict.fromkeys` creates a new dictionary from a list of keys.  The pure
    407 Python equivalent is::
    408 
    409     class Dict(object):
    410         . . .
    411         def fromkeys(klass, iterable, value=None):
    412             "Emulate dict_fromkeys() in Objects/dictobject.c"
    413             d = klass()
    414             for key in iterable:
    415                 d[key] = value
    416             return d
    417         fromkeys = classmethod(fromkeys)
    418 
    419 Now a new dictionary of unique keys can be constructed like this::
    420 
    421     >>> Dict.fromkeys('abracadabra')
    422     {'a': None, 'r': None, 'b': None, 'c': None, 'd': None}
    423 
    424 Using the non-data descriptor protocol, a pure Python version of
    425 :func:`classmethod` would look like this::
    426 
    427     class ClassMethod(object):
    428         "Emulate PyClassMethod_Type() in Objects/funcobject.c"
    429 
    430         def __init__(self, f):
    431             self.f = f
    432 
    433         def __get__(self, obj, klass=None):
    434             if klass is None:
    435                 klass = type(obj)
    436             def newfunc(*args):
    437                 return self.f(klass, *args)
    438             return newfunc
    439 
    440