1 ====================== 2 Descriptor HowTo Guide 3 ====================== 4 5 :Author: Raymond Hettinger 6 :Contact: <python at rcn dot com> 7 8 .. Contents:: 9 10 Abstract 11 -------- 12 13 Defines descriptors, summarizes the protocol, and shows how descriptors are 14 called. Examines a custom descriptor and several built-in python descriptors 15 including functions, properties, static methods, and class methods. Shows how 16 each works by giving a pure Python equivalent and a sample application. 17 18 Learning about descriptors not only provides access to a larger toolset, it 19 creates a deeper understanding of how Python works and an appreciation for the 20 elegance of its design. 21 22 23 Definition and Introduction 24 --------------------------- 25 26 In general, a descriptor is an object attribute with "binding behavior", one 27 whose attribute access has been overridden by methods in the descriptor 28 protocol. Those methods are :meth:`__get__`, :meth:`__set__`, and 29 :meth:`__delete__`. If any of those methods are defined for an object, it is 30 said to be a descriptor. 31 32 The default behavior for attribute access is to get, set, or delete the 33 attribute from an object's dictionary. For instance, ``a.x`` has a lookup chain 34 starting with ``a.__dict__['x']``, then ``type(a).__dict__['x']``, and 35 continuing through the base classes of ``type(a)`` excluding metaclasses. If the 36 looked-up value is an object defining one of the descriptor methods, then Python 37 may override the default behavior and invoke the descriptor method instead. 38 Where this occurs in the precedence chain depends on which descriptor methods 39 were defined. Note that descriptors are only invoked for new style objects or 40 classes (a class is new style if it inherits from :class:`object` or 41 :class:`type`). 42 43 Descriptors are a powerful, general purpose protocol. They are the mechanism 44 behind properties, methods, static methods, class methods, and :func:`super()`. 45 They are used throughout Python itself to implement the new style classes 46 introduced in version 2.2. Descriptors simplify the underlying C-code and offer 47 a flexible set of new tools for everyday Python programs. 48 49 50 Descriptor Protocol 51 ------------------- 52 53 ``descr.__get__(self, obj, type=None) --> value`` 54 55 ``descr.__set__(self, obj, value) --> None`` 56 57 ``descr.__delete__(self, obj) --> None`` 58 59 That is all there is to it. Define any of these methods and an object is 60 considered a descriptor and can override default behavior upon being looked up 61 as an attribute. 62 63 If an object defines both :meth:`__get__` and :meth:`__set__`, it is considered 64 a data descriptor. Descriptors that only define :meth:`__get__` are called 65 non-data descriptors (they are typically used for methods but other uses are 66 possible). 67 68 Data and non-data descriptors differ in how overrides are calculated with 69 respect to entries in an instance's dictionary. If an instance's dictionary 70 has an entry with the same name as a data descriptor, the data descriptor 71 takes precedence. If an instance's dictionary has an entry with the same 72 name as a non-data descriptor, the dictionary entry takes precedence. 73 74 To make a read-only data descriptor, define both :meth:`__get__` and 75 :meth:`__set__` with the :meth:`__set__` raising an :exc:`AttributeError` when 76 called. Defining the :meth:`__set__` method with an exception raising 77 placeholder is enough to make it a data descriptor. 78 79 80 Invoking Descriptors 81 -------------------- 82 83 A descriptor can be called directly by its method name. For example, 84 ``d.__get__(obj)``. 85 86 Alternatively, it is more common for a descriptor to be invoked automatically 87 upon attribute access. For example, ``obj.d`` looks up ``d`` in the dictionary 88 of ``obj``. If ``d`` defines the method :meth:`__get__`, then ``d.__get__(obj)`` 89 is invoked according to the precedence rules listed below. 90 91 The details of invocation depend on whether ``obj`` is an object or a class. 92 Either way, descriptors only work for new style objects and classes. A class is 93 new style if it is a subclass of :class:`object`. 94 95 For objects, the machinery is in :meth:`object.__getattribute__` which 96 transforms ``b.x`` into ``type(b).__dict__['x'].__get__(b, type(b))``. The 97 implementation works through a precedence chain that gives data descriptors 98 priority over instance variables, instance variables priority over non-data 99 descriptors, and assigns lowest priority to :meth:`__getattr__` if provided. 100 The full C implementation can be found in :c:func:`PyObject_GenericGetAttr()` in 101 :source:`Objects/object.c`. 102 103 For classes, the machinery is in :meth:`type.__getattribute__` which transforms 104 ``B.x`` into ``B.__dict__['x'].__get__(None, B)``. In pure Python, it looks 105 like:: 106 107 def __getattribute__(self, key): 108 "Emulate type_getattro() in Objects/typeobject.c" 109 v = object.__getattribute__(self, key) 110 if hasattr(v, '__get__'): 111 return v.__get__(None, self) 112 return v 113 114 The important points to remember are: 115 116 * descriptors are invoked by the :meth:`__getattribute__` method 117 * overriding :meth:`__getattribute__` prevents automatic descriptor calls 118 * :meth:`__getattribute__` is only available with new style classes and objects 119 * :meth:`object.__getattribute__` and :meth:`type.__getattribute__` make 120 different calls to :meth:`__get__`. 121 * data descriptors always override instance dictionaries. 122 * non-data descriptors may be overridden by instance dictionaries. 123 124 The object returned by ``super()`` also has a custom :meth:`__getattribute__` 125 method for invoking descriptors. The call ``super(B, obj).m()`` searches 126 ``obj.__class__.__mro__`` for the base class ``A`` immediately following ``B`` 127 and then returns ``A.__dict__['m'].__get__(obj, B)``. If not a descriptor, 128 ``m`` is returned unchanged. If not in the dictionary, ``m`` reverts to a 129 search using :meth:`object.__getattribute__`. 130 131 Note, in Python 2.2, ``super(B, obj).m()`` would only invoke :meth:`__get__` if 132 ``m`` was a data descriptor. In Python 2.3, non-data descriptors also get 133 invoked unless an old-style class is involved. The implementation details are 134 in :c:func:`super_getattro()` in :source:`Objects/typeobject.c`. 135 136 .. _`Guido's Tutorial`: https://www.python.org/download/releases/2.2.3/descrintro/#cooperation 137 138 The details above show that the mechanism for descriptors is embedded in the 139 :meth:`__getattribute__()` methods for :class:`object`, :class:`type`, and 140 :func:`super`. Classes inherit this machinery when they derive from 141 :class:`object` or if they have a meta-class providing similar functionality. 142 Likewise, classes can turn-off descriptor invocation by overriding 143 :meth:`__getattribute__()`. 144 145 146 Descriptor Example 147 ------------------ 148 149 The following code creates a class whose objects are data descriptors which 150 print a message for each get or set. Overriding :meth:`__getattribute__` is 151 alternate approach that could do this for every attribute. However, this 152 descriptor is useful for monitoring just a few chosen attributes:: 153 154 class RevealAccess(object): 155 """A data descriptor that sets and returns values 156 normally and prints a message logging their access. 157 """ 158 159 def __init__(self, initval=None, name='var'): 160 self.val = initval 161 self.name = name 162 163 def __get__(self, obj, objtype): 164 print 'Retrieving', self.name 165 return self.val 166 167 def __set__(self, obj, val): 168 print 'Updating', self.name 169 self.val = val 170 171 >>> class MyClass(object): 172 ... x = RevealAccess(10, 'var "x"') 173 ... y = 5 174 ... 175 >>> m = MyClass() 176 >>> m.x 177 Retrieving var "x" 178 10 179 >>> m.x = 20 180 Updating var "x" 181 >>> m.x 182 Retrieving var "x" 183 20 184 >>> m.y 185 5 186 187 The protocol is simple and offers exciting possibilities. Several use cases are 188 so common that they have been packaged into individual function calls. 189 Properties, bound and unbound methods, static methods, and class methods are all 190 based on the descriptor protocol. 191 192 193 Properties 194 ---------- 195 196 Calling :func:`property` is a succinct way of building a data descriptor that 197 triggers function calls upon access to an attribute. Its signature is:: 198 199 property(fget=None, fset=None, fdel=None, doc=None) -> property attribute 200 201 The documentation shows a typical use to define a managed attribute ``x``:: 202 203 class C(object): 204 def getx(self): return self.__x 205 def setx(self, value): self.__x = value 206 def delx(self): del self.__x 207 x = property(getx, setx, delx, "I'm the 'x' property.") 208 209 To see how :func:`property` is implemented in terms of the descriptor protocol, 210 here is a pure Python equivalent:: 211 212 class Property(object): 213 "Emulate PyProperty_Type() in Objects/descrobject.c" 214 215 def __init__(self, fget=None, fset=None, fdel=None, doc=None): 216 self.fget = fget 217 self.fset = fset 218 self.fdel = fdel 219 if doc is None and fget is not None: 220 doc = fget.__doc__ 221 self.__doc__ = doc 222 223 def __get__(self, obj, objtype=None): 224 if obj is None: 225 return self 226 if self.fget is None: 227 raise AttributeError("unreadable attribute") 228 return self.fget(obj) 229 230 def __set__(self, obj, value): 231 if self.fset is None: 232 raise AttributeError("can't set attribute") 233 self.fset(obj, value) 234 235 def __delete__(self, obj): 236 if self.fdel is None: 237 raise AttributeError("can't delete attribute") 238 self.fdel(obj) 239 240 def getter(self, fget): 241 return type(self)(fget, self.fset, self.fdel, self.__doc__) 242 243 def setter(self, fset): 244 return type(self)(self.fget, fset, self.fdel, self.__doc__) 245 246 def deleter(self, fdel): 247 return type(self)(self.fget, self.fset, fdel, self.__doc__) 248 249 The :func:`property` builtin helps whenever a user interface has granted 250 attribute access and then subsequent changes require the intervention of a 251 method. 252 253 For instance, a spreadsheet class may grant access to a cell value through 254 ``Cell('b10').value``. Subsequent improvements to the program require the cell 255 to be recalculated on every access; however, the programmer does not want to 256 affect existing client code accessing the attribute directly. The solution is 257 to wrap access to the value attribute in a property data descriptor:: 258 259 class Cell(object): 260 . . . 261 def getvalue(self): 262 "Recalculate the cell before returning value" 263 self.recalc() 264 return self._value 265 value = property(getvalue) 266 267 268 Functions and Methods 269 --------------------- 270 271 Python's object oriented features are built upon a function based environment. 272 Using non-data descriptors, the two are merged seamlessly. 273 274 Class dictionaries store methods as functions. In a class definition, methods 275 are written using :keyword:`def` and :keyword:`lambda`, the usual tools for 276 creating functions. The only difference from regular functions is that the 277 first argument is reserved for the object instance. By Python convention, the 278 instance reference is called *self* but may be called *this* or any other 279 variable name. 280 281 To support method calls, functions include the :meth:`__get__` method for 282 binding methods during attribute access. This means that all functions are 283 non-data descriptors which return bound or unbound methods depending whether 284 they are invoked from an object or a class. In pure python, it works like 285 this:: 286 287 class Function(object): 288 . . . 289 def __get__(self, obj, objtype=None): 290 "Simulate func_descr_get() in Objects/funcobject.c" 291 return types.MethodType(self, obj, objtype) 292 293 Running the interpreter shows how the function descriptor works in practice:: 294 295 >>> class D(object): 296 ... def f(self, x): 297 ... return x 298 ... 299 >>> d = D() 300 >>> D.__dict__['f'] # Stored internally as a function 301 <function f at 0x00C45070> 302 >>> D.f # Get from a class becomes an unbound method 303 <unbound method D.f> 304 >>> d.f # Get from an instance becomes a bound method 305 <bound method D.f of <__main__.D object at 0x00B18C90>> 306 307 The output suggests that bound and unbound methods are two different types. 308 While they could have been implemented that way, the actual C implementation of 309 :c:type:`PyMethod_Type` in :source:`Objects/classobject.c` is a single object 310 with two different representations depending on whether the :attr:`im_self` 311 field is set or is *NULL* (the C equivalent of ``None``). 312 313 Likewise, the effects of calling a method object depend on the :attr:`im_self` 314 field. If set (meaning bound), the original function (stored in the 315 :attr:`im_func` field) is called as expected with the first argument set to the 316 instance. If unbound, all of the arguments are passed unchanged to the original 317 function. The actual C implementation of :func:`instancemethod_call()` is only 318 slightly more complex in that it includes some type checking. 319 320 321 Static Methods and Class Methods 322 -------------------------------- 323 324 Non-data descriptors provide a simple mechanism for variations on the usual 325 patterns of binding functions into methods. 326 327 To recap, functions have a :meth:`__get__` method so that they can be converted 328 to a method when accessed as attributes. The non-data descriptor transforms an 329 ``obj.f(*args)`` call into ``f(obj, *args)``. Calling ``klass.f(*args)`` 330 becomes ``f(*args)``. 331 332 This chart summarizes the binding and its two most useful variants: 333 334 +-----------------+----------------------+------------------+ 335 | Transformation | Called from an | Called from a | 336 | | Object | Class | 337 +=================+======================+==================+ 338 | function | f(obj, \*args) | f(\*args) | 339 +-----------------+----------------------+------------------+ 340 | staticmethod | f(\*args) | f(\*args) | 341 +-----------------+----------------------+------------------+ 342 | classmethod | f(type(obj), \*args) | f(klass, \*args) | 343 +-----------------+----------------------+------------------+ 344 345 Static methods return the underlying function without changes. Calling either 346 ``c.f`` or ``C.f`` is the equivalent of a direct lookup into 347 ``object.__getattribute__(c, "f")`` or ``object.__getattribute__(C, "f")``. As a 348 result, the function becomes identically accessible from either an object or a 349 class. 350 351 Good candidates for static methods are methods that do not reference the 352 ``self`` variable. 353 354 For instance, a statistics package may include a container class for 355 experimental data. The class provides normal methods for computing the average, 356 mean, median, and other descriptive statistics that depend on the data. However, 357 there may be useful functions which are conceptually related but do not depend 358 on the data. For instance, ``erf(x)`` is handy conversion routine that comes up 359 in statistical work but does not directly depend on a particular dataset. 360 It can be called either from an object or the class: ``s.erf(1.5) --> .9332`` or 361 ``Sample.erf(1.5) --> .9332``. 362 363 Since staticmethods return the underlying function with no changes, the example 364 calls are unexciting:: 365 366 >>> class E(object): 367 ... def f(x): 368 ... print x 369 ... f = staticmethod(f) 370 ... 371 >>> print E.f(3) 372 3 373 >>> print E().f(3) 374 3 375 376 Using the non-data descriptor protocol, a pure Python version of 377 :func:`staticmethod` would look like this:: 378 379 class StaticMethod(object): 380 "Emulate PyStaticMethod_Type() in Objects/funcobject.c" 381 382 def __init__(self, f): 383 self.f = f 384 385 def __get__(self, obj, objtype=None): 386 return self.f 387 388 Unlike static methods, class methods prepend the class reference to the 389 argument list before calling the function. This format is the same 390 for whether the caller is an object or a class:: 391 392 >>> class E(object): 393 ... def f(klass, x): 394 ... return klass.__name__, x 395 ... f = classmethod(f) 396 ... 397 >>> print E.f(3) 398 ('E', 3) 399 >>> print E().f(3) 400 ('E', 3) 401 402 403 This behavior is useful whenever the function only needs to have a class 404 reference and does not care about any underlying data. One use for classmethods 405 is to create alternate class constructors. In Python 2.3, the classmethod 406 :func:`dict.fromkeys` creates a new dictionary from a list of keys. The pure 407 Python equivalent is:: 408 409 class Dict(object): 410 . . . 411 def fromkeys(klass, iterable, value=None): 412 "Emulate dict_fromkeys() in Objects/dictobject.c" 413 d = klass() 414 for key in iterable: 415 d[key] = value 416 return d 417 fromkeys = classmethod(fromkeys) 418 419 Now a new dictionary of unique keys can be constructed like this:: 420 421 >>> Dict.fromkeys('abracadabra') 422 {'a': None, 'r': None, 'b': None, 'c': None, 'd': None} 423 424 Using the non-data descriptor protocol, a pure Python version of 425 :func:`classmethod` would look like this:: 426 427 class ClassMethod(object): 428 "Emulate PyClassMethod_Type() in Objects/funcobject.c" 429 430 def __init__(self, f): 431 self.f = f 432 433 def __get__(self, obj, klass=None): 434 if klass is None: 435 klass = type(obj) 436 def newfunc(*args): 437 return self.f(klass, *args) 438 return newfunc 439 440