1 <HTML> 2 3 <HEAD> 4 <TITLE>Metaclasses in Python 1.5</TITLE> 5 </HEAD> 6 7 <BODY BGCOLOR="FFFFFF"> 8 9 <H1>Metaclasses in Python 1.5</H1> 10 <H2>(A.k.a. The Killer Joke :-)</H2> 11 12 <HR> 13 14 (<i>Postscript:</i> reading this essay is probably not the best way to 15 understand the metaclass hook described here. See a <A 16 HREF="meta-vladimir.txt">message posted by Vladimir Marangozov</A> 17 which may give a gentler introduction to the matter. You may also 18 want to search Deja News for messages with "metaclass" in the subject 19 posted to comp.lang.python in July and August 1998.) 20 21 <HR> 22 23 <P>In previous Python releases (and still in 1.5), there is something 24 called the ``Don Beaudry hook'', after its inventor and champion. 25 This allows C extensions to provide alternate class behavior, thereby 26 allowing the Python class syntax to be used to define other class-like 27 entities. Don Beaudry has used this in his infamous <A 28 HREF="http://maigret.cog.brown.edu/pyutil/">MESS</A> package; Jim 29 Fulton has used it in his <A 30 HREF="http://www.digicool.com/releases/ExtensionClass/">Extension 31 Classes</A> package. (It has also been referred to as the ``Don 32 Beaudry <i>hack</i>,'' but that's a misnomer. There's nothing hackish 33 about it -- in fact, it is rather elegant and deep, even though 34 there's something dark to it.) 35 36 <P>(On first reading, you may want to skip directly to the examples in 37 the section "Writing Metaclasses in Python" below, unless you want 38 your head to explode.) 39 40 <P> 41 42 <HR> 43 44 <P>Documentation of the Don Beaudry hook has purposefully been kept 45 minimal, since it is a feature of incredible power, and is easily 46 abused. Basically, it checks whether the <b>type of the base 47 class</b> is callable, and if so, it is called to create the new 48 class. 49 50 <P>Note the two indirection levels. Take a simple example: 51 52 <PRE> 53 class B: 54 pass 55 56 class C(B): 57 pass 58 </PRE> 59 60 Take a look at the second class definition, and try to fathom ``the 61 type of the base class is callable.'' 62 63 <P>(Types are not classes, by the way. See questions 4.2, 4.19 and in 64 particular 6.22 in the <A 65 HREF="http://www.python.org/cgi-bin/faqw.py" >Python FAQ</A> 66 for more on this topic.) 67 68 <P> 69 70 <UL> 71 72 <LI>The <b>base class</b> is B; this one's easy.<P> 73 74 <LI>Since B is a class, its type is ``class''; so the <b>type of the 75 base class</b> is the type ``class''. This is also known as 76 types.ClassType, assuming the standard module <code>types</code> has 77 been imported.<P> 78 79 <LI>Now is the type ``class'' <b>callable</b>? No, because types (in 80 core Python) are never callable. Classes are callable (calling a 81 class creates a new instance) but types aren't.<P> 82 83 </UL> 84 85 <P>So our conclusion is that in our example, the type of the base 86 class (of C) is not callable. So the Don Beaudry hook does not apply, 87 and the default class creation mechanism is used (which is also used 88 when there is no base class). In fact, the Don Beaudry hook never 89 applies when using only core Python, since the type of a core object 90 is never callable. 91 92 <P>So what do Don and Jim do in order to use Don's hook? Write an 93 extension that defines at least two new Python object types. The 94 first would be the type for ``class-like'' objects usable as a base 95 class, to trigger Don's hook. This type must be made callable. 96 That's why we need a second type. Whether an object is callable 97 depends on its type. So whether a type object is callable depends on 98 <i>its</i> type, which is a <i>meta-type</i>. (In core Python there 99 is only one meta-type, the type ``type'' (types.TypeType), which is 100 the type of all type objects, even itself.) A new meta-type must 101 be defined that makes the type of the class-like objects callable. 102 (Normally, a third type would also be needed, the new ``instance'' 103 type, but this is not an absolute requirement -- the new class type 104 could return an object of some existing type when invoked to create an 105 instance.) 106 107 <P>Still confused? Here's a simple device due to Don himself to 108 explain metaclasses. Take a simple class definition; assume B is a 109 special class that triggers Don's hook: 110 111 <PRE> 112 class C(B): 113 a = 1 114 b = 2 115 </PRE> 116 117 This can be though of as equivalent to: 118 119 <PRE> 120 C = type(B)('C', (B,), {'a': 1, 'b': 2}) 121 </PRE> 122 123 If that's too dense for you, here's the same thing written out using 124 temporary variables: 125 126 <PRE> 127 creator = type(B) # The type of the base class 128 name = 'C' # The name of the new class 129 bases = (B,) # A tuple containing the base class(es) 130 namespace = {'a': 1, 'b': 2} # The namespace of the class statement 131 C = creator(name, bases, namespace) 132 </PRE> 133 134 This is analogous to what happens without the Don Beaudry hook, except 135 that in that case the creator function is set to the default class 136 creator. 137 138 <P>In either case, the creator is called with three arguments. The 139 first one, <i>name</i>, is the name of the new class (as given at the 140 top of the class statement). The <i>bases</i> argument is a tuple of 141 base classes (a singleton tuple if there's only one base class, like 142 the example). Finally, <i>namespace</i> is a dictionary containing 143 the local variables collected during execution of the class statement. 144 145 <P>Note that the contents of the namespace dictionary is simply 146 whatever names were defined in the class statement. A little-known 147 fact is that when Python executes a class statement, it enters a new 148 local namespace, and all assignments and function definitions take 149 place in this namespace. Thus, after executing the following class 150 statement: 151 152 <PRE> 153 class C: 154 a = 1 155 def f(s): pass 156 </PRE> 157 158 the class namespace's contents would be {'a': 1, 'f': <function f 159 ...>}. 160 161 <P>But enough already about writing Python metaclasses in C; read the 162 documentation of <A 163 HREF="http://maigret.cog.brown.edu/pyutil/">MESS</A> or <A 164 HREF="http://www.digicool.com/papers/ExtensionClass.html" >Extension 165 Classes</A> for more information. 166 167 <P> 168 169 <HR> 170 171 <H2>Writing Metaclasses in Python</H2> 172 173 <P>In Python 1.5, the requirement to write a C extension in order to 174 write metaclasses has been dropped (though you can still do 175 it, of course). In addition to the check ``is the type of the base 176 class callable,'' there's a check ``does the base class have a 177 __class__ attribute.'' If so, it is assumed that the __class__ 178 attribute refers to a class. 179 180 <P>Let's repeat our simple example from above: 181 182 <PRE> 183 class C(B): 184 a = 1 185 b = 2 186 </PRE> 187 188 Assuming B has a __class__ attribute, this translates into: 189 190 <PRE> 191 C = B.__class__('C', (B,), {'a': 1, 'b': 2}) 192 </PRE> 193 194 This is exactly the same as before except that instead of type(B), 195 B.__class__ is invoked. If you have read <A HREF= 196 "http://www.python.org/cgi-bin/faqw.py?req=show&file=faq06.022.htp" 197 >FAQ question 6.22</A> you will understand that while there is a big 198 technical difference between type(B) and B.__class__, they play the 199 same role at different abstraction levels. And perhaps at some point 200 in the future they will really be the same thing (at which point you 201 would be able to derive subclasses from built-in types). 202 203 <P>At this point it may be worth mentioning that C.__class__ is the 204 same object as B.__class__, i.e., C's metaclass is the same as B's 205 metaclass. In other words, subclassing an existing class creates a 206 new (meta)inststance of the base class's metaclass. 207 208 <P>Going back to the example, the class B.__class__ is instantiated, 209 passing its constructor the same three arguments that are passed to 210 the default class constructor or to an extension's metaclass: 211 <i>name</i>, <i>bases</i>, and <i>namespace</i>. 212 213 <P>It is easy to be confused by what exactly happens when using a 214 metaclass, because we lose the absolute distinction between classes 215 and instances: a class is an instance of a metaclass (a 216 ``metainstance''), but technically (i.e. in the eyes of the python 217 runtime system), the metaclass is just a class, and the metainstance 218 is just an instance. At the end of the class statement, the metaclass 219 whose metainstance is used as a base class is instantiated, yielding a 220 second metainstance (of the same metaclass). This metainstance is 221 then used as a (normal, non-meta) class; instantiation of the class 222 means calling the metainstance, and this will return a real instance. 223 And what class is that an instance of? Conceptually, it is of course 224 an instance of our metainstance; but in most cases the Python runtime 225 system will see it as an instance of a helper class used by the 226 metaclass to implement its (non-meta) instances... 227 228 <P>Hopefully an example will make things clearer. Let's presume we 229 have a metaclass MetaClass1. It's helper class (for non-meta 230 instances) is callled HelperClass1. We now (manually) instantiate 231 MetaClass1 once to get an empty special base class: 232 233 <PRE> 234 BaseClass1 = MetaClass1("BaseClass1", (), {}) 235 </PRE> 236 237 We can now use BaseClass1 as a base class in a class statement: 238 239 <PRE> 240 class MySpecialClass(BaseClass1): 241 i = 1 242 def f(s): pass 243 </PRE> 244 245 At this point, MySpecialClass is defined; it is a metainstance of 246 MetaClass1 just like BaseClass1, and in fact the expression 247 ``BaseClass1.__class__ == MySpecialClass.__class__ == MetaClass1'' 248 yields true. 249 250 <P>We are now ready to create instances of MySpecialClass. Let's 251 assume that no constructor arguments are required: 252 253 <PRE> 254 x = MySpecialClass() 255 y = MySpecialClass() 256 print x.__class__, y.__class__ 257 </PRE> 258 259 The print statement shows that x and y are instances of HelperClass1. 260 How did this happen? MySpecialClass is an instance of MetaClass1 261 (``meta'' is irrelevant here); when an instance is called, its 262 __call__ method is invoked, and presumably the __call__ method defined 263 by MetaClass1 returns an instance of HelperClass1. 264 265 <P>Now let's see how we could use metaclasses -- what can we do 266 with metaclasses that we can't easily do without them? Here's one 267 idea: a metaclass could automatically insert trace calls for all 268 method calls. Let's first develop a simplified example, without 269 support for inheritance or other ``advanced'' Python features (we'll 270 add those later). 271 272 <PRE> 273 import types 274 275 class Tracing: 276 def __init__(self, name, bases, namespace): 277 """Create a new class.""" 278 self.__name__ = name 279 self.__bases__ = bases 280 self.__namespace__ = namespace 281 def __call__(self): 282 """Create a new instance.""" 283 return Instance(self) 284 285 class Instance: 286 def __init__(self, klass): 287 self.__klass__ = klass 288 def __getattr__(self, name): 289 try: 290 value = self.__klass__.__namespace__[name] 291 except KeyError: 292 raise AttributeError, name 293 if type(value) is not types.FunctionType: 294 return value 295 return BoundMethod(value, self) 296 297 class BoundMethod: 298 def __init__(self, function, instance): 299 self.function = function 300 self.instance = instance 301 def __call__(self, *args): 302 print "calling", self.function, "for", self.instance, "with", args 303 return apply(self.function, (self.instance,) + args) 304 305 Trace = Tracing('Trace', (), {}) 306 307 class MyTracedClass(Trace): 308 def method1(self, a): 309 self.a = a 310 def method2(self): 311 return self.a 312 313 aninstance = MyTracedClass() 314 315 aninstance.method1(10) 316 317 print "the answer is %d" % aninstance.method2() 318 </PRE> 319 320 Confused already? The intention is to read this from top down. The 321 Tracing class is the metaclass we're defining. Its structure is 322 really simple. 323 324 <P> 325 326 <UL> 327 328 <LI>The __init__ method is invoked when a new Tracing instance is 329 created, e.g. the definition of class MyTracedClass later in the 330 example. It simply saves the class name, base classes and namespace 331 as instance variables.<P> 332 333 <LI>The __call__ method is invoked when a Tracing instance is called, 334 e.g. the creation of aninstance later in the example. It returns an 335 instance of the class Instance, which is defined next.<P> 336 337 </UL> 338 339 <P>The class Instance is the class used for all instances of classes 340 built using the Tracing metaclass, e.g. aninstance. It has two 341 methods: 342 343 <P> 344 345 <UL> 346 347 <LI>The __init__ method is invoked from the Tracing.__call__ method 348 above to initialize a new instance. It saves the class reference as 349 an instance variable. It uses a funny name because the user's 350 instance variables (e.g. self.a later in the example) live in the same 351 namespace.<P> 352 353 <LI>The __getattr__ method is invoked whenever the user code 354 references an attribute of the instance that is not an instance 355 variable (nor a class variable; but except for __init__ and 356 __getattr__ there are no class variables). It will be called, for 357 example, when aninstance.method1 is referenced in the example, with 358 self set to aninstance and name set to the string "method1".<P> 359 360 </UL> 361 362 <P>The __getattr__ method looks the name up in the __namespace__ 363 dictionary. If it isn't found, it raises an AttributeError exception. 364 (In a more realistic example, it would first have to look through the 365 base classes as well.) If it is found, there are two possibilities: 366 it's either a function or it isn't. If it's not a function, it is 367 assumed to be a class variable, and its value is returned. If it's a 368 function, we have to ``wrap'' it in instance of yet another helper 369 class, BoundMethod. 370 371 <P>The BoundMethod class is needed to implement a familiar feature: 372 when a method is defined, it has an initial argument, self, which is 373 automatically bound to the relevant instance when it is called. For 374 example, aninstance.method1(10) is equivalent to method1(aninstance, 375 10). In the example if this call, first a temporary BoundMethod 376 instance is created with the following constructor call: temp = 377 BoundMethod(method1, aninstance); then this instance is called as 378 temp(10). After the call, the temporary instance is discarded. 379 380 <P> 381 382 <UL> 383 384 <LI>The __init__ method is invoked for the constructor call 385 BoundMethod(method1, aninstance). It simply saves away its 386 arguments.<P> 387 388 <LI>The __call__ method is invoked when the bound method instance is 389 called, as in temp(10). It needs to call method1(aninstance, 10). 390 However, even though self.function is now method1 and self.instance is 391 aninstance, it can't call self.function(self.instance, args) directly, 392 because it should work regardless of the number of arguments passed. 393 (For simplicity, support for keyword arguments has been omitted.)<P> 394 395 </UL> 396 397 <P>In order to be able to support arbitrary argument lists, the 398 __call__ method first constructs a new argument tuple. Conveniently, 399 because of the notation *args in __call__'s own argument list, the 400 arguments to __call__ (except for self) are placed in the tuple args. 401 To construct the desired argument list, we concatenate a singleton 402 tuple containing the instance with the args tuple: (self.instance,) + 403 args. (Note the trailing comma used to construct the singleton 404 tuple.) In our example, the resulting argument tuple is (aninstance, 405 10). 406 407 <P>The intrinsic function apply() takes a function and an argument 408 tuple and calls the function for it. In our example, we are calling 409 apply(method1, (aninstance, 10)) which is equivalent to calling 410 method(aninstance, 10). 411 412 <P>From here on, things should come together quite easily. The output 413 of the example code is something like this: 414 415 <PRE> 416 calling <function method1 at ae8d8> for <Instance instance at 95ab0> with (10,) 417 calling <function method2 at ae900> for <Instance instance at 95ab0> with () 418 the answer is 10 419 </PRE> 420 421 <P>That was about the shortest meaningful example that I could come up 422 with. A real tracing metaclass (for example, <A 423 HREF="#Trace">Trace.py</A> discussed below) needs to be more 424 complicated in two dimensions. 425 426 <P>First, it needs to support more advanced Python features such as 427 class variables, inheritance, __init__ methods, and keyword arguments. 428 429 <P>Second, it needs to provide a more flexible way to handle the 430 actual tracing information; perhaps it should be possible to write 431 your own tracing function that gets called, perhaps it should be 432 possible to enable and disable tracing on a per-class or per-instance 433 basis, and perhaps a filter so that only interesting calls are traced; 434 it should also be able to trace the return value of the call (or the 435 exception it raised if an error occurs). Even the Trace.py example 436 doesn't support all these features yet. 437 438 <P> 439 440 <HR> 441 442 <H1>Real-life Examples</H1> 443 444 <P>Have a look at some very preliminary examples that I coded up to 445 teach myself how to write metaclasses: 446 447 <DL> 448 449 <DT><A HREF="Enum.py">Enum.py</A> 450 451 <DD>This (ab)uses the class syntax as an elegant way to define 452 enumerated types. The resulting classes are never instantiated -- 453 rather, their class attributes are the enumerated values. For 454 example: 455 456 <PRE> 457 class Color(Enum): 458 red = 1 459 green = 2 460 blue = 3 461 print Color.red 462 </PRE> 463 464 will print the string ``Color.red'', while ``Color.red==1'' is true, 465 and ``Color.red + 1'' raise a TypeError exception. 466 467 <P> 468 469 <DT><A NAME=Trace></A><A HREF="Trace.py">Trace.py</A> 470 471 <DD>The resulting classes work much like standard 472 classes, but by setting a special class or instance attribute 473 __trace_output__ to point to a file, all calls to the class's methods 474 are traced. It was a bit of a struggle to get this right. This 475 should probably redone using the generic metaclass below. 476 477 <P> 478 479 <DT><A HREF="Meta.py">Meta.py</A> 480 481 <DD>A generic metaclass. This is an attempt at finding out how much 482 standard class behavior can be mimicked by a metaclass. The 483 preliminary answer appears to be that everything's fine as long as the 484 class (or its clients) don't look at the instance's __class__ 485 attribute, nor at the class's __dict__ attribute. The use of 486 __getattr__ internally makes the classic implementation of __getattr__ 487 hooks tough; we provide a similar hook _getattr_ instead. 488 (__setattr__ and __delattr__ are not affected.) 489 (XXX Hm. Could detect presence of __getattr__ and rename it.) 490 491 <P> 492 493 <DT><A HREF="Eiffel.py">Eiffel.py</A> 494 495 <DD>Uses the above generic metaclass to implement Eiffel style 496 pre-conditions and post-conditions. 497 498 <P> 499 500 <DT><A HREF="Synch.py">Synch.py</A> 501 502 <DD>Uses the above generic metaclass to implement synchronized 503 methods. 504 505 <P> 506 507 <DT><A HREF="Simple.py">Simple.py</A> 508 509 <DD>The example module used above. 510 511 <P> 512 513 </DL> 514 515 <P>A pattern seems to be emerging: almost all these uses of 516 metaclasses (except for Enum, which is probably more cute than useful) 517 mostly work by placing wrappers around method calls. An obvious 518 problem with that is that it's not easy to combine the features of 519 different metaclasses, while this would actually be quite useful: for 520 example, I wouldn't mind getting a trace from the test run of the 521 Synch module, and it would be interesting to add preconditions to it 522 as well. This needs more research. Perhaps a metaclass could be 523 provided that allows stackable wrappers... 524 525 <P> 526 527 <HR> 528 529 <H2>Things You Could Do With Metaclasses</H2> 530 531 <P>There are lots of things you could do with metaclasses. Most of 532 these can also be done with creative use of __getattr__, but 533 metaclasses make it easier to modify the attribute lookup behavior of 534 classes. Here's a partial list. 535 536 <P> 537 538 <UL> 539 540 <LI>Enforce different inheritance semantics, e.g. automatically call 541 base class methods when a derived class overrides<P> 542 543 <LI>Implement class methods (e.g. if the first argument is not named 544 'self')<P> 545 546 <LI>Implement that each instance is initialized with <b>copies</b> of 547 all class variables<P> 548 549 <LI>Implement a different way to store instance variables (e.g. in a 550 list kept outside the instance but indexed by the instance's id())<P> 551 552 <LI>Automatically wrap or trap all or certain methods 553 554 <UL> 555 556 <LI>for tracing 557 558 <LI>for precondition and postcondition checking 559 560 <LI>for synchronized methods 561 562 <LI>for automatic value caching 563 564 </UL> 565 <P> 566 567 <LI>When an attribute is a parameterless function, call it on 568 reference (to mimic it being an instance variable); same on assignment<P> 569 570 <LI>Instrumentation: see how many times various attributes are used<P> 571 572 <LI>Different semantics for __setattr__ and __getattr__ (e.g. disable 573 them when they are being used recursively)<P> 574 575 <LI>Abuse class syntax for other things<P> 576 577 <LI>Experiment with automatic type checking<P> 578 579 <LI>Delegation (or acquisition)<P> 580 581 <LI>Dynamic inheritance patterns<P> 582 583 <LI>Automatic caching of methods<P> 584 585 </UL> 586 587 <P> 588 589 <HR> 590 591 <H4>Credits</H4> 592 593 <P>Many thanks to David Ascher and Donald Beaudry for their comments 594 on earlier draft of this paper. Also thanks to Matt Conway and Tommy 595 Burnette for putting a seed for the idea of metaclasses in my 596 mind, nearly three years ago, even though at the time my response was 597 ``you can do that with __getattr__ hooks...'' :-) 598 599 <P> 600 601 <HR> 602 603 </BODY> 604 605 </HTML> 606