1 Wiki Example 2 ============ 3 4 :author: Ian Bicking <ianb (a] colorstudy.com> 5 6 .. contents:: 7 8 Introduction 9 ------------ 10 11 This is an example of how to write a WSGI application using WebOb. 12 WebOb isn't itself intended to write applications -- it is not a web 13 framework on its own -- but it is *possible* to write applications 14 using just WebOb. 15 16 The `file serving example <file-example.html>`_ is a better example of 17 advanced HTTP usage. The `comment middleware example 18 <comment-example.html>`_ is a better example of using middleware. 19 This example provides some completeness by showing an 20 application-focused end point. 21 22 This example implements a very simple wiki. 23 24 Code 25 ---- 26 27 The finished code for this is available in 28 `docs/wiki-example-code/example.py 29 <https://github.com/Pylons/webob/tree/master/docs/wiki-example-code/example.py>`_ 30 -- you can run that file as a script to try it out. 31 32 Creating an Application 33 ----------------------- 34 35 A common pattern for creating small WSGI applications is to have a 36 class which is instantiated with the configuration. For our 37 application we'll be storing the pages under a directory. 38 39 .. code-block:: python 40 41 class WikiApp(object): 42 43 def __init__(self, storage_dir): 44 self.storage_dir = os.path.abspath(os.path.normpath(storage_dir)) 45 46 WSGI applications are callables like ``wsgi_app(environ, 47 start_response)``. *Instances* of `WikiApp` are WSGI applications, so 48 we'll implement a ``__call__`` method: 49 50 .. code-block:: python 51 52 class WikiApp(object): 53 ... 54 def __call__(self, environ, start_response): 55 # what we'll fill in 56 57 To make the script runnable we'll create a simple command-line 58 interface: 59 60 .. code-block:: python 61 62 if __name__ == '__main__': 63 import optparse 64 parser = optparse.OptionParser( 65 usage='%prog --port=PORT' 66 ) 67 parser.add_option( 68 '-p', '--port', 69 default='8080', 70 dest='port', 71 type='int', 72 help='Port to serve on (default 8080)') 73 parser.add_option( 74 '--wiki-data', 75 default='./wiki', 76 dest='wiki_data', 77 help='Place to put wiki data into (default ./wiki/)') 78 options, args = parser.parse_args() 79 print 'Writing wiki pages to %s' % options.wiki_data 80 app = WikiApp(options.wiki_data) 81 from wsgiref.simple_server import make_server 82 httpd = make_server('localhost', options.port, app) 83 print 'Serving on http://localhost:%s' % options.port 84 try: 85 httpd.serve_forever() 86 except KeyboardInterrupt: 87 print '^C' 88 89 There's not much to talk about in this code block. The application is 90 instantiated and served with the built-in module 91 `wsgiref.simple_server 92 <http://www.python.org/doc/current/lib/module-wsgiref.simple_server.html>`_. 93 94 The WSGI Application 95 -------------------- 96 97 Of course all the interesting stuff is in that ``__call__`` method. 98 WebOb lets you ignore some of the details of WSGI, like what 99 ``start_response`` really is. ``environ`` is a CGI-like dictionary, 100 but ``webob.Request`` gives an object interface to it. 101 ``webob.Response`` represents a response, and is itself a WSGI 102 application. Here's kind of the hello world of WSGI applications 103 using these objects: 104 105 .. code-block:: python 106 107 from webob import Request, Response 108 109 class WikiApp(object): 110 ... 111 112 def __call__(self, environ, start_response): 113 req = Request(environ) 114 resp = Response( 115 'Hello %s!' % req.params.get('name', 'World')) 116 return resp(environ, start_response) 117 118 ``req.params.get('name', 'World')`` gets any query string parameter 119 (like ``?name=Bob``), or if it's a POST form request it will look for 120 a form parameter ``name``. We instantiate the response with the body 121 of the response. You could also give keyword arguments like 122 ``content_type='text/plain'`` (``text/html`` is the default content 123 type and ``200 OK`` is the default status). 124 125 For the wiki application we'll support a couple different kinds of 126 screens, and we'll make our ``__call__`` method dispatch to different 127 methods depending on the request. We'll support an ``action`` 128 parameter like ``?action=edit``, and also dispatch on the method (GET, 129 POST, etc, in ``req.method``). We'll pass in the request and expect a 130 response object back. 131 132 Also, WebOb has a series of exceptions in ``webob.exc``, like 133 ``webob.exc.HTTPNotFound``, ``webob.exc.HTTPTemporaryRedirect``, etc. 134 We'll also let the method raise one of these exceptions and turn it 135 into a response. 136 137 One last thing we'll do in our ``__call__`` method is create our 138 ``Page`` object, which represents a wiki page. 139 140 All this together makes: 141 142 .. code-block:: python 143 144 from webob import Request, Response 145 from webob import exc 146 147 class WikiApp(object): 148 ... 149 150 def __call__(self, environ, start_response): 151 req = Request(environ) 152 action = req.params.get('action', 'view') 153 # Here's where we get the Page domain object: 154 page = self.get_page(req.path_info) 155 try: 156 try: 157 # The method name is action_{action_param}_{request_method}: 158 meth = getattr(self, 'action_%s_%s' % (action, req.method)) 159 except AttributeError: 160 # If the method wasn't found there must be 161 # something wrong with the request: 162 raise exc.HTTPBadRequest('No such action %r' % action) 163 resp = meth(req, page) 164 except exc.HTTPException, e: 165 # The exception object itself is a WSGI application/response: 166 resp = e 167 return resp(environ, start_response) 168 169 The Domain Object 170 ----------------- 171 172 The ``Page`` domain object isn't really related to the web, but it is 173 important to implementing this. Each ``Page`` is just a file on the 174 filesystem. Our ``get_page`` method figures out the filename given 175 the path (the path is in ``req.path_info``, which is all the path 176 after the base path). The ``Page`` class handles getting and setting 177 the title and content. 178 179 Here's the method to figure out the filename: 180 181 .. code-block:: python 182 183 import os 184 185 class WikiApp(object): 186 ... 187 188 def get_page(self, path): 189 path = path.lstrip('/') 190 if not path: 191 # The path was '/', the home page 192 path = 'index' 193 path = os.path.join(self.storage_dir) 194 path = os.path.normpath(path) 195 if path.endswith('/'): 196 path += 'index' 197 if not path.startswith(self.storage_dir): 198 raise exc.HTTPBadRequest("Bad path") 199 path += '.html' 200 return Page(path) 201 202 Mostly this is just the kind of careful path construction you have to 203 do when mapping a URL to a filename. While the server *may* normalize 204 the path (so that a path like ``/../../`` can't be requested), you can 205 never really be sure. By using ``os.path.normpath`` we eliminate 206 these, and then we make absolutely sure that the resulting path is 207 under our ``self.storage_dir`` with ``if not 208 path.startswith(self.storage_dir): raise exc.HTTPBadRequest("Bad 209 path")``. 210 211 Here's the actual domain object: 212 213 .. code-block:: python 214 215 class Page(object): 216 def __init__(self, filename): 217 self.filename = filename 218 219 @property 220 def exists(self): 221 return os.path.exists(self.filename) 222 223 @property 224 def title(self): 225 if not self.exists: 226 # we need to guess the title 227 basename = os.path.splitext(os.path.basename(self.filename))[0] 228 basename = re.sub(r'[_-]', ' ', basename) 229 return basename.capitalize() 230 content = self.full_content 231 match = re.search(r'<title>(.*?)</title>', content, re.I|re.S) 232 return match.group(1) 233 234 @property 235 def full_content(self): 236 f = open(self.filename, 'rb') 237 try: 238 return f.read() 239 finally: 240 f.close() 241 242 @property 243 def content(self): 244 if not self.exists: 245 return '' 246 content = self.full_content 247 match = re.search(r'<body[^>]*>(.*?)</body>', content, re.I|re.S) 248 return match.group(1) 249 250 @property 251 def mtime(self): 252 if not self.exists: 253 return None 254 else: 255 return int(os.stat(self.filename).st_mtime) 256 257 def set(self, title, content): 258 dir = os.path.dirname(self.filename) 259 if not os.path.exists(dir): 260 os.makedirs(dir) 261 new_content = """<html><head><title>%s</title></head><body>%s</body></html>""" % ( 262 title, content) 263 f = open(self.filename, 'wb') 264 f.write(new_content) 265 f.close() 266 267 Basically it provides a ``.title`` attribute, a ``.content`` 268 attribute, the ``.mtime`` (last modified time), and the page can exist 269 or not (giving appropriate guesses for title and content when the page 270 does not exist). It encodes these on the filesystem as a simple HTML 271 page that is parsed by some regular expressions. 272 273 None of this really applies much to the web or WebOb, so I'll leave it 274 to you to figure out the details of this. 275 276 URLs, PATH_INFO, and SCRIPT_NAME 277 -------------------------------- 278 279 This is an aside for the tutorial, but an important concept. In WSGI, 280 and accordingly with WebOb, the URL is split up into several pieces. 281 Some of these are obvious and some not. 282 283 An example:: 284 285 http://example.com:8080/wiki/article/12?version=10 286 287 There are several components here: 288 289 * req.scheme: ``http`` 290 * req.host: ``example.com:8080`` 291 * req.server_name: ``example.com`` 292 * req.server_port: 8080 293 * req.script_name: ``/wiki`` 294 * req.path_info: ``/article/12`` 295 * req.query_string: ``version=10`` 296 297 One non-obvious part is ``req.script_name`` and ``req.path_info``. 298 These correspond to the CGI environmental variables ``SCRIPT_NAME`` 299 and ``PATH_INFO``. ``req.script_name`` points to the *application*. 300 You might have several applications in your site at different paths: 301 one at ``/wiki``, one at ``/blog``, one at ``/``. Each application 302 doesn't necessarily know about the others, but it has to construct its 303 URLs properly -- so any internal links to the wiki application should 304 start with ``/wiki``. 305 306 Just as there are pieces to the URL, there are several properties in 307 WebOb to construct URLs based on these: 308 309 * req.host_url: ``http://example.com:8080`` 310 * req.application_url: ``http://example.com:8080/wiki`` 311 * req.path_url: ``http://example.com:8080/wiki/article/12`` 312 * req.path: ``/wiki/article/12`` 313 * req.path_qs: ``/wiki/article/12?version=10`` 314 * req.url: ``http://example.com:8080/wiki/article/12?version10`` 315 316 You can also create URLs with 317 ``req.relative_url('some/other/page')``. In this example that would 318 resolve to ``http://example.com:8080/wiki/article/some/other/page``. 319 You can also create a relative URL to the application URL 320 (SCRIPT_NAME) like ``req.relative_url('some/other/page', True)`` which 321 would be ``http://example.com:8080/wiki/some/other/page``. 322 323 Back to the Application 324 ----------------------- 325 326 We have a dispatching function with ``__call__`` and we have a domain 327 object with ``Page``, but we aren't actually doing anything. 328 329 The dispatching goes to ``action_ACTION_METHOD``, where ACTION 330 defaults to ``view``. So a simple page view will be 331 ``action_view_GET``. Let's implement that: 332 333 .. code-block:: python 334 335 class WikiApp(object): 336 ... 337 338 def action_view_GET(self, req, page): 339 if not page.exists: 340 return exc.HTTPTemporaryRedirect( 341 location=req.url + '?action=edit') 342 text = self.view_template.substitute( 343 page=page, req=req) 344 resp = Response(text) 345 resp.last_modified = page.mtime 346 resp.conditional_response = True 347 return resp 348 349 The first thing we do is redirect the user to the edit screen if the 350 page doesn't exist. ``exc.HTTPTemporaryRedirect`` is a response that 351 gives a ``307 Temporary Redirect`` response with the given location. 352 353 Otherwise we fill in a template. The template language we're going to 354 use in this example is `Tempita <http://pythonpaste.org/tempita/>`_, a 355 very simple template language with a similar interface to 356 `string.Template <http://python.org/doc/current/lib/node40.html>`_. 357 358 The template actually looks like this: 359 360 .. code-block:: python 361 362 from tempita import HTMLTemplate 363 364 VIEW_TEMPLATE = HTMLTemplate("""\ 365 <html> 366 <head> 367 <title>{{page.title}}</title> 368 </head> 369 <body> 370 <h1>{{page.title}}</h1> 371 372 <div>{{page.content|html}}</div> 373 374 <hr> 375 <a href="{{req.url}}?action=edit">Edit</a> 376 </body> 377 </html> 378 """) 379 380 class WikiApp(object): 381 view_template = VIEW_TEMPLATE 382 ... 383 384 As you can see it's a simple template using the title and the body, 385 and a link to the edit screen. We copy the template object into a 386 class method (``view_template = VIEW_TEMPLATE``) so that potentially a 387 subclass could override these templates. 388 389 ``tempita.HTMLTemplate`` is a template that does automatic HTML 390 escaping. Our wiki will just be written in plain HTML, so we disable 391 escaping of the content with ``{{page.content|html}}``. 392 393 So let's look at the ``action_view_GET`` method again: 394 395 .. code-block:: python 396 397 def action_view_GET(self, req, page): 398 if not page.exists: 399 return exc.HTTPTemporaryRedirect( 400 location=req.url + '?action=edit') 401 text = self.view_template.substitute( 402 page=page, req=req) 403 resp = Response(text) 404 resp.last_modified = page.mtime 405 resp.conditional_response = True 406 return resp 407 408 The template should be pretty obvious now. We create a response with 409 ``Response(text)``, which already has a default Content-Type of 410 ``text/html``. 411 412 To allow conditional responses we set ``resp.last_modified``. You can 413 set this attribute to a date, None (effectively removing the header), 414 a time tuple (like produced by ``time.localtime()``), or as in this 415 case to an integer timestamp. If you get the value back it will 416 always be a `datetime 417 <http://python.org/doc/current/lib/datetime-datetime.html>`_ object 418 (or None). With this header we can process requests with 419 If-Modified-Since headers, and return ``304 Not Modified`` if 420 appropriate. It won't actually do that unless you set 421 ``resp.conditional_response`` to True. 422 423 .. note:: 424 425 If you subclass ``webob.Response`` you can set the class attribute 426 ``default_conditional_response = True`` and this setting will be 427 on by default. You can also set other defaults, like the 428 ``default_charset`` (``"utf8"``), or ``default_content_type`` 429 (``"text/html"``). 430 431 The Edit Screen 432 --------------- 433 434 The edit screen will be implemented in the method 435 ``action_edit_GET``. There's a template and a very simple method: 436 437 .. code-block:: python 438 439 EDIT_TEMPLATE = HTMLTemplate("""\ 440 <html> 441 <head> 442 <title>Edit: {{page.title}}</title> 443 </head> 444 <body> 445 {{if page.exists}} 446 <h1>Edit: {{page.title}}</h1> 447 {{else}} 448 <h1>Create: {{page.title}}</h1> 449 {{endif}} 450 451 <form action="{{req.path_url}}" method="POST"> 452 <input type="hidden" name="mtime" value="{{page.mtime}}"> 453 Title: <input type="text" name="title" style="width: 70%" value="{{page.title}}"><br> 454 Content: <input type="submit" value="Save"> 455 <a href="{{req.path_url}}">Cancel</a> 456 <br> 457 <textarea name="content" style="width: 100%; height: 75%" rows="40">{{page.content}}</textarea> 458 <br> 459 <input type="submit" value="Save"> 460 <a href="{{req.path_url}}">Cancel</a> 461 </form> 462 </body></html> 463 """) 464 465 class WikiApp(object): 466 ... 467 468 edit_template = EDIT_TEMPLATE 469 470 def action_edit_GET(self, req, page): 471 text = self.edit_template.substitute( 472 page=page, req=req) 473 return Response(text) 474 475 As you can see, all the action here is in the template. 476 477 In ``<form action="{{req.path_url}}" method="POST">`` we submit to 478 ``req.path_url``; that's everything *but* ``?action=edit``. So we are 479 POSTing right over the view page. This has the nice side effect of 480 automatically invalidating any caches of the original page. It also 481 is vaguely `RESTful 482 <http://en.wikipedia.org/wiki/Representational_State_Transfer>`_. 483 484 We save the last modified time in a hidden ``mtime`` field. This way 485 we can detect concurrent updates. If start editing the page who's 486 mtime is 100000, and someone else edits and saves a revision changing 487 the mtime to 100010, we can use this hidden field to detect that 488 conflict. Actually resolving the conflict is a little tricky and 489 outside the scope of this particular tutorial, we'll just note the 490 conflict to the user in an error. 491 492 From there we just have a very straight-forward HTML form. Note that 493 we don't quote the values because that is done automatically by 494 ``HTMLTemplate``; if you are using something like ``string.Template`` 495 or a templating language that doesn't do automatic quoting, you have 496 to be careful to quote all the field values. 497 498 We don't have any error conditions in our application, but if there 499 were error conditions we might have to re-display this form with the 500 input values the user already gave. In that case we'd do something 501 like:: 502 503 <input type="text" name="title" 504 value="{{req.params.get('title', page.title)}}"> 505 506 This way we use the value in the request (``req.params`` is both the 507 query string parameters and any variables in a POST response), but if 508 there is no value (e.g., first request) then we use the page values. 509 510 Processing the Form 511 ------------------- 512 513 The form submits to ``action_view_POST`` (``view`` is the default 514 action). So we have to implement that method: 515 516 .. code-block:: python 517 518 class WikiApp(object): 519 ... 520 521 def action_view_POST(self, req, page): 522 submit_mtime = int(req.params.get('mtime') or '0') or None 523 if page.mtime != submit_mtime: 524 return exc.HTTPPreconditionFailed( 525 "The page has been updated since you started editing it") 526 page.set( 527 title=req.params['title'], 528 content=req.params['content']) 529 resp = exc.HTTPSeeOther( 530 location=req.path_url) 531 return resp 532 533 The first thing we do is check the mtime value. It can be an empty 534 string (when there's no mtime, like when you are creating a page) or 535 an integer. ``int(req.params.get('time') or '0') or None`` basically 536 makes sure we don't pass ``""`` to ``int()`` (which is an error) then 537 turns 0 into None (``0 or None`` will evaluate to None in Python -- 538 ``false_value or other_value`` in Python resolves to ``other_value``). 539 If it fails we just give a not-very-helpful error message, using ``412 540 Precondition Failed`` (typically preconditions are HTTP headers like 541 ``If-Unmodified-Since``, but we can't really get the browser to send 542 requests like that, so we use the hidden field instead). 543 544 .. note:: 545 546 Error statuses in HTTP are often under-used because people think 547 they need to either return an error (useful for machines) or an 548 error message or interface (useful for humans). In fact you can 549 do both: you can give any human readable error message with your 550 error response. 551 552 One problem is that Internet Explorer will replace error messages 553 with its own incredibly unhelpful error messages. However, it 554 will only do this if the error message is short. If it's fairly 555 large (4Kb is large enough) it will show the error message it was 556 given. You can load your error with a big HTML comment to 557 accomplish this, like ``"<!-- %s -->" % ('x'*4000)``. 558 559 You can change the status of any response with ``resp.status_int = 560 412``, or you can change the body of an ``exc.HTTPSomething`` with 561 ``resp.body = new_body``. The primary advantage of using the 562 classes in ``webob.exc`` is giving the response a clear name and a 563 boilerplate error message. 564 565 After we check the mtime we get the form parameters from 566 ``req.params`` and issue a redirect back to the original view page. 567 ``303 See Other`` is a good response to give after accepting a POST 568 form submission, as it gets rid of the POST (no warning messages for the 569 user if they try to go back). 570 571 In this example we've used ``req.params`` for all the form values. If 572 we wanted to be specific about where we get the values from, they 573 could come from ``req.GET`` (the query string, a misnomer since the 574 query string is present even in POST requests) or ``req.POST`` (a POST 575 form body). While sometimes it's nice to distinguish between these 576 two locations, for the most part it doesn't matter. If you want to 577 check the request method (e.g., make sure you can't change a page with 578 a GET request) there's no reason to do it by accessing these 579 method-specific getters. It's better to just handle the method 580 specifically. We do it here by including the request method in our 581 dispatcher (dispatching to ``action_view_GET`` or 582 ``action_view_POST``). 583 584 585 Cookies 586 ------- 587 588 One last little improvement we can do is show the user a message when 589 they update the page, so it's not quite so mysteriously just another 590 page view. 591 592 A simple way to do this is to set a cookie after the save, then 593 display it in the page view. To set it on save, we add a little to 594 ``action_view_POST``: 595 596 .. code-block:: python 597 598 def action_view_POST(self, req, page): 599 ... 600 resp = exc.HTTPSeeOther( 601 location=req.path_url) 602 resp.set_cookie('message', 'Page updated') 603 return resp 604 605 And then in ``action_view_GET``: 606 607 .. code-block:: python 608 609 610 VIEW_TEMPLATE = HTMLTemplate("""\ 611 ... 612 {{if message}} 613 <div style="background-color: #99f">{{message}}</div> 614 {{endif}} 615 ...""") 616 617 class WikiApp(object): 618 ... 619 620 def action_view_GET(self, req, page): 621 ... 622 if req.cookies.get('message'): 623 message = req.cookies['message'] 624 else: 625 message = None 626 text = self.view_template.substitute( 627 page=page, req=req, message=message) 628 resp = Response(text) 629 if message: 630 resp.delete_cookie('message') 631 else: 632 resp.last_modified = page.mtime 633 resp.conditional_response = True 634 return resp 635 636 ``req.cookies`` is just a dictionary, and we also delete the cookie if 637 it is present (so the message doesn't keep getting set). The 638 conditional response stuff only applies when there isn't any 639 message, as messages are private. Another alternative would be to 640 display the message with Javascript, like:: 641 642 <script type="text/javascript"> 643 function readCookie(name) { 644 var nameEQ = name + "="; 645 var ca = document.cookie.split(';'); 646 for (var i=0; i < ca.length; i++) { 647 var c = ca[i]; 648 while (c.charAt(0) == ' ') c = c.substring(1,c.length); 649 if (c.indexOf(nameEQ) == 0) return c.substring(nameEQ.length,c.length); 650 } 651 return null; 652 } 653 654 function createCookie(name, value, days) { 655 if (days) { 656 var date = new Date(); 657 date.setTime(date.getTime()+(days*24*60*60*1000)); 658 var expires = "; expires="+date.toGMTString(); 659 } else { 660 var expires = ""; 661 } 662 document.cookie = name+"="+value+expires+"; path=/"; 663 } 664 665 function eraseCookie(name) { 666 createCookie(name, "", -1); 667 } 668 669 function showMessage() { 670 var message = readCookie('message'); 671 if (message) { 672 var el = document.getElementById('message'); 673 el.innerHTML = message; 674 el.style.display = ''; 675 eraseCookie('message'); 676 } 677 } 678 </script> 679 680 Then put ``<div id="messaage" style="display: none"></div>`` in the 681 page somewhere. This has the advantage of being very cacheable and 682 simple on the server side. 683 684 Conclusion 685 ---------- 686 687 We're done, hurrah! 688