1 = Introduction = 2 3 >>> from bs4 import BeautifulSoup 4 >>> soup = BeautifulSoup("<p>Some<b>bad<i>HTML") 5 >>> print soup.prettify() 6 <html> 7 <body> 8 <p> 9 Some 10 <b> 11 bad 12 <i> 13 HTML 14 </i> 15 </b> 16 </p> 17 </body> 18 </html> 19 >>> soup.find(text="bad") 20 u'bad' 21 22 >>> soup.i 23 <i>HTML</i> 24 25 >>> soup = BeautifulSoup("<tag1>Some<tag2/>bad<tag3>XML", "xml") 26 >>> print soup.prettify() 27 <?xml version="1.0" encoding="utf-8"> 28 <tag1> 29 Some 30 <tag2 /> 31 bad 32 <tag3> 33 XML 34 </tag3> 35 </tag1> 36 37 = Full documentation = 38 39 The bs4/doc/ directory contains full documentation in Sphinx 40 format. Run "make html" in that directory to create HTML 41 documentation. 42 43 = Running the unit tests = 44 45 Beautiful Soup supports unit test discovery from the project root directory: 46 47 $ nosetests 48 49 $ python -m unittest discover -s bs4 # Python 2.7 and up 50 51 If you checked out the source tree, you should see a script in the 52 home directory called test-all-versions. This script will run the unit 53 tests under Python 2.7, then create a temporary Python 3 conversion of 54 the source and run the unit tests again under Python 3. 55 56 = Links = 57 58 Homepage: http://www.crummy.com/software/BeautifulSoup/bs4/ 59 Documentation: http://www.crummy.com/software/BeautifulSoup/bs4/doc/ 60 http://readthedocs.org/docs/beautiful-soup-4/ 61 Discussion group: http://groups.google.com/group/beautifulsoup/ 62 Development: https://code.launchpad.net/beautifulsoup/ 63 Bug tracker: https://bugs.launchpad.net/beautifulsoup/ 64