1 Change Log 2 ---------- 3 4 0.9999999/1.0b8 5 ~~~~~~~~~~~~~~~ 6 7 Released on XXX 8 9 * XXX 10 11 12 0.999999/1.0b7 13 ~~~~~~~~~~~~~~ 14 15 Released on July 7, 2015 16 17 * Fix #189: fix the sanitizer to allow relative URLs again (as it did 18 prior to 0.9999/1.0b5). 19 20 21 0.99999/1.0b6 22 ~~~~~~~~~~~~~ 23 24 Released on April 30, 2015 25 26 * Fix #188: fix the sanitizer to not throw an exception when sanitizing 27 bogus data URLs. 28 29 30 0.9999/1.0b5 31 ~~~~~~~~~~~~ 32 33 Released on April 29, 2015 34 35 * Fix #153: Sanitizer fails to treat some attributes as URLs. Despite how 36 this sounds, this has no known security implications. No known version 37 of IE (5.5 to current), Firefox (3 to current), Safari (6 to current), 38 Chrome (1 to current), or Opera (12 to current) will run any script 39 provided in these attributes. 40 41 * Pass error message to the ParseError exception in strict parsing mode. 42 43 * Allow data URIs in the sanitizer, with a whitelist of content-types. 44 45 * Add support for Python implementations that don't support lone 46 surrogates (read: Jython). Fixes #2. 47 48 * Remove localization of error messages. This functionality was totally 49 unused (and untested that everything was localizable), so we may as 50 well follow numerous browsers in not supporting translating technical 51 strings. 52 53 * Expose treewalkers.pprint as a public API. 54 55 * Add a documentEncoding property to HTML5Parser, fix #121. 56 57 58 0.999 59 ~~~~~ 60 61 Released on December 23, 2013 62 63 * Fix #127: add work-around for CPython issue #20007: .read(0) on 64 http.client.HTTPResponse drops the rest of the content. 65 66 * Fix #115: lxml treewalker can now deal with fragments containing, at 67 their root level, text nodes with non-ASCII characters on Python 2. 68 69 70 0.99 71 ~~~~ 72 73 Released on September 10, 2013 74 75 * No library changes from 1.0b3; released as 0.99 as pip has changed 76 behaviour from 1.4 to avoid installing pre-release versions per 77 PEP 440. 78 79 80 1.0b3 81 ~~~~~ 82 83 Released on July 24, 2013 84 85 * Removed ``RecursiveTreeWalker`` from ``treewalkers._base``. Any 86 implementation using it should be moved to 87 ``NonRecursiveTreeWalker``, as everything bundled with html5lib has 88 for years. 89 90 * Fix #67 so that ``BufferedStream`` to correctly returns a bytes 91 object, thereby fixing any case where html5lib is passed a 92 non-seekable RawIOBase-like object. 93 94 95 1.0b2 96 ~~~~~ 97 98 Released on June 27, 2013 99 100 * Removed reordering of attributes within the serializer. There is now 101 an ``alphabetical_attributes`` option which preserves the previous 102 behaviour through a new filter. This allows attribute order to be 103 preserved through html5lib if the tree builder preserves order. 104 105 * Removed ``dom2sax`` from DOM treebuilders. It has been replaced by 106 ``treeadapters.sax.to_sax`` which is generic and supports any 107 treewalker; it also resolves all known bugs with ``dom2sax``. 108 109 * Fix treewalker assertions on hitting bytes strings on 110 Python 2. Previous to 1.0b1, treewalkers coped with mixed 111 bytes/unicode data on Python 2; this reintroduces this prior 112 behaviour on Python 2. Behaviour is unchanged on Python 3. 113 114 115 1.0b1 116 ~~~~~ 117 118 Released on May 17, 2013 119 120 * Implementation updated to implement the `HTML specification 121 <http://www.whatwg.org/specs/web-apps/current-work/>`_ as of 5th May 122 2013 (`SVN <http://svn.whatwg.org/webapps/>`_ revision r7867). 123 124 * Python 3.2+ supported in a single codebase using the ``six`` library. 125 126 * Removed support for Python 2.5 and older. 127 128 * Removed the deprecated Beautiful Soup 3 treebuilder. 129 ``beautifulsoup4`` can use ``html5lib`` as a parser instead. Note that 130 since it doesn't support namespaces, foreign content like SVG and 131 MathML is parsed incorrectly. 132 133 * Removed ``simpletree`` from the package. The default tree builder is 134 now ``etree`` (using the ``xml.etree.cElementTree`` implementation if 135 available, and ``xml.etree.ElementTree`` otherwise). 136 137 * Removed the ``XHTMLSerializer`` as it never actually guaranteed its 138 output was well-formed XML, and hence provided little of use. 139 140 * Removed default DOM treebuilder, so ``html5lib.treebuilders.dom`` is no 141 longer supported. ``html5lib.treebuilders.getTreeBuilder("dom")`` will 142 return the default DOM treebuilder, which uses ``xml.dom.minidom``. 143 144 * Optional heuristic character encoding detection now based on 145 ``charade`` for Python 2.6 - 3.3 compatibility. 146 147 * Optional ``Genshi`` treewalker support fixed. 148 149 * Many bugfixes, including: 150 151 * #33: null in attribute value breaks XML AttValue; 152 153 * #4: nested, indirect descendant, <button> causes infinite loop; 154 155 * `Google Code 215 156 <http://code.google.com/p/html5lib/issues/detail?id=215>`_: Properly 157 detect seekable streams; 158 159 * `Google Code 206 160 <http://code.google.com/p/html5lib/issues/detail?id=206>`_: add 161 support for <video preload=...>, <audio preload=...>; 162 163 * `Google Code 205 164 <http://code.google.com/p/html5lib/issues/detail?id=205>`_: add 165 support for <video poster=...>; 166 167 * `Google Code 202 168 <http://code.google.com/p/html5lib/issues/detail?id=202>`_: Unicode 169 file breaks InputStream. 170 171 * Source code is now mostly PEP 8 compliant. 172 173 * Test harness has been improved and now depends on ``nose``. 174 175 * Documentation updated and moved to http://html5lib.readthedocs.org/. 176 177 178 0.95 179 ~~~~ 180 181 Released on February 11, 2012 182 183 184 0.90 185 ~~~~ 186 187 Released on January 17, 2010 188 189 190 0.11.1 191 ~~~~~~ 192 193 Released on June 12, 2008 194 195 196 0.11 197 ~~~~ 198 199 Released on June 10, 2008 200 201 202 0.10 203 ~~~~ 204 205 Released on October 7, 2007 206 207 208 0.9 209 ~~~ 210 211 Released on March 11, 2007 212 213 214 0.2 215 ~~~ 216 217 Released on January 8, 2007 218