214
by Leonard Richardson
Fixed a bug that made the HTMLParser treebuilder generate XML definitions ending with two question marks instead of one. [bug=984258] |
1 |
Additions |
2 |
--------- |
|
3 |
||
226
by Leonard Richardson
Removed completed TODO. |
4 |
More of the jQuery API: nextUntil? |
214
by Leonard Richardson
Fixed a bug that made the HTMLParser treebuilder generate XML definitions ending with two question marks instead of one. [bug=984258] |
5 |
|
106
by Leonard Richardson
Cleaned up the TODO. |
6 |
Optimizations |
7 |
------------- |
|
102
by Leonard Richardson
Committed minor changes made while writing docs. |
8 |
|
173
by Leonard Richardson
Warn when SoupStrainer is used with the html5lib tree builder. |
9 |
The html5lib tree builder doesn't use the standard tree-building API, |
210
by Leonard Richardson
Attribute values are now run through the provided output formatter. Previously they were always run through the 'minimal' formatter. [bug=980237] |
10 |
which worries me and has resulted in a number of bugs. |
173
by Leonard Richardson
Warn when SoupStrainer is used with the html5lib tree builder. |
11 |
|
88
by Leonard Richardson
A big patch from Aaron that brings in features from 3.0.8 and makes the code more PEP-8 compliant. |
12 |
markup_attr_map can be optimized since it's always a map now. |
106
by Leonard Richardson
Cleaned up the TODO. |
13 |
|
222
by Leonard Richardson
Fixed a bug in decoding data that contained a byte-order mark, such as data encoded in UTF-16LE. [bug=988980] |
14 |
Upon encountering UTF-16LE data or some other uncommon serialization |
15 |
of Unicode, UnicodeDammit will convert the data to Unicode, then |
|
16 |
encode it at UTF-8. This is wasteful because it will just get decoded |
|
17 |
back to Unicode. |
|
18 |
||
106
by Leonard Richardson
Cleaned up the TODO. |
19 |
CDATA |
20 |
----- |
|
57.1.8
by Leonard Richardson
Figured out the deal with CDATA sections in lxml and html5lib, and added comments and tests. |
21 |
|
22 |
The elementtree XMLParser has a strip_cdata argument that, when set to |
|
23 |
False, should allow Beautiful Soup to preserve CDATA sections instead |
|
106
by Leonard Richardson
Cleaned up the TODO. |
24 |
of treating them as text. Except it doesn't. (This argument is also |
25 |
present for HTMLParser, and also does nothing there.) |
|
57.1.8
by Leonard Richardson
Figured out the deal with CDATA sections in lxml and html5lib, and added comments and tests. |
26 |
|
27 |
Currently, htm5lib converts CDATA sections into comments. An |
|
28 |
as-yet-unreleased version of html5lib changes the parser's handling of |
|
29 |
CDATA sections to allow CDATA sections in tags like <svg> and |
|
30 |
<math>. The HTML5TreeBuilder will need to be updated to create CData |
|
31 |
objects instead of Comment objects in this situation. |