- Improve the performance of
get_bolds_and_italics(recursive=True, filter_cls=None)
. - Fix a bug in
get_bolds_and_italics(recursive=False, filter_cls=None)
which was causing it to return recursive Bold items.
- Remove the deprecated parameters of
Template.normal_name()
. - Fix a bug in
get_bolds_and_italics()
which was causing it to return onlyBold
items.
- Fix a bug in handling of comments in template names. (#54)
- Improve the handling of weird
colspan
androwspan
values in tables. (#53)
- Fix a syntax error in Python 3.5.
- BREAKING CHANGE:
- Remove
replace_bolds
/replace_italics
params fromremove_markup
/plain_text
methods. Users can use the newreplace_bolds_and_italics
parameter. Removing only bolds or only italics is no longer possible.
- Add
get_bolds_and_italics
as a new method. - Fixed bugs and rewrote the algorithm for finding
Bold
andItalic
objects. (#51)
- Trying to mutate an overwritten/detached object will now raise
DeadIndexError
(a subclass ofTypeError
). Hopefully this will prevent some subtle late-appearing bugs.
- Fix a bug in
plaintext
method.
- Fix a bug in detection of external links in parsable tag extensions. (#50)
- Fix a bug in handling of half-marked bold/italic, e.g.
'''bold\n
.
- Fix a bug handling of half-marked bold/italic items e.g.
'''bold text\n
.
- Improve handling of extension tags inside external links. (#49)
- Ignore invalid attributes that do not start with space characters. (#48)
- Improved how invalid attributes (in html tags, tables, etc.) are handled. (#47)
- Fixed a bug in handling
<pre>
tags. (#46)
- Fixed a bug in parsing tag attributes. (#44)
- Fixed handling of tags having different casings in start and end name, e.g.
<s></S>
. - Fix handling of extension tags.
- Fixed a bug in
get_bolds
/get_italics
resulting in duplicate items in returned values. It also was causing a subtle issue inplain_text
/remove_markup
, too. (#42) - Fixed detection of parameters containing single braces.
- Fix handling of external links containing wikilinks.
- Fixed a bug in
plain_text
/remove_markup
causing unexpectedly empty objects. (#40)
Fixed some other bugs in
plain_text
/remove_markup
functions for:- images containing wikitext
- tags containing bold/italic items
- nested tags
Fixed a bug in extracting sub-tags.
- Fixed a bug in Tag objects causing strange behaviour upon mutating a tag.
- Fixed a bug in
plain_text
/remove_markup
functions, causing some objects that are expected to be removed, remain in the result. (#39)
- Fix syntax errors for python 3.5, 3.6, and 3.7.
- Fix a bug in getting the parser functions of a Template object.
- Fix a catastrophic backtracking issue for wikitexts containing html tags. (#37)
- Add
wikitextparser.remove_markup
function andWikiText.plain_text
method. - Improve detection of parameters and wikilinks.
- Add
get_bolds
andget_italics
methods. WikiLink.wikilinks
,WikiList.get_lists()
,Template.templates
,Tag.get_tags()
,ParserFunction.parser_functions
, andParameter.parameters
won't return objects equal toself
anymore, only sub-elements will be returned.- Improve handling of comments within wikilinks.
WikiLink.text.setter
no longer accepts None values. This was marked as deprecated since v0.25.0.- Drop support for Python 3.4.
- Remove the deprecated
pprint
method. Users should usepformat
instead. - Allow a tuple of patterns in
get_list
andsublists
method. The defaultNone
is now deprecated and a tuple is used instead.
- Add a new parameter,
level
, for theget_sections
method.
- Fixed a rare bug in handling lists and template arguments when there is newline or a pipe inside a starting or closing tag.
Section.title
will return None instead of''
when the section does not have any title.
- Invoking the deleter of
Section.title
won't raise a RuntimeError anymore if the section does not have a title already.
- Add a deleter for
Section.title
property. (#32)
- Fixed a bug in
WikiText.get_lists()
which was causing it to sometimes return items in an unordered fashion. (#31)
- Rename
WikiText.lists()
method toWikiText.get_lists()
and deprecate the old name. - Add
get_sections()
method withinclude_subsections
parameter which allows getting section without including subsections. (#23)
- Fixed a bug in parsing wikilinks contianing
[.*]
(#29) - Fixed: wikilinks are not allowed to be preceded by
[
anymore. - Rename
WikiText.tags()
method toWikiText.get_tags()
and deprecate the old name.
- Fix a bug in detecting the end-tag of two consecutive same-name tags. (#27)
- Properly exclude the
test
package from the source distribution.
- Fix a regression in parsing some corner cases of nested templates. (#26)
- The previously deprecated
WikiText.__getitem__
now raises NotImplementedError. - WikiText.__call__: Remove the deprecated support for start is None.
- Optimize a little and use more robust algorithms.
- Implemented a workaround for a catastrophic backtracking condition when parsing tables. (#22)
- Add
get_tables
as a new method toWikiText
objects. It allows extracting tables in a non-recursive manner. - The
nesting_level
property was only meaningful for tables, templates, and parser functions, remove it from other types.
- Fix a bug in detecting nested tables. (#21)
- Fix a few bug in detecting tables and template arguments.
- Changed the
comments
property ofComment
objects to return an empty list. - Changed the
external_links
property ofExternalLink
objects to return an empty list.
- Fix a bug in setting
Section.contents
which only occurred when the title had trailing whitespace. - Setting
Section.level
will not overwriteSection.title
anymore.
- Define
WikiLink.title
property. It is similar toWikiLink.target
but will not include the#fragment
.
- Deprecate using None as the start value of
__call__
.
- Added fragment property to
WikiLink
class (#18) - Added deleter method for
WikiLink.text
property. - Deprecated: Setting
WikiLink.text
toNone
. Usedel WikiLink.text
instead. - Added deleter method for
WikiLink.target
property. - Added deleter method for
ExternalLink.text
property. - Added deleter method for
Parameter.default
property. - Deprecated: Setting
Parameter.default
toNone
. Usedel Parameter.default
instead. - Defined
WikiText.__call__
to get a slice of wikitext as string. - Deprecated
WikiText.__getitem__
. UseWikiText.__call__
orWikiText.string
instead.
- Fixed a bug in
Tag.parsed_contents
. (#19)
- Fixed a rarely occurring bug in detecting parameters with names consisting only of whitespace or underscores.
- Fixed a bug in detecting parser functions containing parameters.
- Fixed a bug in detecting table header cells that start with +, -, or }. (#17)
- Define deleter method for
WikiText.string
property and addTemplate.del_arg
method. (#14) - Improve the
lists
method ofTemplate
andParserFunction
classes. (#15) - Fixed a bug in detection of multiline arguments. (#13)
- Deprecated
capital_links
parameter ofTemplate.normal_name
. Usecapitalize
instead (keyword-only argument). - Deprecated the
code
parameter ofTemplate.normal_name
as a positional argument deprecate. It's now a keyword-only argument.
- Fixed a bug in
Section
objects that was causing them to return the properties of the whole page (#15). - Removed the deprecated attribute access methods.
The following deprecated methods accessible on
Table
andTag
objects, have been removed:.has
,.get
,.set
. Use.has_attr
,.get_attr
,.set_attr
instead. - Fixed a bug in
set_attr
method. - Removed the deprecated
Table.getdata
method. UseTable.data
instead. - Removed the deprecated
Table.getrdata(row_num)
method. UseTable.data(row=row_num)
instead. - Removed the deprecated
Table.getcdata(col_num)
method. UseTable.data(col=col_num)
instead. - Removed the deprecated
Table.table_attrs
property. UseTable.attrs
or other attribute-related methods instead.
- Fixed MemoryError caused by very long or unclosed comment tags (issue #12)
- Change the behaviour of external_links property to never return Templates or parser functions as part of the external link.
- Add support for literal IPv6 external links, e.g. https://[2001:db8:85a3:8d3:1319:8a2e:370:7348]:443/.
- Fixed: Do not mistake the equal signs of section titles for template keyword arguments.
- Fixed Invalid escape sequences for Python 3.6.
- Added
msg
,msgnw
,raw
,safesubst
, andsubst
to known parser function identifiers.
- Fixed a bug in Table.data (issue #9)
- Fixed: A bug in processing
Section
objects.
- Fixed: A bug in
external_links
(the starting position must now be a word boundary; previously this condition was not checked)
- Fixed: A bug in
external_links
(external links withing sub-templates are now detected correctly; previously they were ignored)
- Changed: The order of results, now everything is sorted by its starting position.
- Fixed: Bug in
ancestors
andparent
methods
- Added:
parent
andancestors
methods - Added:
__version__
to__init__.py
- Removed: Support for Python 3.3
- Fixed: Handling of comments and tags in section titles
- Changed: Add an underscore prefix to private internal modules names
- Changed: Moved test modules to a different directory
- Changed: Templates adjacent to external links are now treated as part of the link
- Fixed: A bug in handling tag extensions withing parser functions
- Fixed: A minor bug in Template.set_arg
- Changed: ExternalLink.text: Return None if the link is not within brackets
- Fixed: Handling of comments and templates in external links
links