I have a utility method that parses XML using a parser created as etree.XMLParser(recover=True)
. I would like to test failure scenarios in a unit test. Except for empty input throwing an lxml.etree.XMLSyntaxError
, I can't seem to break the parser.
My question is: is it possible to construct a StringIO
or BytesIO
input for this parser such that the parser throws a parse error?
Here's some examples (tested with Python 3.5 and lxml 4.3.3):
from io import BytesIO
from lxml import etree
def parse(xml):
parser = etree.XMLParser(recover=True)
elem = etree.parse(BytesIO(xml), parser)
print(etree.tostring(elem))
parse(b'<broken<') # prints b'<broken/>'
parse(b'</lf|\jf>') # prints None
parse('<?xml encoding="ascii"?><foo>æøå</foo>'.encode('utf-8')) # prints b'<foo/>'
parse(b'') # Throws lxml.etree.XMLSyntaxError
from Can etree.XMLParser in recover mode still throw a parse error?
No comments:
Post a Comment