I'm trying to consume the Exchange GetAttachment webservice using requests, lxml and base64io. This service returns a base64-encoded file in a SOAP XML HTTP response. The file content is contained in a single line in a single XML element. GetAttachment is just an example, but the problem is more general.
I would like to stream the decoded file contents directly to disk without storing the entire contents of the attachment in-memory at any point, since an attachment could be several 100 MB.
I have tried something like this:
r = requests.post('https://example.com/EWS/Exchange.asmx', data=..., stream=True)
with open('foo.txt', 'wb') as f:
for action, elem in lxml.etree.iterparse(GzipFile(fileobj=r.raw)):
if elem.tag == 't:Content':
b64_encoder = Base64IO(BytesIO(elem.text))
f.write(b64_encoder.read())
but lxml still stores a copy of the attachment as elem.text. Is there any way I can create a fully streaming XML parser that also streams the content of an element directly from the input stream?
from Fully streaming XML parser
No comments:
Post a Comment