Wednesday, 10 November 2021

Convert DOCX Bytestream to PDF Bytestream Python

I currently have a program that generates a .docx document using the python-docx library.

Upon completing the building of the .docx file I save it into a Bytestream as so

file_stream = io.BytesIO()
document.save(file_stream)
file_stream.seek(0)

Now, I need to convert this word document into a PDF. I have looked at a few different libraries for conversion such as docx2pdf or even doing it manually using comtypes as so

import sys
import os
import comtypes.client

wdFormatPDF = 17

in_file = "Input_file_path.docx"
out_file = "output_file_path.pdf"

word = comtypes.client.CreateObject('Word.Application')
doc = word.Documents.Open(in_file)
doc.SaveAs(out_file, FileFormat=wdFormatPDF)
doc.Close()
word.Quit()

The problem is, I need to do this conversion in memory and cannot physically save the DOCX or the PDF to the machine. Every converter I've seen requires a filepath to the physical document on the machine and I do not have that.

Is there a way I can convert the DOCX filestream into a PDF stream just in memory?

Thanks



from Convert DOCX Bytestream to PDF Bytestream Python

No comments:

Post a Comment