Wednesday, 1 May 2019

Can python-docx preserve font color and styles when importing documents?

Essentially what I need to do is write a program that takes in many .docx files and puts them all in one, ordered in a certain way. I have importing working via:

import docx, os, glob
finaldocname = 'Midterm-All-Questions.docx'
finaldoc=docx.Document()
docstoworkon = glob.glob('*.docx')
if finaldocname in docstoworkon:
    docstoworkon.remove(finaldocname)   #dont process final doc if it exists

for f in docstoworkon:
    doc=docx.Document(f)

    fullText=[]
    for para in doc.paragraphs:
        fullText.append(para.text)  #generates a long text list

    # finaldoc.styles = doc.styles
    for l in fullText:
        # if l=='u\'\\n\'':
        if '#' in l:
            print('We got here!')
            if '#1 ' not in l:  #check last two characters to see if this is the first question
                finaldoc.add_section()  #only add a page break between questions
        finaldoc.add_paragraph(l)
        # finaldoc.add_page_break
        # finaldoc.add_page_break
finaldoc.save(finaldocname)

But I need to preserve text styles, like font colors, sizes, italics, etc., and they aren't in this method since it just gets the raw text and dumps it. I can't find anything on the python-docx documentation about preserving text styles or importing in something other than raw text. Does anyone know how to go about this?



from Can python-docx preserve font color and styles when importing documents?

No comments:

Post a Comment