Sunday, 16 January 2022

Getting the List Numbers of List Items in docx file using Python-Docx

When I am accessing paragraph text it does not include the numbering in a list.

Current code:

document = Document("C:/Foo.docx")
for p in document.paragraphs:
     print(p.text)

List in docx file:

Numbered List

I am expecting:
(1) The naturalization of both ...
(2) The naturalization of the ...
(3) The naturalization of the ...

What I get:
The naturalization of both ...
The naturalization of the ...
The naturalization of the ...

Upon checking the XML of the document, the list numbers are stored in w:abstructNum but I have no idea how to access them or connect them to the proper list item. How can I access the number for each list item in python-docx so they could be included in my output? Is there a way also to determine the proper nesting of these lists using python-docx?

Thanks. This is the last hurdle for the project I am currently handling.



from Getting the List Numbers of List Items in docx file using Python-Docx

No comments:

Post a Comment