Sunday, 19 December 2021

How can I convert a Markdown string to a DocX in Python?

I am getting markdown text from my API like this:

{
    name:'Onur',
    surname:'Gule',
    biography:'## Computers
    I like **computers** so much.
    I wanna *be* a computer.',
    membership:1
}

biography column includes markdown string like above.

## Computers
I like **computers** so much.
I wanna *be* a computer.

I want to take this markdown text and convert to docx string for my reports.

In my docx template:




I am using python3 docxtpl package for creating docx and it's working for simple texts.

  • I tried BeautifulSoup for convert markdown to docx text but it doesn't work for styles(bold, italic etc.).
  • I tried pandoc and it worked but it just create a docx file, I want to add rendered markdown text to existing docx(while creating).

My current code:

import docx
from docxtpl import DocxTemplate, RichText
import markdown
import jinja2
import markupsafe
from bs4 import BeautifulSoup
import pypandoc

def safe_markdown(text):
    return markupsafe.Markup(markdown.markdown(text))

def mark2html(value):
    html = markdown.markdown(value)
    soup = BeautifulSoup(html, features='html.parser')
    output = pypandoc.convert_text(value,'rtf',format='md')
    return RichText(value) #tried soup and pandoc..

def from_template(template):
    template = DocxTemplate(template)
    context = {
        'simpleText':'Simple text test.',
        'markdownText':'Markdown **text** test.'
    } 
    jenv = jinja2.Environment()
    jenv.filters['markdown'] = safe_markdown
    jenv.filters["mark2html"] = mark2html
    template.render(context,jenv)
    template.save('new_report.docx')

So, how can I add rendered markdown to existed docx or while creating, maybe with a jinja2 filter?



from How can I convert a Markdown string to a DocX in Python?

No comments:

Post a Comment