Monday 9 November 2020

Non-ASCII characters are not correctly displayed in PDF when served via HttpResponse and AJAX

I have generated a PDF file which contains Cyrillic characters (non-ASCII) with ReportLab. For this purpose I have used the "Montserrat" font, which support such characters. When I look in the generated PDF file inside the media folder of Django, the characters are correctly displayed:

enter image description here

I have embedded the font by using the following code in the function generating the PDF:

from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import A4
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont

pdfmetrics.registerFont(TTFont('Montserrat', 'apps/Generic/static/Generic/tff/Montserrat-Regular.ttf'))
canvas_test = canvas.Canvas("media/"+filename, pagesize=A4)
canvas_test.setFont('Montserrat', 18)
canvas_test.drawString(10, 150, "Some text encoded in UTF-8")
canvas_test.drawString(10, 100, "как поживаешь")
canvas_test.save()

However, when I try to serve this PDF via HttpResponse, the Cyrillic characters are not properly displayed, despite being displayed in the Montserrat font:

enter image description here

The code that serves the PDF is the following:

# Return the pdf as a response
fs = FileSystemStorage()
if fs.exists(filename):
    with fs.open(filename) as pdf:
        response = HttpResponse(
            pdf, content_type='application/pdf; encoding=utf-8; charset=utf-8')
        response['Content-Disposition'] = 'inline; filename="'+filename+'"'
        return response

I have tried nearly everything (using FileResponse, opening the PDF with with open(fs.location + "/" + filename, 'rb') as pdf...) without success. Actually, I do not understand why, if ReportLab embeddes correctly the font (local file inside media folder), the file provided to the browser is not embedding the font.

It is also interesting to note that I have used Foxit Reader via Chrome or Edge to read the PDF. When I use the default PDF viewer of Firefox, different erroneous characters are displayed. Actually the font seems to be also erroneous in that case:

enter image description here

Edit

Thanks to @Melvyn, I have realized that the error did not lay in the response directly sent from the Python view, but in the success code in the AJAX call, which I leave hereafter:

success: function (data) {
    if (data.error === undefined) {
        var blob = new Blob([data]);
        var link = document.createElement('a');
        link.href = window.URL.createObjectURL(blob);
        link.download = filename + '.pdf';
        link.click();
    }
}

This is the part of the code that is changing somehow the encoding.



from Non-ASCII characters are not correctly displayed in PDF when served via HttpResponse and AJAX

No comments:

Post a Comment