Saturday, 19 September 2020

trying to decompress xref stream from pdf - getting "ERROR incorrect header check"

I am trying to parse the xref stream from PDF in JavaScript. I managed to succesfully isolate the stream itself (I checked that it's ok by comparing it in debugging mode with the value between steram. and endstream tags in PDF.

However, when I try to inflate it using pako lib, I get an error saying: ERROR incorrect header check.

The compression method is FlateDecode, which can be seen from the dictionary.

Here is the code in question:

const dict = pdfStr.slice(pdf.startXRef);
            const xrefStreamStart = this.getSubstringIndex(dict, 'stream', 1) + 'stream'.length + 2;
            const xrefStreamEnd = this.getSubstringIndex(dict, 'endstream', 1) + 1;
            const xrefStream = dict.slice(xrefStreamStart, xrefStreamEnd);
            const inflatedXrefStream = pako.inflate(this.str2ab(xrefStream), { to: 'string' });


pdfStr is the whole PDF read as a string, while *pdf.startXRef* holds the value of the position of the xref stream object.

Here's the whole PDF if someone wants to have a look: https://easyupload.io/lzf9he

EDIT: As mcernak has suggested I had a problem that I included /r and /n in the stream. However, now that I corrected the code I got a different error: invalid distance too far back



from trying to decompress xref stream from pdf - getting "ERROR incorrect header check"

No comments:

Post a Comment