I am trying to parse the xref stream from PDF in JavaScript. I managed to succesfully isolate the stream itself (I checked that it's ok by comparing it in debugging mode with the value between steram. and endstream tags in PDF.
However, when I try to inflate it using pako lib, I get an error saying: ERROR incorrect header check.
The compression method is FlateDecode, which can be seen from the dictionary.
Here is the code in question:
const dict = pdfStr.slice(pdf.startXRef);
const xrefStreamStart = this.getSubstringIndex(dict, 'stream', 1) + 'stream'.length + 2;
const xrefStreamEnd = this.getSubstringIndex(dict, 'endstream', 1) + 1;
const xrefStream = dict.slice(xrefStreamStart, xrefStreamEnd);
const inflatedXrefStream = pako.inflate(this.str2ab(xrefStream), { to: 'string' });
pdfStr is the whole PDF read as a string, while *pdf.startXRef* holds the value of the position of the xref stream object.
Here's the whole PDF if someone wants to have a look: https://easyupload.io/lzf9he
EDIT: As mcernak has suggested I had a problem that I included /r and /n in the stream. However, now that I corrected the code I got a different error: invalid distance too far back
from trying to decompress xref stream from pdf - getting "ERROR incorrect header check"
No comments:
Post a Comment