Tuesday, 18 January 2022

Best way to unpack multi-part archive (zip/rar) in python

I have 2gb archive (prefer zip or rar) file in parts (let's assume 100parts x 20mb). I didn't find a way to unpack it properly. Firstly I tried with zip. I had files like test.zip, test.z01, test.z02...test.99 when I marge it like that:

    for zipName in zips:
    with open(os.path.join(path_to_zip_file, "test.zip"), "ab") as f:
        with open(os.path.join(path_to_zip_file, zipName), "rb") as z:
            f.write(z.read())

and then after merge unpack it like that:

with zipfile.ZipFile(os.path.join(path_to_zip_file, "test.zip"), "r") as zipObj:
     zipObj.extractall(path_to_zip_file)

I get errors like: test.zip file isn't zip file.

So I tried with rar. Firstly I think that maybe it will be enough when I choose first part and unpack it and it will do it itself but no. So again i merge file (just like in zip case) and then try to unpack it by using patoolib

patoolib.extract_archive("test.rar", outdir="path here")

When I do that I get errors like: patoolib.util.PatoolError: could not find an executable program to extract format rar; candidates are (rar,unrar,7z)

After some work I figured out that this merged files are corrupted (I copied it and try to unpack normally on windows using winrar and there was some problems). So I tried other ways to merge for example using cat cat test.part.* >test.rar but it also doesn't help.

So there is a question there is option to handle it ? I need to say that I'm just really disappointed that libs doesn't on their own doesn't unpack multi-part archive it should be naturally.



from Best way to unpack multi-part archive (zip/rar) in python

No comments:

Post a Comment