Step by step reading a large multipoint zipped text file in Python

I have a very large zip file, which is divided into several parts as divided archives with one file in the archive. I don’t have enough resources to combine these archives together or extract them (the source text file is almost 1 TB).

I would like to parse a text file line by line, ideally using something like this:

import zipfile
for zipfilename in filenames:
    with zipfile.ZipFile(zipfilename) as z:
        with z.open(...) as f:
            for line in f:
                print line

Is it possible? If so, how can I read the text file:

  • Without using too much memory (loading the entire file into memory is obviously out of the question)
  • Without extracting any of the zip files
  • (Ideally) Without combining zip files

Thank you in advance for your help.

+5
source share
1 answer

I will make a punch.

zip " " Zip , zipfile Python, unzip.

, , zip, split , Python.

"-" , seek() read() (, , ), .

seek() , zip , ( ) () , .

read() , , .

, , ZipFile, "virtual zip".

+2

All Articles