Easiest way to download filtered .tda file using pandas?

Question

Easiest way to download filtered .tda file using pandas?

Pandas has a great feature .read_table(), but huge files lead to a MemoryError.
Since I only need to load lines that satisfy a certain condition, I am looking for a way to only load them.

This can be done using a temporary file:

with open(hugeTdaFile) as huge:
    with open(hugeTdaFile + ".partial.tmp", "w") as tmp:
        tmp.write(huge.readline())  # the header line
        for line in huge:
            if SomeCondition(line):
                tmp.write(line)

t = pandas.read_table(tmp.name)

Is there a way to avoid such use of a temporary file?

+2

python pandas large-files

Paul yster Feb 26 '13 at 11:39

source share

1 answer

Jeff · Answer 1 · 2013-02-26T16:18:51+0000

you can use the chunksize parameter to return an iterator

see this: http://pandas.pydata.org/pandas-docs/stable/io.html#iterating-through-files-chunk-by-chunk

filter block frames but you want
add filter to list
concat at the end

( csvs HDFStores - )

Easiest way to download filtered .tda file using pandas?

More articles: