I work with huge CSV files (20-25Mln lines) and do not want to break them into smaller parts for many reasons.
My script reads the file line by line using the csv module. I now need to specify the position (number of bytes) of the line that will be read at the next iteration (or which has just been read).
I tried
>>> import csv
>>> f = open("uscompany.csv","rU")
>>> reader = csv.reader(f)
>>> reader.next()
....
>>> f.tell()
8230
But it looks like the csv module reads the file in blocks. Because when I continue the iteration, I get the same position
>>> reader.next()
....
>>> f.tell()
8230
Any suggestions? Please advice.
source
share