Suppose I have a BIG file with some lines that I want to ignore, and a function ( file_function) that takes a file object. Can I return a new file object whose lines satisfy some condition without first reading the entire file , this laziness is an important part.
Note. I could just save a temporary file ignoring these lines, but this is not ideal.
For example, suppose I had a csv file (with a bad line):
1,2
ooops
3,4
The first attempt was to create a new file object (with the same methods as the file) and overwrite readline:
class FileWithoutCondition(file):
def __init__(self, f, condition):
self.f = f
self.condition = condition
def readline(self):
while True:
x = self.f.readline()
if self.condition(x):
return x
This works if it file_nameonly uses readline... but not if it requires some other features.
with ('file_name', 'r') as f:
f1 = FileWithoutOoops(f, lambda x: x != 'ooops\n')
result = file_function(f1)
, StringIO, , .
, file_function - , , (, , ?).
(skim-) ?
: pandas , readline pd.read_csv..