I use read_csvto read CSV files into Pandas data frames. My CSV files contain a lot of decimal / floating point numbers. Numbers are encoded using the European decimal notation:
1.234.456,78
It means that '.' is used as a thousands separator, and ',' is a decimal place.
Pandas 0.8. provides an argument read_csvcalled "thousands" to set the thousands separator. Is there an additional argument for providing a decimal point? If not, what is the most efficient way to analyze European-style decimal?
I am currently using string replacement, which I consider to be a significant performance hit. I use encoding:
f = lambda x: string.replace(x, u',', u'.')
df['MyColumn'] = df['MyColumn'].map(f)
Any help is appreciated.
source
share