Uncertainties in Pandas

How to easily handle uncertainties in Series or DataFrame in Pandas (Python data analysis library)? I recently discovered a Python uncertainty package , but I'm wondering if there is any simpler way to manage uncertainties directly in Pandas. I did not find anything about this in the documentation.

To be more precise, I do not want to store the uncertainties as a new column in my DataFrame, because I think they are part of a series of data and should not be logically separated from it. For example, it makes no sense to delete a column in a DataFrame, but not its uncertainties, so I have to handle this case manually.

I was looking for something like data_frame.uncertaintiesthat could work as an attribute data_frame.values. data_frame.units(for data blocks) would also be, data_frame.unitsbut I think there are no such things in Pandas (yet?) ...

+13
source share
1 answer

If you really want this to be a built-in function, you can simply create a class to accommodate your data frame. Then you can define any values ​​or functions that you want. I wrote a quick example below, but you can easily add a definition of units or a more complex uncertainty formula

import pandas as pd

data={'target_column':[100,105,110]}

class data_analysis():
    def __init__(self, data, percentage_uncertainty):
    self.df = pd.DataFrame(data)
    self.uncertainty = percentage_uncertainty*self.df['target_column'].values

When i run

example=data_analysis(data,.01)
example.uncertainty

I get an array ([1., 1.05, 1.1])

Hope this helps

0
source

All Articles