Pandas: bar chart index data

Question

Pandas: bar chart index data

I am trying to index the data by their probability (estimated using a simple histogram). The goal is to select items in a series with a probability that is less than a threshold value.

I have a number of integer values, for example:

import pandas as pnd
import numpy  as np

series = pnd.Series(np.random.poisson(5, size = 100))

then I calculated their histogram as follows:

tmp  = {"series" : series, "count" : np.ones(len(series))}
hist = pnd.DataFrame(tmp).groupby("series").sum()
freq = hist / hist.sum()

So now I have the frequencies of each result indexed by the result, and a series of results. I have two questions:

Is there a way to index seriesby displaying the result / frequency being determined freq?
If I manage to do this, how can I select only results with a frequency greater than some value?

Thank.

+3

python pandas statistics

Rafael S. Calsaverini Apr 13 '12 at 16:52

source share

1 answer

Wes mckinney · Accepted Answer · 2012-04-13T22:27:10+0000

Yes, use the mapSeries method :

In [16]: series.map(freq['count'])
Out[16]: 
0     0.12
1     0.06
2     0.20
3     0.11
4     0.02
5     0.13
6     0.14
7     0.11
8     0.12
9     0.16
10    0.20
<snip>

You can:

In [22]: series[series.map(freq['count']) > 0.16]
Out[22]: 
2     4
10    4
11    4
22    4
27    4
31    4
34    4
56    4
64    4
71    4
73    4
76    4
77    4
79    4
80    4
86    4
88    4
89    4
91    4
99    4

Pandas: bar chart index data

More articles: