Hierarchical / multi-index operations in Pandas

Let's say I have a multi-index framework, for example:

                     A         B         C
X      Y                              
bar   one    -0.007381 -0.365315 -0.024817
      two    -1.219794  0.370955 -0.795125
baz   one     0.145578  1.428502 -0.408384
      two    -0.249321 -0.292967 -1.849202
      three  -0.249321 -0.292967 -1.849202
      four    0.21     -0.967123  1.202234
foo   one    -1.046479 -1.250595  0.781722
      two     1.314373  0.333150  0.133331
qux   one     0.716789  0.616471 -0.298493
      two     0.385795 -0.915417 -1.367644

I would like to get the maximum value A for each value of the first level ( X) and collect the index of the second level when this happens.

How to do it in Pandas?

+3
source share
2 answers
In [87]: df.loc[df['A'].groupby(level='X').idxmax(), 'A']
Out[87]: 
X    Y   
bar  one    -0.007381
baz  four    0.210000
foo  two     1.314373
qux  one     0.716789
Name: A, dtype: float64

To find median values ​​you can use

df['A'].groupby(level='X').median()

but it’s less clear which row should be associated with the median, because if the group has an even number of rows, the average of the middle rows is used to calculate the median. Thus, the median is not associated with one line, but rather two.

, , n//2 - ( (n-1)//2 - ),

grouped =  df['A'].groupby(level='X', sort=True)
df.loc[grouped.apply(lambda grp: grp.index[grp.count()//2]), 'A']

, .

,

In [93]: df.loc[grouped.apply(lambda grp: grp.index[grp.count()//2]), 'A']
Out[93]: 
X    Y    
bar  two     -1.219794
baz  three   -0.249321
foo  two      1.314373
qux  two      0.385795
Name: A, dtype: float64
+3

groupby:

groups = df['A'].groupby(level='X')
groups.min()
+1

All Articles