Python Pandas groupby and return the result to the original Pandas Data Frame

Question

Python Pandas groupby and return the result to the original Pandas Data Frame

I created a DataFrame using Pandas with 6 year monthly data. Then I created a DataFrame with only the first 5 years of data. I returned the maximum values and minimum monthly values for 5 years for each month (January-December).

This means that I can build the previous five-year range this year.

Here is how I did it below, but it is a bit detailed. I would like any suggestion to make it cleaner.

DF = pd.Series(np.random.randn(72), index=pd.date_range('1/1/2000', periods=72, freq='M'))


DF5y = DF['2000':'2004']

by = lambda x: lambda y: getattr(y, x)

Max5y = DF5y.groupby([by('month')]).max()
Min5y = DF5y.groupby([by('month')]).max()

MaxbyMonthDates = pd.Series(Max5y.values, index=pd.date_range('1/1/2005', periods=12, freq='M'))

MinbyMonthDates = pd.Series(Min5y.values, index=pd.date_range('1/1/2005', periods=12, freq='M'))


keys1 = ['DF', 'MaxbyMonthDates' , 'MinbyMonthDates']

DF_5yr_Range = pd.concat([DF, MaxbyMonthDates, MinbyMonthDates], axis=1, keys=keys1)


DF_5yr_Range.tail(15)
Out[13]: 
                  DF  MaxbyMonthDates  MinbyMonthDates
2004-10-31 -0.154463              NaN              NaN
2004-11-30 -1.085822              NaN              NaN
2004-12-31 -0.462416              NaN              NaN
2005-01-31  2.422458         0.439354         0.439354
2005-02-28 -1.033706         2.308936         2.308936
2005-03-31 -0.020724         0.333981         0.333981
2005-04-30 -0.901237         1.810083         1.810083
2005-05-31 -0.890278         1.538757         1.538757
2005-06-30 -1.412531         1.416770         1.416770
2005-07-31  1.640020         1.903341         1.903341
2005-08-31  0.897491         2.001736         2.001736
2005-09-30 -0.690588         0.798006         0.798006
2005-10-31 -0.768929         1.276541         1.276541
2005-11-30 -1.618866         0.347229         0.347229
2005-12-31  0.160188        -0.046892        -0.046892

+3

python pandas

user3055920 Feb 11 '14 at 12:30

source share

1 answer

horatio · Accepted Answer · 2014-02-11T15:04:12+0000

What about:

import pandas as pd
import seaborn as sns
import numpy as np

df = pd.Series(np.random.randn(72), index=pd.date_range('1/1/2000', periods=72, freq='M'))

grouped = df.groupby(df.index.map(lambda x: x.month))
mnthmax, mnthmin = grouped.transform(max), grouped.transform(min)
df2 = pd.concat([df, mnthmax, mnthmin], axis=1)
df2.columns = ['data', 'max', 'min']

df2['2001'].plot()

gives, for example. 2001:

Python Pandas groupby and return the result to the original Pandas Data Frame

More articles: