I created a DataFrame using Pandas with 6 year monthly data. Then I created a DataFrame with only the first 5 years of data. I returned the maximum values and minimum monthly values for 5 years for each month (January-December).
This means that I can build the previous five-year range this year.
Here is how I did it below, but it is a bit detailed. I would like any suggestion to make it cleaner.
DF = pd.Series(np.random.randn(72), index=pd.date_range('1/1/2000', periods=72, freq='M'))
DF5y = DF['2000':'2004']
by = lambda x: lambda y: getattr(y, x)
Max5y = DF5y.groupby([by('month')]).max()
Min5y = DF5y.groupby([by('month')]).max()
MaxbyMonthDates = pd.Series(Max5y.values, index=pd.date_range('1/1/2005', periods=12, freq='M'))
MinbyMonthDates = pd.Series(Min5y.values, index=pd.date_range('1/1/2005', periods=12, freq='M'))
keys1 = ['DF', 'MaxbyMonthDates' , 'MinbyMonthDates']
DF_5yr_Range = pd.concat([DF, MaxbyMonthDates, MinbyMonthDates], axis=1, keys=keys1)
DF_5yr_Range.tail(15)
Out[13]:
DF MaxbyMonthDates MinbyMonthDates
2004-10-31 -0.154463 NaN NaN
2004-11-30 -1.085822 NaN NaN
2004-12-31 -0.462416 NaN NaN
2005-01-31 2.422458 0.439354 0.439354
2005-02-28 -1.033706 2.308936 2.308936
2005-03-31 -0.020724 0.333981 0.333981
2005-04-30 -0.901237 1.810083 1.810083
2005-05-31 -0.890278 1.538757 1.538757
2005-06-30 -1.412531 1.416770 1.416770
2005-07-31 1.640020 1.903341 1.903341
2005-08-31 0.897491 2.001736 2.001736
2005-09-30 -0.690588 0.798006 0.798006
2005-10-31 -0.768929 1.276541 1.276541
2005-11-30 -1.618866 0.347229 0.347229
2005-12-31 0.160188 -0.046892 -0.046892
source
share