Rolling window ranking data in pandas DataFrame

Question

Rolling window ranking data in pandas DataFrame

I am new to Python and the Pandas library, so I apologize if this is a trivial question. I am trying to rank the Timeseries over a rolling window of N days. I know there is a rank function, but this function evaluates the data in all time series. I can't seem to find a rolling rank function. Here is an example of what I'm trying to do:

           A

01-01-2013 100
02-01-2013 85
03-01-2013 110
04-01-2013 60
05-01-2013 20
06-01-2013 40

If I wanted to rank the calendar data for 3 days, the answer should be:

           Ranked_A

01-01-2013 NaN
02-01-2013 Nan
03-01-2013 1
04-01-2013 3
05-01-2013 3
06-01-2013 2

Is there a built-in function in Python that can do this? Any suggestion? Many thanks.

+5

pandas time-series rank

Frankdr Jan 21 '13 at 13:55

source share

2 answers

roll_window Pandas. numpy argsort() :

import pandas as pd
import StringIO

testdata = StringIO.StringIO("""
Date,A
01-01-2013,100
02-01-2013,85
03-01-2013,110
04-01-2013,60
05-01-2013,20
06-01-2013,40""")

df = pd.read_csv(testdata, header=True, index_col=['Date'])

rollrank = lambda data: data.size - data.argsort().argsort()[-1]

df['rank'] = pd.rolling_apply(df, 3, rollrank)

print df

:

              A  rank
Date                 
01-01-2013  100   NaN
02-01-2013   85   NaN
03-01-2013  110     1
04-01-2013   60     3
05-01-2013   20     3
06-01-2013   40     2

+2

Rutger Kassies 21 . '13 14:40

metakermit · Accepted Answer · 2013-01-21T15:52:58+0000

Pandas ( , ),

def rank(array):
    s = pd.Series(array)
    return s.rank(ascending=False)[len(s)-1]

.

pd.rolling_apply(df['A'], 3, rank)

Date
01-01-2013   NaN
02-01-2013   NaN
03-01-2013     1
04-01-2013     3
05-01-2013     3
06-01-2013     2

( df Rutger)

Rolling window ranking data in pandas DataFrame

More articles: