Pandas time series interpolation using cubic spline

I would like to fill in the blanks in a column in my DataFrame using a cubic spline. If I were to export to a list, I could use the numpy function interp1dand apply this to the missing values.

Is there any way to use this function inside pandas?

+4
source share
1 answer

In most numpy / scipy functions, the arguments must be "array_like", is iterp1dno exception. Fortunately, both Series and DataFrame are "array_like", so we do not need to leave pandas:

import pandas as pd
import numpy as np
from scipy.interpolate import interp1d

df = pd.DataFrame([np.arange(1, 6), [1, 8, 27, np.nan, 125]]).T

In [5]: df
Out[5]: 
   0    1
0  1    1
1  2    8
2  3   27
3  4  NaN
4  5  125

df2 = df.dropna() # interpolate on the non nan
f = interp1d(df2[0], df2[1], kind='cubic')
#f(4) == array(63.9999999999992)

df[1] = df[0].apply(f)

In [10]: df
Out[10]: 
   0    1
0  1    1
1  2    8
2  3   27
3  4   64
4  5  125

. , DataFrame (y)... .

+6

All Articles