本文介绍了使用三次样条插值 pandas 中的时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用三次样条曲线填充DataFrame中一列的空白.如果要导出到列表,则可以使用numpy的interp1d函数并将其应用于缺少的值.

I would like to fill gaps in a column in my DataFrame using a cubic spline. If I were to export to a list then I could use the numpy's interp1d function and apply this to the missing values.

有没有办法在大熊猫内部使用此功能?

Is there a way to use this function inside pandas?

推荐答案

大多数numpy/scipy函数要求参数仅是"array_like", iterp1d 也不例外.幸运的是,Series和DataFrame都是"array_like",因此我们无需离开熊猫:

Most numpy/scipy function require the arguments only to be "array_like", iterp1d is no exception. Fortunately both Series and DataFrame are "array_like" so we don't need to leave pandas:

import pandas as pd
import numpy as np
from scipy.interpolate import interp1d

df = pd.DataFrame([np.arange(1, 6), [1, 8, 27, np.nan, 125]]).T

In [5]: df
Out[5]:
   0    1
0  1    1
1  2    8
2  3   27
3  4  NaN
4  5  125

df2 = df.dropna() # interpolate on the non nan
f = interp1d(df2[0], df2[1], kind='cubic')
#f(4) == array(63.9999999999992)

df[1] = df[0].apply(f)

In [10]: df
Out[10]:
   0    1
0  1    1
1  2    8
2  3   27
3  4   64
4  5  125

注意:我想不出一个例子,将DataFrame传递给第二个参数(y)...,但这也应该起作用.

Note: I couldn't think of an example off the top of my head to pass in a DataFrame into the second argument (y)... but this ought to work too.

这篇关于使用三次样条插值 pandas 中的时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 11:08
查看更多