Python pandas 循环值以两列为条件

在我的数据框“数据”中，我有两列“趋势”和“rtrend”
trend 的值为 -1、0 和 1。

def newfunc(a):

j = -1

for i in a:

    j = j+1
    x = (j-1)

    if data.iloc[j]['trend'] != 0:

        return data.iloc[j]['trend']

    if data.iloc[j]['trend'] == 0:

        return data.iloc[x]['rtrend']

如果 trend 等于 -1 或 1，那么我想将 rtrend 列值设置为等于 trend 。

如果 trend 等于 0，则将 rtrend 设置为该系列中出现在数据帧上方的最后一个值。

data['rtrend'] = newfunc(data['trend'])

对于整个系列，它当前返回的所有值都是 0。

请有人能指出我正确的方向吗？我相信一定有更好的方法来做到这一点。 (我试过 np.where() 似乎没有做我想要的)。

最佳答案

不要执行程序性缓慢的 for 循环。做矢量化的方法。只需将非零数据复制到新的 rtrend 列中，然后向前填充数据:

df['rtrend'] = df[df.trend!=0]['trend']

df
Out[21]:
   trend    b    c  rtrend
a   -1.0  1.0 -1.0    -1.0
c    0.0 -1.0  1.0     NaN
e    1.0 -1.0 -1.0     1.0
f   -1.0  1.0 -1.0    -1.0
h   -1.0  1.0  1.0    -1.0

df['rtrend'].ffill()
Out[22]:
a   -1.0
c   -1.0
e    1.0
f   -1.0
h   -1.0
Name: rtrend, dtype: float64

关于Python pandas 循环值以两列为条件，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/41786349/