问题描述
当特定的列也满足单独的条件时,我试图选择满足特定条件的pandas DataFrame中的所有单元格.
I'm trying to select all cells in a pandas DataFrame that meet a certain criteria when a specific column also meets a separate criteria.
给出以下数据框:
A B C D
1/1 0 1 0 1
1/2 2 1 1 1
1/3 3 0 1 0
1/4 1 0 1 2
1/5 1 0 1 1
1/6 2 0 2 1
1/7 3 5 2 3
当D
也是> 1时,我想以某种方式选择列大于其先前值的数据.
I would like to somehow select the data where a column is greater than its previous value, when D
is also > 1. The syntax I'm trying to use currently is:
matches = df[(df > df.shift(1)) & (df.D > 1)]
但是,当我这样做时,会出现以下错误:
However, when i do this, I receive the following error:
注意:该错误是我实际代码的直接复制和过去,因此该错误的描述和形状不会与我的示例DataFrame直接相关.
Note: the error is a direct copy and past from my actual code, so the description and the shape in the error would not correlate directly to my example DataFrame.
我知道df.D > 1
引起了问题,直接将列与D
进行比较是有效的(例如,df > df.D
).尝试将D
与值1
进行比较时,我的语法有什么问题,我该怎么做?
I know that the df.D > 1
is causing the problem, and comparing columns directly to D
is valid (df > df.D
for example). What is wrong with my syntax when trying to compare D
to the value 1
, and how could I accomplish this?
推荐答案
此应该直接起作用,但是熊猫没有广播和运算符(发生在0.14中).这是一种解决方法.
This should work directly, but pandas doesn't have a broadcasting and operator (will happenin 0.14). Here's a workaround.
In [74]: df
Out[74]:
A B C D
1/1 0 1 0 1
1/2 2 1 1 1
1/3 3 0 1 0
1/4 1 0 1 2
1/5 1 0 1 1
1/6 2 0 2 1
1/7 3 5 2 3
这是一个where操作,本质上将np.nan
放在条件为False的地方
This is a where operation, essentially put np.nan
where the condition is False
In [78]: x = df[df>df.shift(1)]
In [79]: x
Out[79]:
A B C D
1/1 NaN NaN NaN NaN
1/2 2 NaN 1 NaN
1/3 3 NaN NaN NaN
1/4 NaN NaN NaN 2
1/5 NaN NaN NaN NaN
1/6 2 NaN 2 NaN
1/7 3 5 NaN 3
根据第二个条件选择
In [80]: x[df.D>1]
Out[80]:
A B C D
1/4 NaN NaN NaN 2
1/7 3 5 NaN 3
这篇关于如何在Pandas DataFrame where子句中使用特定列的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!