问题描述
我有一个二维的熊猫数据框:'col1'和'col2'
I have a pandas dataframe with two dimensions : 'col1' and 'col2'
我可以使用以下命令过滤这两列的某些值:
I can filter certain values of those two columns using :
df[ (df["col1"]=='foo') & (df["col2"]=='bar')]
有什么办法可以一次过滤两列吗?
Is there any way I can filter both columns at once ?
我天真地尝试将数据帧的限制限制为两列,但是我对等式第二部分的最佳猜测不起作用:
I tried naively to use the restriction of the dataframes to two columns, but my best guesses for the second part of the equality don't work :
df[df[["col1","col2"]]==['foo','bar']]
产生此错误
ValueError: Invalid broadcasting comparison [['foo', 'bar']] with block values
我需要这样做,因为列的名称以及要设置条件的列数都会有所不同
I need to do this because the names of the columns, but also the number of columns on which the condition will be set will vary
推荐答案
据我所知,Pandas中没有办法让您做自己想做的事情.但是,尽管以下解决方案可能不是我最喜欢的解决方案,但是您可以按如下所示压缩一组并行列表:
To the best of my knowledge, there is no way in Pandas for you to do what you want. However, although the following solution may not me the most pretty, you can zip a set of parallel lists as follows:
cols = ['col1', 'col2']
conditions = ['foo', 'bar']
df[eval(" & ".join(["(df['{0}'] == '{1}')".format(col, cond)
for col, cond in zip(cols, conditions)]))]
字符串连接的结果如下:
The string join results in the following:
>>> " & ".join(["(df['{0}'] == '{1}')".format(col, cond)
for col, cond in zip(cols, conditions)])
"(df['col1'] == 'foo') & (df['col2'] == 'bar')"
然后使用eval
有效地进行评估:
Which you then use eval
to evaluate, effectively:
df[eval("(df['col1'] == 'foo') & (df['col2'] == 'bar')")]
例如:
df = pd.DataFrame({'col1': ['foo', 'bar, 'baz'], 'col2': ['bar', 'spam', 'ham']})
>>> df
col1 col2
0 foo bar
1 bar spam
2 baz ham
>>> df[eval(" & ".join(["(df['{0}'] == {1})".format(col, repr(cond))
for col, cond in zip(cols, conditions)]))]
col1 col2
0 foo bar
这篇关于过滤 pandas 中的数据框:使用条件列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!