本文介绍了过滤 pandas 中的数据框:使用条件列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个二维的熊猫数据框:'col1'和'col2'

I have a pandas dataframe with two dimensions : 'col1' and 'col2'

我可以使用以下命令过滤这两列的某些值:

I can filter certain values of those two columns using :

df[ (df["col1"]=='foo') & (df["col2"]=='bar')]

有什么办法可以一次过滤两列吗?

Is there any way I can filter both columns at once ?

我天真地尝试将数据帧的限制限制为两列,但是我对等式第二部分的最佳猜测不起作用:

I tried naively to use the restriction of the dataframes to two columns, but my best guesses for the second part of the equality don't work :

df[df[["col1","col2"]]==['foo','bar']]

产生此错误

ValueError: Invalid broadcasting comparison [['foo', 'bar']] with block values

我需要这样做,因为列的名称以及要设置条件的列数都会有所不同

I need to do this because the names of the columns, but also the number of columns on which the condition will be set will vary

推荐答案

据我所知,Pandas中没有办法让您做自己想做的事情.但是,尽管以下解决方案可能不是我最喜欢的解决方案,但是您可以按如下所示压缩一组并行列表:

To the best of my knowledge, there is no way in Pandas for you to do what you want. However, although the following solution may not me the most pretty, you can zip a set of parallel lists as follows:

cols = ['col1', 'col2']
conditions = ['foo', 'bar']

df[eval(" & ".join(["(df['{0}'] == '{1}')".format(col, cond)
   for col, cond in zip(cols, conditions)]))]

字符串连接的结果如下:

The string join results in the following:

>>> " & ".join(["(df['{0}'] == '{1}')".format(col, cond)
    for col, cond in zip(cols, conditions)])

"(df['col1'] == 'foo') & (df['col2'] == 'bar')"

然后使用eval有效地进行评估:

Which you then use eval to evaluate, effectively:

df[eval("(df['col1'] == 'foo') & (df['col2'] == 'bar')")]

例如:

df = pd.DataFrame({'col1': ['foo', 'bar, 'baz'], 'col2': ['bar', 'spam', 'ham']})

>>> df
  col1  col2
0  foo   bar
1  bar  spam
2  baz   ham

>>> df[eval(" & ".join(["(df['{0}'] == {1})".format(col, repr(cond))
            for col, cond in zip(cols, conditions)]))]
  col1 col2
0  foo  bar

这篇关于过滤 pandas 中的数据框:使用条件列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 06:52