问题描述
如果我有一个像这样的DataFrame(非常简单的例子)
If I have a DataFrame like this (very minimal example)
col1 col2
0 a 1
1 a 2
2 b 1
3 b 2
4 b 4
5 c 1
6 c 2
7 c 3
并且我想要所有col2
值与其唯一的col1
值相关的交集(因此,在这种情况下,交集为[1,2]
),我该如何使用Pandas?另一种表达方式是col2
中的值,该值对于col1
中的每个唯一值都存在.
and I want the intersection of all col2
values when they are related to their unique col1
values (so in this case, the intersection would be [1,2]
), how can I do so with Pandas? Another way to word this would be the values in col2
that exist for every unique value in col1
.
我的( bad )解决方案是用unique
获取唯一的col1
元素,然后从col1
中的每个唯一元素构建字典,然后采用这些字典的集合交集价值观.我觉得我应该使用一种机制将列关联在一起,但是这可以使这容易得多.
My (bad) solution was to get the unique col1
elements with unique
, and then build dictionaries from each unique element in col1
and then take the set intersection of those dictionary values. I feel like there is a mechanism I should be using to relate the columns together however that could make this much easier.
推荐答案
一种方法是使用 pivot_table
:
One way is to use pivot_table
:
In [11]: cross = df.pivot_table(index="col1", columns="col2", aggfunc='size') == 1
In [12]: cross
Out[12]:
col2 1 2 3 4
col1
a True True False False
b True True False True
c True True True False
In [13]: cross.all()
Out[13]:
col2
1 True
2 True
3 False
4 False
dtype: bool
In [14]: cross.columns[cross.all()]
Out[14]: Int64Index([1, 2], dtype='int64', name='col2')
这篇关于在与另一列Pandas中的唯一值相关联的列中查找值的交集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!