本文介绍了如何通过将列的类别拆分为集合来过滤数据框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框:
Prop_ID Unit_ID Prop_Usage Unit_Usage
1 1 RESIDENTIAL RESIDENTIAL
1 2 RESIDENTIAL COMMERCIAL
1 3 RESIDENTIAL INDUSTRIAL
1 4 RESIDENTIAL RESIDENTIAL
2 1 COMMERCIAL RESIDENTIAL
2 2 COMMERCIAL COMMERCIAL
2 3 COMMERCIAL COMMERCIAL
3 1 INDUSTRIAL INDUSTRIAL
3 2 INDUSTRIAL COMMERCIAL
4 1 RESIDENTIAL - COMMERCIAL RESIDENTIAL
4 2 RESIDENTIAL - COMMERCIAL COMMERCIAL
4 3 RESIDENTIAL - COMMERCIAL INDUSTRIAL
5 1 COMMERCIAL / RESIDENTIAL RESIDENTIAL
5 2 COMMERCIAL / RESIDENTIAL COMMERCIAL
5 3 COMMERCIAL / RESIDENTIAL INDUSTRIAL
5 4 COMMERCIAL / RESIDENTIAL COMMERCIAL
一个房产可能有超过 1 个单位.这意味着单位是属性的子类别.我想过滤 Prop_Usage
与 Unit_Usage
不匹配的行.我们在 Prop_Usage
列中有一个类别是 RESIDENTIAL - COMMERCIAL
然后 Unit_Usage
可以是 RESIDENTIAL
或 COMMERCIAL代码>.
COMMERCIAL/RESIDENTIAL
也是如此.
One property may have more than 1 unit. That means units are the subcategory of properties. I want to filter rows where Prop_Usage
does not match with Unit_Usage
. We have a category in Prop_Usage
column that's RESIDENTIAL - COMMERCIAL
then Unit_Usage
can be either RESIDENTIAL
or COMMERCIAL
. Similarly for COMMERCIAL / RESIDENTIAL
.
预期输出:
Prop_ID Unit_ID Prop_Usage Unit_Usage
1 2 RESIDENTIAL COMMERCIAL
1 3 RESIDENTIAL INDUSTRIAL
2 1 COMMERCIAL RESIDENTIAL
3 2 INDUSTRIAL COMMERCIAL
4 3 RESIDENTIAL - COMMERCIAL INDUSTRIAL
5 3 COMMERCIAL / RESIDENTIAL INDUSTRIAL
推荐答案
在 :
Use in
statement in DataFrame.apply
:
df = df[~df.apply(lambda x: x['Unit_Usage'] in x['Prop_Usage'], axis=1)]
或者在列表推导中使用zip
:
Or use zip
in list comprehension:
df = df[[not a in b for a, b in zip(df['Unit_Usage'], df['Prop_Usage'])]]
print (df)
Prop_ID Unit_ID Prop_Usage Unit_Usage
1 1 2 RESIDENTIAL COMMERCIAL
2 1 3 RESIDENTIAL INDUSTRIAL
4 2 1 COMMERCIAL RESIDENTIAL
8 3 2 INDUSTRIAL COMMERCIAL
11 4 3 RESIDENTIAL - COMMERCIAL INDUSTRIAL
14 5 3 COMMERCIAL / RESIDENTIAL INDUSTRIAL
这篇关于如何通过将列的类别拆分为集合来过滤数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!