正如你们所看到的,由于某些关键点不匹配,我们失去了两帧的值。我要找的是注意左框架和右框架的不匹配条目的数量。我不知道怎么做。
左框架
key left_value
0 0 a
1 1 b
2 2 c
3 3 d
4 4 e
右框架
key right_value
0 2 f
1 3 g
2 4 h
3 5 i
4 6 j
pd.merge(left_frame, right_frame, on='key', how='inner')
**期望输出:1**
key left_value right_value
0 2 c f
1 3 d g
2 4 e h
**期望输出:2**
key left_value right_value _merge
0 0 a NaN left_only
1 1 b NaN left_only
5 5 NaN i right_only
6 6 NaN j right_only
所以基本上,我想要两个数据帧,一个用于“内部”,另一个用于不匹配
最佳答案
如果将合并类型更改为“outer”并传递indicator=True
,则可以看到不匹配的行来自何处:
In [193]:
pd.merge(left, right, how='outer', indicator=True)
Out[193]:
key left_value right_value _merge
0 0 a NaN left_only
1 1 b NaN left_only
2 2 c f both
3 3 d g both
4 4 e h both
5 5 NaN i right_only
6 6 NaN j right_only
您可以在此列上
groupby
并调用count
:In [194]:
pd.merge(left, right, how='outer', indicator=True).groupby('_merge').count()
Out[194]:
key left_value right_value
_merge
left_only 2 2 0
right_only 2 0 2
both 3 3 3
如果要筛选并保存结果:
In [198]:
merged = pd.merge(left, right, how='outer', indicator=True)
merged
Out[198]:
key left_value right_value _merge
0 0 a NaN left_only
1 1 b NaN left_only
2 2 c f both
3 3 d g both
4 4 e h both
5 5 NaN i right_only
6 6 NaN j right_only
In [199]:
both = merged[merged['_merge'] == 'both']
both
Out[199]:
key left_value right_value _merge
2 2 c f both
3 3 d g both
4 4 e h both
In [200]:
other = merged[merged['_merge'] != 'both']
other
Out[200]:
key left_value right_value _merge
0 0 a NaN left_only
1 1 b NaN left_only
5 5 NaN i right_only
6 6 NaN j right_only
关于python - 使用Python Pandas记录(保存)数据集_a和数据集_b的不匹配条目,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/35800790/