本文介绍了以向量化方式在另一个DataFrame中查找具有值子集的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何匹配此DataFrame source中的值:

How to match values from this DataFrame source:

     car_id     lat     lon
0    100        10.0    15.0
1    100        12.0    10.0
2    100        09.0    08.0
3    110        23.0    12.0
4    110        18.0    32.0
5    110        21.0    16.0
5    110        12.0    02.0

并且仅将那些坐标位于第二个DataFrame coords中的人

And keep only those whose coords are in this second DataFrame coords:

     lat     lon
0    12.0    10.0
1    23.0    12.0
3    18.0    32.0

这样得到的DataFrame result是:

So that the resulting DataFrame result is:

     car_id     lat     lon
1    100        12.0    10.0
3    110        23.0    12.0
4    110        18.0    32.0

我可以使用apply进行迭代,但是我正在寻找一种向量化的方法.我用isin()尝试了以下操作,但没有成功:

I can do that in an iterative way with apply, but I'm looking for a vectorized way. I tried the following with isin() with no success:

result = source[source[['lat', 'lon']].isin({
    'lat': coords['lat'],
    'lon': coords['lon']
})]

以上方法返回:

ValueError: ('operands could not be broadcast together with shapes (53103,) (53103,2)

推荐答案

DataFrame.merge()会在所有具有相同名称(两个DF的列的交集)的列上合并:

DataFrame.merge() per default merges on all columns with the same names (intersection of the columns of both DFs):

In [197]: source.merge(coords)
Out[197]:
   car_id   lat   lon
0     100  12.0  10.0
1     110  23.0  12.0
2     110  18.0  32.0

这篇关于以向量化方式在另一个DataFrame中查找具有值子集的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-30 05:59