本文介绍了加快获取两个纬度和经度之间的距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个包含Lat和Lon的DataFrame.我想找到从一个(Lat,Lon)对到另一个DataFrame的 ALL (Lat,Lon)的距离,并获取最小值.我正在使用 geopy 的程序包.代码如下:

I have two DataFrame containing Lat and Lon. I want to find distance from one (Lat, Lon) pair to ALL (Lat, Lon) from another DataFrame and get the minimum. The package that I am using geopy. The code is as follows:

from geopy import distance
import numpy as np

distanceMiles = []
count = 0
for id1, row1 in df1.iterrows():
    target = (row1["LAT"], row1["LON"])
    count = count + 1
    print(count)
    for id2, row2 in df2.iterrows():
        point = (row2["LAT"], row2["LON"])
        distanceMiles.append(distance.distance(target, point).miles)

    closestPoint = np.argmin(distanceMiles)
    distanceMiles = []

问题是 df1 具有 168K 行,而 df2 具有 1200 行.如何使其更快?

The problem is that df1 has 168K rows and df2 has 1200 rows. How do I make it faster?

推荐答案

将其留在此处,以防将来有人需要它:

Leaving this here in case anyone needs it in the future:

如果仅需要最小距离,则不必强行使用所有对.有一些数据结构可以帮助您解决O(n * log(n))时间复杂性的问题,这比bruteforce方法要快得多.

If you need only the minimum distance, then you don't have to bruteforce all the pairs. There are some data structures that can help you solve this in O(n*log(n)) time complexity, which is way faster than the bruteforce method.

例如,您可以使用广义KNearestNeighbors(k = 1)算法来精确地做到这一点,因为您要注意点在球面上而不是在平面上.有关使用sklearn的示例实现,请参见此SO答案.

For example, you can use a generalized KNearestNeighbors (with k=1) algorithm to do exactly that, given that you pay attention to your points being on a sphere, not a plane. See this SO answer for an example implementation using sklearn.

似乎也有一些库可以解决此问题,例如 sknni GriSPy .

There seems to be a few libraries to solve this too, like sknni and GriSPy.

这里也是另一个问题,关于理论的一点点.

Here's also another question that talks a bit about the theory.

这篇关于加快获取两个纬度和经度之间的距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 02:27