本文介绍了从纬度经度坐标以矢量化方式(无循环)构建距离矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想提出一种更快的方法来在所有纬度对之间创建距离矩阵.此 QA 解决了使用标准线性代数,但不包含纬度Lon Lon坐标.
I would like to come up with a faster way to create a distance matrix between all lat lon pairs. This QA addresses doing a vectorized way with standard Linear Algebra, but without Lat Lon coordinates.
在我的情况下,这些经久不衰的农场是农场.这是我的Python代码,对于完整的数据集(4000(拉特,lon)),至少需要五分钟.有什么想法吗?
In my case these lat longs are farms. Here is my Python code, which for the full data set (4000 (lat, lon)'s) takes at least five minutes. Any ideas?
> def slowdistancematrix(df, distance_calc=True, sparse=False, dlim=100):
"""
inputs: df
returns:
1.) distance between all farms in miles
2.) distance^2
"""
from scipy.spatial import distance_matrix
from geopy.distance import geodesic
unique_farms = pd.unique(df.pixel)
df_unique = df.set_index('pixel')
df_unique = df_unique[~df_unique.index.duplicated(keep='first')] # only keep unique index values
distance = np.zeros((unique_farms.size,unique_farms.size))
for i in range(unique_farms.size):
lat_lon_i = df_unique.Latitude.iloc[i],df_unique.Longitude.iloc[i]
for j in range(i):
lat_lon_j = df_unique.Latitude.iloc[j],df_unique.Longitude.iloc[j]
if distance_calc == True:
distance[i,j] = geodesic(lat_lon_i, lat_lon_j).miles
distance[j,i] = distance[i,j] # make use of symmetry
return distance, np.power(distance, 2)
推荐答案
我的解决方案是:
import numpy as np
def dist(v):
v = np.radians(v)
dlat = v[:, 0, np.newaxis] - v[:, 0]
dlon = v[:, 1, np.newaxis] - v[:, 1]
a = np.sin(dlat / 2.0) ** 2 + np.cos(v[:, 0]) * np.cos(v[:, 0]) * np.sin(dlon / 2.0) ** 2
c = 2 * np.arcsin(np.sqrt(a))
result = 3956 * c
return result
但是,您将需要使用属性values
将数据框转换为numpy数组.例如:
However you will need to convert your dataframe to a numpy array, using the attribute values
. For example:
df = pd.read_csv('some_csv_file.csv')
distances = dist(df[['lat', 'lng']].values)
这篇关于从纬度经度坐标以矢量化方式(无循环)构建距离矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!