本文介绍了聚类经纬度gps数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有超过40万辆汽车的GPS位置,例如:
I have more than 400 thousand cars GPS locations, like:
[ 25.41452217, 37.94879532],
[ 25.33231735, 37.93455887],
[ 25.44327736, 37.96868896],
...
我需要对点之间的距离< = 3米进行空间聚类.
我尝试使用 DBSCAN
,但是它似乎不适用于 geo(经度,纬度)
.
I need to make spatial clustering with the distance between points <= 3 meters.
I tried to use DBSCAN
, but it seems that it is not working for geo(longitude, latitude)
.
此外,我不知道群集的数量.
Also, I do not know the number of clusters.
推荐答案
您可以使用pairwise_distances计算纬度/经度的地理距离,然后通过指定metric ='precomputed'将距离矩阵传递给DBSCAN.
You can use pairwise_distances to calculate Geo distance from latitude / longitude and then pass the distance matrix into DBSCAN, by specifying metric='precomputed'.
要计算距离矩阵:
from sklearn.metrics.pairwise import pairwise_distances
from sklearn.cluster import DBSCAN
from geopy.distance import vincenty
def distance_in_meters(x, y):
return vincenty((x[0], x[1]), (y[0], y[1])).m
distance_matrix = pairwise_distances(sample, metric=distance_in_meters)
要使用矩阵运行DBSCAN:
To run DBSCAN using the matrix:
dbscan = DBSCAN(metric='precomputed', eps=3, min_samples=10)
dbscan.fit(distance_matrix)
希望这会有所帮助.
耕yu
这篇关于聚类经纬度gps数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!