聚类经纬度gps数据

聚类经纬度gps数据

本文介绍了聚类经纬度gps数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有超过40万辆汽车的GPS位置,例如:

I have more than 400 thousand cars GPS locations, like:

[ 25.41452217,  37.94879532],
[ 25.33231735,  37.93455887],
[ 25.44327736,  37.96868896],
...

我需要对点之间的距离< = 3米进行空间聚类.
我尝试使用 DBSCAN ,但是它似乎不适用于 geo(经度,纬度).

I need to make spatial clustering with the distance between points <= 3 meters.
I tried to use DBSCAN, but it seems that it is not working for geo(longitude, latitude).

此外,我不知道群集的数量.

Also, I do not know the number of clusters.

推荐答案

您可以使用pairwise_distances计算纬度/经度的地理距离,然后通过指定metric ='precomputed'将距离矩阵传递给DBSCAN.

You can use pairwise_distances to calculate Geo distance from latitude / longitude and then pass the distance matrix into DBSCAN, by specifying metric='precomputed'.

要计算距离矩阵:

from sklearn.metrics.pairwise import pairwise_distances
from sklearn.cluster import DBSCAN
from geopy.distance import vincenty

def distance_in_meters(x, y):
    return vincenty((x[0], x[1]), (y[0], y[1])).m

distance_matrix = pairwise_distances(sample, metric=distance_in_meters)

要使用矩阵运行DBSCAN:

To run DBSCAN using the matrix:

dbscan = DBSCAN(metric='precomputed', eps=3, min_samples=10)
dbscan.fit(distance_matrix)

希望这会有所帮助.

耕yu

这篇关于聚类经纬度gps数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 16:22