问题描述
到N
个点的列表[(x_1,y_1), (x_2,y_2), ... ]
我正在尝试根据距离找到与每个点最近的邻居.我的数据集太大,无法使用蛮力方法,因此KDtree似乎是最好的.
To a list of N
points [(x_1,y_1), (x_2,y_2), ... ]
I am trying to find the nearest neighbours to each point based on distance. My dataset is too large to use a brute force approach so a KDtree seems best.
我看到sklearn.neighbors.KDTree
可以找到最近的邻居,而不是从头开始实现.可以用来查找每个粒子的最近邻居,即返回一个dim(N)
列表吗?
Rather than implement one from scratch I see that sklearn.neighbors.KDTree
can find the nearest neighbours. Can this be used to find the nearest neighbours of each particle, i.e return a dim(N)
list?
推荐答案
此问题非常广泛,缺少详细信息.尚不清楚您尝试过什么,数据看起来如何以及最近的邻居(身份?).
This question is very broad and missing details. It's unclear what you did try, how your data looks like and what a nearest-neighbor is (identity?).
假设您对身份(距离为0)不感兴趣,则可以查询两个最近的邻居并删除第一列.这可能是这里最简单的方法.
Assuming you are not interested in the identity (with distance 0), you can query the two nearest-neighbors and drop the first column. This is probably the easiest approach here.
import numpy as np
from sklearn.neighbors import KDTree
np.random.seed(0)
X = np.random.random((5, 2)) # 5 points in 2 dimensions
tree = KDTree(X)
nearest_dist, nearest_ind = tree.query(X, k=2) # k=2 nearest neighbors where k1 = identity
print(X)
print(nearest_dist[:, 1]) # drop id; assumes sorted -> see args!
print(nearest_ind[:, 1]) # drop id
输出
[[ 0.5488135 0.71518937]
[ 0.60276338 0.54488318]
[ 0.4236548 0.64589411]
[ 0.43758721 0.891773 ]
[ 0.96366276 0.38344152]]
[ 0.14306129 0.1786471 0.14306129 0.20869372 0.39536284]
[2 0 0 0 1]
这篇关于最近邻居搜索kdTree的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!