问题描述
给定:给定2D平面中的一组N点(x和y坐标)以及一组对应于每个点的N个半径。我们将把一个点的光盘称为以其半径点为中心的光盘。问题:对点进行聚类。点群是这样的,即每个点都落在该群中至少一个其他点的盘内,或者该群中至少一个其他点落入其盘中。我不需要像kd-trees那样使用复杂的空间散列技术,我需要一个算法来有效地计算它(最好不要使用复杂的空间散列技术)。我可以解决O(N ^ 2)运行时间,但肯定不会超过O(N ^ 3)。我觉得这应该是简单的,但我陷入了我的聚类条件的非互惠性质。
实质上,我期望填写C函数原型:
int cluster_points(
size_t n,
point_t * pt,//长度n点列表
double * radius,//长度n每个点的半径列表
int * cluster,//长度n集群索引列表;如果不是集群,则为-1
size_t * ncluster //簇的数量(簇中不同的非-1项的数量)
);
这不是家庭作业。我需要这个作为矩阵算法的一部分,用于对复数进行聚类。
蛮力解决方案只有O(N ),所以它应该适合你。
1)从未分配组中的所有点开始。
2)选择一个点并查看未分配组中的所有其他人,并查看是否符合您描述的半径标准。
3)最后,您将按照您的标准对点进行分组,超过N *(N / 2)次检查。
顺便说一句,你所描述的并不是通常所说的集群,所以我认为这是把人们抛在这里。使聚类成为一个难题的原因是,两个相邻点是否将被分配到同一个聚类的问题由数据集中的所有其他点确定。在你的情况下,它(基本上)只是由两点的属性决定的,所以在你的情况下,你可以检查它们全部。
Given: Given a set of N points in the 2D plane (x and y coordinates), and a set of N radii corresponding to each point. We will refer to a point's disc as the disc centered at the point with its radius.
Problem: Cluster the points. A cluster of points is such that each point EITHER falls within the disc of at least one other point in the cluster OR at least one other point in the cluster falls within its disc. Clusters of single points do not count as clusters.
I need an algorithm to compute this efficiently (preferably without resorting to complicated spatial hashing techniques like kd-trees). I can settle for O(N^2) run time but definitely no more than O(N^3). I feel like this should be simple but I'm getting caught up on the non-reciprocal nature of my clustering condition.
In essence, I am looking to fill in the C function prototype:
int cluster_points(
size_t n,
point_t *pt, // length n list of points
double *radius, // length n list of radii for each point
int *cluster, // length n list of cluster indexes; -1 if not clustered
size_t *ncluster // number of clusters (number of distinct non -1 entries in cluster)
);
This is not homework. I need this as part of a matrix algorithm for clustering complex numbers.
The brute force solution is only O(N), so it should work for you.
1) Start with all of the points in the unassigned group.
2) Pick one point and look at all the others in the unassigned group and see whether the meet the radii criterion you describe.
- 2a) If yes, start a group, and continue on with all the other points to see if they fit in this group (based on your current point), and if they fit in, move them from the unassigned group into this new one.
- 2ab) When done with the current point, go to each point added to the group and check all the points in the unassigned group.
- 2b) But, if no point matches the radii criteria for the current point, discard it.
3) At the end you will have grouped the points by your criteria, and will have done no more than N*(N/2) inspections.
Btw, what you describe is not what's normally meant by "clustering", so I think that's throwing people off here. What makes clustering a difficult problem is that the question of whether two neighboring points will be assigned to the same cluster is determined by all the other points in the data set. In your case, it's (basically) only determined by properties of the two points, so that in your case, you can just check them all.
这篇关于2d点聚类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!