问题描述
我想看看是否有人知道如何使用数据库对一些Lat / Long结果进行聚类,以减少通过线路发送到应用程序的结果数量。
有关如何集群的一些资源,无论是在客户端OR或在服务器(应用程序)端,但不在数据库端:(
,由SO同事提出。解决方案是服务器端的(即C#代码后面)。
有没有任何运气或经验解决这个问题,但在一个数据库?有什么数据库大师
编辑1:澄清 - 通过聚类,我希望将 x
分组到一个点的一个区域。所以,如果我说的是在1英里/ 1公里的方块中的所有结果,那么正方形的所有结果都是GROUP'D到一个结果(说...正方形的中间)。
编辑2:我使用的是MS Sql 2008,但我很开心听到如果在其他DB的其他解决方案。
我可能会使用。它很容易实现&快速收敛,适应您的数据,无论它看起来像什么。此外,您可以选择 k 以满足您的带宽要求,每个群集将具有相同数量的关联点(模k)。
我将创建一个集群质心的表,并向原始数据表中添加一个字段,以指示它属于哪个集群。你显然希望定期更新集群,如果你的数据是动态的。我不知道你是否可以用一个存储过程&触发,但也许。
*修改将调整计算的质心向量的长度,以便它们在地球表面。否则,你最终会得到一些负的高度点(当转换回LLH时)。
I'm trying to see if anyone knows how to cluster some Lat/Long results, using a database, to reduce the number of results sent over the wire to the application.
There are a number of resources about how to cluster, either on the client side OR in the server (application) side .. but not in the database side :(
This is a similar question, asked by a fellow S.O. member. The solutions are server side based (ie. C# code behind).
Has anyone had any luck or experience with solving this, but in a database? Are there any database guru's out there who are after a hawt and sexy DB challenge?
please help :)
EDIT 1: Clarification - by clustering, i'm hoping to group x
number of points into a single point, for an area. So, if i say cluster everything in a 1 mile / 1 km square, then all the results in that 'square' are GROUP'D into a single result (say ... the middle of the square).
EDIT 2: I'm using MS Sql 2008, but i'm open to hearing if there are other solutions in other DB's.
I'd probably use a modified* version of k-means clustering using the cartesian (e.g. WGS-84 ECF) coordinates for your points. It's easy to implement & converges quickly, and adapts to your data no matter what it looks like. Plus, you can pick k to suit your bandwidth requirements, and each cluster will have the same number of associated points (mod k).
I'd make a table of cluster centroids, and add a field to the original data table to indicate what cluster it belonged too. You'd obviously want to update the clustering periodically if your data is at all dynamic. I don't know if you could do that with a stored procedure & trigger, but perhaps.
*The "modification" would be to adjust the length of the computed centroid vectors so they'd be on the surface of the earth. Otherwise you'd end up with a bunch of points with negative altitude (when converted back to LLH).
这篇关于在数据库中聚类Lat / Longs的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!