问题描述
我需要编写一个查询,使我能够从提供的位置查找一个范围(英里)内的所有位置.
I need to write a query which allows me to find all locations within a range (Miles) from a provided location.
表是这样的:
id | name | lat | lng
因此,我一直在进行研究,发现:这是我的sql演示文稿
So I have been doing research and found: this my sql presentation
我已经在大约100行的表上对其进行了测试,并且还会更多! -必须是可扩展的.
I have tested it on a table with around 100 rows and will have plenty more! - Must be scalable.
我首先尝试了以下更简单的方法:
I tried something more simple like this first:
//just some test data this would be required by user input
set @orig_lat=55.857807; set @orig_lng=-4.242511; set @dist=10;
SELECT *, 3956 * 2 * ASIN(
SQRT( POWER(SIN((orig.lat - abs(dest.lat)) * pi()/180 / 2), 2)
+ COS(orig.lat * pi()/180 ) * COS(abs(dest.lat) * pi()/180)
* POWER(SIN((orig.lng - dest.lng) * pi()/180 / 2), 2) ))
AS distance
FROM locations dest, locations orig
WHERE orig.id = '1'
HAVING distance < 1
ORDER BY distance;
这将在大约 50ms 的时间内返回行,这非常好!但是,随着行数的增加,这将大大减慢速度.
This returned rows in around 50ms which is pretty good!However this would slow down dramatically as the rows increase.
EXPLAIN
表示仅使用明显的PRIMARY键.
EXPLAIN
shows it's only using the PRIMARY key which is obvious.
然后阅读上面链接的文章..我尝试过这样的事情:
Then after reading the article linked above. I tried something like this:
// defining variables - this when made into a stored procedure will call
// the values with a SELECT query.
set @mylon = -4.242511;
set @mylat = 55.857807;
set @dist = 0.5;
-- calculate lon and lat for the rectangle:
set @lon1 = @mylon-@dist/abs(cos(radians(@mylat))*69);
set @lon2 = @mylon+@dist/abs(cos(radians(@mylat))*69);
set @lat1 = @mylat-(@dist/69);
set @lat2 = @mylat+(@dist/69);
-- run the query:
SELECT *, 3956 * 2 * ASIN(
SQRT( POWER(SIN((@mylat - abs(dest.lat)) * pi()/180 / 2) ,2)
+ COS(@mylat * pi()/180 ) * COS(abs(dest.lat) * pi()/180)
* POWER(SIN((@mylon - dest.lng) * pi()/180 / 2), 2) ))
AS distance
FROM locations dest
WHERE dest.lng BETWEEN @lon1 AND @lon2
AND dest.lat BETWEEN @lat1 AND @lat2
HAVING distance < @dist
ORDER BY distance;
此查询的时间大约为 240ms ,虽然还算不错,但比上一次要慢.但是我可以想象,如果行数多得多,则执行速度会更快.但是,EXPLAIN
将可能的键显示为lat
,lng
或PRIMARY
,并使用了PRIMARY
.
The time of this query is around 240ms, this is not too bad, but is slower than the last. But I can imagine at much higher number of rows this would work out faster. However anEXPLAIN
shows the possible keys as lat
,lng
or PRIMARY
and used PRIMARY
.
我该如何做得更好?
我知道我可以将纬度存储为POINT();但是我也没有找到太多文档来说明它是更快还是准确?
I know I could store the lat lng as a POINT(); but I also haven't found too much documentation on this which shows if it's faster or accurate?
其他任何想法都会很高兴被接受!
Any other ideas would be happily accepted!
非常感谢!
-Stefan
更新:
正如乔纳森·莱夫勒(Jonathan Leffler)指出的那样,我犯了一些我没有注意到的错误:
As Jonathan Leffler pointed out I had made a few mistakes which I hadn't noticed:
我只将abs()放在一个lat值上.不需要时,我也在第二个查询的WHERE子句中使用了id搜索.在第一个查询中纯粹是实验性的,第二个查询更可能影响生产.
I had only put abs() on one of the lat values. I was using an id search in the WHERE clause in the second one as well, when there was no need. In the first query was purely experimental the second one is more likely to hit production.
这些更改后,EXPLAIN
显示密钥现在正在使用lng
列,并且现在平均响应时间在 180ms 左右,这是一个改进.
After these changes EXPLAIN
shows the key is now using lng
column and average time to respond around 180ms now which is an improvement.
推荐答案
第一个查询将忽略您设置的参数-使用1代替@dist作为距离,并使用表别名orig
代替参数和@orig_lon
.
The first query ignores the parameters you set - using 1 instead of @dist for the distance, and using the table alias orig
instead of the parameters @orig_lat
and @orig_lon
.
然后让查询在表和表之间进行笛卡尔乘积运算,如果可以避免的话,这并不是一个好主意.您由于过滤条件orig.id = 1
而无法使用它,这意味着orig
中只有一行与dest
中的每一行连接在一起(包括带有dest.id = 1
的点;您可能应该有一个条件AND orig.id != dest.id
).您还具有HAVING子句,但没有GROUP BY子句,这表明存在问题. HAVING子句不涉及任何聚合,但是HAVING子句(主要)用于比较聚合值.
You then have the query doing a Cartesian product between the table and itself, which is seldom a good idea if you can avoid it. You get away with it because of the filter condition orig.id = 1
, which means that there's only one row from orig
joined with each of the rows in dest
(including the point with dest.id = 1
; you should probably have a condition AND orig.id != dest.id
). You also have a HAVING clause but no GROUP BY clause, which is indicative of problems. The HAVING clause is not relating any aggregates, but a HAVING clause is (primarily) for comparing aggregate values.
除非我的记忆使我失望,否则COS(ABS(x))=== COS(x),因此您可以通过删除ABS()来简化操作.未能做到这一点的原因尚不清楚,为什么一个纬度需要ABS而另一个纬度则不需要-球形三角学中的对称性至关重要.
Unless my memory is failing me, COS(ABS(x)) === COS(x), so you might be able to simplify things by dropping the ABS(). Failing that, it is not clear why one latitude needs the ABS and the other does not - symmetry is crucial in matters of spherical trigonometry.
您拥有一些魔幻数字-值69大概是度数(以赤道为经度)的英里数,而3956是地球的半径.
You have a dose of the magic numbers - the value 69 is presumably number of miles in a degree (of longitude, at the equator), and 3956 is the radius of the earth.
我怀疑如果给定位置接近极点而计算出的盒子.在极端情况下,您可能需要完全允许任何经度.
I'm suspicious of the box calculated if the given position is close to a pole. In the extreme case, you might need to allow any longitude at all.
第二个查询中的条件dest.id = 1
为奇数;我相信应该省略它,但是它的存在应该加快速度,因为只有一行符合该条件.因此,花费的额外时间令人费解.但是使用主键索引是合适的.
The condition dest.id = 1
in the second query is odd; I believe it should be omitted, but its presence should speed things up, because only one row matches that condition. So the extra time taken is puzzling. But using the primary key index is appropriate as written.
您应该将HAVING子句中的条件移到WHERE子句中.
You should move the condition in the HAVING clause into the WHERE clause.
但是我不确定这是否真的有帮助...
But I'm not sure this is really helping...
这篇关于两个坐标之间的距离,如何简化和/或使用其他技术?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!