我有一个数据框,其中包含有关驱动程序及其遵循的路线的数据。我试图找出旅行的总里程。我正在使用geosphere
程序包,但无法找出正确的方法来应用它并以英里为单位获取答案。
> head(df1)
id routeDateTime driverId lat lon
1 1 2012-11-12 02:08:41 123 76.57169 -110.8070
2 2 2012-11-12 02:09:41 123 76.44325 -110.7525
3 3 2012-11-12 02:10:41 123 76.90897 -110.8613
4 4 2012-11-12 03:18:41 123 76.11152 -110.2037
5 5 2012-11-12 03:19:41 123 76.29013 -110.3838
6 6 2012-11-12 03:20:41 123 76.15544 -110.4506
到目前为止,我已经尝试过
spDists(cbind(df1$lon,df1$lat))
和其他几个功能,但似乎无法获得合理的答案。
有什么建议?
> dput(df1)
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40), routeDateTime = c("2012-11-12 02:08:41",
"2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41",
"2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41",
"2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41",
"2012-11-12 02:08:41", "2012-11-12 02:09:41", "2012-11-12 02:10:41",
"2012-11-12 03:18:41", "2012-11-12 03:19:41", "2012-11-12 03:20:41",
"2012-11-12 03:21:41", "2012-11-12 12:08:41", "2012-11-12 12:09:41",
"2012-11-12 12:10:41", "2012-11-12 02:08:41", "2012-11-12 02:09:41",
"2012-11-12 02:10:41", "2012-11-12 03:18:41", "2012-11-12 03:19:41",
"2012-11-12 03:20:41", "2012-11-12 03:21:41", "2012-11-12 12:08:41",
"2012-11-12 12:09:41", "2012-11-12 12:10:41", "2012-11-12 02:08:41",
"2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41",
"2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41",
"2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41"
), driverId = c(123, 123, 123, 123, 123, 123, 123, 123, 123,
123, 456, 456, 456, 456, 456, 456, 456, 456, 456, 456, 789, 789,
789, 789, 789, 789, 789, 789, 789, 789, 246, 246, 246, 246, 246,
246, 246, 246, 246, 246), lat = c(76.5716897079255, 76.4432530414779,
76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499,
76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785,
76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383,
76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343,
76.3357809444424, 76.032417796785, 76.5716897079255, 76.4432530414779,
76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499,
76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785,
76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383,
76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343,
76.3357809444424, 76.032417796785), lon = c(-110.80701574916,
-110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505,
-110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522,
-110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726,
-110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111,
-110.556956546381, -110.24483308522, -110.217355202651, -110.80701574916,
-110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505,
-110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522,
-110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726,
-110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111,
-110.556956546381, -110.24483308522, -110.217355202651)), .Names = c("id",
"routeDateTime", "driverId", "lat", "lon"), row.names = c(NA,
-40L), class = "data.frame")
最佳答案
这个怎么样?
## Setup
library(geosphere)
metersPerMile <- 1609.34
pts <- df1[c("lon", "lat")]
## Pass in two derived data.frames that are lagged by one point
segDists <- distVincentyEllipsoid(p1 = pts[-nrow(df),],
p2 = pts[-1,])
sum(segDists)/metersPerMile
# [1] 1013.919
(要使用一种更快的距离计算算法,只需在上面的调用中用
distCosine
,distVincentySphere
或distHaversine
替换distVincentyEllipsoid
。)