如何基于最短距离为经纬度观测分配名称

如何基于最短距离为经纬度观测分配名称

本文介绍了如何基于最短距离为经纬度观测分配名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框:df1包含具有经纬度坐标的观测值; df2具有带有经纬度坐标的名称.我想创建一个新的变量df1$name,该变量的每个观测值均具有与该观测值距离最短的df2名称.

I have two dataframes: df1 contains observations with lat-lon coordinates; df2 has names with lat-lon coordinates. I want to create a new variable df1$name which has for each observation the name of df2 that has the shortest distance to that observation.

df1的一些示例数据:

df1 <- structure(list(lat = c(52.768, 53.155, 53.238, 53.253, 53.312, 53.21, 53.21, 53.109, 53.376, 53.317, 52.972, 53.337, 53.208, 53.278, 53.316, 53.288, 53.341, 52.945, 53.317, 53.249), lon = c(6.873, 6.82, 6.81, 6.82, 6.84, 6.748, 6.743, 6.855, 6.742, 6.808, 6.588, 6.743, 6.752, 6.845, 6.638, 6.872, 6.713, 6.57, 6.735, 6.917), cat = c(2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 2L), diff = c(6.97305555555555, 3.39815972222222, 14.2874305555556, -0.759791666666667, 34.448275462963, 4.38783564814815, 0.142430555555556, 0.698599537037037, 1.22914351851852, 7.0008912037037, 1.3349537037037, 8.67978009259259, 1.6090162037037,    25.9466782407407, 9.45068287037037, 4.76284722222222, 1.79163194444444, 16.8280787037037, 1.01336805555556, 3.51240740740741)), .Names = c("lat", "lon", "cat", "diff"), row.names = c(125L, 705L, 435L, 682L, 186L, 783L, 250L, 517L, 547L, 369L, 618L, 280L, 839L, 614L, 371L, 786L, 542L, 100L, 667L, 785L), class = "data.frame")

df2的一些示例数据:

df2 <- structure(list(latlonloc = structure(c(6L, 3L, 4L, 2L, 5L, 1L), .Label = c("Boelenslaan", "Borgercompagnie", "Froombosch", "Garrelsweer", "Stitswerd", "Tinallinge"), class = "factor"), lat = c(53.356789, 53.193886, 53.311237, 53.111339, 53.360848, 53.162031), lon = c(6.53493, 6.780792, 6.768608, 6.82354, 6.599604, 6.143804)), .Names = c("latlonloc", "lat", "lon"), class = "data.frame", row.names = c(NA, -6L))

基于答案,我用geosphere包创建了一个距离矩阵:

Based on this answer, I created a distance matrix with the geosphere package:

library(geosphere)
mat <- distm(df1[,c('lon','lat')], df2[,c('lon','lat')], fun=distHaversine)

但是如何为每个观察值选择最短的距离,并为df1$name分配相应的名称?

But how do I pick the shortest distance for each observation and assign the corresponding name to df1$name?

推荐答案

像这样:

df1$name <- df2$latlonloc[apply(mat, 1, which.min)]

这篇关于如何基于最短距离为经纬度观测分配名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 21:41