df1 <- data.frame(MLID=c('992','992','BJR'),
Position=c('N0','N1','N1'),
Weight=c(0.125,0.58,0.69))
df2 <- data.frame(MLID=c('992','992','992','992',
'BJR','BJR','BJR','BJR'),
Weight=c(0,0.251,0.501,1.001,
0,0.251,0.501,1.001),
N0=c(2.80,4.05,4.05,4.05,
4.05,4.05,4.05,4.05),
N1=c(3.47,4.73,4.95,5.15,
4.73,7.73,4.95,5.15) )
我想要的是合并这两个表遵循规则:
所以最终的输出应该是:
MILD Position Weight Charge
992 N0 0.125 2.8
992 N1 0.580 4.95
BJR N1 0.690 4.95
可以在R中做到吗?特别是在 dplyr 包中?
最佳答案
可以实现使用 data.table
rolling
join 的选项。首先,需要使用 df2
将 long-format
转换为 melt
,然后加入 df1
和 df2
。
library(data.table)
setDT(df1, key = c("MLID", "Position","Weight") )
df2 <- melt(df2, id.vars = c("MLID","Weight"), variable.name = "Position",
value.name = "Charge")
setDT(df2, key = c("MLID", "Position","Weight"))
df2[df1, roll = "nearest"]
# MLID Weight Position Charge
# 1: 992 0.580 N1 4.95
# 2: 992 0.125 NO 2.80
# 3: BJR 0.690 N1 4.95
选项#2: 基于
tidyverse
的方法可以是:library(tidyverse)
df2 %>% gather(Position, Charge, -MLID, -Weight) %>%
right_join(df1, by=c("MLID", "Position")) %>%
filter(Weight.x <= Weight.y) %>%
group_by(MLID, Position) %>%
arrange(Weight.y-Weight.x) %>%
slice(1) %>%
select(MLID, Weight = Weight.y, Position, Charge)
# # A tibble: 3 x 4
# # Groups: MLID, Position [3]
# MLID Weight Position Charge
# <chr> <dbl> <chr> <dbl>
# 1 992 0.580 N1 4.95
# 2 992 0.125 NO 2.80
# 3 BJR 0.690 N1 4.95
数据:
OP's
数据稍作修改,以在 stringsAsFactors = FALSE
中包含 data.frame
参数,以避免出现不必要的警告。df1 <- data.frame(MLID=c('992','992','BJR'),
Position=c('NO','N1','N1'),
Weight=c(0.125,0.58,0.69), stringsAsFactors = FALSE)
df2 <- data.frame(MLID=c('992','992','992','992',
'BJR','BJR','BJR','BJR'),
Weight=c(0,0.251,0.501,1.001,
0,0.251,0.501,1.001),
NO=c(2.80,4.05,4.05,4.05,
4.05,4.05,4.05,4.05),
N1=c(3.47,4.73,4.95,5.15,
4.73,7.73,4.95,5.15), stringsAsFactors = FALSE )
关于r - 如何根据R中的行和列合并两个表?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51074982/