问题描述
可能有一个* _join版本,我在这里不见了,但是我有两个数据框,其中
There might be a *_join version for this I'm missing here, but I have two data frames, where
- 合并应该在第一个数据帧中进行,因此,left_join
- 我不仅要添加列,还要更新第一个数据帧中的现有列,更具体地说:用第二个数据帧中的值替换第一个数据帧中的NA
- 第二个数据帧比第一个数据帧包含更多行.
条件#1和#2使 left_join
失败.条件#3使 rows_update
失败.因此,我需要在两者之间做一些步骤,想知道是否有更简单的解决方案来获得所需的输出.
Condition #1 and #2 make left_join
fail. Condition #3 makes rows_update
fail. So I need to do some steps in between and am wondering if there's an easier solution to get the desired output.
x <- data.frame(id = c(1, 2, 3),
a = c("A", "B", NA))
id a
1 1 A
2 2 B
3 3 <NA>
y <- data.frame(id = c(1, 2, 3, 4),
a = c("A", "B", "C", "D"),
q = c("u", "v", "w", "x"))
id a q
1 1 A u
2 2 B v
3 3 C w
4 4 D x
,所需的输出将是:
id a q
1 1 A u
2 2 B v
3 3 C w
我知道我可以使用以下代码来实现这一点,但是对我来说,它看起来不必要地复杂.那么,也许有一种更直接的方法而不必执行下面两个命令中的中间管道吗?
I know I can achieve this with the following code, but it looks unnecessarily complicated to me. So is there maybe a more direct approach without having to do the intermediate pipes in the two commands below?
library(tidyverse)
x %>%
left_join(., y %>% select(id, q), by = c("id")) %>%
rows_update(., y %>% filter(id %in% x$id), by = "id")
推荐答案
您可以 left_join
并使用 coalesce
替换缺少的值.
You can left_join
and use coalesce
to replace missing values.
library(dplyr)
x %>%
left_join(y, by = 'id') %>%
transmute(id, a = coalesce(a.x, a.y), q)
# id a q
#1 1 A u
#2 2 B v
#3 3 C w
这篇关于tidyverse替代left_join&当两个数据框的行和列不同时,rows_update的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!