本文介绍了逐行填充缺失值(右/左)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种用 dplyr 向右(而不是向下/向上)填充"NAs 的方法.换句话说,我想将 d 转换为 d2,而不必在 mutate 调用中显式引用任何列.

I'm looking for a way to "fill" NAs to the right (as opposed to down/up) with dplyr. In other words, I would like to convert d into d2 without having to explicitly reference any columns in a mutate call.

我的真实数据框有几十个字段,其中包含跨越可变列数的交错 NA 块.我很好奇是否有一种简短的方法可以全局继承左侧的第一个非 NA 值,无论它出现在哪个字段中.

My real dataframe has several 10s of fields with staggered blocks of NAs spanning variable numbers of columns. I'm curious whether there's a short way to globally inherit the first non-NA value to the left, regardless of what field it occurs in.

d<-data.frame(c1=c("a",1:4), c2=c(NA,2,NA,4,5), c3=c(NA,3,4,NA,6))
d2<-data.frame(c1=c("a",1:4), c2=c("a",2,2,4,5), c3=c("a",3,4,4,6))
d
d2

推荐答案

我们可以做一个gather成'long'格式,按行号分组做fill然后 spread 回到宽"格式

We can do a gather into 'long' format, do the fill grouped by the row number and then spread back to 'wide' format

library(tidyverse)
rownames_to_column(d, 'rn') %>%
    gather(key, val, -rn) %>%
    group_by(rn) %>%
    fill(val) %>%
    spread(key, val) %>%
    ungroup %>%
    select(-rn)
# A tibble: 5 x 3
#  c1    c2    c3
#  <chr> <chr> <chr>
#1 a     a     a
#2 1     2     3
#3 2     2     4
#4 3     4     4
#5 4     5     6

或另一种无需重塑的选项是使用 na.locf

library(zoo)
d %>%
    mutate(c1 = as.character(c1)) %>%
    pmap_dfr(., ~ na.locf(c(...)) %>%
                      as.list %>%
                      as_tibble)

另外,如果我们使用na.locf,它是按列运行的,所以数据可以转置,直接应用na.locf


Also, if we use na.locf, it run columnwise, so the data can be transposed and apply na.locf directly

d[] <- t(na.locf(t(d)))
d
#  c1 c2 c3
#1  a  a  a
#2  1  2  3
#3  2  2  4
#4  3  4  4
#5  4  5  6

正如@G.Grothendieck 在评论中提到的,为了处理行首为 NA 的元素,使用 na.locf0 而不是 na.locf

As @G.Grothendieck mentioned in the comments, inorder to take care of the elements that are NA at the beginning of the row, use na.locf0 instead of na.locf

这篇关于逐行填充缺失值(右/左)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-18 13:20