本文介绍了用未指定列数的组均值替换NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想用多列中每个单个组collembola
和mite
的平均值代替NA.这是一个3列的示例,但是我想将其应用于5000列的数据框
I want to replace the NA with mean of each single group collembola
and mite
in multiple columns. Here it is an example with 3 columns however I want to apply this a data frame with 5000 columns
dat <- read.table(text =
"id ID length width extra
101 collembola 2.1 0.9 1
102 mite NA 0.7 NA
103 mite 1.1 0.8 2
104 collembola 1 NA 3
105 collembola 1.5 0.5 4
106 mite NA NA NA
106 mite 1.9 NA 4",
header=TRUE)
如果我输入每一列,它将起作用
It works if I enter each column
library(plyr)
impute.mean <- function(x) replace(x, is.na(x), mean(x, na.rm = TRUE))
data2 <- ddply(dat, ~ ID, transform, length = impute.mean(length))
我想应用计算多列中每个组ID
collembola
和mite
的均值的函数,下面是我尝试过的方法(不起作用):
I want to apply the function that calculates the mean of each single group ID
collembola
and mite
across multiple columns, below is what I tried (it does not work):
dat2 <- ddply(dat, ~ ID, transform, impute.mean(dat[,3:ncol(dat)]))
推荐答案
如果您不介意使用dplyr
:
library(dplyr)
dat %>%
group_by(ID) %>%
mutate_if(is.numeric, function(x) ifelse(is.na(x), mean(x, na.rm = TRUE), x))
#> # A tibble: 7 x 5
#> # Groups: ID [2]
#> id ID length width extra
#> <int> <fctr> <dbl> <dbl> <dbl>
#> 1 101 collembola 2.1 0.90 1
#> 2 102 mite 1.5 0.70 3
#> 3 103 mite 1.1 0.80 2
#> 4 104 collembola 1.0 0.70 3
#> 5 105 collembola 1.5 0.50 4
#> 6 106 mite 1.5 0.75 3
#> 7 106 mite 1.9 0.75 4
这篇关于用未指定列数的组均值替换NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!