本文介绍了如何在R中使用dplyr计算离开平均数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试使用dplyr
在所有行中找到一个变量的留一平均值.由于dplyr
提供了一个称为row_number()
的便捷函数,因此我认为我可以像这样使用它:
Hi I am trying to find a leave one out average of a variable in all rows using dplyr
. Since dplyr
provides a convenient function called row_number()
, I thought I could use it like this:
library(dplyr)
iris %>%
tbl_df %>%
select(Sepal.Length) %>%
mutate(loo_avg=mean(Sepal.Length[-row_number()])) # leave one out average
但这会返回如下结果:
Source: local data frame [150 x 2]
Sepal.Length loo_avg
(dbl) (dbl)
1 5.1 NaN
2 4.9 NaN
3 4.7 NaN
4 4.6 NaN
5 5.0 NaN
6 5.4 NaN
7 4.6 NaN
8 5.0 NaN
9 4.4 NaN
10 4.9 NaN
.. ... ...
您如何解决此问题?
推荐答案
我特别喜欢data.table
方法:
library(data.table)
DT <- as.data.table(iris)
DT[ , loo_avg := DT[-.BY$left_out, mean(Sepal.Length)],
by = .(left_out = 1:nrow(DT))
][,.(Sepal.Length, loo_avg)]
# Sepal.Length loo_avg
# 1: 5.1 5.848322
# 2: 4.9 5.849664
# 3: 4.7 5.851007
# 4: 4.6 5.851678
# 5: 5.0 5.848993
# ---
# 146: 6.7 5.837584
# 147: 6.3 5.840268
# 148: 6.5 5.838926
# 149: 6.2 5.840940
# 150: 5.9 5.842953
请注意,除了j
中的mean
之外,这种方法还使您执行所需的操作变得异常简单.
Note that this approach also makes it incredibly easy to do whatever you want besides mean
in j
.
这篇关于如何在R中使用dplyr计算离开平均数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!