问题描述
我有类似的数据集
index <- seq(2000,2020)
weight <-seq(50,70)
length <-seq(10,50,2)
data <- cbind(index,weight,length)
row.names(data) <-as.character(seq(1:21))
data
index weight length
1 2000 50 10
2 2001 51 12
3 2002 52 14
4 2003 53 16
5 2004 54 18
6 2005 55 20
7 2006 56 22
8 2007 57 24
9 2008 58 26
10 2009 59 28
11 2010 60 30
12 2011 61 32
13 2012 62 34
14 2013 63 36
15 2014 64 38
16 2015 65 40
17 2016 66 42
18 2017 67 44
19 2018 68 46
20 2019 69 48
21 2020 70 50
我需要创建几个新变量,代表所有间隔的先前测量值.
I need to create several new variables representing the previous measurements for all intervals.
我需要为每一行(每个索引)设置这些值:
I need to have these values for each row (for each index):
- 测量前 1 天的体重
- 测量前 1-2 天的平均体重
- 测量前 1-3 天的平均体重
- 等.最多 10 天 [frame 从 1 到 10,frameshift 等于 1]
之后:
- 测量前 2 天的体重
- 测量前 2-3 天的平均体重
- 测量前 2-4 天的平均体重
- 等.最多 11 天 [frame 从 1 到 10,frameshift 等于 2]
并继续到等于 30 的移码.因此,帧从平均 1 天到 10 天不等,并且该帧从测量前 1 天移动到测量前 30 天.
and continue that up to frameshift that is equal to 30.So, frame varies from 1 day to 10 days average and this frame shifts from 1 day before the measurement to 30 days before the measurement.
此外,我需要为多列(大约 10 列)执行此操作.
Also, I need to do that for multiple columns (around 10).
谢谢!
推荐答案
考虑到包 tidyverse
和 zoo
这是一个命题:
Considering the packages tidyverse
and zoo
this is a proposition :
准备环境
library(tidyverse)
data <- tibble(
index = seq(2000,2020),
weight = seq(50,70),
length = seq(10,50,2)
)
执行任务:
遍历所有移码并计算从 1 到 10 的所有滚动平均值:
Loop over all frameshifts and compute all rolling mean from 1 to 10:
lapply(1:30, function(frameshift) {
w <- lag(data$weight, frameshift)
lapply(1:10, function(k) {
name <- sprintf("frameshift%i_k%i", frameshift, k)
tibble("{name}" := zoo::rollmean(x = w, k = k, fill = NA, align = "r"))
}) %>% bind_cols()
}) %>% bind_cols()
最后,您只需要将生成的 tibble 与您的数据绑定...
Finally, you just have to bind the resulting tibble with your data...
移码为 3 且 rollmean 最大为 5 的样本
res <- lapply(3, function(frameshift) {
w <- lag(data$weight, frameshift)
lapply(1:5, function(k) {
name <- sprintf("frameshift%i_k%i", frameshift, k)
tibble("{name}" := zoo::rollmean(x = w, k = k, fill = NA, align = "r"))
}) %>% bind_cols()
}) %>% bind_cols()
bind_cols(data, res)
A tibble: 21 x 8
index weight length frameshift3_k1 frameshift3_k2 frameshift3_k3 frameshift3_k4 frameshift3_k5
<int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2000 50 10 NA NA NA NA NA
2 2001 51 12 NA NA NA NA NA
3 2002 52 14 NA NA NA NA NA
4 2003 53 16 50 NA NA NA NA
5 2004 54 18 51 50.5 NA NA NA
6 2005 55 20 52 51.5 51 NA NA
7 2006 56 22 53 52.5 52 51.5 NA
8 2007 57 24 54 53.5 53 52.5 52
9 2008 58 26 55 54.5 54 53.5 53
10 2009 59 28 56 55.5 55 54.5 54
这篇关于多个变量的变量移码滚动平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!