本文介绍了多个变量的变量移码滚动平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有类似的数据集

index <- seq(2000,2020)
weight <-seq(50,70)
length <-seq(10,50,2)
data <- cbind(index,weight,length)
row.names(data) <-as.character(seq(1:21))
data
   index weight length
1   2000     50     10
2   2001     51     12
3   2002     52     14
4   2003     53     16
5   2004     54     18
6   2005     55     20
7   2006     56     22
8   2007     57     24
9   2008     58     26
10  2009     59     28
11  2010     60     30
12  2011     61     32
13  2012     62     34
14  2013     63     36
15  2014     64     38
16  2015     65     40
17  2016     66     42
18  2017     67     44
19  2018     68     46
20  2019     69     48
21  2020     70     50

我需要创建几个新变量,代表所有间隔的先前测量值.

I need to create several new variables representing the previous measurements for all intervals.

我需要为每一行(每个索引)设置这些值:

I need to have these values for each row (for each index):

  • 测量前 1 天的体重
  • 测量前 1-2 天的平均体重
  • 测量前 1-3 天的平均体重
  • 等.最多 10 天 [frame 从 1 到 10,frameshift 等于 1]

之后:

  • 测量前 2 天的体重
  • 测量前 2-3 天的平均体重
  • 测量前 2-4 天的平均体重
  • 等.最多 11 天 [frame 从 1 到 10,frameshift 等于 2]

并继续到等于 30 的移码.因此,帧从平均 1 天到 10 天不等,并且该帧从测量前 1 天移动到测量前 30 天.

and continue that up to frameshift that is equal to 30.So, frame varies from 1 day to 10 days average and this frame shifts from 1 day before the measurement to 30 days before the measurement.

此外,我需要为多列(大约 10 列)执行此操作.

Also, I need to do that for multiple columns (around 10).

谢谢!

推荐答案

考虑到包 tidyversezoo 这是一个命题:

Considering the packages tidyverse and zoo this is a proposition :

准备环境

library(tidyverse)
data <- tibble(
  index = seq(2000,2020),
  weight = seq(50,70),
  length = seq(10,50,2)
)

执行任务:

遍历所有移码并计算从 1 到 10 的所有滚动平均值:

Loop over all frameshifts and compute all rolling mean from 1 to 10:

lapply(1:30, function(frameshift) {
  w <- lag(data$weight, frameshift)
  lapply(1:10, function(k) {
    name <- sprintf("frameshift%i_k%i", frameshift, k)
    tibble("{name}" := zoo::rollmean(x = w, k = k, fill = NA, align = "r"))
  }) %>% bind_cols()
}) %>% bind_cols()

最后,您只需要将生成的 tibble 与您的数据绑定...

Finally, you just have to bind the resulting tibble with your data...

移码为 3 且 rollmean 最大为 5 的样本

res <- lapply(3, function(frameshift) {
  w <- lag(data$weight, frameshift)
  lapply(1:5, function(k) {
    name <- sprintf("frameshift%i_k%i", frameshift, k)
    tibble("{name}" := zoo::rollmean(x = w, k = k, fill = NA, align = "r"))
  }) %>% bind_cols()
}) %>% bind_cols()

bind_cols(data, res)
A tibble: 21 x 8
  index weight length frameshift3_k1 frameshift3_k2 frameshift3_k3 frameshift3_k4 frameshift3_k5
   <int>  <int>  <dbl>          <dbl>          <dbl>          <dbl>          <dbl>          <dbl>
 1  2000     50     10             NA           NA               NA           NA               NA
 2  2001     51     12             NA           NA               NA           NA               NA
 3  2002     52     14             NA           NA               NA           NA               NA
 4  2003     53     16             50           NA               NA           NA               NA
 5  2004     54     18             51           50.5             NA           NA               NA
 6  2005     55     20             52           51.5             51           NA               NA
 7  2006     56     22             53           52.5             52           51.5             NA
 8  2007     57     24             54           53.5             53           52.5             52
 9  2008     58     26             55           54.5             54           53.5             53
10  2009     59     28             56           55.5             55           54.5             54

这篇关于多个变量的变量移码滚动平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 13:55