


I am struggling to get the standard deviation of the previous n values. Or in my case the last 5 days.


df<- data.frame(date = seq(as.Date("2019-12-01"), as.Date("2020-03-31"), by="days"),
                TRM= runif(122, min=3500, max=4100))
> df
          date      TRM
1   2019-12-01 3540.028
2   2019-12-02 3673.536
3   2019-12-03 3827.182
4   2019-12-04 3824.791
5   2019-12-05 3906.753
6   2019-12-06 3528.100
7   2019-12-07 3650.191
# ... with more rows

然后我使用 mutate 添加一些我需要的信息,将显示最后一行:

Then I use mutate to add some information that I need, I will show you the last rows:

df<-mutate(df, diferencia = TRM - lag(TRM, 1),
           VAR=diferencia/lag(TRM, 1))
          date      TRM  diferencia          VAR
118 2020-03-27 3779.479 -262.366328 -0.064912515
119 2020-03-28 3773.771   -5.708207 -0.001510316
120 2020-03-29 4097.078  323.307069  0.085672159
121 2020-03-30 3752.619 -344.459061 -0.084074332
122 2020-03-31 3707.442  -45.176979 -0.012038788


So what I need is the following:

  1. 创建具有 sd 的列

  2. 每行的 sd 必须仅包含 VAR列的最后5天。

  3. 如果所有这些都可以通过 dply 完成,那就太好了。 (不是必需的)

  1. Create a column that have the sd for the column "VAR".
  2. That the sd for each row must contain only the last 5 days of the column "VAR".
  3. If all this could be done with dply, would be great. (Not necessary)


For example, for the row 122 the result would be this:

 > sd(df[118:122,4])
[1] 0.06630885

那又怎样我要得到的是 df 的所有行的此值,我以5天为例,但是我想修改范围:

So what I what to get is this value for all the rows of my df, I used 5 days as an example but I would like to modify the range:

          date      TRM  diferencia          VAR  diff5days
118 2020-03-27 3779.479 -262.366328 -0.064912515 0.05801765
119 2020-03-28 3773.771   -5.708207 -0.001510316 0.04799908
120 2020-03-29 4097.078  323.307069  0.085672159 0.06207932
121 2020-03-30 3752.619 -344.459061 -0.084074332 0.07522609
122 2020-03-31 3707.442  -45.176979 -0.012038788 0.06630885



这里是使用Base R的解决方案:

Here is a solution using Base R:

df<- data.frame(date = seq(as.Date("2019-12-01"), as.Date("2020-03-31"), by="days"),
                TRM= runif(122, min=3500, max=4100))
df$stDev <- NA

for(i in 5:nrow(df)) df$stDev[i] <- sd(df$TRM[(i - 4):i])


> head(df,n = 10)
         date      TRM rownum     stDev
1  2019-12-01 3553.666      1        NA
2  2019-12-02 4054.015      2        NA
3  2019-12-03 3976.555      3        NA
4  2019-12-04 3825.628      4        NA
5  2019-12-05 4036.383      5 208.01581
6  2019-12-06 3787.414      6 122.38142
7  2019-12-07 3886.663      7 103.45743
8  2019-12-08 3930.801      8  97.10099
9  2019-12-09 3626.911      9 155.10571
10 2019-12-10 3781.731     10 117.29726


We can verify the results for the first three rows as follows:

> # verify first three results
> sd(df$TRM[1:5])
[1] 208.0158
> sd(df$TRM[2:6])
[1] 122.3814
> sd(df$TRM[3:7])
[1] 103.4574


09-05 17:57