R 在数据框中的连续行上运行

我有一个主要由连续行组成的数据框。主要是指一些乱序或缺失。当当前行的连续行存在时，我想使用两行的数据执行一些功能。如果它不存在，请跳过它并继续。我知道我可以用循环来做到这一点，但它很慢。我认为这与使用索引有关。这是我使用示例数据和使用循环的所需结果的问题的示例。

df <- data.frame(id=1:10, x=rnorm(10))
df <- df[c(1:3, 5:10), ]
df$z <- NA


dfLoop <- function(d)
{
  for(i in 1:(nrow(d)-1))
  {
    if(d[i+1, ]$id - d[i, ]$id == 1)
    {
      d[i, ]$z = d[i+1, ]$x - d[i, ]$x
    }
  }

  return(d)
}

dfLoop(df)

那么如何在不使用循环的情况下获得相同的结果呢？谢谢你的帮助。

最佳答案

试试这个:

index <- which(diff(df$id)==1) #gives the index of rows that have a row below in sequence

df$z[index] <- diff(df$x)[index]

作为一个函数:

fun <- function(x) {
  index <- which(diff(x$id)==1)
  xdiff <- diff(x$x)
  x$z[index] <- xdiff[index]
  return(x)
}

与您的循环进行比较:

a <- fun(df)
b <- dfLoop(df)
identical(a, b)
[1] TRUE

关于R 在数据框中的连续行上运行，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/15212476/