问题描述
我试图使用基于上一行的条件对数据集的行进行子集化,同时将上一行保留在子集数据中.这与这里的问题基本相同,但是我正在寻找一种dplyr方法:
I am trying to subset rows of a data set using a condition that's based on the previous row, whilst keeping the previous row in the subsetted data. This is essentially the same as the question here, but I am looking for a dplyr approach:
我已经在注释中采用了dplyr方法来回答该问题,但是我无法弄清楚保留上一行的最后一步.
I have taken the dplyr approach applied in the comments to that answer, but I am unable to figure out the last step of retaining the previous row.
我可以获得支持我感兴趣的条件的行(当上一行不是 enter
时, invalid
).
I can get the rows that support the condition I'm interested in (incorrect
when the previous row is not enter
).
set.seed(123)
x=c("enter","incorrect","enter","correct","incorrect",
"enter","correct","enter","incorrect")
y=c(runif(9, 5.0, 7.5))
z=data.frame(x,y)
filter(z, x=="incorrect" & lag(x)!="enter")
如预期那样给予:
x y
1 incorrect 7.351168
我想产生的是这样,因此我根据条件过滤的所有行都与原始数据集中位于它们前面的行一起存储:
What I would like to produce is this, so that all rows I've filtered based on the condition are stored with the row that precedes them in the original data set:
x y
1 correct 7.207544
2 incorrect 7.351168
任何帮助将不胜感激!
推荐答案
通过过滤,您可以做到:
By filtering you could do:
z %>%
filter( (x == "incorrect" & lag(x) != "enter") | lead(x == "incorrect" & lag(x) != "enter") )
给予:
x y
1 correct 7.207544
2 incorrect 7.351168
这篇关于如何基于上一行过滤行并使用dplyr保留上一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!