I'm trying to select 100 rows before and after a marker in a relatively large dataframe. The markers are sparse and for some reason I haven't been able to figure it out or find a solution - this doesn't seem like it should be that hard, so I'm probably missing something obvious.
Here's a very small simple example of what the data looks like:
timestamp talking_yn transition_yn
0.01 n n
0.02 n n
0.03 n n
0.04 n n
0.05 n n
0.06 n n
0.07 n n
0.08 n n
0.09 n n
0.10 n n
0.11 y y
0.12 y n
0.13 y n
0.14 y n
0.15 y n
0.16 y n
0.17 y n
0.18 y n
我尝试使用各种答案中的不同方法( zoo
或 dplyr
中的 lag
),但它们都专注于选择一行或仅用标记子集替换那些行.对于虚拟示例数据,我如何选择 transition =='y'
I've tried using different methods from a variety of answers (lag
from zoo
or dplyr
), but they all focus on selecting one row or subsetting only those rows with the marker. For the dummy example data, how would I select the 5 rows before and after the transition == 'y'
I have a quick function for that:
#' Lead/Lag a logical
#' @param lgl logical vector
#' @param bef integer, number of elements to lead by
#' @param aft integer, number of elements to lag by
#' @return logical, same length as 'lgl'
#' @export
leadlag <- function(lgl, bef = 1, aft = 1) {
n <- length(lgl)
bef <- min(n, max(0, bef))
aft <- min(n, max(0, aft))
befx <- if (bef > 0) sapply(seq_len(bef), function(b) c(tail(lgl, n = -b), rep(FALSE, b)))
aftx <- if (aft > 0) sapply(seq_len(aft), function(a) c(rep(FALSE, a), head(lgl, n = -a)))
rowSums(cbind(befx, lgl, aftx), na.rm = TRUE) > 0
dat[leadlag(dat$transition_yn == 'y', 2, 4),]
# timestamp talking_yn transition_yn
# 9 0.09 n n
# 10 0.10 n n
# 11 0.11 y y
# 12 0.12 y n
# 13 0.13 y n
# 14 0.14 y n
# 15 0.15 y n
dat <- read.table(header=TRUE, stringsAsFactor=FALSE, text="
timestamp talking_yn transition_yn
0.01 n n
0.02 n n
0.03 n n
0.04 n n
0.05 n n
0.06 n n
0.07 n n
0.08 n n
0.09 n n
0.10 n n
0.11 y y
0.12 y n
0.13 y n
0.14 y n
0.15 y n
0.16 y n
0.17 y n
0.18 y n")