问题描述
我正在尝试产生观察的情节,将发现的地方分组在一起,彼此间隔14天。
用dplyr我设法计算自上次观察以来的天数。然而,我不知道如何根据条件< / = 14
获取一个新的id,而不需要的循环。
样本数据:
#obsvn是第一次观察后的天数在组
dat< - data.frame(id = c(rep(A ,5),rep(B,2)),
obsvn = c(1,2,29,30,45,1,15))
id obsvn
1 A 1
2 A 2
3 A 29
4 A 30
5 A 45
6 B 1
7 B 15
预期输出:
id obsvn ith
1 A 1 1
2 A 2 1
3 A 29 2
4 A 30 2
5 A 45 3
6 B 1 1
7 B 15 2
我尝试使用滞后到
dat< - dat%>%
group_by(id)%>%
mutate(ith = 1,
ith = ifelse(obsvn - lag(obsvn)< = 14,lag(ith),lag(ith)+1))
dat
来源:本地数据框[7 x 3]
组:id
id obsvn ith
1 A 1 NA
2 A 2 1
3 A 29 2
4 A 30 1
5 A 45 2
6 B 1 NA
7 B 15 1
哪些不是我想要的。我不明白为什么行4中的 ith
是1而不是2.
因为它返回 lag(ith)
,始终为1(或开始为NA)。
我会使用 diff
和 cumsum
:
%$($)$%$%$%$%$%$%$%$%$ b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b来源:本地数据框[7 x 3]
组:id
id obsvn ith
1 A 1 1
2 A 2 1
3 A 29 2
4 A 30 2
5 A 45 3
6 B 1 1
7 B 15 2
I'm trying to generate 'episodes' of observations, grouping together observations where they occur </=
14 days apart. With dplyr I've managed to calculate the number of days since the last observation. However, I cannot figure out how to get a new id based on the conditional </= 14
without a for
loop.
Sample data:
#obsvn is number of days since first observation in group
dat <- data.frame(id = c(rep("A",5), rep("B", 2)),
obsvn = c(1, 2, 29, 30, 45, 1, 15))
id obsvn
1 A 1
2 A 2
3 A 29
4 A 30
5 A 45
6 B 1
7 B 15
Expected output:
id obsvn ith
1 A 1 1
2 A 2 1
3 A 29 2
4 A 30 2
5 A 45 3
6 B 1 1
7 B 15 2
I've tried using lag to
dat <- dat %>%
group_by(id) %>%
mutate(ith = 1,
ith = ifelse(obsvn - lag(obsvn) <= 14, lag(ith), lag(ith)+1))
dat
Source: local data frame [7 x 3]
Groups: id
id obsvn ith
1 A 1 NA
2 A 2 1
3 A 29 2
4 A 30 1
5 A 45 2
6 B 1 NA
7 B 15 1
Which isn't what I want. I don't understand why ith
in row 4 is 1 rather than 2.
Because it is returning lag(ith)
, which is always 1 (or NA at the start).
I would do it using diff
and cumsum
:
dat %>% group_by(id) %>% mutate(ith = cumsum(c(1,diff(obsvn)>=14)))
Source: local data frame [7 x 3]
Groups: id
id obsvn ith
1 A 1 1
2 A 2 1
3 A 29 2
4 A 30 2
5 A 45 3
6 B 1 1
7 B 15 2
这篇关于像分组数据的条件seq_along的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!