问题描述
我有一些数据可以看出一群人和他们随时间吃的水果。我想使用dplyr来看每个人,直到他们吃香蕉,并总结他们吃的所有水果,直到他们吃了他们的第一个香蕉。data:
data< - = c(1234L,1234L,1234L,1234L,1234L,1234L,
1234L,1234L,1234L,1234L,1234L,1234L,9584L,9584L,9584L,
9584L,9584L,9584L,9584L,9584L ,cL(6L,6L,1L,1L,6L,
5L,5L,3L,4L,1L,5L,3L,4L,1L, 2L,6L,1L,6L,5L,5L,3L,2L,6L,6L,6L,
4L,2L,5L,5L,4L,2L),标号= c(苹果 ,b,L,L,L,L, 7L,8L,9L,10L,11L,12L,1L,2L,3L,4L,5L,
6L,7L,8L,9L,5L,6L,7L,8L,9L,10L),int = (c(2L,
2L,2L,2L,2L,2L,2L,2L,2L,2L,1L,2L,2L,2L,2L,2L,2L,
1L,2L,2L ,2L,2L,1L,2L,2L,2L,1L),.Label = c(banana,
other),class =factor)),.Names = c(user ,site,time,
int),row.names = c(NA,-27L),class =data.frame)
我最初的想法是将数据分组,以查找每个用户吃香蕉的第一个实例:
data< - data%>%transform(var = ifelse(site ==banana,'banana' 'other'))
data_ban< - data%>%
filter(var =='banana')%>%
group_by(user,var,time )%>%
group_by(user)%>%
summaryize(first_banana = min(time))
但是现在我坚持如何将这个实际应用回到原始的数据数据框,并设置一个过滤器,说:对于每个用户,只包括数据,直到给出的时间data_ban。有任何想法吗?
您可以尝试切片
data%>%
group_by(user)%>%
slice(1:(which(int ==' )[1L]))
I have some data that looks at a group of people and the fruits they eat over time. I want to use dplyr to look at each individual person up until they eat a banana and summarise all the fruits they ate up until they eat their first banana.
data:
data <- structure(list(user = c(1234L, 1234L, 1234L, 1234L, 1234L, 1234L,
1234L, 1234L, 1234L, 1234L, 1234L, 1234L, 9584L, 9584L, 9584L,
9584L, 9584L, 9584L, 9584L, 9584L, 9584L, 4758L, 4758L, 4758L,
4758L, 4758L, 4758L), site = structure(c(1L, 6L, 1L, 1L, 6L,
5L, 5L, 3L, 4L, 1L, 2L, 6L, 1L, 6L, 5L, 5L, 3L, 2L, 6L, 6L, 6L,
4L, 2L, 5L, 5L, 4L, 2L), .Label = c("apple", "banana", "lemon",
"lime", "orange", "pear"), class = "factor"), time = c(1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 5L, 6L, 7L, 8L, 9L, 10L), int = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L), .Label = c("banana",
"other"), class = "factor")), .Names = c("user", "site", "time",
"int"), row.names = c(NA, -27L), class = "data.frame")
My initial thought would be to group the data to find the first instance of each user eating a banana:
data <- data %>% transform(var = ifelse(site=="banana", 'banana','other'))
data_ban <- data %>%
filter(var=='banana') %>%
group_by(user, var, time) %>%
group_by(user) %>%
summarise(first_banana = min(time))
But now I'm stuck on how to actually apply this back to the original "data" dataframe, and set a filter that says: for each user, only include data up until the time given in "data_ban". Any ideas?
You could try slice
data %>%
group_by(user) %>%
slice(1:(which(int=='banana')[1L]))
这篇关于r + dplyr过滤出时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!