r + dplyr过滤出时间序列

本文介绍了r + dplyr过滤出时间序列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些数据可以看出一群人和他们随时间吃的水果。我想使用dplyr来看每个人，直到他们吃香蕉，并总结他们吃的所有水果，直到他们吃了他们的第一个香蕉。

data：

  data<  - = c（1234L，1234L，1234L，1234L，1234L，1234L，
 1234L，1234L，1234L，1234L，1234L，1234L，9584L，9584L，9584L，
 9584L，9584L，9584L，9584L，9584L ，cL（6L，6L，1L，1L，6L，
 5L，5L，3L，4L，1L，5L，3L，4L，1L， 2L，6L，1L，6L，5L，5L，3L，2L，6L，6L，6L，
 4L，2L，5L，5L，4L，2L），标号= c（苹果 ，b，L，L，L，L， 7L，8L，9L，10L，11L，12L，1L，2L，3L，4L，5L，
 6L，7L，8L，9L，5L，6L，7L，8L，9L，10L），int = （c（2L，
 2L，2L，2L，2L，2L，2L，2L，2L，2L，1L，2L，2L，2L，2L，2L，2L，
 1L，2L，2L ，2L，2L，1L，2L，2L，2L，1L），.Label = c（banana，
other），class =factor）），.Names = c（user ，site，time，
int），row.names = c（NA，-27L），class =data.frame）

我最初的想法是将数据分组，以查找每个用户吃香蕉的第一个实例：

  data<  -  data％>％transform（var = ifelse（site ==banana，'banana' 'other'））
 
 data_ban<  -  data％>％
 filter（var =='banana'）％>％
 group_by（user，var，time ）％>％
 group_by（user）％>％
 summaryize（first_banana = min（time））

但是现在我坚持如何将这个实际应用回到原始的数据数据框，并设置一个过滤器，说：对于每个用户，只包括数据，直到给出的时间data_ban。有任何想法吗？

解决方案

您可以尝试切片

  data％>％
 group_by（user）％>％
 slice（1：（which（int ==' ）[1L]））

I have some data that looks at a group of people and the fruits they eat over time. I want to use dplyr to look at each individual person up until they eat a banana and summarise all the fruits they ate up until they eat their first banana.

data:

data <-  structure(list(user = c(1234L, 1234L, 1234L, 1234L, 1234L, 1234L,
    1234L, 1234L, 1234L, 1234L, 1234L, 1234L, 9584L, 9584L, 9584L,
    9584L, 9584L, 9584L, 9584L, 9584L, 9584L, 4758L, 4758L, 4758L,
    4758L, 4758L, 4758L), site = structure(c(1L, 6L, 1L, 1L, 6L,
    5L, 5L, 3L, 4L, 1L, 2L, 6L, 1L, 6L, 5L, 5L, 3L, 2L, 6L, 6L, 6L,
    4L, 2L, 5L, 5L, 4L, 2L), .Label = c("apple", "banana", "lemon",
    "lime", "orange", "pear"), class = "factor"), time = c(1L, 2L,
    3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L,
    6L, 7L, 8L, 9L, 5L, 6L, 7L, 8L, 9L, 10L), int = structure(c(2L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
    1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L), .Label = c("banana",
    "other"), class = "factor")), .Names = c("user", "site", "time",
    "int"), row.names = c(NA, -27L), class = "data.frame")

My initial thought would be to group the data to find the first instance of each user eating a banana:

data <- data %>% transform(var = ifelse(site=="banana", 'banana','other'))

data_ban <- data %>%
    filter(var=='banana') %>%
    group_by(user, var, time) %>%
    group_by(user) %>%
    summarise(first_banana = min(time))

But now I'm stuck on how to actually apply this back to the original "data" dataframe, and set a filter that says: for each user, only include data up until the time given in "data_ban". Any ideas?

解决方案

You could try slice

data %>%
     group_by(user) %>%
     slice(1:(which(int=='banana')[1L]))

这篇关于r + dplyr过滤出时间序列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

GROUP

r + dplyr过滤出时间序列

问题描述