本文介绍了使用dplyr与过滤器,group_by&尾巴?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个示例df:

  df<  -  structure(list(x = 1:30,y = 130,g =结构(c(1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,3L, 3L,3L,3L,3L,3L,3L,3L,3L,3L),.Label = c(A,B,C),class =factor)),.Names = c x,y,g),row.names = c(NA,-30L),class =data.frame)

我想在过滤后的数据中获取每组的y的10个最低值。



但是

  df2<  -  df%>%filter(x> 3)%>%group_by(g)%>%tail y,n = 10)

只返回最后一个组的行(在这种情况下为C):

 资料来源:本地资料框[10 x 3] 
群组:g

xyg
18 21 121 C
19 22 122 C
20 23 123 C
21 24 124 C
22 25 125 C
23 26 126 C
24 27 127 C
25 28 128 C
26 29 129 C
27 30 130 C

我做错了什么?

解决方案

你可以使用 tail do 之内。

  df2<  -  df%>%filter(x> 3)%>%group_by(g)%> %do(tail(。,n = 10))

使用是这个工作的关键。从 do 帮助页面:您可以使用。指向当前组。



编辑: / p>

正如@beginneR所指出的那样,我专注于如何使用 tail dplyr ,错过了OP要求 y 的最低值的问题部分。要正确执行此操作将会添加排列。使用 tail ,这意味着按 y 的降序排列。

  df2<  -  df%>%filter(x> 3)%>%group_by(g)%>%arrange(desc(y))%> %do(tail(。,n = 10))


Here's an example df:

df <- structure(list(x = 1:30, y = 101:130, g = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor")), .Names = c("x", "y", "g"), row.names = c(NA, -30L), class = "data.frame")

I would like to get the 10 lowest values of y for each group within the filtered data.

But

df2 <- df %>% filter(x>3) %>% group_by(g) %>%  tail(y, n=10)

only returns the rows for the last group (C in this case):

Source: local data frame [10 x 3]
Groups: g

    x   y g
18 21 121 C
19 22 122 C
20 23 123 C
21 24 124 C
22 25 125 C
23 26 126 C
24 27 127 C
25 28 128 C
26 29 129 C
27 30 130 C

What am I doing wrong?

解决方案

You can use tail inside do.

df2 <- df %>% filter(x>3) %>% group_by(g) %>%  do(tail(., n=10))

The use of . is key for this to work. From the do help page: "You can use . to refer to the current group."

Edit:

As @beginneR pointed out, I was focusing on how to use tail in groups with dplyr and missed the part of the question where the OP asked for the 10 lowest values of y. To do this correctly would take the addition of arrange. With tail, this would mean arranging by descending order of y.

df2 <- df %>% filter(x>3) %>% group_by(g) %>%  arrange(desc(y)) %>% do(tail(., n=10))

这篇关于使用dplyr与过滤器,group_by&amp;尾巴?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 19:10