问题描述
如果我不知道列名,但是想通过一个变量来指定列,那么我怎么能将列名传递给dplyr?
How can I pass column names to dplyr if I do not know the column name, but want to specify it through a variable?
这样做:
require(dplyr)
df <- as.data.frame(matrix(seq(1:9),ncol=3,nrow=3))
df$group <- c("A","B","A")
gdf <- df %.% group_by(group) %.% summarise(m1 =mean(V1),m2 =mean(V2),m3 =mean(V3))
但这不是
require(dplyr)
someColumn = "group"
df <- as.data.frame(matrix(seq(1:9),ncol=3,nrow=3))
df$group <- c("A","B","A")
gdf <- df %.% group_by(someColumn) %.% summarise(m1 =mean(V1),m2 =mean(V2),m3 =mean(V3))
推荐答案
我刚刚在,但是很好的措施:允许您使用字符串对列进行操作的功能已添加到 dplyr
中。它们具有与常规 dplyr
函数相同的名称,但以下划线结尾。这些功能在中有详细描述。
I just gave a similar answer over at Group by multiple columns in dplyr, using string vector input, but for good measure: functions that allow you to operate on columns using strings have been added to dplyr
. These have the same name as the regular dplyr
functions, but end in an underscore. The functions are described in detail in this vignette.
从OP中给出 df
和 someColumn
现在可以起作用:
Given df
and someColumn
from the OP, this now works a treat:
gdf <- df %>% group_by_(someColumn) %>% summarise(m1=mean(V1),m2=mean(V2),m3=mean(V3))
它是 group_by _
而不是 group_by
,而%>%
operator被用作%。%
已被弃用。
Note that it is group_by_
, rather than group_by
, and the %>%
operator is used as %.%
is deprecated.
这篇关于指定dplyr列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!