问题描述
我有一个导入的数据框,该框的列名称带有各种标点符号,包括括号,例如 BILLNG.STATUS。(已完成。/。INCOMPLTE)
。
I have an imported data frame that has column names with various punctuations including parentheses, e.g. BILLNG.STATUS.(COMPLETED./.INCOMPLTE)
.
我正尝试使用 dplyr > group_by 做一些总结,例如
I was trying to use group_by
from dplyr
to do some summarizing, something like
df <- df %>% group_by(ORDER.NO, BILLNG.STATUS.(COMPLETED./.INCOMPLTE))
这会导致错误 mutate_impl(.data,点)中的错误:
找不到函数 BILLNG.STATUS。
更改列名称的时间短,是否可以直接在 group_by
中处理此类列名称?
Short of changing the column names, is there a way to handle such column names directly in group_by
?
推荐答案
我认为,如果将非法列名放在反引号中,则可以使此工作有效。例如,假设我从以下数据帧开始(称为 df
):
I think you can make this work if you enclose the "illegal" column names in backticks. For example, let's say I start with this data frame (called df
):
BILLING.STATUS.(COMPLETED./.INCOMPLETE) ORDER.VALUE.(USD)
1 A 0.01544196
2 A 0.95522706
3 B 1.13479303
4 B 1.22848285
然后我可以这样总结:
dat %>% group_by(`BILLING.STATUS.(COMPLETED./.INCOMPLETE)`) %>%
summarise(count=n(),
mean = mean(`ORDER.VALUE.(USD)`))
捐赠:
BILLING.STATUS.(COMPLETED./.INCOMPLETE) count mean
1 A 2 0.4853345
2 B 2 1.1816379
反引号也很方便引用或使用空格创建变量名。您可以找到许多与 dplyr
和反引号有关的问题,并且在 Quotes $ c的帮助中也有一些反引号的讨论。 $ c>。
Backticks also come in handy for referring to or creating variable names with whitespace. You can find a number of questions related to dplyr
and backticks on SO, and there's also some discussion of backticks in the help for Quotes
.
这篇关于dplyr group_by的方括号或其他标点符号的列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!