本文介绍了dplyr group_by的方括号或其他标点符号的列名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个导入的数据框,该框的列名称带有各种标点符号,包括括号,例如 BILLNG.STATUS。(已完成。/。INCOMPLTE)

I have an imported data frame that has column names with various punctuations including parentheses, e.g. BILLNG.STATUS.(COMPLETED./.INCOMPLTE) .

我正尝试使用 dplyr > group_by 做一些总结,例如

I was trying to use group_by from dplyr to do some summarizing, something like

df <- df %>% group_by(ORDER.NO, BILLNG.STATUS.(COMPLETED./.INCOMPLTE))

这会导致错误 mutate_impl(.data,点)中的错误:
找不到函数 BILLNG.STATUS。

更改列名称的时间短,是否可以直接在 group_by 中处理此类列名称?

Short of changing the column names, is there a way to handle such column names directly in group_by ?

推荐答案

我认为,如果将非法列名放在反引号中,则可以使此工作有效。例如,假设我从以下数据帧开始(称为 df ):

I think you can make this work if you enclose the "illegal" column names in backticks. For example, let's say I start with this data frame (called df):

  BILLING.STATUS.(COMPLETED./.INCOMPLETE) ORDER.VALUE.(USD)
1                                       A        0.01544196
2                                       A        0.95522706
3                                       B        1.13479303
4                                       B        1.22848285

然后我可以这样总结:

dat %>% group_by(`BILLING.STATUS.(COMPLETED./.INCOMPLETE)`) %>% 
  summarise(count=n(),
            mean = mean(`ORDER.VALUE.(USD)`))

捐赠:

  BILLING.STATUS.(COMPLETED./.INCOMPLETE) count      mean
1                                       A     2 0.4853345
2                                       B     2 1.1816379

反引号也很方便引用或使用空格创建变量名。您可以找到许多与 dplyr 和反引号有关的问题,并且在 Quotes

Backticks also come in handy for referring to or creating variable names with whitespace. You can find a number of questions related to dplyr and backticks on SO, and there's also some discussion of backticks in the help for Quotes.

这篇关于dplyr group_by的方括号或其他标点符号的列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-28 07:52