dplyr，R：一次计算多个列中的特定值 | 一次计算多个列中的特定值

本文介绍了dplyr，R：一次计算多个列中的特定值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据框：

md <- data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
      device = c(1,1,2,2,3,3))
myvars = c("a", "b", "c")
md[2,3] <- NA
md[4,1] <- NA
md

我要计算数字每列5s-按设备。我可以这样：

I want to count number of 5s in each column - by device. I can do it like this:

library(dplyr)
group_by(md, device) %>%
summarise(counts.a = sum(a==5, na.rm = T),
          counts.b = sum(b==5, na.rm = T),
          counts.c = sum(c==5, na.rm = T))

但是，实际上生活中我会有很多变量（ myvars 的长度可能非常大）-这样我就无法指定那些 counts.a ， counts.b 等手动操作-数十次。

However, in real life I'll have tons of variables (the length of myvars can be very large) - so that I can't specify those counts.a, counts.b, etc. manually - dozens of times.

是否 dplyr 是否允许同时在所有 myvars 列上运行5s？

Does dplyr allow to run the count of 5s on all myvars columns at once?

谢谢！

推荐答案

如果您在乎以计数开头的名称。您可以在dplyr管道中这样做：

If you care about the names starting with "counts." you could do it like this in a dplyr pipe:

md %>%
  group_by(device) %>%
  summarise_each_(funs(sum(.==5,na.rm=TRUE)), myvars) %>%
  setNames(c(names(.)[1], paste0("counts.", myvars)))
#Source: local data frame [3 x 4]
#
#  device counts.a counts.b counts.c
#1      1        1        2        0
#2      2        0        1        0
#3      3        1        0        2

还有另一个关于如何命名dplyr的 mutate_each 产生的新列的问答（其行为与 summarise_each ）在这里：dplyr中的。

There's another Q&A about how one can name new columns produced by dplyr's mutate_each (which behaves the same way as summarise_each) here: mutate_each in dplyr: how do I select certain columns and give new names to mutated columns?.

这篇关于dplyr，R：一次计算多个列中的特定值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！