本文介绍了在dplyr函数中向vars()添加列名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个函数,该函数可用于基于某些用户定义的组,使用 dplyr

I have a function that can be used for summarizing a variable based on some user-defined groups, making use of dplyr:

library(tidyverse)

get_var_summary <- function(.data, .target_var, .group_vars = vars()) {
  .target_var = enquo(.target_var)
  return(
    .data %>%
      filter(!is.na(!! .target_var)) %>%
      group_by_at(.vars = .group_vars) %>%
      summarize(
        mean = mean(!! .target_var),
        sd = sd(!! .target_var),
        ci = qnorm(0.975) * sd(!! .target_var) / sqrt(n()),
        median = median(!! .target_var),
        n = n()
      ) %>%
      mutate(
        sd = ifelse(is.na(sd), Inf, sd),
        ci = ifelse(is.na(ci), Inf, ci)
      ) %>%
      ungroup()
  )
}

mtcars %>%
  get_var_summary(wt, .group_vars = vars(cyl))

返回值:

# A tibble: 3 x 6
    cyl  mean    sd    ci median     n
  <dbl> <dbl> <dbl> <dbl>  <dbl> <int>
1    4.  2.29 0.570 0.337   2.20    11
2    6.  3.12 0.356 0.264   3.22     7
3    8.  4.00 0.759 0.398   3.76    14

现在,为了能够轻松地重复 .group_vars ,但偶尔还会提供另一个分组变量,我想定义另一个调用 get_var_summary 的函数,但是在 .group_vars 中添加了另外一列:

Now, in order to be able to easily repeat the .group_vars, but occasionally supply another grouping var in addition, I would like to define another function that calls get_var_summary, but with one additional column added to .group_vars:

get_var_summary_by_another <- function(.data, .extra_var, .target_var, .group_vars = vars()) {

  # how do I add .extra_var to .group_vars?

}

我该怎么做?

推荐答案

想法是先将 .group_vars 拼接起来! !,并将 .extra_var 添加到新的 vars()调用中:

The idea is to first splice the .group_vars with !!!, and add the .extra_var to a new vars() call:

get_var_summary_by_another <- function(.data, .extra_var, .target_var, .group_vars = vars()) {
  .extra_var = enquo(.extra_var)
  .target_var = enquo(.target_var)
  .group_vars = vars(!!! .group_vars, !! .extra_var)
  return(
    .data %>% get_var_summary(
      !! .target_var,
      .group_vars
    )
  )
}

mtcars %>%
get_var_summary_by_another(gear, .target_var = wt, .group_vars = vars(cyl))

返回值:

# A tibble: 8 x 7
    cyl  gear  mean      sd      ci median     n
  <dbl> <dbl> <dbl>   <dbl>   <dbl>  <dbl> <int>
1    4.    3.  2.46 Inf     Inf       2.46     1
2    4.    4.  2.38   0.601   0.416   2.26     8
3    4.    5.  1.83   0.443   0.614   1.83     2
4    6.    3.  3.34   0.173   0.240   3.34     2
5    6.    4.  3.09   0.413   0.405   3.16     4
6    6.    5.  2.77 Inf     Inf       2.77     1
7    8.    3.  4.10   0.768   0.435   3.81    12
8    8.    5.  3.37   0.283   0.392   3.37     2

这篇关于在dplyr函数中向vars()添加列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 10:59
查看更多