本文介绍了在此列中使用数据时,排除dplyr`mutate_at中的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 df中重新缩放所有变量(但是 year gender 按一个特定的,按性别分组:

I want to rescale all variables (but year and gender) in a df by one specific year, grouped by gender:

set.seed(1)
df <- data.frame(gender = c(rep("m", 5), rep("f", 5)), year = rep(1:5, 2), var_a = 1:10, var_b = 0:9)
df

   gender year var_a var_b
1       m    1     1     0
2       m    2     2     1
3       m    3     3     2
4       m    4     4     3
5       m    5     5     4
6       f    1     6     5
7       f    2     7     6
8       f    3     8     7
9       f    4     9     8
10      f    5    10     9

我可以使用以下方法生成期望值:

I can generate what I expect using:

df %>% group_by(gender) %>% mutate(var_a = ifelse(year == 3, 0, var_a - var_a[year == 3])) %>%
  mutate(var_b = ifelse(year == 3, 0, var_b - var_b[year == 3]))

   gender  year var_a var_b
   <fct>  <int> <dbl> <dbl>
 1 m          1    -2    -2
 2 m          2    -1    -1
 3 m          3     0     0
 4 m          4     1     1
 5 m          5     2     2
 6 f          1    -2    -2
 7 f          2    -1    -1
 8 f          3     0     0
 9 f          4     1     1
10 f          5     2     2

但是,由于我的列太多,所以这不是一个选择。

However, this is not an option since I have too many columns.

所以我尝试了(没有成功):

So I tried (with no success):

df %>% group_by(gender) %>% mutate_at(vars(-gender, -year), ifelse(year == 3, 0, var_a - var_a[year == 3]))



如何排除 mutate_at 中的列名(或替代方法)使用 vars(-col_name)仍在读取这些列中的数据吗?

How can I exclude column names in mutate_at (or an alternative) using vars(-col_name) while still reading the data in those columns?

这与

推荐答案

如果在函数之前添加,则应该获得所需的输出。

If you add a ~ before the function you should get the wanted output.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
set.seed(1)
df <- data.frame(gender = c(rep("m", 5),
                            rep("f", 5)), 
                 year = rep(1:5, 2), var_a = 1:10, var_b = 0:9)
df
#>    gender year var_a var_b
#> 1       m    1     1     0
#> 2       m    2     2     1
#> 3       m    3     3     2
#> 4       m    4     4     3
#> 5       m    5     5     4
#> 6       f    1     6     5
#> 7       f    2     7     6
#> 8       f    3     8     7
#> 9       f    4     9     8
#> 10      f    5    10     9

df %>%
  group_by(gender) %>% 
  mutate_at(vars(-gender, -year),
            ~ifelse(year == 3, 0, . - .[year == 3]))
#> # A tibble: 10 x 4
#> # Groups:   gender [2]
#>    gender  year var_a var_b
#>    <fct>  <int> <dbl> <dbl>
#>  1 m          1    -2    -2
#>  2 m          2    -1    -1
#>  3 m          3     0     0
#>  4 m          4     1     1
#>  5 m          5     2     2
#>  6 f          1    -2    -2
#>  7 f          2    -1    -1
#>  8 f          3     0     0
#>  9 f          4     1     1
#> 10 f          5     2     2

编辑:
在较早版本的dplyr中,您可以使用 funs(),但是从dplyr 0.8.0开始它已被弃用

In older versions of dplyr you would use funs(), but it is soft deprecated as of dplyr 0.8.0

df %>%
  group_by(gender) %>% 
  mutate_at(vars(-gender, -year),
            funs(ifelse(year == 3, 0, . - .[year == 3])))

这篇关于在此列中使用数据时,排除dplyr`mutate_at中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-27 21:57