问题描述
最近我偶然发现 dplyr
的奇怪行为,如果有人能提供一些见解,我会很高兴。
rowSums 。尽管有很多方法可以做到,但是这里有两个示例:
df<-data.frame(matrix(rnorm( 20),10,2),
ids = paste( i,1:20,sep =),
stringsAsFactors = FALSE)
#作品
dplyr :: select(df,-ids)%>%{rowSums(。)}
#不起作用
#错误:一元运算符
的参数无效df%>%
dplyr :: mutate(blubb = dplyr :: select(df,-ids)%>%{rowSums(。)})
#不起作用
#错误:一元运算符的无效参数
df%>%
dplyr :: mutate(blubb = dplyr :: select(。,-ids)%&%;%{rowSums(。) })
#解决方法:
tmp<-dplyr :: select(df,-ids)%>%{rowSums(。)}
df%>%
dplyr :: mutate(blubb = tmp)
#作品
rowSums(dplyr :: select(df,-ids))
#有不起作用
#错误:一元运算符
df%>%
的无效参数dplyr :: mutate(blubb = rowSums(dplyr :: select(df,-ids)))
#w orkaround
tmp<-rowSums(dplyr :: select(df,-ids))
df%>%
dplyr :: mutate(blubb = tmp)
首先,我不太了解导致错误的原因,其次,我想知道如何实现整洁的计算
edit
问题,尽管相关,但着重于使用 rowSums
进行计算。在这里,我很想了解为什么上面的示例不起作用。与其说如何解决(不是解决方法),不如说是要了解应用朴素方法时会发生什么。
这些示例不起作用,因为您将 select
嵌套在 mutate
中并使用裸变量名。在这种情况下, select
试图做类似的事情
> -df $ ids
-df $ ids错误:一元运算符
的参数无效失败,因为您无法否定字符串(即- i1
或- i2
没有意义)。以下任何一种公式均有效:
df%>%mutate(blubb = rowSums(select_(。, X1, X2))))
df%>%mutate(blubb = rowSums(select(。,-3)))
或
df%>%mutate(blubb = rowSums(select_(。, -ids)))
由@Haboryme建议。
Recently I stumbled uppon a strange behaviour of dplyr
and I would be happy if somebody would provide some insights.
Assuming I have a data of which com columns contain some numerical values. In an easy scenario I would like to compute rowSums
. Although there are many ways to do it, here are two examples:
df <- data.frame(matrix(rnorm(20), 10, 2),
ids = paste("i", 1:20, sep = ""),
stringsAsFactors = FALSE)
# works
dplyr::select(df, - ids) %>% {rowSums(.)}
# does not work
# Error: invalid argument to unary operator
df %>%
dplyr::mutate(blubb = dplyr::select(df, - ids) %>% {rowSums(.)})
# does not work
# Error: invalid argument to unary operator
df %>%
dplyr::mutate(blubb = dplyr::select(., - ids) %>% {rowSums(.)})
# workaround:
tmp <- dplyr::select(df, - ids) %>% {rowSums(.)}
df %>%
dplyr::mutate(blubb = tmp)
# works
rowSums(dplyr::select(df, - ids))
# does not work
# Error: invalid argument to unary operator
df %>%
dplyr::mutate(blubb = rowSums(dplyr::select(df, - ids)))
# workaround
tmp <- rowSums(dplyr::select(df, - ids))
df %>%
dplyr::mutate(blubb = tmp)
First, I don't really understand what is causing the error and second I would like to know how to actually achieve a tidy computation of some (viable) columns in a tidy way.
edit
The question mutate and rowSums exclude columns , although related, focuses on using rowSums
for computation. Here I'm eager to understand why the upper examples do not work. It is not so much about how to solve (see the workarounds) but to understand what happens when the naive approach is applied.
The examples do not work because you are nesting select
in mutate
and using bare variable names. In this case, select
is trying to do something like
> -df$ids
Error in -df$ids : invalid argument to unary operator
which fails because you can't negate a character string (i.e. -"i1"
or -"i2"
makes no sense). Either of the formulations below works:
df %>% mutate(blubb = rowSums(select_(., "X1", "X2")))
df %>% mutate(blubb = rowSums(select(., -3)))
or
df %>% mutate(blubb = rowSums(select_(., "-ids")))
as suggested by @Haboryme.
这篇关于使用`rowSums`改变`dplyr`中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!