问题描述
在 SO 上有一些与此主题类似的问题,但与我的用例并不完全相同.我有一个数据集,其中列的布局如下所示
There are some questions similar to this topic on SO but not exactly like my usecase. I have a dataset where the columns are laid out as shown below
Id Description Value
10 Cat 19
10 Cat 20
10 Cat 5
10 Cat 13
11 Cat 17
11 Cat 23
11 Cat 7
11 Cat 14
10 Dog 19
10 Dog 20
10 Dog 5
10 Dog 13
11 Dog 17
11 Dog 23
11 Dog 7
11 Dog 14
我想要做的是通过 Id、Description 捕获 Value 列的平均值.最终的数据集看起来像这样.
What I am trying to do is capture the mean of the Value column by Id, Description. The final dataset would look like this.
Id Cat Dog
10 14.25 28.5
11 15.25 15.25
我可以用非常粗略的方式来做到这一点,像这样效率不高
I can do this in a very rough manner not very efficient like this
tempdf1 <- df %>%
filter(str_detect(Description, "Cat")) %>%
group_by(Id, Description) %>%
summarize(Mean_Value = mean(Value) , na.rm = TRUE))
这不是很方便.任何关于如何更有效地实现预期结果的建议都非常感谢.
This is not very convenient. Any advise on how how to accomplish the expected results more efficiently is much appreciated.
推荐答案
使用 dcast
甚至 acast
来自 reshape2()
包>
Use dcast
or even acast
from reshape2()
package
dcast(dat,Id~Description,mean)
Id Cat Dog
1 10 14.25 14.25
2 11 15.25 15.25
Base R
可能有点长:
reshape(aggregate(.~Id+Description,dat,mean),direction = "wide",v.names = "Value",idvar = "Id",timevar = "Description")
Id Value.Cat Value.Dog
1 10 14.25 14.25
2 11 15.25 15.25
这篇关于按多列聚合并从长到宽重塑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!