本文介绍了后续跟踪:R中具有共享唯一行名称的匹配因子级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
后续。我们如何使用dplyr或data.table包来使因子级别与共享行名称相匹配?
In follow-up to this post. How do we use the dplyr or data.table package to match the factor levels appropriately with shared row names?
library(data.table)
(DT = data.table(a = LETTERS[c(1, 1:3, 8)], b = c(2, 4:7),
c = as.factor(c("bob", "mary", "bob", "george", "alice")), key="a"))
# a b c
# 1: A 2 bob
# 2: A 4 mary
# 3: B 5 bob
# 4: C 6 george
# 5: H 7 alice
...并使用@frank的好答案:
...and using @frank 's great answer:
uc <- sort(unique(as.character(DT$c)))
( DT[,(uc):=lapply(uc,function(x)ifelse(c==x,b,NA))][,c('b','c'):=NULL] )
返回:
# a alice bob george mary
# 1 A NA 2 NA NA
# 2 A NA NA NA 4
# 3 B NA 5 NA NA
# 4 C NA NA 6 NA
# 5 H 7 NA NA NA
最后一个问题是,我们如何得到下面的输出,其中唯一的行名称共享级别值返回NA,保持?
And the final question here is, how do we get the below output, where unique row names share level values returning NAs where empty elements remain?
alice bob george mary
# 1 A NA 2 NA 4
# 2 B NA 5 NA NA
# 3 C NA NA 6 NA
# 4 H 7 NA NA NA
推荐答案
使用tidyr:
library(tidyr)
spread(DT, c, b)
这篇关于后续跟踪:R中具有共享唯一行名称的匹配因子级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!