问题描述
mexico <- c(1,2,5,1,NA,1)
argentina <- c(2,2,2,2,NA,2)
italy<- c(NA,10,10,10,NA,10)
spain <- c(NA,NA,11,11,11,11)
england <- c(5,NA,10,NA,NA,12)
germany <- c(1,NA,NA,NA,NA,10)
Data_Risk = data.frame( Mexico, Argentina, Italy, Spain, England, Germany)
Data_Risk
给予
mexico argentina italy spain england germany
1 1 2 NA NA 5 1
2 2 2 10 NA NA NA
3 5 2 10 11 10 NA
4 1 2 10 11 NA NA
5 NA NA NA 11 NA NA
6 1 2 10 11 12 10
在这种情况下,我不需要考虑NA情况,因此我尝试了
in this case, I need no consider NA cases, for this reason I tried this
Data_Risk <- as.data.table(Data_Risk)
my_c <- !apply(Data_Risk, 1, is.na)[,1]
my_L <- Data_Risk[1]
as.data.frame(my_L)[my_c]
结果:
Mexico Argentina England Germany
1 1 2 5 1
在这种情况下,我不仅需要考虑行,而且还需要考虑所有行。
每行都需要放入新列而不考虑
的值,因此最终表必须如下所示:
in this case, I need not only that it considers a row, but all of them.
Moreover group by each row need to be put in new columns without considerthe values, so the final tables have to look like this:
var1 var2 var3 var4 var5 var6
mexico argentina england germany null null
mexico argentina italy null null null
mexico argentina italy spain england null
mexico argentina italy spain null null
spain null null null null null
mexico argentina italy spain england germany
推荐答案
一种选择是查看 which(!is.na(Data_Risk),arr.ind = T)
并将其扩展为宽幅形式,将 col
变量替换为 order(col)
,并添加 colnm
列用作value.var在扩展到多头( dcast
)过程中。
One option is to look at which(!is.na(Data_Risk), arr.ind = T)
and spread that to wide form, substituting the col
variable with order(col)
, and adding a colnm
column to use as the value.var in the spread-to-long (dcast
) process.
library(data.table)
library(magrittr)
nms <- as.data.table(which(!is.na(Data_Risk), arr.ind = T))
nms[, .(colnm = names(Data_Risk)[col], col = paste0('var', order(col)))
, by = row] %>%
dcast(row ~ col, value.var = 'colnm')
# row var1 var2 var3 var4 var5 var6
# 1: 1 mexico argentina england germany <NA> <NA>
# 2: 2 mexico argentina italy <NA> <NA> <NA>
# 3: 3 mexico argentina italy spain england <NA>
# 4: 4 mexico argentina italy spain <NA> <NA>
# 5: 5 spain <NA> <NA> <NA> <NA> <NA>
# 6: 6 mexico argentina italy spain england germany
等价 dplyr
代码:
library(dplyr)
nms <- as.data.frame(which(!is.na(Data_Risk), arr.ind = T))
nms %>%
group_by(row) %>%
mutate(colnm = names(Data_Risk)[col],
col = paste0('var', order(col))) %>%
spread(col, value = colnm) %>%
ungroup
这篇关于如何在不考虑Na值的情况下返回多列,并按R中其他列的名称分组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!