问题描述
我正在手动给数据框添加标签,如下所示,我要标记800列,然后创建数据框的子集(数据的子设置有很多),然后将该数据框应用于功能
Hi I am giving labels to my data frame manually like below, I have 800 columns to be labeled , after that I am creating a subset of data frame (sub setting of data have many), then applying that data frame to function for calculation.
标签对于所有块而言可能是不同的,这也是为所有块一个一个地创建标签所花费的时间。
labels can be different for all chunks , also its very time taking for creating labels one by one for all chunks.
data<-data.frame( col1=c(1,1,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,1,1,1,NA,1,1,NA,NA,NA,NA,1,NA,NA,NA,NA,1,NA,1),
col2=c(1,1,1,1,1,NA,NA,NA,NA,1,1,1,1,1,NA,NA,NA,1,1,1,NA,1,1,1,1,1,NA,NA,NA,1,1,1,1,1,1,1,NA,NA,NA),
col3=c(1,1,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,1,1,1,NA,NA,NA,1,NA,NA,1,1,1,1,1,NA,NA,1),
col4=c(1,NA,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
col5=c(1,2,1,1,1,2,1,2,2,1,2,NA,1,1,2,2,2,1,1,1,2,NA,2,1,1,1,2,2,2,NA,1,2,2,1,1,1,2,2,2)
)
data$col5<-factor(data$col5, levels=c(1,2), labels=c("Local","Overseas"))
df<- data
df$cc1<-1
df2<- subset(df, col5 == 'Local')
df$cc2<-ifelse(df$col5 == 'Local',1,NA)
lst<-list(df$cc1, df$cc2)
ldat<-list("ALL" = df, "Local" =df2)
col_names <- c("col1","col2"...."col4")
labels <- c("Sales","Ops"...."HR")
make_mutator <- function(x) {
paste0(
"factor(", names(faclist)[[x]],
",labels=c('",
paste0(faclist[[x]],
collapse = "','"
), "'))"
)
}
list_of_fac <- purrr::map_chr(seq_len(length(faclist)),
make_mutator)
names(list_of_fac) <- names(faclist)
ldat <- purrr::map(ldat,
~mutate(.,
!!!parse_exprs(list_of_fac)))
这很好并且为我工作....但是,如果我将为列和标签分别给列和标签,例如
This is perfectly fine and working for me ....but just want new solution if i will give columns and labels separately for columns and labels like
col_names<-c( col1, col2 ; .... col4&qu ot;)
标签<-c(销售, Ops .... HR)
col_names <- c("col1","col2"...."col4")labels <- c("Sales","Ops"...."HR")
然后我该如何更改我的功能这.... ??
then how can i change my function for this....??
推荐答案
代替解析,一个更简单的选择是使用 map2
遍历列表
与 map
之后。使用 map2
,我们传递感兴趣的列和基于命名的 list
'faclist'
Instead of the parsing, an easier option is to use map2
after looping over the list
with map
. With map2
, we pass the columns of interest and the labels to be changed based on the named list
'faclist'
library(dplyr)
library(purrr)
ldat1 <- map(ldat, ~ {
.x[names(faclist)] <- map2(.x %>%
dplyr::select(names(faclist)),
faclist, ~ factor(.x, labels= .y))
.x} )
-输出
str(ldat1[[1]])
#'data.frame': 39 obs. of 7 variables:
# $ col1: Factor w/ 1 level "Sales": 1 1 NA NA NA NA NA NA 1 NA ...
# $ col2: Factor w/ 1 level "OPS": 1 1 1 1 1 NA NA NA NA 1 ...
# $ col3: Factor w/ 1 level "Management": 1 1 NA NA NA NA NA 1 NA NA ...
# $ col4: Factor w/ 1 level "HR": 1 NA NA NA NA NA NA NA NA NA ...
# $ col5: Factor w/ 2 levels "Local","Overseas": 1 2 1 1 1 2 1 2 2 1 ...
# $ cc1 : num 1 1 1 1 1 1 1 1 1 1 ...
# $ cc2 : num 1 NA 1 1 1 NA 1 NA NA 1 ...
str(ldat1[[2]])
#'data.frame': 18 obs. of 6 variables:
# $ col1: Factor w/ 1 level "Sales": 1 NA NA NA NA NA NA NA 1 NA ...
#$ col2: Factor w/ 1 level "OPS": 1 1 1 1 NA 1 1 1 1 1 ...
# $ col3: Factor w/ 1 level "Management": 1 NA NA NA NA NA NA NA NA NA ...
# $ col4: Factor w/ 1 level "HR": 1 NA NA NA NA NA NA NA NA 1 ...
# $ col5: Factor w/ 2 levels "Local","Overseas": 1 1 1 1 1 1 1 1 1 1 ...
# $ cc1 : num 1 1 1 1 1 1 1 1 1 1 ...
如果它不是列表
,而是两个向量,则只需更改<$ c带有 col_names矢量的$ c> names(faclist)和带有 labels $的
向量 list
'faclist'
If it is not a list
, but two vectors, then just change the names(faclist)
with the 'col_names' vector and the list
'faclist' with labels
vector
ldat1 <- map(ldat, ~ {
.x[col_names] <- map2(.x %>%
dplyr::select(col_names),
labels, ~ factor(.x, labels= .y))
.x} )
这篇关于将数据框列表的列转换为因数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!