问题描述
此问题与该相关联,并应@Akruns的要求,我要求类似的内容.
This questions ties onto this here, and at @Akruns request I'm asking for something similar.
基本上,如果我在以下条件内插入数据框:
Essentially, If I insert a dataframe within the following conditional:
if(length(weight) > 0) {weight %>%
select(where(negate(is.numeric))) %>%
map_dfc(~ model.matrix(~ .x -1) %>%
as_tibble) %>%
rename_all(~ str_remove(., "\\.x")) %>%
bind_rows(weight, .)
}
分配:
#Following @Akruns mention for turning numeric into factor:
i1 <- sapply(weight, is.numeric); df[i1] <- lapply(weight[i1], factor) and then use the Filter(function(x) is.factor(x)|is.character(x), weight)
test = function(data) {
x = as.data.frame(Reduce(cbind, lapply(x, function(col) model.matrix(~ . -1, data = data.frame(col)))))
setNames(x, sub(pattern = "^col", replacement = "", names(x)))
}
test(weight)
#Missing column names
1 64 57 8 1 0 0 1 0
2 71 59 10 1 0 0 1 0
3 53 49 6 1 0 0 1 0
4 67 62 11 1 0 0 1 0
5 55 51 8 0 0 1 1 0
6 58 50 7 0 0 1 1 0
7 77 55 10 0 0 1 0 1
8 57 48 9 0 0 1 0 1
9 56 42 10 0 1 0 0 1
10 51 42 6 0 1 0 0 1
11 76 61 12 0 1 0 0 1
12 68 57 9 0 1 0 0 1
然后,如果 weight
具有因数,它将把属于因数的列拆分为列,并使用之前出现的 1
和 0 在其他地方.
Then if
weight
has factors, it will split columns that are factors into columns and assign them values with 1
where it appeared before and 0
elsewhere.
但是,如果我输入一个仅
numeric
的数据帧,它将返回 character(0)
.问题是,如何赋予以下函数一个条件,以使得例如 x
的数据框是否为数字,然后按原样返回该数据框.如果这是一个因素,则返回请求的输出.
However, if I input a
numeric
only dataframe, it returns character(0)
. The question is, how to give the following function a conditional such that whether the dataframe for example x
is numeric then return the dataframe as it is. If it is a factor, then return the requested output.
我之所以要求这样做,是因为我正在寻求在另一个函数中实现这一点,它将包含许多数据框,其中一些仅包含数字,而另一些包含因子.在这种情况下,我可以将数据框表示为函数中的
x
.
The reason I request this is because I'm looking to implement this within another function, that will include many dataframe where some have only numeric and others include factors. In that case, I can denote the dataframe as
x
within the function.
我对函数的
fact_col <- function(x){
if(length(x) > 0) {
weight_sub <- x %>%
select(where(is.factor))
weight_sub %>%
map_dfc(~ model.matrix(~ .x -1) %>%
as_tibble) %>%
rename_all(~ str_remove(., "\\.x")) %>%
bind_cols(weight_sub, .) -> x
x<- x%>% select(!where(is.factor))
x<- data.frame(sapply(x, as.numeric))
}}
预期输出:
#when x is numeric
function(x) { ... }
Richness pat
1 20 1
2 17 2
3 18 3
4 19 4
5 11 5
6 15 6
7 17 7
8 15 8
9 15 9
10 9 10
11 13 11
12 14 12
#when x is a factor
function(x) { ... }
wgt hgt age id sex black brown white female male
1 64 57 8 black female 1 0 0 1 0
2 71 59 10 black female 1 0 0 1 0
3 53 49 6 black female 1 0 0 1 0
4 67 62 11 black female 1 0 0 1 0
5 55 51 8 white female 0 0 1 1 0
6 58 50 7 white female 0 0 1 1 0
7 77 55 10 white male 0 0 1 0 1
8 57 48 9 white male 0 0 1 0 1
9 56 42 10 brown male 0 1 0 0 1
10 51 42 6 brown male 0 1 0 0 1
11 76 61 12 brown male 0 1 0 0 1
12 68 57 9 brown male 0 1 0 0 1
可复制的代码:
structure(list(wgt = c(64L, 71L, 53L, 67L, 55L, 58L, 77L, 57L,
56L, 51L, 76L, 68L), hgt = c(57L, 59L, 49L, 62L, 51L, 50L, 55L,
48L, 42L, 42L, 61L, 57L), age = c(8L, 10L, 6L, 11L, 8L, 7L, 10L,
9L, 10L, 6L, 12L, 9L), id = structure(c(1L, 1L, 1L, 1L, 3L, 3L,
3L, 3L, 2L, 2L, 2L, 2L), .Label = c("black", "brown", "white"
), class = "factor"), sex = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("female", "male"), class = "factor")), class = "data.frame", row.names = c(NA,
-12L))
推荐答案
一个选项是在使用
if
即 select
列 factor
并创建一个新对象('weight_sub'),然后检查'weight_sub'上的 length
,如果
大于0,则执行 model.matrix
的其余部分,并将其分配回'weight'
An option is to split the code before we use the
if
i.e. select
the columns that are factor
and create a new object ('weight_sub'), then check the length
on the 'weight_sub', if
it is greater than 0, do the rest of model.matrix
and assign it back to 'weight'
weight_sub <- weight %>%
select(where(is.factor))
if(length(weight_sub) > 0) {
weight_sub %>%
map_dfc(~ model.matrix(~ .x -1) %>%
as_tibble) %>%
rename_all(~ str_remove(., "\\.x")) %>%
bind_cols(weight, .) -> weight
}
-输出
# wgt hgt age id sex black brown white female male
#1 64 57 8 black female 1 0 0 1 0
#2 71 59 10 black female 1 0 0 1 0
#3 53 49 6 black female 1 0 0 1 0
#4 67 62 11 black female 1 0 0 1 0
#5 55 51 8 white female 0 0 1 1 0
#6 58 50 7 white female 0 0 1 1 0
#7 77 55 10 white male 0 0 1 0 1
#8 57 48 9 white male 0 0 1 0 1
#9 56 42 10 brown male 0 1 0 0 1
#10 51 42 6 brown male 0 1 0 0 1
#11 76 61 12 brown male 0 1 0 0 1
#12 68 57 9 brown male 0 1 0 0 1
作为否定测试,请检查其是否为
character
类列
As a negative test, do this by checking if it is a
character
class column
weight_sub <- weight %>%
select(where(is.character))
if(length(weight_sub) > 0) {
weight_sub %>%
map_dfc(~ model.matrix(~ .x -1) %>%
as_tibble) %>%
rename_all(~ str_remove(., "\\.x")) %>%
bind_cols(weight, .) -> weight
}
没有输出,因为
if
条件返回 FALSE
,因此权重"数据集保持不变,而无需添加任何新列
No output as the
if
condition returns FALSE
, thus the 'weight' dataset remains the same without adding any new columns
在更新中,如果OP也使用
numeric
列传递给 model.matrix
,则它仅返回同一列,即一列(因为循环使用 map
的列(列名称为 .x
(来自 model.matrix
公式).当我们使用 str_remove
时,此 .x
列名将通过 rename_all
删除,保留一个空白列名,默认情况下该列名将被填充从 _dfc
分配为"col".为防止这种情况,我们可以在使用 if/else
条件之前,将原始列名称作为后缀附加到具有一列输出并且是数字列的人的后缀
In the update, if the OP is also using
numeric
columns to be passed into model.matrix
, it just returns the same column i.e. one column (as we are looping over columns with map
) with the column name as .x
(from model.matrix
formula). This .x
column name is removed with rename_all
when we use str_remove
, leaving a blank column name, which by default is filled with a column name assigned as 'col' from _dfc
. To prevent, that, we can use an if/else
condition before doing this to append the original column name as suffix for those having one column output and is a numeric one
weight %>%
imap_dfc(~ {
nm1 <- .y
tmp <- model.matrix(~ .x - 1) %>%
as_tibble
if(ncol(tmp) == 1 && class(tmp[[1]]) == 'numeric') {
names(tmp) <- paste0(names(tmp), nm1)
}
tmp
}) %>%
rename_all(~ str_remove(., "\\.x"))
-输出
# A tibble: 12 x 8
# wgt hgt age black brown white female male
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 64 57 8 1 0 0 1 0
# 2 71 59 10 1 0 0 1 0
# 3 53 49 6 1 0 0 1 0
# 4 67 62 11 1 0 0 1 0
# 5 55 51 8 0 0 1 1 0
# 6 58 50 7 0 0 1 1 0
# 7 77 55 10 0 0 1 0 1
# 8 57 48 9 0 0 1 0 1
# 9 56 42 10 0 1 0 0 1
#10 51 42 6 0 1 0 0 1
#11 76 61 12 0 1 0 0 1
#12 68 57 9 0 1 0 0 1
或者我们使用
base R
out <- do.call(cbind, unname(Map(function(x, y) {
tmp <- as.data.frame(model.matrix(~x -1))
if(ncol(tmp) == 1 & class(tmp[[1]]) == 'numeric') {
names(tmp) <- paste0(names(tmp), y)}
tmp
}, weight, names(weight))))
names(out) <- sub('^x', '', names(out))
out
# wgt hgt age black brown white female male
#1 64 57 8 1 0 0 1 0
#2 71 59 10 1 0 0 1 0
#3 53 49 6 1 0 0 1 0
#4 67 62 11 1 0 0 1 0
#5 55 51 8 0 0 1 1 0
#6 58 50 7 0 0 1 1 0
#7 77 55 10 0 0 1 0 1
#8 57 48 9 0 0 1 0 1
#9 56 42 10 0 1 0 0 1
#10 51 42 6 0 1 0 0 1
#11 76 61 12 0 1 0 0 1
#12 68 57 9 0 1 0 0 1
这篇关于给有条件的功能赋予条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!