我想虚拟代码,即为列物种创建标志变量。

我写了下面的代码:

create_dummies <- function(data, categorical_preds){
    if (categorical_preds == "setosa"){data$setosa_flg <- 1}
    else {data$setosa_flg <- 0}
    if (categorical_preds == "versicolor"){data$versicolor_flg <- 1}
    else {data$versicolor_flg <- 0}
    if (categorical_preds == "virginica"){data$virginica_flg <- 1}
    else {data$virginica_flg <- 0}
    return(data)
}
create_dummies(iris,iris$Species)

我收到警告:
Warning messages:
1: In if (categorical_preds == "setosa") { :
  the condition has length > 1 and only the first element will be used
2: In if (categorical_preds == "versicolor") { :
  the condition has length > 1 and only the first element will be used
3: In if (categorical_preds == "virginica") { :
  the condition has length > 1 and only the first element will be used

然后我将代码更改为:
create_dummies <- function(data, categorical_preds){
    ifelse(categorical_preds == "setosa",data$setosa_flg <- 1,data$setosa_flg <- 0)
    ifelse(categorical_preds == "versicolor",data$versicolor_flg <- 1,data$versicolor_flg <- 0)
    ifelse(categorical_preds == "virginica",data$virginica_flg <- 1,data$virginica_flg <- 0)

    return(data)
}
create_dummies(iris,iris$Species)

这次没有警告,但新的虚拟变量始终为 0。

作为下一步,我想避免硬编码,所以我写了
create_dummies <- function(data, categorical_preds){
catvar <- (unique(categorical_preds))

for (i in 1:length(catvar)){
  iris[catvar[i]] <- ifelse(iris$Species == catvar[i],1,0)
}
return(data)
}
create_dummies(iris,iris$Species)

这有什么问题?

问题:
  • 为什么代码的 2 个版本不起作用?
  • R 中的 if(){}ifelse() 函数有什么区别?
  • ifelse() 中,如果条件是 true ,我如何做多个 Action ?
    示例: ifelse(categorical_preds == "setosa",data$setosa_flg <- 1 print(iris$Species),data$setosa_flg <- 0)
  • 最佳答案

    警告信息:

      the condition has length > 1 and only the first element will be used
    

    告诉您在 if 条件中使用向量等效于使用其第一个元素:
    [if (v == 1)] ~ [if (v[1] == 1)] ## v here is a vector
    

    您应该使用矢量化的 ifelse 。例如,你可以这样写你的条件:
    create_dummies<-function(data, categorical_preds){
      ## here I show only the first condition
      data$setosa_flg <-
           ifelse (categorical_preds=="setosa",1,0)
      data
    }
    

    关于r - if() 和 ifelse() 函数的区别,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/22433704/

    10-12 18:26