问题描述
我试图在 dplyr :: mutate
中使用 dplyr :: case_when
在其中创建新变量设置一些值以丢失并同时重新编码其他值。
I am trying to use dplyr::case_when
within dplyr::mutate
to create a new variable where I set some values to missing and recode other values simultaneously.
但是,如果我尝试将值设置为 NA
,我收到一条错误消息,说我们不能创建变量 new
,因为 NA
s是合乎逻辑的:
However, if I try to set values to NA
, I get an error saying that we cannot create the variable new
because NA
s are logical:
是否可以使用以下方法在数据帧中的非逻辑向量中将值设置为 NA
Is there a way to set values to NA
in a non-logical vector in a data frame using this?
library(dplyr)
# Create data
df <- data.frame(old = 1:3)
# Create new variable
df <- df %>% dplyr::mutate(new = dplyr::case_when(old == 1 ~ 5,
old == 2 ~ NA,
TRUE ~ old))
# Desired output
c(5, NA, 3)
推荐答案
在?case_when
中说:
实际上,您有两种可能性:
You actually have two possibilities:
1)创建 new
作为数值向量
1) Create new
as a numeric vector
df <- df %>% mutate(new = case_when(old == 1 ~ 5,
old == 2 ~ NA_real_,
TRUE ~ as.numeric(old)))
请注意, NA_real _
是 NA
的数字版本,并且必须将 old
转换为数字,因为您在原始数据框中将其创建为整数。
Note that NA_real_
is the numeric version of NA
, and that you must convert old
to numeric because you created it as an integer in your original dataframe.
您会得到:
str(df)
# 'data.frame': 3 obs. of 2 variables:
# $ old: int 1 2 3
# $ new: num 5 NA 3
2)创建 new
作为整数向量
2) Create new
as an integer vector
df <- df %>% mutate(new = case_when(old == 1 ~ 5L,
old == 2 ~ NA_integer_,
TRUE ~ old))
在这里, 5L
迫使5进入整数类型,而 NA_integer _
是 NA
的整数形式。
Here, 5L
forces 5 into the integer type, and NA_integer_
is the integer version of NA
.
所以这次 new
是整数:
str(df)
# 'data.frame': 3 obs. of 2 variables:
# $ old: int 1 2 3
# $ new: int 5 NA 3
这篇关于避免与dplyr :: case_when发生类型冲突的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!