问题描述
我想用有效值替换factors列中的<NA>
值.但是我找不到办法.本示例仅用于演示.原始数据来自我必须处理的国外csv文件.
I want to replace <NA>
values in a factors column with a valid value. But I can not find a way. This example is only for demonstration. The original data comes from a foreign csv file I have to deal with.
df <- data.frame(a=sample(0:10, size=10, replace=TRUE),
b=sample(20:30, size=10, replace=TRUE))
df[df$a==0,'a'] <- NA
df$a <- as.factor(df$a)
可能看起来像这样
a b
1 1 29
2 2 23
3 3 23
4 3 22
5 4 28
6 <NA> 24
7 2 21
8 4 25
9 <NA> 29
10 3 24
现在我想用数字替换<NA>
值.
Now I want to replace the <NA>
values with a number.
df[is.na(df$a), 'a'] <- 88
In `[<-.factor`(`*tmp*`, iseq, value = c(88, 88)) :
invalid factor level, NA generated
我想我错过了有关因素的基本R概念.是吗我不明白为什么它不起作用.我认为invalid factor level
表示88
在该因素中不是有效的水平,对吗?所以我必须告诉因子列还有另一个层次吗?
I think I missed a fundamental R concept about factors. Am I?I can not understand why it doesn't work. I think invalid factor level
means that 88
is not a valid level in that factor, right? So I have to tell the factor column that there is another level?
推荐答案
1)addNA 如果fac
是一个因素,addNA(fac)
是相同的因素,但添加了NA作为水平.参见?addNA
1) addNA If fac
is a factor addNA(fac)
is the same factor but with NA added as a level. See ?addNA
强制将NA级别设置为88:
To force the NA level to be 88:
facna <- addNA(fac)
levels(facna) <- c(levels(fac), 88)
给予:
> facna
[1] 1 2 3 3 4 88 2 4 88 3
Levels: 1 2 3 4 88
1a)可以单行编写,如下所示:
1a) This can be written in a single line as follows:
`levels<-`(addNA(fac), c(levels(fac), 88))
2)因子也可以使用factor
的各种参数在一行中完成,就像这样:
2) factor It can also be done in one line using the various arguments of factor
like this:
factor(fac, levels = levels(addNA(fac)), labels = c(levels(fac), 88), exclude = NULL)
2a)或等效地:
factor(fac, levels = c(levels(fac), NA), labels = c(levels(fac), 88), exclude = NULL)
3)ifelse 的另一种方法是:
factor(ifelse(is.na(fac), 88, paste(fac)), levels = c(levels(fac), 88))
4)forcats forcats软件包具有以下功能:
4) forcats The forcats package has a function for this:
library(forcats)
fct_explicit_na(fac, "88")
## [1] 1 2 3 3 4 88 2 4 88 3
## Levels: 1 2 3 4 88
注意:我们将以下内容用于输入fac
Note: We used the following for input fac
fac <- structure(c(1L, 2L, 3L, 3L, 4L, NA, 2L, 4L, NA, 3L), .Label = c("1",
"2", "3", "4"), class = "factor")
更新:已改进(1),并已添加(1a).后来添加了(4).
Update: Have improved (1) and added (1a). Later added (4).
这篇关于替换< NA>在因子列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!