问题描述
我有一个简单的问题要弄清楚:
I have an easy question to figure out:
value
1000
2500
5080
10009
我要指定值
到一个间隔:
value Range
1000 0-1000
2500 1001-5000
5080 5001-10000
10009 10001-20000
我尝试以下操作:
dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value < 5001, "1001-5000", ifelse(5000 < value < 10001, "5001-10000", "10001-20000")))
但是,我得到了错误: dt [中出现意外的'<',范围:= ifelse(值< 1001, 0-1000,ifelse (1000< value<
有什么帮助吗?
编辑:
这个问题并不是在寻求将连续变量转换为因子的最佳方法,而是在寻求可重现示例的调试帮助:
This question is not asking for the best way to convert a continuous variable to a factor. It is asking for debugging help with the reproducible example:
library(data.table)
dt <- data.table(value = c(1000, 2500, 5080, 10009))
dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value < 5001, "1001-5000", ifelse(5000 < value < 10001, "5001-10000", "10001-20000")))
# produces the error above
推荐答案
)错误,则表示其含义。与python不同,R无法解释 1000<值< 5001
。相反,您需要使用 1000<价值和值< 5001
Like many (some) errors, it means what it says. Unlike python, R can't interpret 1000 < value < 5001
. Instead you need to use 1000 < value & value < 5001
library(data.table)
dt <- data.table(value = c(1000, 2500, 5080, 10009))
dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value & value < 5001, "1001-5000", ifelse(5000 < value & value < 10001, "5001-10000", "10001-20000")))]
dt
value Range
1: 1000 0-1000
2: 2500 1001-5000
3: 5080 5001-10000
4: 10009 10001-20000
正如@akrun提到的那样,您可能会有一个更好的选择。例如:
As @akrun mentioned, you may be better off with a factor. Here's an example:
dt[, Range := cut(value, breaks = c(0, 1001, 5001, 10001, 20001), labels = c("0-1000", "1001-5000", "5001-10000", "10001-20000"))]
这会产生一个显示相同方式的data.table,但是提取 Range
列将为您提供一个与范围。
This produces a data.table that displays the same way, but extracting the Range
column will give you a factor corresponding to the ranges.
这篇关于将值分配到特定范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!