本文介绍了使用R将变量值编码为类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组数据,其中我需要将某些变量(数字)的值编码为3类。

I have a set of data in which I need to code values of certain variables (numeric) into 3 classes.

我的数据集与此类似,但具有还有60个变量:

My data set is similar to this but has 60 more variables:

anim <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
wt <- c(181,179,180.5,201,201.5,245,246.4,189.3,301,354,369,205,199,394,231.3)
data <- data.frame(anim,wt)

> data
   anim    wt
1     1 181.0
2     2 179.0
3     3 180.5
4     4 201.0
5     5 201.5
6     6 245.0
7     7 246.4
8     8 189.3
9     9 301.0
10   10 354.0
11   11 369.0
12   12 205.0
13   13 199.0
14   14 394.0
15   15 231.3

我需要对变量的值进行编码 wt分为3类:(wt> = 179&wt; 200)= 1; (wt> = 200&wt; 300)= 2; (wt> 300)= 3

I need to code values of the variable "wt" up into 3 classes: (wt >= 179 & wt < 200) = 1; (wt >= 200 & wt < 300) = 2; (wt > 300) = 3

应该给我这个

> data2
   anim    wt SWT
1     1 181.0   1
2     2 179.0   1
3     3 180.5   1
4     4 201.0   2
5     5 201.5   2
6     6 245.0   2
7     7 246.4   2
8     8 189.3   1
9     9 301.0   3
10   10 354.0   3
11   11 369.0   3
12   12 205.0   2
13   13 199.0   1
14   14 394.0   3
15   15 231.3   2


推荐答案

@Greg概述的 cut 方法可能就是您想要的。需要注意的一件事是 cut 默认返回一个因子,您可以通过提供 labels = FALSE 来抑制该因子整数值:

The cut method as outlined by @Greg is probably what you want here. One thing to note is that cut returns a factor by default, which you can suppress by supplying labels = FALSE to return the integer values:

cut(data$wt, c(178, 200, 300, Inf), labels = FALSE)

或者,如果您的切割不适合自然休息,则可以使用 ifelse()。您可以嵌套类似于Excel的ifelse语句。我使用 with来减少所需的输入:

Alternatively, if your cutting does not lend itself to natural breaks, you can use ifelse(). You can "nest" the ifelse statements similar to Excel. I use "with" to cut down on the typing needed:

data$group2 <- with(data, ifelse(wt >= 179 & wt < 200, 1,
  ifelse(wt >= 200 & wt < 300, 2, 3))
)

这篇关于使用R将变量值编码为类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-14 09:15