本文介绍了如何产生“范围”变量在R?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我有一个数据集,看起来像这样: 主题年份X A 1990 1 A 1991 1 A 1992 2 A 1993 3 A 1994 4 A 1995 4 B 1990 0 B 1991 1 B 1992 1 B 1993 2 C 1991 1 C 1992 2 C 1993 3 C 1994 3 D 1991 1 D 1992 2 D 1993 3 D 1994 4 D 1995 5 D 1996 5 D 1997 6 pre> 我想生成一个二进制(0/1)变量(让我们说变量A),表示X变量达到3(或1-3)的天气,为每个主题。如果X变量达到4以上,则A不能捕获。 应该如下所示: 主题年份XA A 1990 1 0 A 1991 1 0 A 1992 2 0 A 1993 3 0 A 1994 4 0 A 1995 4 0 B 1990 0 0 B 1991 1 0 B 1992 1 0 B 1993 2 0 C 1991 1 1 C 1992 2 1 C 1993 3 1 C 1994 3 1 D 1991 1 0 D 1992 2 0 D 1993 3 0 D 1994 4 0 D 1995 5 0 D 1996 5 0 D 1997 6 0 我尝试过以下操作: mydata $ A ,但不能继续执行.... 可重现的样本: > dput(mydata) structure(list(Subject = structure(c(1L,1L,1L,1L,1L,1L, 2L,2L,2L,2L,3L,3L,3L,3L, 4L,4L,4L,4L,4L,4L,4L),.Label = c(A,B,C,D),class =factor c(1990L,1991L,1992L, 1993L,1994L,1995L,1990L,1991L,1992L,1993L,1991L,1992L, 1993L,1994L,1991L,1992L,1993L,1994L,1995L, (1L,1L,2L,3L,4L,4L,0L,1L,1L,2L,1L,2L,3L, 3L,1L,2L,3L,4L ,5L,5L,6L)),.Names = c(Subject,Year,X),class =data.frame,row.names = c(NA,-21L) ) 欢迎所有的建议 - 谢谢! 解决方案这是一个基本的R单行使用 ave p> df $ A< - ave(df $ X,df $ Subject,FUN = function(x)if(max(x) == 3)1 else 0) > df 主题年份XA 1 A 1990 1 0 2 A 1991 1 0 3 A 1992 2 0 4 A 1993 3 0 5 A 1994 4 0 6 A 1995 4 0 7 B 1990 0 0 8 B 1991 1 0 9 B 1992 1 0 10 B 1993 2 0 11 C 1991 1 1 12 C 1992 2 1 13 C 1993 3 1 14 C 1994 3 1 15 D 1991 1 0 16 D 1992 2 0 17 D 1993 3 0 18 D 1994 4 0 19 D 1995 5 0 20 D 1996 5 0 21 D 1997 6 0 I have a dataset that looks something like this: Subject Year X A 1990 1 A 1991 1 A 1992 2 A 1993 3 A 1994 4 A 1995 4 B 1990 0 B 1991 1 B 1992 1 B 1993 2 C 1991 1 C 1992 2 C 1993 3 C 1994 3 D 1991 1 D 1992 2 D 1993 3 D 1994 4 D 1995 5 D 1996 5 D 1997 6I want to generate a binary(0/1) variable (let's say variable A) that indicates weather the X variables has reached 3 (or 1-3), for each Subject. If the X variable has reached 4 or more, the A should not capture it. It should look like this:Subject Year X A A 1990 1 0 A 1991 1 0 A 1992 2 0 A 1993 3 0 A 1994 4 0 A 1995 4 0 B 1990 0 0 B 1991 1 0 B 1992 1 0 B 1993 2 0 C 1991 1 1 C 1992 2 1 C 1993 3 1 C 1994 3 1 D 1991 1 0 D 1992 2 0 D 1993 3 0 D 1994 4 0 D 1995 5 0 D 1996 5 0 D 1997 6 0I tried the following: mydata$A<- as.numeric(mydata$X %in% 1:3)but it doesn't control for the continuation.... A reproducible sample: > dput(mydata)structure(list(Subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A", "B", "C", "D"), class = "factor"), Year = c(1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1990L, 1991L, 1992L, 1993L, 1991L, 1992L, 1993L, 1994L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L), X = c(1L, 1L, 2L, 3L, 4L, 4L, 0L, 1L, 1L, 2L, 1L, 2L, 3L, 3L, 1L, 2L, 3L, 4L, 5L, 5L, 6L)), .Names = c("Subject", "Year", "X"), class = "data.frame", row.names = c(NA, -21L))All suggestions are welcome – thanks! 解决方案 Here's a base R one-liner use ave:df$A <- ave(df$X, df$Subject, FUN = function(x) if (max(x) == 3) 1 else 0)> df Subject Year X A1 A 1990 1 02 A 1991 1 03 A 1992 2 04 A 1993 3 05 A 1994 4 06 A 1995 4 07 B 1990 0 08 B 1991 1 09 B 1992 1 010 B 1993 2 011 C 1991 1 112 C 1992 2 113 C 1993 3 114 C 1994 3 115 D 1991 1 016 D 1992 2 017 D 1993 3 018 D 1994 4 019 D 1995 5 020 D 1996 5 021 D 1997 6 0 这篇关于如何产生“范围”变量在R?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-18 14:26