本文介绍了R使用Data.Table配置数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

data=data.frame("Student"=c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5),
       "Grade"=c(5,6,7,3,4,5,4,5,6,8,9,10,2,3,4),
       "Pass"=c(NA,0,1,0,1,1,0,1,0,0,NA,NA,0,0,0),
       "NEWPass"=c(0,0,1,0,1,1,0,1,1,0,0,0,0,0,0),
       "GradeNEWPass"=c(7,7,7,4,4,4,5,5,5,10,10,10,4,4,4),
       "GradeBeforeNEWPass"=c(6,6,6,3,3,3,4,4,4,10,10,10,4,4,4))

我有一个名为data的data.frame.它的列名称为学生",成绩"和及格".我希望这样做:

I have a data.frame called data. It has column names Student, Grade and Pass. I wish to do this:

NEWPass:持有通行证,并为每个学生使用先前的值填写NA值.如果第一个值为"NA",则放置零.那么这应该是运行的最大值.

NEWPass: Take Pass and for every Student fill in NA values with the previous value. If the first value is 'NA' than put a zero. Then this should be a running maximum.

GradeNEWPass:采用学生在NEWPass中获得的最低成绩.如果学生未在NEWPass中获得1分,则等于最高成绩.

GradeNEWPass: Take the lowest value of Grade that a Student got a one in NEWPass. If a Student did not get a one in NEWPass, this equals to the maximum grade.

GradeBeforeNEWPass:在学生获得NEWPass分数之前,取分数的价值.如果学生未在NEWPass中获得1分,则等于最高成绩.

GradeBeforeNEWPass: Take the value of Grade BEFORE a Student got a one in NEWPass. If a Student did not get a one in NEWPass, this equals to the maximum grade.

__尝试:

setDT(data)[, NEWPassTry := cummax(Pass), by = Student]
data$GradeNEWPass = data$NEWPassTry * data$Grade
data[, GradeNEWPass := min(GradeNEWPass), by = Student]

推荐答案

我们可以使用data.table方法.按学生"分组,创建一个索引("i1"),其中通过"为1,而不是NA,然后使用whichhead('i2')获得第一个位置1,同时计算max of'Grade'('mx'),然后根据索引创建三列('v1'-获取二进制的累积最大值,'v2'-ifany 1s,然后索引为'i2'或else的'Grade'子集返回'mx',类似地为'v3'-将索引减去1得到'Grade'值

We can use data.table methods. Grouped by 'Student', create an index ('i1') where the 'Pass' is 1 and not an NA, then get the first position of 1 with which and head ('i2'), while calculating the max of 'Grade' ('mx'), then create the three columns based on the indexes ('v1' - get the cumulative maximum of the binary, 'v2' - if there are any 1s, then subset the 'Grade' with the index 'i2' or else return 'mx', similarly 'v3'- the index is subtracted 1 to get the 'Grade' value

library(data.table)
setDT(data)[, c('NEWPass1', 'GradeNEWPass1', 'GradeBeforeNEWPass1') :={
              i1 <- Pass == 1 & !is.na(Pass)
              i2 <- head(which(i1), 1)
              mx <- max(Grade, na.rm = TRUE)
              v1 <- cummax(+(i1))
              v2 <- if(any(i1)) Grade[i2] else mx
              v3 <- if(any(i1)) Grade[max(1, i2-1)] else mx

            .(v1, v2, v3)}, Student]


data
#    Student Grade Pass NEWPass GradeNEWPass GradeBeforeNEWPass NEWPass1 GradeNEWPass1 GradeBeforeNEWPass1
# 1:       1     5   NA       0            7                  6        0             7                   6
# 2:       1     6    0       0            7                  6        0             7                   6
# 3:       1     7    1       1            7                  6        1             7                   6
# 4:       2     3    0       0            4                  3        0             4                   3
# 5:       2     4    1       1            4                  3        1             4                   3
# 6:       2     5    1       1            4                  3        1             4                   3
# 7:       3     4    0       0            5                  4        0             5                   4
# 8:       3     5    1       1            5                  4        1             5                   4
# 9:       3     6    0       1            5                  4        1             5                   4
#10:       4     8    0       0           10                 10        0            10                  10
#11:       4     9   NA       0           10                 10        0            10                  10
#12:       4    10   NA       0           10                 10        0            10                  10
#13:       5     2    0       0            4                  4        0             4                   4
#14:       5     3    0       0            4                  4        0             4                   4
#15:       5     4    0       0            4                  4        0             4                   4

这篇关于R使用Data.Table配置数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 07:34