本文介绍了通过`:`循环中的`:=`赋值(R data.table)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我试图在中为循环分配一些新的变量(我试图创建一些具有共同结构的变量,但它们是依赖于子抽样的)。 我试着在我的生活中重新产生这个错误样本数据,我不能。这里的代码工作&获得我想要做的主旨: dt grp = rep(sample(4,size = 100,replace = T),each = 20),y = runif(2000,min = 0,max = 5),key = c(id,period))[,x:= cumsum(y),by = id] dt2 DT3<在%SEQ(1,100 -dt [ID%,按= 3)] 为(列表中的DD(DT,DT2,DT3 )){ setkey的(setkey的(DD,GRP)[DD [时间段== 0,和(x)时,由= GRP],x_at_0_by_grp:= V],编号,周期)} 这很好 - 但是,当我对自己的代码执行此操作时,它会生成无效的。 selfref warning(并且不创建我想要的变量):事实上,当我将我的数据子集到仅在合并中需要的那些列,它也适用于我的数据(虽然不保存到原始数据集)。 这表明这是一个键控问题,但我明确设置键的每一步。我完全失去了如何调试这里从这里,因为我不能得到错误重复除了我的完整的数据集。 如果我突破操作(dt,dt2,dt3)中的错误)$ {$($,$)$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ dummy setkey(dd,grp) dd [dummy,x_at_0_by_grp:= V1]#*** ERROR HERE *** setkey(dd,id,period)} 快速更新 - 如果我使用 lapply 而不是 / code> loop。 任何想法都在这里发生了什么? UPDATE:我想出了一个解决方法,通过做: nnames< -c(dt,dt2 ,dt3) dt_list< -list(dt,dt2,dt3) for(ii in 1:3){ dummy& dt_list [[ii]]) dummy [,x_at_0_by_grp:= sum(x [period == 0]),by = grp] assign(nnames [ii],dummy)} 还是想了解发生了什么,或许是一种更好的方法 使用20-30条件,将它们保留在列表之外(手动名称为 dt2 等)太笨重,所以我只是假设你有他们所有在 dt_list 。 我建议只使用您计算的统计资料建立表格,然后 rbind > xxt dt_list [[i] [,list(cond = i,xx = sum(x [period == 0])),by = grp])) 创建 grp cond xx 1:1 1 623.3448 2 :2 1 784.8438 3:4 1 699.2362 4:3 1 367.7196 5:1 2 323.6268 6:4 2 307.0374 7:2 2 447.0753 8:3 2 185.7377 9:1 3 275.4897 10:4 3 243.0214 11:2 3 149.6041 12:3 3 166.3626 如果你真的想要这些var,你可以很容易地合并回来。例如, dt2 : myi = 2 setkey(dt_list [[myi]],grp)[xxt [cond == myi,list(grp,xx)]] b $ b 这不能解决你遇到的错误,但我认为是一个更好的方法。 I'm trying to assign some new variables within a for loop (I'm trying to create some variables with common structure, but which are subsample-dependent).I've tried for the life of me to re-produce this error on sample data and I can't. Here's code that works & gets the gist of what I want to do:dt<-data.table(id=rep(1:100,each=20),period=rep(-9:10,100), grp=rep(sample(4,size=100,replace=T),each=20), y=runif(2000,min=0,max=5),key=c("id","period"))[,x:=cumsum(y),by=id]dt2<-dt[id %in% seq(1,100,by=2),]dt3<-dt[id %in% seq(1,100,by=3),]for (dd in list(dt,dt2,dt3)){ setkey(setkey(dd,grp)[dd[period==0,sum(x),by=grp],x_at_0_by_grp:=V1],id,period)}This works fine--however, when I do this to my own code, it generates the Invalid .internal.selfref warning (and doesn't create the variable I want):In fact, when I subset my data to only those columns needed within the merge, it also works fine on my data (though doesn't save to the original data sets).This suggests to me it's a problem with keying, but I'm explicitly setting the keys every step of the way. I'm completely lost on how to debug this from here because I can't get the error to repeat except on my full data set.If I break out the operation into steps, the error arises at the merge step:for (dd in list(dt,dt2,dt3)){ dummy<-dd[period==0,sum(x),by=grp] setkey(dd,grp) dd[dummy,x_at_0_by_grp:=V1] #***ERROR HERE*** setkey(dd,id,period)}Quick update--also produces the error if I cast this with lapply instead of within a for loop. Any ideas what on earth is going on here?UPDATE: I've come up with a workaround by doing:nnames<-c("dt","dt2","dt3")dt_list<-list(dt,dt2,dt3)for (ii in 1:3){ dummy<-copy(dt_list[[ii]]) dummy[,x_at_0_by_grp:=sum(x[period==0]),by=grp] assign(nnames[ii],dummy)}Would still like to understand what's going on, and perhaps a better way of assigning variables iteratively in situations like this. 解决方案 With 20-30 criteria, keeping them outside of a list (with manual names like dt2, etc.) is too clunky, so I'll just assume you have them all in dt_list. I suggest making tables with just the stat you're computing, and then rbinding them:xxt <- rbindlist(lapply(1:length(dt_list),function(i) dt_list[[i]][,list(cond=i,xx=sum(x[period==0])),by=grp]))which creates grp cond xx 1: 1 1 623.3448 2: 2 1 784.8438 3: 4 1 699.2362 4: 3 1 367.7196 5: 1 2 323.6268 6: 4 2 307.0374 7: 2 2 447.0753 8: 3 2 185.7377 9: 1 3 275.489710: 4 3 243.021411: 2 3 149.604112: 3 3 166.3626You can easily merge back if you really want those vars. For example, for dt2:myi = 2setkey(dt_list[[myi]],grp)[xxt[cond==myi,list(grp,xx)]]This doesn't resolve the bug you're running into, but I think is a better approach. 这篇关于通过`:`循环中的`:=`赋值(R data.table)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 09-17 07:00