Using a matrix insteadIf you can afford to do it as a matrix (since you have a homogenous data type of "character") instead of a data.frame, you'll get considerably better performance:nr <- 100000nc <- 76mymtx2 <- matrix(sample(letters[1:4], nr*nc, repl=TRUE), nc=nc)dim(mymtx2)## [1] 10000 76system.time( matches2 <- vapply(seq.int(nrow(mymtx2)-1), function(ii) sum(mymtx2[ii,] == mymtx2[ii+1,]), integer(1)) )## user system elapsed ## 0.81 0.00 0.81 (与上次运行的 370.63 用户 相比.)将其扩展到全强度:(Compare with 370.63 user from the previous run.) Scaling it up to full-strength:nr <- 3.7e6nc <- 76mymtx3 <- matrix(sample(letters[1:4], nr*nc, repl=TRUE), nc=nc)dim(mymtx3)## [1] 3700000 76system.time( matches3 <- vapply(seq.int(nrow(mymtx3)-1), function(ii) sum(mymtx3[ii,] == mymtx3[ii+1,]), integer(1)) )## user system elapsed ## 35.32 0.05 35.81 length(matches3)## [1] 3699999sum(matches3 == nc)## [1] 0不幸的是,仍然没有匹配项,但我认为 36 秒对于 3.7M 来说比对于 100K 来说是一个小时要好得多.(如果我做出了错误的假设,请纠正我.)Unfortunately, still no matches, but I think 36 seconds is considerably better for 3.7M than an hour for 100K. (Please correct me if I'm made an incorrect assumption.)(Ref: win7 x64, R-3.0.3-64bit, intel i7-2640M 2.8GHz, 8GB RAM)(Ref: win7 x64, R-3.0.3-64bit, intel i7-2640M 2.8GHz, 8GB RAM) 这篇关于在 R 中加速循环的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-16 04:42