问题描述
我有一个像
A= [ 1 2 4
2 3 1
3 1 2 ]
我想通过行和列通过计算其累计总和,就是我想要的结果是
and I would like to calculate its cumulative sum by row and by column, that is, I want the result to be
B = [ 1 3 7
3 8 13
6 12 19 ]
如何在一个快速的方式使这个R中的任何想法? (可能使用功能cumsum)
(我有巨大的矩阵)
Any ideas of how to make this in R in a fast way? (Possibly using the function cumsum)(I have huge matrices)
谢谢!
推荐答案
一个单行:
t(apply(apply(A, 2, cumsum)), 1, cumsum))
底层的观察是,你可以先计算在列的累积和,然后这个矩阵在行的累计总和。
The underlying observation is that you can first compute the cumulative sums over the columns and then the cumulative sum of this matrix over the rows.
请注意:在做行,你要转的结果矩阵
Note: When doing the rows, you have to transpose the resulting matrix.
您例如:
> apply(A, 2, cumsum)
[,1] [,2] [,3]
[1,] 1 2 4
[2,] 3 5 5
[3,] 6 6 7
> t(apply(apply(A, 2, cumsum), 1, cumsum))
[,1] [,2] [,3]
[1,] 1 3 7
[2,] 3 8 13
[3,] 6 12 19
关于业绩:我现在已经知道这个方法有多好,扩展到大的矩阵。复杂性的角度来看,这应该是接近最优。通常情况下,适用
在性能上并不坏为好。
现在我是越来越好奇 - 什么方法是最好的一个?一个简短的风向标:
Now I was getting curious - what approach is the better one? A short benchmark:
> A <- matrix(runif(1000*1000, 1, 500), 1000)
>
> system.time(
+ B <- t(apply(apply(A, 2, cumsum), 1, cumsum))
+ )
User System elapsed
0.082 0.011 0.093
>
> system.time(
+ C <- lower.tri(diag(nrow(A)), diag = TRUE) %*% A %*% upper.tri(diag(ncol(A)), diag = TRUE)
+ )
User System elapsed
1.519 0.016 1.530
因此:应用由15倍优于矩阵乘法(只是为了对比:MATLAB需要0.10719秒),结果真的不意外,因为适用
- 版本可以在O进行(N ^ 2),而矩阵乘法将需要约为O(n ^ 2.7)计算。因此,如果n是足够大的矩阵乘法提供所有的优化应该会丢失。
Thus: Apply outperforms matrix multiplication by a factor of 15. (Just for comparision: MATLAB needed 0.10719 seconds.) The results do not really surprise, as the apply
-version can be done in O(n^2), while the matrix multiplication will need approx. O(n^2.7) computations. Thus, all optimizations that matrix multiplication offers should be lost if n is big enough.
这篇关于在矩阵累计总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!