问题描述
当我得到一些奇怪的结果时,我正在对运行Microsoft R 3.3.0的Macbook进行一些主成分分析.经过与同事的双重检查,我已经意识到SVD函数的输出与使用Vanilla R可能获得的输出有所不同.
I was doing some principal component analysis on my macbook running Microsoft R 3.3.0 when I got some strange results. Double checking with a colleague, I've realised that the output of the SVD function was different from what I may get by using vanilla R.
这是可重复的结果,请在此处
This is the reproducible result, please load the file (~78 Mb) here
使用Microsoft R 3.3.0(x86_64-apple-darwin14.5.0),我得到:
With Microsoft R 3.3.0 (x86_64-apple-darwin14.5.0) I get:
>> sv <- svd(Cx)
>> print(sv$d[1:10])
[1] 122.73664 104.45759 90.52001 87.21890 81.28256 74.33418 73.29427 66.26472 63.51379
[10] 55.20763
而不是在普通R上(在两个不同的Linux机器上都具有R 3.3和R 3.3.1):
Instead on a vanilla R (both with R 3.3 and R 3.3.1 on two different linux machines):
>> sv <- svd(Cx)
>> print(sv$d[1:10])
[1] 122.73664 34.67177 18.50610 14.04483 8.35690 6.80784 6.14566
[8] 3.91788 3.76016 2.66381
并非所有数据都发生这种情况,如果我创建一些随机矩阵并对之应用svd,则会得到相同的结果.因此,看起来像是一种数值不稳定,不是吗?
This is not happening with all the data, if I create some random matrix and I apply svd on that, I get the same results. So, it looks like a sort of numerical instability, isn't it?
更新:我尝试使用svd
包在具有相同R版本的同一台计算机(macbook)上的同一矩阵(Cx
)上计算SVD,最后我得到了正确的"结果数字.然后,这似乎是由于Microsoft R Open使用的svd实现.
UPDATE: I've tried to compute the SVD on the same matrix (Cx
) on the same machine (macbook) with the same version of R by using the svd
package and finally I get the "right" numbers. Then it seems due to the svd implementation used by Microsoft R Open.
更新:该行为也在MRO 3.3.1上发生
UPDATE: The behaviour happens also on MRO 3.3.1
推荐答案
看来,这是一种错误,已确认在Microsoft-r-open的Github中.他们说这种行为正在调查中,并且与MacOs中的Accelerate库有关.
It seems this is a sort of bug, as confirmed in the Github of microsoft-r-open. They say this behaviour is under investigation and it's related with the Accelerate library in MacOs.
这篇关于在协方差矩阵上计算SVD时的奇怪行为:Microsoft R和Vanilla R之间的结果不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!