本文介绍了为什么numpy std()与matlab std()给出不同的结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试将matlab代码转换为numpy,并发现numpy与std函数的结果不同.

I try to convert matlab code to numpy and figured out that numpy has a different result with the std function.

在MATLAB中

std([1,3,4,6])
ans =  2.0817

以numpy格式

np.std([1,3,4,6])
1.8027756377319946

这正常吗?而我应该如何处理呢?

Is this normal? And how should I handle this?

推荐答案

NumPy函数 np.std 采用可选参数ddof:自由度增量".默认情况下,这是0.将其设置为1以获得MATLAB结果:

The NumPy function np.std takes an optional parameter ddof: "Delta Degrees of Freedom". By default, this is 0. Set it to 1 to get the MATLAB result:

>>> np.std([1,3,4,6], ddof=1)
2.0816659994661326

要添加更多上下文,在计算方差(标准偏差为平方根)时,通常将其除以我们拥有的值的数量.

To add a little more context, in the calculation of the variance (of which the standard deviation is the square root) we typically divide by the number of values we have.

但是,如果我们从较大的分布中随机选择N个元素的样本并计算方差,则将N除以会导致实际方差的低估.为了解决这个问题,我们可以将除以(自由度)的数字降低到小于N(通常是N-1). ddof参数允许我们按指定的数量更改除数.

But if we select a random sample of N elements from a larger distribution and calculate the variance, division by N can lead to an underestimate of the actual variance. To fix this, we can lower the number we divide by (the degrees of freedom) to a number less than N (usually N-1). The ddof parameter allows us change the divisor by the amount we specify.

除非另有说明,否则NumPy将计算方差(ddof=0除以N)的 biased 估计量.如果要使用整个分布(而不是从较大的分布中随机选择的值的子集),这就是您想要的.如果指定了ddof参数,则NumPy会除以N - ddof.

Unless told otherwise, NumPy will calculate the biased estimator for the variance (ddof=0, dividing by N). This is what you want if you are working with the entire distribution (and not a subset of values which have been randomly picked from a larger distribution). If the ddof parameter is given, NumPy divides by N - ddof instead.

MATLAB std的默认行为是通过除以N-1来校正样本方差的偏差.这消除了标准偏差中的某些(但可能不是全部)偏差.如果您是在较大分布的随机样本上使用此函数,则可能正是您想要的.

The default behaviour of MATLAB's std is to correct the bias for sample variance by dividing by N-1. This gets rid of some of (but probably not all of) of the bias in the standard deviation. This is likely to be what you want if you're using the function on a random sample of a larger distribution.

@hbaderts的不错回答给出了进一步的数学细节.

The nice answer by @hbaderts gives further mathematical details.

这篇关于为什么numpy std()与matlab std()给出不同的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 11:54