本文介绍了基本偏度公式,Python和R之间的偏度不一致的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用的数据粘贴在下面.当我将基本公式应用于偏斜时,我在R中的数据:

The data I'm using is pasted below. When I apply the basic formula for skewness to my data in R:

3*(mean(data) - median(data))/sd(data)

结果为-0.07949198.我在Python中得到了非常相似的结果.因此,中位数大于表明左尾较长的平均值.

The result is -0.07949198. I get a very similar result in Python. The median is therefore greater than the mean suggesting the left tail is longer.

但是,当我从 fitdistrplus包倾斜返回0.303的偏度.

However, when I apply the descdist function from the fitdistrplus package, the skewness is 0.3076471 suggesting the right tail is longer. The Scipy function skew again returns a skewness of 0.303.

我可以相信这个使我产生负偏度的简单公式吗?这是怎么回事.

Can I trust this simple formula which gives me a negative skewness? What is going on here.

谢谢,奥利弗(Oliver)

Thanks,Oliver

data = c(0.18941565600882029, 1.9861271676300578, -5.2022598870056491, 1.6826411075612353, 1.6826411075612353, -2.9502890173410403, -2.923253150057274, -2.9778296382730454, 0.71202396234488663, 0.71202396234488663, -3.1281373844121529, 1.8326831382748159, -5.2961554710604135, 2.7793190416141234, 0.46922759190417185, 7.0730158730158728, 1.1745152354570636, 2.8142292490118579, 2.037940379403794, 7.0607489597780866, 10.460258249641321, 11.894978479196554, 4.8334682860998655, 1.3884016973125886, 4.0940458015267174, 0.12592959841348539, -0.37022332506203476, 1.9713554987212274, -0.83774145616641893, -1.896978417266187, 6.4340675477239362, -6.4774193548387089, -0.31790393013100438, -4.4193265007320646, 5.7454545454545451, 2.5913432835820895, 0.86190724335591451, 0.95753781950965045, 6.8923556942277697, 1.7650659630606862, -2.4558421851289833, -2.390546528803545, 2.6355029585798815, 0.26983655274888557, 1.5032159264931086, 3.9839506172839503, -5.1404511278195484, -2.2477777777777779, 6.0604444444444443, -0.9691172451489477, 1.1383462670591382, -1.5281319661168078, 4.7775667118950702, 1.2223175965665234, 2.0563555555555553, -3.6153201970443352, -0.35731206188058978, -3.6265094676670238, 1.3053804930332262, -4.4604960677555958, -0.8933514246947083, 0.7622542595019659, 1.3892170651664322, 2.5725258493353031, -0.028006088280060883, 0.8933947772657449, 2.4907086614173228, 3.0914196567862717, 4.4222575516693157, 0.64568527918781726, 0.97095158597662778, -3.7409780775716697, -3.3472636815920396, -0.66307448494453247, -7.0384291725105186, -0.14540612516644474, -0.38161535029004906, 5.1076923076923082, 4.0237516869095806, 1.510099573257468, 1.5064083457526081, -0.025879043600562587, 4.5001414427156998, 3.2326264274061991, 1.0185639229422065, 2.66690518783542, 0.53032015065913374, 1.2117829457364342, 0.60861244019138749, -2.5248049921996878, 1.8666666666666669, -0.32978612415232139, 0.29055999999999998, 1.9150729335494328, 2.2988352745424296, 3.779225265235628, 0.093884800811976657, 1.0097869890616005, 1.2220632081097198, 0.21164401128494487)

推荐答案

我现在无法访问您提到的软件包,因此我无法检查它们适用的公式,但是,您似乎正在使用Pearson的第二个软件包偏度系数(请参见维基百科).样本偏斜度的估算器在同一页面上给出,并由第三矩给出,该矩可以通过以下方式简单地计算出:

I don't have access to the packages you mention right now so I can't check which formula they apply, however, you seem to be using Pearson's second skewness coefficient (see wikipedia). The estimator for the sample skewness is given on the same page and is given by the third moment which can be calculated simply by:

> S <- mean((data-mean(data))^3)/sd(data)^3
> S
[1] 0.2984792
> n <- length(data)
> S_alt <- S*n^2/((n-1)*(n-2))
> S_alt
[1] 0.3076471

请参见Wiki页面上的替代定义,该定义所产生的结果与示例中的结果相同.

See the alternative definition on the wiki page which yields the same results as in your example.

这篇关于基本偏度公式,Python和R之间的偏度不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-02 20:51