问题描述
我正在使用严格的正观测向量(它们是距离度量).
我将此向量与ksdensity
结合使用以获得密度函数的感觉,令人惊讶的是它包含负值.意味着有一个观察所有负值区间的正可能性.
这是不正确的,因为我知道我的观察都是积极的.
I am working with a strictly positive observation vector (they are a distance measure).
I use ksdensity
with this vector to get a feeling of the density function and surprisingly it includes negative values. Meaning that there is a positive probability to observe an all negative values interval.
This is not correct because I know my observations are all positives.
为什么ksdensity
这样做?我的感觉是,假设连续微分,它会完成曲线.这是正确的假设吗?
Why is ksdensity
doing this? I have the feeling that it completes the curve assuming continuous differentiability. Is this a correct assumption?
Matlab不会猜测并仅给出经验累积函数的导数"吗?
Is there any option where Matlab doesn't guess and just gives a "Derivate" of the empirical cumulative function?
推荐答案
ksdensity
返回基于正常内核功能的假设.如果您的数据值接近零,那么在对各个内核求和时,您自然会与负数有一些重叠:
The probability density estimate that ksdensity
returns is based off the assumption of a normal kernal function. If your data has values near zero, you'll naturally get some overlap into the negative as the individual kernels are summed:
(图像源)
直方图不会出现此问题,因为它仅显示实际存在的值.为了纠正该错误,您可以指定其他分布(由Mathworks称为内核平滑器"),甚至可以添加自定义分布.例如:
A histogram won't have this problem since it only displays values that actually exist. To remedy the error, you can specify a different distribution (termed by Mathworks as the 'kernel smoother'), or even add a custom one. For example:
[f,xi] = ksdensity(x,pts,'kernel','epanechnikov')
用epanechnikov代替正态分布.
replaces the normal distribution with an epanechnikov.
...并证明您应该始终先阅读文档,我才发现您可以将内核密度估算值限制为仅正值:
...and proving that you should always read the documentation first, I just discovered that you can limit your kernel density estimation to positive values only:
x = gamrnd(5,7,1000,1);
[f,xi] = ksdensity(x,'support','positive');
figure
plot(xi,f,'linewidth',2)
这篇关于Matlab ksdensity无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!