创建皮肤连续分布时使用scipy的rv_continuous方法

本文介绍了创建皮肤连续分布时使用scipy的rv_continuous方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试为我根据数据生成/估算的pdf计算E[f(x)].

I am trying to calculate E[f(x)] for some pdf that I generate/estimated from data.

它在文档中说:

可以通过将rv_continuous类子类化来定义新的随机变量并至少重新定义_pdf或_cdf方法(归一化为位置0和小数位数1)，将得到干净的参数(在在a和b)之间，并传递参数检查方法.

New random variables can be defined by subclassing rv_continuous class and re-defining at least the _pdf or the _cdf method (normalized to location 0 and scale 1) which will be given clean arguments (in between a and b) and passing the argument check method.

如果对于您的RV不正确的肯定参数检查，则您将还需要重新定义_argcheck方法.

If positive argument checking is not correct for your RV then you will also need to re-define the _argcheck method.

所以我继承并定义了_pdf，但是每当我尝试调用时:

So I subclassed and defined _pdf but whenever I try call:

print my_continuous_rv.expect(lambda x: x)

scipy对我大喊:

scipy yells at me:

AttributeError: 'your_continuous_rv' object has no attribute 'a'

之所以有意义，是因为我猜想它试图找出积分的下限，因为它也会在错误中打印出来:

Which makes sense because I guess its trying to figure out the lower bound of the integral because it also print in the error:

lb = loc + self.a * scale

我尝试将self.a和self.b属性定义为(我认为这是定义rv的限制/间隔):

I tried defining the attribute self.a and self.b as (which I believe are the limits/interval of where the rv is defined):

self.a = float("-inf")
self.b = float("inf")

但是，当我这样做时，它会抱怨并说:

However, when I do that then it complains and says:

if N > self.numargs:
AttributeError: 'your_continuous_rv' object has no attribute 'numargs'

我不是很确定numargs应该是什么，但是在github上检查scipy的代码后，看起来有这行代码:

I was not really sure what numargs was suppose to be but after checking scipy's code on github it looks there is this line of code:

if not hasattr(self, 'numargs'):
    # allows more general subclassing with *args
    self.numargs = len(shapes)

我以为函数应该采用的随机变量的形状.

Which I assume is the shape of the random variable my function was suppose to take.

当前，我只用一个浮点数做一个非常简单的随机变量，将其作为可能的值.因此，我决定将numargs硬编码为1.但这只是导致scipy部分大喊大叫.

Currently I am only doing a very simple random variable with a single float as a possible value for it. So I decided to hard code numargs to be 1. But that just lead down the road to more yelling from scipy's part.

因此，归结为，我认为从文档中我可以清楚地了解我在对其进行子类化时必须做什么，因为我按照他们的话做了，就覆盖了_pdf，但是这样做之后，它要求我自我.a，我对其进行了硬编码，然后它要求我提供numargs，至此，我认为我的结论是我真的不知道他们如何希望我继承rv_continuous.有人知道吗?我可以从我要拟合的数据中生成想要的pdf，然后可以从pdf中获取期望值和类似的东西，还需要在rv_continous中初始化什么才能使其真正起作用?

Thus, what it boils down is that I think from the documentation its not clear to me what I have to do when I subclass it, because I did what they said, to overwrite _pdf but after doing that it asks me for self.a, which I hardcoded and then it asks me for numargs, and at this point I think I am concluding I don't really know how they want me to subclass, rv_continuous. Does some one know? I have can generate the pdf I want from the data I want to fit and then just be able to get expected values and things like that from the pdf, what else do I have to initialize in rv_continous so that it actually works?

推荐答案

由于历史原因，scipy分布是实例，因此您需要具有子类的实例.例如:

For historical reasons, scipy distributions are instances, so that you need to have an instance of your subclass. For example:

>>> class MyRV(stats.rv_continuous):
...    def _pdf(self, x, k):
...      return k * np.exp(-k*x)
>>> my_rv = MyRV(name='exp', a=0.)     # instantiation

注意需要指定支持的限制:默认值为a=-inf和b=inf.

Notice the need to specify the limits of the support: default values are a=-inf and b=inf.

>>> my_rv.a, my_rv.b
(0.0, inf)
>>> my_rv.numargs        # gets figured out automagically
1

指定了_pdf后，您就有一个有效的分发实例:

Once you've specified, say, _pdf, you have a working distribution instance:

>>> my_rv.cdf(4, k=3)
0.99999385578764677
>>> my_rv.rvs(k=3, size=4)
array([ 0.37696127,  1.10192779,  0.02632473,  0.25516446])
>>> my_rv.expect(lambda x: 1, args=(2,))    # k=2 here
0.9999999999999999

这篇关于创建皮肤连续分布时使用scipy的rv_continuous方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

When