


I am trying to calculate E[f(x)] for some pdf that I generate/estimated from data.


可以通过将rv_continuous类子类化来定义新的随机变量 并至少重新定义_pdf或_cdf方法(归一化为 位置0和小数位数1),将得到干净的参数(在 在a和b)之间,并传递参数检查方法.

New random variables can be defined by subclassing rv_continuous class and re-defining at least the _pdf or the _cdf method (normalized to location 0 and scale 1) which will be given clean arguments (in between a and b) and passing the argument check method.

如果对于您的RV不正确的肯定参数检查,则您将 还需要重新定义_argcheck方法.

If positive argument checking is not correct for your RV then you will also need to re-define the _argcheck method.


So I subclassed and defined _pdf but whenever I try call:

print my_continuous_rv.expect(lambda x: x)


scipy yells at me:

AttributeError: 'your_continuous_rv' object has no attribute 'a'


Which makes sense because I guess its trying to figure out the lower bound of the integral because it also print in the error:

lb = loc + self.a * scale


I tried defining the attribute self.a and self.b as (which I believe are the limits/interval of where the rv is defined):

self.a = float("-inf")
self.b = float("inf")


However, when I do that then it complains and says:

if N > self.numargs:
AttributeError: 'your_continuous_rv' object has no attribute 'numargs'


I was not really sure what numargs was suppose to be but after checking scipy's code on github it looks there is this line of code:

if not hasattr(self, 'numargs'):
    # allows more general subclassing with *args
    self.numargs = len(shapes)


Which I assume is the shape of the random variable my function was suppose to take.


Currently I am only doing a very simple random variable with a single float as a possible value for it. So I decided to hard code numargs to be 1. But that just lead down the road to more yelling from scipy's part.


Thus, what it boils down is that I think from the documentation its not clear to me what I have to do when I subclass it, because I did what they said, to overwrite _pdf but after doing that it asks me for self.a, which I hardcoded and then it asks me for numargs, and at this point I think I am concluding I don't really know how they want me to subclass, rv_continuous. Does some one know? I have can generate the pdf I want from the data I want to fit and then just be able to get expected values and things like that from the pdf, what else do I have to initialize in rv_continous so that it actually works?



For historical reasons, scipy distributions are instances, so that you need to have an instance of your subclass. For example:

>>> class MyRV(stats.rv_continuous):
...    def _pdf(self, x, k):
...      return k * np.exp(-k*x)
>>> my_rv = MyRV(name='exp', a=0.)     # instantiation


Notice the need to specify the limits of the support: default values are a=-inf and b=inf.

>>> my_rv.a, my_rv.b
(0.0, inf)
>>> my_rv.numargs        # gets figured out automagically


Once you've specified, say, _pdf, you have a working distribution instance:

>>> my_rv.cdf(4, k=3)
>>> my_rv.rvs(k=3, size=4)
array([ 0.37696127,  1.10192779,  0.02632473,  0.25516446])
>>> my_rv.expect(lambda x: 1, args=(2,))    # k=2 here


08-20 03:10