python - ValueError : A value in x_new is below the interpolation range

这是我执行时遇到的scikit-learn错误

my_estimator = LassoLarsCV(fit_intercept=False, normalize=False, positive=True, max_n_alphas=1e5)

请注意，如果我将max_n_alphas从1e5降低到1e4，则不会再收到此错误。

任何人都知道发生了什么事吗？

当我打电话时发生错误

my_estimator.fit(x, y)

我在40k维度中有40数据点。

完整的堆栈跟踪如下所示

  File "/usr/lib64/python2.7/site-packages/sklearn/linear_model/least_angle.py", line 1113, in fit
    axis=0)(all_alphas)
  File "/usr/lib64/python2.7/site-packages/scipy/interpolate/polyint.py", line 79, in __call__
    y = self._evaluate(x)
  File "/usr/lib64/python2.7/site-packages/scipy/interpolate/interpolate.py", line 498, in _evaluate
    out_of_bounds = self._check_bounds(x_new)
  File "/usr/lib64/python2.7/site-packages/scipy/interpolate/interpolate.py", line 525, in _check_bounds
    raise ValueError("A value in x_new is below the interpolation "
ValueError: A value in x_new is below the interpolation range.

最佳答案

您的数据必须有一些特定的内容。 LassoLarsCV()似乎可以很好地与行为良好的数据的合成示例一起正常工作:

import numpy
import sklearn.linear_model

# create 40000 x 40 sample data from linear model with a bit of noise
npoints = 40000
ndims = 40
numpy.random.seed(1)
X = numpy.random.random((npoints, ndims))
w = numpy.random.random(ndims)
y = X.dot(w) + numpy.random.random(npoints) * 0.1

clf = sklearn.linear_model.LassoLarsCV(fit_intercept=False, normalize=False, max_n_alphas=1e6)
clf.fit(X, y)

# coefficients are almost exactly recovered, this prints 0.00377
print max(abs( clf.coef_ - w ))

# alphas actually used are 41 or ndims+1
print clf.alphas_.shape

这在sklearn 0.16中，我没有positive=True选项。

我不确定为什么您仍然要使用很大的max_n_alphas。虽然我不知道为什么1e + 4起作用而1e + 5不能起作用，但我怀疑您从max_n_alphas = ndims + 1和max_n_alphas = 1e + 4获得的路径或对行为良好的数据而言完全相同的路径。通过clf.alpha_中的交叉验证估计的最佳alpha值也将是相同的。查看Lasso path using LARS示例以了解alpha试图做什么。

另外，来自LassoLars documentation

因此以上面的ndims + 1(即n_features + 1)大小的alphas_结尾是有意义的。

P.S.使用sklearn 0.17.1和positive = True进行了测试，还使用了一些正负系数进行了测试，结果相同:alphas_为ndims + 1或更小。

关于python - ValueError : A value in x_new is below the interpolation range，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/36320787/