python - 如何在numpy中修复 “polyfit maybe poorly conditioned”？

我正在尝试使用numpy包对一组数据进行polyfit。
以下是代码，它可以成功运行。当订单达到20(非常高)时，拟合线似乎适合数据。但是，最后，它说“Polyfit的条件可能很差”。
如果我没看错，那是因为度数越高，拟合对数据就会敏感，即容易受到数据的影响吗？我怎样才能解决这个问题？

def gen_data_9(length=5000):
x = 2.0 * (np.random.rand(length) - 0.5) * np.pi * 2.0
f = lambda x: np.exp(-x**2) * (-x) * 5 + x / 3
y = f(x) + np.random.randn(len(x)) * 0.5
return x, y, f

fig,ax = plt.subplots(3,3,figsize = (16,16))

for n in range(3):
    for k in range(3):

        order = 20*n+10*k+1
        z = np.polyfit(x,y,order)
        p = np.poly1d(z)

        ax[n,k].scatter(x,y,label = "Real data",s=1)
        ax[n,k].scatter(x,p(x),label = "Polynomial with order={}".format(order),
                    color='C1',s=1)
    ax[n,k].legend()

fig.show()

最佳答案

TL; DR:在这种情况下，警告表示:使用较低的顺序!

引用documentation:

换句话说，警告告诉您仔细检查结果。如果它们看起来不错，请不要担心。但是他们还好吗？要回答这个问题，您不仅应该在用于拟合的数据点上评估结果的拟合度(这些数据点通常匹配得很好，尤其是在过度拟合时)。考虑一下:

xp = np.linspace(-1, 1, 10000) * 2 * np.pi

for n in range(3):
    for k in range(3):

        order = 20*n+10*k+1
        print(order)
        z = np.polyfit(x,y,order)
        p = np.poly1d(z)

        ax[n,k].scatter(x,y,label = "Real data",s=1)
        ax[n,k].plot(xp,p(xp),label = "Polynomial with order={}".format(order), color='C1')
        ax[n,k].legend()

在这里，我们在距离样本数据更精细的点上评估polyfit。结果如下:

python - 如何在numpy中修复 “polyfit maybe poorly conditioned”？-LMLPHP

您可以看到40个订单，但结果却实在不值一提。这与我得到的警告一致。