在加权拟合后将卡方减小得太小

在加权拟合后将卡方减小得太小

本文介绍了Python lmfit 在加权拟合后将卡方减小得太小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用带有以下代码的一些测试数据在 Python 2.7 中使用 lmfit 进行拟合.我需要权重为 1/y 的加权拟合(使用 Leven-Marq. 例程).我已经定义了权重并在此处使用它们:

I am running a fit in Python 2.7 with lmfit using some test data with the following code. I require a weighted fit with weights of 1/y (with the Leven-Marq. routine). I have defined weights and am using them here:

from __future__ import division
from numpy import array, var
from lmfit import Model
from lmfit.models import GaussianModel, LinearModel

import matplotlib.pyplot as plt
import seaborn as sns

xd = array([1267, 1268, 1269, 1270, 1271, 1272, 1273, 1274, 1275, 1276,
    1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284, 1285, 1286, 1287, 1288,
     1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299, 1300,
     1301, 1302, 1303, 1304, 1305, 1306, 1307, 1308, 1309, 1310, 1311, 1312,
     1313, 1314, 1315, 1316, 1317, 1318, 1319, 1320, 1321, 1322, 1323, 1324,
     1325, 1326, 1327, 1328, 1329, 1330, 1331, 1332, 1333, 1334])
yd = array([238, 262, 255, 271, 270, 281, 261, 278, 280, 254, 289, 285, 304, 314,
    329, 342, 379, 450, 449, 564, 613, 705, 769, 899, 987, 1043, 1183, 1295, 1298,
    1521, 1502, 1605, 1639, 1572, 1659, 1558, 1476, 1397, 1267, 1193, 1016, 951,
    835, 741, 678, 558, 502, 480, 442, 399, 331, 334, 308, 283, 296, 265, 264,
    273, 258, 270, 262, 263, 239, 263, 251, 246, 246, 234])

mod = GaussianModel() + LinearModel()
pars  = mod.make_params(amplitude=25300, center=1299, sigma=7, slope=0, intercept=450)
result = mod.fit(yd, pars, method='leastsq', x=xd, weights=1./yd)
rsq = 1 - result.residual.var() / var(yd)
print(result.fit_report())
print rsq

plt.plot(xd, yd,         'bo', label='raw')
plt.plot(xd, result.init_fit, 'k--', label='Initial_Guess')
plt.plot(xd, result.best_fit, 'r-', label='Best')
plt.legend()
plt.show()

输出为:

[[Model]]
    (Model(gaussian) + Model(linear))
[[Fit Statistics]]
    # function evals   = 27
    # data points      = 68
    # variables        = 5
    chi-square         = 0.099
    reduced chi-square = 0.002
    Akaike info crit   = -434.115
    Bayesian info crit = -423.017
[[Variables]]
    sigma:       7.57360038 +/- 0.063715 (0.84%) (init= 7)
    center:      1299.41410 +/- 0.071046 (0.01%) (init= 1299)
    amplitude:   25369.3304 +/- 263.0961 (1.04%) (init= 25300)
    slope:      -0.15015228 +/- 0.071540 (47.65%) (init= 0)
    intercept:   452.838215 +/- 93.28860 (20.60%) (init= 450)
    fwhm:        17.8344656 +/- 0.150037 (0.84%)  == '2.3548200*sigma'
    height:      1336.33919 +/- 17.28192 (1.29%)  == '0.3989423*amplitude/max(1.e-15, sigma)'
.
.
.
.
0.999999993313

最后一行(就在此处上方,或紧接在 plt.plot(xd, yd, 'bo', label='raw') 之前)是 R^2,结果拟合为附在这里..

The last line (just above here, or immediately before plt.plot(xd, yd, 'bo', label='raw')) is the R^2 and the resulting fit is attached here..

R^2 和对输出的目视检查表明这是一个合理的拟合.我期待 1.00 阶的减少卡方 (来源).然而,减少卡方值的返回值比 1.00 小几个数量级.

The R^2 and the visual inspection of the output suggest this is a reasonable fit. I am expecting a reduced chi-squared of order 1.00 (source). However, the returned value for the reduced chi-squared value is several orders of magnitude smaller than 1.00.

因为默认是无权重lmfit 中,我需要一个加权拟合,我已经定义了权重,但我认为我需要以不同的方式指定它们.我怀疑这种权重规格可能会导致减少的卡方变得如此之小.

Since the default is no weights in lmfit and I need a weighted fit, I have defined weights, but I think I need to be specifying them differently. My suspicion is this specification of weights might be causing the reduced chi-squared to be so small.

是否有不同的方法来指定权重或其他一些参数,使得曲线拟合后的缩减卡方接近或与 1.00 的数量级相同?

Is there a different way to specify weights, or some other parameter, such that the reduced chi-squared after the curve fit is close to or on the same magnitude as 1.00?

推荐答案

lmfit 中的权重是在最小二乘意义上最小化残差的乘法因子.也就是说,它取代了

The weight in lmfit is a multiplicative factor for the residua to be minimized in the least-squares sense. That is, it replaces

residual = model - data

residual = (model - data) * weights

一种常见的方法,我认为您可能打算使用一种方法,即权重应为 1.0/variance_in_data,因为这通常意味着将卡方减少到 1 左右以获得良好拟合,如您链接到讨论的优秀文章.

A common approach, and one that I think you might be intending, is to say that the weights should be 1.0/variance_in_data, as that is what usually means to get to reduced chi-square around 1 for a good fit, as the excellent writeup you link to discusses.

正如那里所讨论的,问题是确定数据的方差.在很多情况下,例如当信号以计数统计为主时,数据的方差可以估计为sqrt(data).这忽略了许多噪声源,但通常是一个很好的起点.碰巧,我相信使用

As discussed there, the problem is determining the variance in the data. For many cases, such as when the signal is dominated by counting statistics, the variance in data can be estimated as sqrt(data). This ignores many sources of noise, but is often a good starting point. As it happens, I believe using

result = model.fit(..., weights=np.sqrt(1.0/yd))

将导致您的情况的卡方减少约 0.8.我想这可能就是你想要的.

will lead to a reduced chi-square of around 0.8 for your case. I think that is probably what you want.

另外,为了澄清一个相关的观点:您链接的文章讨论了当减少的卡方远非 1 时缩放拟合参数中的不确定性.Lmfit 会默认进行那里描述的缩放(scale_covar 选项可以关闭这个),这样改变权重的比例不会改变参数sigmacenter等中不确定性的比例.不确定性(和最佳拟合值)会发生一些变化,因为权重的变化会改变每个数据点的重点,但最佳拟合值不会有太大变化,估计的不确定性应该保持相同的数量级即使您对数据方差的估计(因此减少卡方)偏离了几个数量级.

Also, to clarify a related point: The writeup you link discusses scaling the uncertainties in the fitted parameters when reduced chi-square is far from 1. Lmfit does the scaling described there by default (the scale_covar option can turn this off), so that changing the scale of the weights won't change the scale of the uncertainties in the parameters sigma, center, etc. The values for the uncertainties (and, best-fit values) will change some because the change in weighting changes the emphasis for each data point, but the best-fit values won't change much, and the estimated uncertainties should stay the same order of magnitude even if your estimate of the variance in the data (and so reduced chi-square) is off by a few orders of magnitude.

也就是说,将您的脚本更改为使用 weights=1.0/np.sqrt(yd) 将使减少的卡方更接近 1,但不会改变拟合变量的不确定性非常喜欢.

That is, changing your script to use weights=1.0/np.sqrt(yd) will bring reduced chi-square much closer to 1, but it will not change the uncertainties in the fitted variables very much.

这篇关于Python lmfit 在加权拟合后将卡方减小得太小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-22 15:17