问题描述
我在实验室中使用仪器测量了以下数据.由于该仪器会根据其直径将不同大小的颗粒收集在垃圾箱中,因此测量结果实际上是装箱"的:
I have the following data measured using an instrument in the lab. Since the instrument collects particles of different sizes in bins based upon their diameter the measurements are essentially "binned":
import numpy as np
import matplotlib.pylab as plt
from lmfit import models
y = np.array([196, 486, 968, 2262, 3321, 4203, 15072, 46789, 95201, 303494, 421484, 327507, 138931, 27973])
bins = np.array([0.0150, 0.0306, 0.0548, 0.0944, 0.1540, 0.2560, 0.3830, 0.6050, 0.9510, 1.6400, 2.4800, 3.6700, 5.3800, 9.9100, 15])
bin_width=np.diff(bins)
x_plot = np.add(bins[:-1],np.divide(bin_width,2))
x=x_plot
y=y
在绘制时,这里是数据的外观.以x标度为单位,有一个模式在0.1左右,另一种模式在2左右.
When plotted here is how the data look. There is one mode around 0.1 and another mode around 2 in the units of the x-scale.
在此研究领域中,通常将多峰"对数正态分布拟合到此类数据:鉴于此,我已使用LMFIT拟合了大约2的模式:
Within this research area it is common to fit "multimodal" lognormal distributions to such data: Given this I have fitted the mode around 2 using LMFIT:
model = models.LognormalModel()
params = model.make_params(center=1.5, sigma=0.6, amplitude=2214337)
result = model.fit(y, params, x=x)
print(result.fit_report())
plt.plot(x, y, label='data')
plt.plot(x, result.best_fit, label='fit')
plt.xscale("log")
plt.yscale("log")
plt.legend()
plt.show()
如预期的那样,这将非常适合第二模式在2附近.我的问题是,我还将如何着手在0.1附近适应第二模式(基本上这两种模式的总和应该适合数据)?我意识到也可以争论说三种模式会更好,但是我假设一旦我了解了如何使用两种模式,那么添加第三种就不那么容易了.
As expected this results in a good fit for the second mode around 2. My question is how would I also go about fitting a second mode around 0.1 (essentially a sum of the two modes should fit the data)? I realise it could also be argued that three modes would be better but I assume once I understand how to use two modes, the addition of a third should be trivial.
推荐答案
lmfit.Models
可以一起添加,就像:
lmfit.Models
can be added together, as with:
model = (models.LognormalModel(prefix='p1_') +
models.LognormalModel(prefix='p2_') +
models.LognormalModel(prefix='p3_') )
params = model.make_params(p1_center=0.3, p1_sigma=0.2, p1_amplitude=1e4,
p2_center=1.0, p2_sigma=0.4, p2_amplitude=1e6,
p3_center=1.5, p3_sigma=0.6, p3_amplitude=2e7)
在复合模型中,模型的每个组件都有其自己的前缀"(任何字符串),该前缀带有参数名称.符合以下条件后,您可以获取有关Models组件的字典:
In a composite model, each component of the Model gets its own "prefix" (any string) that prepends the parameter names. You can get a dictionary of a Models components after the fit with:
components = result.eval_components()
# {'p1_': Array, 'p2_': Array, 'p3_': Array}
for compname, comp in components.keys():
plt.plot(x, comp, label=compname)
对于拟合将在半对数或对数对数图中表示的数据,您可以考虑将模型拟合至 log(y)
.否则,对于 y
的非常低的值,拟合对失配不会非常敏感.
For fitting data that you would represent on a semi-log or log-log plot, you might consider fitting a model to log(y)
. Otherwise, the fit will not be very sensitive to misfit at very low values of y
.
请注意, lmfit
模型和参数支持边界.您可能想要或发现需要放置边界,例如
Note that lmfit
models and parameters support bounds. You may want to or find that you need to place bounds such as
params['p1_amplitude'].min = 0
params['p1_sigma'].min = 0
params['p1_center'].max = 0.5
params['p3_center'].min = 1.0
这篇关于适合“多式联运"使用python对数正态分布到数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!