问题描述
我有一个由特定温度曲线组成的数据集,我想在温度曲线上拟合或映射测量点,如下所示:
I have a dataset consisting of a certain temperature profile and I wanna fit or map the measurement points on temperature profile which is following:
停留时间: 30分钟
上架时间: 1分钟
周期数: 1000个周期
测量点时间段: 16分钟
测量点可以在高正则+150或低正则-40中发生
Measure points can be happened in either in high regim +150 or low regim -40
注意:T0(初始时间)不清楚,因此时间参考也不清楚. T0 = 0.
Note: The T0 (initial time) is not clear so time reference is not clear eg. T0=0 .
我已经在Pandas DataFrame中获取了数据:
I already fetched the data in Pandas DataFrame:
import numpy as np
import pandas as pd
from scipy.optimize import curve_fit
df = pd.read_csv('D:\SOF.csv', header=None)
data = {'A': A[:,0], 'B': B[:,0], 'Temperature': Temperature[:,0],
'S':S, 'C':C , 'Measurement_Points':MP}
dff = pd.DataFrame(data, columns=['A','B','Temperature','S','C','MP'], index = id_set[:,0])
# Temperature's range is [-40,+150]
# MP's range is [0-3000] from 1st MP till last one
MP = int(len(dff)/480) # calculate number of measurement points
print(MP)
for cycle in range(MP):
j = cycle * 480
#use mean or average of each 480 values from temperature column of DataFrame to pass for fit on Thermal profile
Mean_temp = np.mean(df['Temperature'].iloc[j:j+480]) # by using Mean from numpy
#Mean_temp = df.groupby('Temperature').mean() #by using groupby
到目前为止,我只是基于答案和此 但是我想知道拟合过程如何在这里正常工作,我想将温度值仅舍入到最接近的或者 -40 或 +150 .如果有人可以帮助我,我会很好!
So far I just find curve_fit
from scipy.optimize
based on this answer and this post but I am wondering how the fitting process could work right here in the other hand I would like Temperature values round only to the nearest either -40 or +150 .I would be nice if someone can help me!
更新:标准周期性热剖面图如下:
Update: Standard periodic Thermal profile pic is following:
预期结果:
更新的数据示例: 数据
推荐答案
这就是我的出发点:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
### to generate test data
def temp( t , low, high, period, ramp ):
tRed = t % period
dwell = period / 2. - ramp
if tRed < dwell:
out = high
elif tRed < dwell + ramp:
out = high - ( tRed - dwell ) / ramp * ( high - low )
elif tRed < 2 * dwell + ramp:
out = low
elif tRed <= period:
out = low + ( tRed - 2 * dwell - ramp)/ramp * ( high -low )
else:
assert 0
return out + np.random.normal()
### A continuous function that somewhat fits the data
### but definitively gets the period and levels.
### The ramp is less well defined
def fit_func( t, low, high, period, s, delta):
return ( high + low ) / 2. + ( high - low )/2. * np.tanh( s * np.sin( 2 * np.pi * ( t - delta ) / period ) )
time1List = np.arange( 300 ) * 16
time2List = np.linspace( 0, 300 * 16, 7213 )
tempList = np.fromiter( ( temp(t - 6.3 , 41, 155, 63.3, 2.05 ) for t in time1List ), np.float )
funcList = np.fromiter( ( fit_func(t , 41, 155, 63.3, 10., 0 ) for t in time2List ), np.float )
sol, err = curve_fit( fit_func, time1List, tempList, [ 40, 150, 63, 10, 0 ] )
print sol
fittedLow, fittedHigh, fittedPeriod, fittedS, fittedOff = sol
realHigh = fit_func( fittedPeriod / 4., *sol)
realLow = fit_func( 3 / 4. * fittedPeriod, *sol)
print "high, low : ", [ realHigh, realLow ]
print "apprx ramp: ", fittedPeriod/( 2 * np.pi * fittedS ) * 2
realAmp = realHigh - realLow
rampX, rampY = zip( *[ [ t, d ] for t, d in zip( time1List, tempList ) if ( ( d < realHigh - 0.05 * realAmp ) and ( d > realLow + 0.05 * realAmp ) ) ] )
topX, topY = zip( *[ [ t, d ] for t, d in zip( time1List, tempList ) if ( ( d > realHigh - 0.05 * realAmp ) ) ] )
botX, botY = zip( *[ [ t, d ] for t, d in zip( time1List, tempList ) if ( ( d < realLow + 0.05 * realAmp ) ) ] )
fig = plt.figure()
ax = fig.add_subplot( 2, 1, 1 )
bx = fig.add_subplot( 2, 1, 2 )
ax.plot( time1List, tempList, marker='x', linestyle='', zorder=100 )
ax.plot( time2List, fit_func( time2List, *sol ), zorder=0 )
bx.plot( time1List, tempList, marker='x', linestyle='' )
bx.plot( time2List, fit_func( time2List, *sol ) )
bx.plot( rampX, rampY, linestyle='', marker='o', markersize=10, fillstyle='none', color='r')
bx.plot( topX, topY, linestyle='', marker='o', markersize=10, fillstyle='none', color='#00FFAA')
bx.plot( botX, botY, linestyle='', marker='o', markersize=10, fillstyle='none', color='#80DD00')
bx.set_xlim( [ 0, 800 ] )
plt.show()
提供:
>> [155.0445024 40.7417905 63.29983807 13.07677546 -26.36945489]
>> high, low : [155.04450237880076, 40.741790521444436]
>> apprx ramp: 1.540820542195840
有几件事要注意.如果斜率比停留时间小,我的拟合功能会更好.此外,在这里可以找到几篇文章,其中讨论了阶跃函数的拟合.通常,由于拟合需要有意义的导数,因此离散函数是一个问题.至少有两种解决方案. a)制作一个连续的版本,进行拟合,然后根据您的喜好使结果离散化;或者b)提供一个离散的函数和一个手动连续导数.
There is a few things to note. My fit function works better if the ramp is small compared to the dwell time. Moreover, one will find several posts here where the fitting of step functions is discussed. In general, as fitting requires a meaningful derivative, discrete functions are a problem. There are at least two solutions. a) make a continuous version, fit, and make the result discrete to your liking or b) provide a discrete function and a manual continuous derivative.
编辑
这就是我要处理的最新发布的数据集:
So here is what I get working with your newly posted data set:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit, minimize
def partition( inList, n ):
return zip( *[ iter( inList ) ] * n )
def temp( t, low, high, period, ramp, off ):
tRed = (t - off ) % period
dwell = period / 2. - ramp
if tRed < dwell:
out = high
elif tRed < dwell + ramp:
out = high - ( tRed - dwell ) / ramp * ( high - low )
elif tRed < 2 * dwell + ramp:
out = low
elif tRed <= period:
out = low + ( tRed - 2 * dwell - ramp)/ramp * ( high -low )
else:
assert 0
return out
def chi2( params, xData=None, yData=None, verbose=False ):
low, high, period, ramp, off = params
th = np.fromiter( ( temp( t, low, high, period, ramp, off ) for t in xData ), np.float )
diff = ( th - yData )
diff2 = diff**2
out = np.sum( diff2 )
if verbose:
print '-----------'
print th
print diff
print diff2
print '-----------'
return out
# ~ return th
def fit_func( t, low, high, period, s, delta):
return ( high + low ) / 2. + ( high - low )/2. * np.tanh( s * np.sin( 2 * np.pi * ( t - delta ) / period ) )
inData = np.loadtxt('SOF2.csv', skiprows=1, delimiter=',' )
inData2 = inData[ :, 2 ]
xList = np.arange( len(inData2) )
inData480 = partition( inData2, 480 )
xList480 = partition( xList, 480 )
inDataMean = np.fromiter( (np.mean( x ) for x in inData480 ), np.float )
xMean = np.arange( len( inDataMean) ) * 16
time1List = np.linspace( 0, 16 * len(inDataMean), 500 )
sol, err = curve_fit( fit_func, xMean, inDataMean, [ -40, 150, 60, 10, 10 ] )
print sol
# ~ print chi2([-49,155,62.5,1 , 8.6], xMean, inDataMean )
res = minimize( chi2, [-44.12, 150.0, 62.0, 8.015, 12.3 ], args=( xMean, inDataMean ), method='nelder-mead' )
# ~ print res
print res.x
# ~ print chi2( res.x, xMean, inDataMean, verbose=True )
# ~ print chi2( [-44.12, 150.0, 62.0, 8.015, 6.3], xMean, inDataMean, verbose=True )
fig = plt.figure()
ax = fig.add_subplot( 2, 1, 1 )
bx = fig.add_subplot( 2, 1, 2 )
for x,y in zip( xList480, inData480):
ax.plot( x, y, marker='x', linestyle='', zorder=100 )
bx.plot( xMean, inDataMean , marker='x', linestyle='' )
bx.plot( time1List, fit_func( time1List, *sol ) )
bx.plot( time1List, np.fromiter( ( temp( t , *res.x ) for t in time1List ), np.float) )
bx.plot( time1List, np.fromiter( ( temp( t , -44.12, 150.0, 62.0, 8.015, 12.3 ) for t in time1List ), np.float) )
plt.show()
>> [-49.53569904 166.92138068 62.56131027 1.8547409 8.75673747]
>> [-34.12188737 150.02194584 63.81464913 8.26491754 13.88344623]
如您所见,斜坡上的数据点不适合.因此,可能16分钟的时间不是那么恒定吗?这将是一个问题,因为这不是局部x误差,而是累积效应.
As you can see, the data point on the ramp does not fit in. So, it might be that the 16 min time is not that constant? That would be a problem as this is not a local x-error but an accumulating effect.
这篇关于如何拟合温度/热曲线上的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!