我有一个按时间顺序绘制的数据框,如下图所示(下面的图片中同时绘制了两个数据框)
线图线是橙色和蓝色的细线。每个数据集的回归线或趋势线分别类似于橙色和蓝色的粗线。
如何分别计算回归线上方和下方的R值的总和,最小值,最大值和平均值(距回归线的距离)?那是在python中分别为正R值和负R值的总和,最小值,最大值和平均值。
对于我要尝试做的事情可能会有一个术语,但是我是统计学的新手,不知道那件事。谁能指导我?
更新我所拥有的数据如下所示(实际数据要长得多)。总体趋势有所下降,但两者之间有小幅上升。
Time Values
101 20.402
102 20.302
103 20.202
104 20.102
105 20.002
106 19.902
107 19.802
108 19.702
109 19.602
110 19.502
111 19.402
112 19.302
113 19.202
114 20.337
115 20.437
116 20.537
117 18.802
118 18.702
119 18.602
120 18.502
121 18.402
122 18.302
123 18.202
124 18.102
125 18.002
126 17.902
127 17.802
128 17.702
129 17.602
130 17.502
131 18.502
132 18.402
133 18.302
134 17.702
135 17.602
136 17.502
137 17.402
138 17.302
139 17.202
140 17.102
141 17.002
最佳答案
本示例使用您发布的数据并分别计算正误差和负误差,不包括恰好为零的误差。
import numpy
xData = numpy.array([101.0, 102.0, 103.0, 104.0, 105.0, 106.0, 107.0, 108.0, 109.0, 110.0, 111.0, 112.0, 113.0, 114.0, 115.0, 116.0, 117.0, 118.0, 119.0, 120.0, 121.0, 122.0, 123.0, 124.0, 125.0, 126.0, 127.0, 128.0, 129.0, 130.0, 131.0, 132.0, 133.0, 134.0, 135.0, 136.0, 137.0, 138.0, 139.0, 140.0, 141.0])
yData = numpy.array([20.402, 20.302, 20.202, 20.102, 20.002, 19.902, 19.802, 19.702, 19.602, 19.502, 19.402, 19.302, 19.202, 20.337, 20.437, 20.537, 18.802, 18.702, 18.602, 18.502, 18.402, 18.302, 18.202, 18.102, 18.002, 17.902, 17.802, 17.702, 17.602, 17.502, 18.502, 18.402, 18.302, 17.702, 17.602, 17.502, 17.402, 17.302, 17.202, 17.102, 17.002])
polynomialOrder = 1 # example straight line
# curve fit the test data
fittedParameters = numpy.polyfit(xData, yData, polynomialOrder)
print('Fitted Parameters:', fittedParameters)
modelPredictions = numpy.polyval(fittedParameters, xData)
fitErrors = modelPredictions - yData
positiveErrors = []
negativeErrors = []
# this logic excludes errors of exactly zero
for error in fitErrors:
if error < 0.0:
negativeErrors.append(error)
if error > 0.0:
positiveErrors.append(error)
print('Positive error statistics:')
print(' sum =', numpy.sum(positiveErrors))
print(' min =', numpy.min(positiveErrors))
print(' max =', numpy.max(positiveErrors))
print(' mean =', numpy.mean(positiveErrors))
print()
print('Negative error statistics:')
print(' sum =', numpy.sum(negativeErrors))
print(' min =', numpy.min(negativeErrors))
print(' max =', numpy.max(negativeErrors))
print(' mean =', numpy.mean(negativeErrors))
关于python - Python中回归线上方的R值的总和,最小值,最大值和平均值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57025209/