问题描述
我试图估计我的keras模型的预测时间,并意识到一些奇怪的事情.除了正常情况下速度较快外,模型偶尔还需要很长时间才能得出预测.不仅如此,这些时间还增加了模型运行的时间.我添加了一个最小的工作示例来重现该错误.
import time
import numpy as np
from sklearn.datasets import make_classification
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
# Make a dummy classification problem
X, y = make_classification()
# Make a dummy model
model = Sequential()
model.add(Dense(10, activation='relu',name='input',input_shape=(X.shape[1],)))
model.add(Dense(2, activation='softmax',name='predictions'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X, y, verbose=0, batch_size=20, epochs=100)
for i in range(1000):
# Pick a random sample
sample = np.expand_dims(X[np.random.randint(99), :], axis=0)
# Record the prediction time 10x and then take the average
start = time.time()
for j in range(10):
y_pred = model.predict_classes(sample)
end = time.time()
print('%d, %0.7f' % (i, (end-start)/10))
时间不取决于样本(它是随机抽取的).如果重复测试,则预测所需时间更长的for循环中的索引将再次(几乎)相同.
我正在使用:
tensorflow 2.0.0
python 3.7.4
对于我的应用程序,我需要保证在一定时间内执行.但是考虑到该行为,这是不可能的.怎么了?是Keras中的错误还是Tensorflow后端中的错误?
predict_on_batch
显示相同的行为,但是更稀疏:
y_pred = model(sample, training=False).numpy()
也显示了一些异常值,但是并没有增加.
我降级到了最新的tensorflow 1版本(1.15).不仅不再存在该问题,而且正常"的预测时间也大大缩短了!我不认为这两个峰值是有问题的,因为当我重复测试时它们并没有出现(至少不是在相同的指数处并且线性增加),并且百分比不如第一个图大.
因此我们可以得出结论,这似乎是张量流2.0固有的问题,它在其他情况下的行为类似于@OverLordGoldDragon所述.
在我遇到的几种情况下,TF2通常表现出较差的错误且类似于错误的内存管理-简要说明和此处.特别是对于预测,最有效的馈送方法是直接通过model(x)
-参见及其相关链接的讨论.
model(x)
通过其__call__
方法起作用(它继承自 base_layer.Layer
),而predict()
,predict_classes()
等涉及通过 _select_training_loop()
;每个版本都使用适合不同用例的不同数据预处理和后处理方法,而2.1版中的model(x)
是专门为产生最快的小模型/小批量(可能是任意大小)性能而设计的(而在2.0).从链接的讨论中引用 TensorFlow开发:
注意:这在2.1尤其是2.2中应该不是问题-但无论如何都要测试每种方法.我也意识到这并不能直接回答您关于时间高峰的问题;我怀疑它与Eager缓存机制有关,但是确定的最可靠方法是通过 TF Profiler
,在2.1中已损坏.
更新:关于增加峰值,可能的GPU节流;您已经完成了约1000次迭代,请尝试尝试10,000次-最终,增长应该停止.正如您在评论中指出的那样,model(x)
不会发生这种情况.减少了GPU的工作量(转换为数据集")就很有意义.
Update2 :您可以在此处对开发人员进行调试关于此问题,如果您遇到此问题;主要是我在那里唱歌
I tried to get an estimate of the prediction time of my keras model and realised something strange. Apart from being fairly fast normally, every once in a while the model needs quite long to come up with a prediction. And not only that, those times also increase the longer the model runs. I added a minimal working example to reproduce the error.
import time
import numpy as np
from sklearn.datasets import make_classification
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
# Make a dummy classification problem
X, y = make_classification()
# Make a dummy model
model = Sequential()
model.add(Dense(10, activation='relu',name='input',input_shape=(X.shape[1],)))
model.add(Dense(2, activation='softmax',name='predictions'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X, y, verbose=0, batch_size=20, epochs=100)
for i in range(1000):
# Pick a random sample
sample = np.expand_dims(X[np.random.randint(99), :], axis=0)
# Record the prediction time 10x and then take the average
start = time.time()
for j in range(10):
y_pred = model.predict_classes(sample)
end = time.time()
print('%d, %0.7f' % (i, (end-start)/10))
The time does not depend on the sample (it is being picked randomly). If the test is repeated, the indices in the for loop where the prediction takes longer are going to be (nearly) the same again.
I'm using:
tensorflow 2.0.0
python 3.7.4
For my application I need to guarantee the execution in a certain time. This is however impossible considering that behaviour. What is going wrong? Is it a bug in Keras or a bug in the tensorflow backend?
EDIT:predict_on_batch
shows the same behavior, however, more sparse:
y_pred = model(sample, training=False).numpy()
shows some heavy outliers as well, however, they are not increasing.
EDIT 2:I downgraded to the latest tensorflow 1 version (1.15). Not only is the problem not existent anymore, also the "normal" prediction time significantly improved! I do not see the two spikes as problematic, as they didn't appear when I repeated the test (at least not at the same indices and linearly increasing) and are percentual not as large as in the first plot.
We can thus conclude that this seems to be a problem inherent to tensorflow 2.0, which shows similar behaviour in other situations as @OverLordGoldDragon mentions.
TF2 generally exhibits poor and bug-like memory management in several instances I've encountered - brief description here and here. With prediction in particular, the most performant feeding method is via model(x)
directly - see here, and its linked discussions.
In a nutshell: model(x)
acts via its its __call__
method (which it inherits from base_layer.Layer
), whereas predict()
, predict_classes()
, etc. involve a dedicated loop function via _select_training_loop()
; each utilize different data pre- and post-processing methods suited for different use-cases, and model(x)
in 2.1 was designed specifically to yield fastest small-model / small-batch (and maybe any-size) performance (and still fastest in 2.0).
Quoting a TensorFlow dev from linked discussions:
Note: this should be less of an issue in 2.1, and especially 2.2 - but test each method anyway. Also I realize this doesn't directly answer your question on the time spikes; I suspect it's related to Eager caching mechanisms, but the surest way to determine is via TF Profiler
, which is broken in 2.1.
Update: regarding increasing spikes, possible GPU throttling; you've done ~1000 iters, try 10,000 instead - eventually, the increasing should stop. As you noted in your comments, this doesn't occur with model(x)
; makes sense as one less GPU step is involved ("conversion to dataset").
Update2: you could bug the devs here about it if you face this issue; it's mostly me singing there
这篇关于Keras的预测时间不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!