问题描述
我使用 Tensorflow 1.15.3 量化了一个 Keras h5 模型(TF 1.13 ;keras_vggface 模型),以便与 NPU 一起使用.我用于转换的代码是:
I quantize a Keras h5 model (TF 1.13 ; keras_vggface model) with Tensorflow 1.15.3, to use it with an NPU. The code I used for conversion is:
converter = tf.lite.TFLiteConverter.from_keras_model_file(saved_model_dir + modelname)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 # or tf.uint8
converter.inference_output_type = tf.int8 # or tf.uint8
tflite_quant_model = converter.convert()
我得到的量化模型乍一看还不错.层的输入类型为int8,filter为int8,bias为int32,输出为int8.
The quantized model I get looks good on first sight.Input type of layers are int8, filter are int8, bias is int32, and output is int8.
然而,模型在输入层之后有一个量化层,输入层是 float32 [见下图].但似乎NPU也需要输入为int8.
However, the model has a quantize layer after the input layer and the input layer is float32 [See image below]. But it seems that the NPU needs also the input to be int8.
有没有一种方法可以在没有转换层的情况下完全量化,但也有 int8 作为输入?
如上所示,我使用了 :
As you see above, I used the :
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
编辑
来自用户 dtlam 的解决方案
即使模型仍然不能与 google NNAPI 一起运行,使用 TF 1.15.3 或 TF2.2.0 在 int8 中量化模型和输出的解决方案是,感谢 delan:
Even though the model still does not run with the google NNAPI, the solution to quantize the model with in and output in int8 using either TF 1.15.3 or TF2.2.0 is, thanks to delan:
...
converter = tf.lite.TFLiteConverter.from_keras_model_file(saved_model_dir + modelname)
def representative_dataset_gen():
for _ in range(10):
pfad='pathtoimage/000001.jpg'
img=cv2.imread(pfad)
img = np.expand_dims(img,0).astype(np.float32)
# Get sample input data as a numpy array in a method of your choosing.
yield [img]
converter.representative_dataset = representative_dataset_gen
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.experimental_new_converter = True
converter.target_spec.supported_types = [tf.int8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
quantized_tflite_model = converter.convert()
if tf.__version__.startswith('1.'):
open("test153.tflite", "wb").write(quantized_tflite_model)
if tf.__version__.startswith('2.'):
with open("test220.tflite", 'wb') as f:
f.write(quantized_tflite_model)
推荐答案
如果您应用了训练后量化,则必须确保您的代表性数据集不在 float32 中.此外,如果您想使用 int8 或 uint8 输入/输出确定量化模型,您应该考虑使用量化感知训练.这也给你更好的量化结果
If you applied Post-training Quantization you have to make sure your representative dataset not in float32. Furthermore, if you want to surely quantize model with int8 or uint8 input/ouput you should consider using quantize aware training. This also give you better result in quatization
我也尝试从你给我的图像和代码中量化你的模型,毕竟它是量化的
I also tried to quantize your model from the image and code you give me and it is quantized after all
这篇关于获得完全量化的 TfLite 模型,也可以在 int8 上输入和输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!