本文介绍了tflite 量化推理非常慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将经过训练的模型从检查点文件转换为 tflite.我正在使用 tf.lite.LiteConverter.浮点转换顺利进行,推理速度合理.但是INT8转换的推理速度很慢.我试图通过输入一个非常小的网络来调试.我发现 INT8 模型的推理速度通常比浮点模型慢.

I am trying to convert a trained model from checkpoint file to tflite. I am using tf.lite.LiteConverter. The float conversion went fine with reasonable inference speed. But the inference speed of the INT8 conversion is very slow. I tried to debug by feeding in a very small network. I found that inference speed for INT8 model is generally slower than float model.

在 INT8 tflite 文件中,我发现了一些叫做 ReadVariableOp 的张量,在 TensorFlow 的官方 mobilenet tflite 模型中并不存在.

In the INT8 tflite file, I found some tensors called ReadVariableOp, which doesn't exist in TensorFlow's official mobilenet tflite model.

我想知道是什么导致了 INT8 推理的缓慢.

I wonder what causes the slowness of INT8 inference.

推荐答案

您可能使用了 x86 cpu 而不是带有 arm 指令的 CPU.你可以在这里参考 https://github.com/tensorflow/tensorflow/issues/21698#issuecomment-414764709

You possibly used x86 cpu instead of one with arm instructions. You can refer it here https://github.com/tensorflow/tensorflow/issues/21698#issuecomment-414764709

这篇关于tflite 量化推理非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-07 00:42