本文介绍了TF 对象检测:返回推理负载的子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 TF 的对象检测 API 训练和部署实例分割模型.我能够成功训练模型,将其打包到 TF Serving Docker 映像(latest 标记截至 2020 年 10 月),并通过 REST 接口处理推理请求.但是,推理请求返回的数据量非常大(数百 Mb).当推理请求和处理不在同一台机器上发生时,这是一个大问题,因为所有返回的数据都必须通过网络.

I'm working on training and deploying an instance segmentation model using TF's object detection API. I'm able to successfully train the model, package it into a TF Serving Docker image (latest tag as of Oct 2020), and process inference requests via the REST interface. However, the amount of data returned from an inference request is very large (hundreds of Mb). This is a big problem when the inference request and processing don't happen on the same machine because all that returned data has to go over the network.

有没有办法减少输出数量(在模型导出期间或在 TF Serving 图像内),以便在推理期间缩短往返时间?

我正在使用 TF OD API(带 TF2)来训练 Mask RCNN 模型,它是 这个配置.我相信完整的输出列表在代码 这里.我在推理过程中得到的项目列表也粘贴在下面.对于具有 100 个对象提议的模型,如果我将返回的推理作为 json 写入磁盘,则该信息约为 270 Mb.

I'm using TF OD API (with TF2) to train a Mask RCNN model, which is a modified version of this config. I believe the full list of outputs is described in code here. The list of items I get during inference is also pasted below. For a model with 100 object proposals, that information is ~270 Mb if I just write the returned inference as json to disk.

inference_payload['outputs'].keys()

dict_keys(['detection_masks', 'rpn_features_to_crop', 'detection_anchor_indices', 'refined_box_encodings', 'final_anchors', 'mask_predictions', 'detection_classes', 'num_detections', 'rpn_box_predictor_features', 'class_predictions_with_background', 'proposal_boxes', 'raw_detection_boxes', 'rpn_box_encodings', 'box_classifier_features', 'raw_detection_scores', 'proposal_boxes_normalized', 'detection_multiclass_scores', 'anchors', 'num_proposals', 'detection_boxes', 'image_shape', 'rpn_objectness_predictions_with_background', 'detection_scores'])

我已经将推理请求中的图像编码为 base64,因此在通过网络时请求负载不会太大.只是相比之下,推理响应是巨大的.我只需要此响应中的 4 或 5 个项目,因此最好排除其余项目并避免通过网络传递如此大的数据包.

I already encode the images within my inference requests as base64, so the request payload is not too large when going over the network. It's just that the inference response is gigantic in comparison. I only need 4 or 5 of the items out of this response, so it'd be great to exclude the rest and avoid passing such a large package of bits over the network.

  1. 我尝试在导出过程中将 score_threshold 设置为更高的值 (此处的代码示例)以减少输出数量.然而,这似乎只是限制了 detection_scores.仍然返回所有无关的推理信息.
  2. 我还尝试通过添加键的名称来手动排除其中一些推理输出以删除 此处.这似乎也没有任何效果,我担心这是一个坏主意,因为在评分/评估过程中可能需要其中一些键.
  3. 我也在此处和 tensorflow/models 存储库上进行了搜索,但找不到任何内容.
  1. I've tried setting the score_threshold to a higher value during the export (code example here) to reduce the number of outputs. However, this seems to just threshold the detection_scores. All the extraneous inference information is still returned.
  2. I also tried just manually excluding some of these inference outputs by adding the names of keys to remove here. That also didn't seem to have any effect, and I'm worried this is a bad idea because some of those keys might be needed during scoring/evaluation.
  3. I also searched here and on tensorflow/models repo, but I wasn't able to find anything.

推荐答案

我找到了一个笨拙的解决方法.在导出过程中(norel="herel5840a>),预测dict的部分成分被删除.我向 non_tensor_predictions 列表添加了其他项目,其中包含将在后处理步骤中删除的所有键.增加这个列表将我的推理输出从 ~200MB 减少到 ~12MB.

I was able to find a hacky workaround. In the export process (here), some of the components of the prediction dict are deleted. I added additional items to the non_tensor_predictions list, which contains all keys that will get removed during the postprocess step. Augmenting this list cut down my inference outputs from ~200MB to ~12MB.

if self._number_of_stages == 3 块的完整代码:

    if self._number_of_stages == 3:

      non_tensor_predictions = [
          k for k, v in prediction_dict.items() if not isinstance(v, tf.Tensor)]

      # Add additional keys to delete during postprocessing
      non_tensor_predictions = non_tensor_predictions + ['raw_detection_scores', 'detection_multiclass_scores', 'anchors', 'rpn_objectness_predictions_with_background', 'detection_anchor_indices', 'refined_box_encodings', 'class_predictions_with_background', 'raw_detection_boxes', 'final_anchors', 'rpn_box_encodings', 'box_classifier_features']

      for k in non_tensor_predictions:
        tf.logging.info('Removing {0} from prediction_dict'.format(k))
        prediction_dict.pop(k)

      return prediction_dict

我认为有一个更合适"的在创建 TF Serving 映像期间使用签名定义来处理此问题的方法,但这适用于快速而肮脏的修复.

I think there's a more "proper" way to deal with this using signature definitions during the creation of the TF Serving image, but this worked for a quick and dirty fix.

这篇关于TF 对象检测:返回推理负载的子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

查看更多