google vision API返回空的边界框顶点，而是返回normalized_vertexes

本文介绍了google vision API返回空的边界框顶点，而是返回normalized_vertexes的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION 提取pdf文档中的一些密集文本.这是我的代码:

I am using vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION to extract some dense text in a pdf document. Here is my code:

from google.cloud import vision

def extract_text(bucket, filename, mimetype):
    print('Looking for text in PDF {}'.format(filename))
    # BATCH_SIZE; How many pages should be grouped into each json output file.
    # """OCR with PDF/TIFF as source files on GCS"""

    # Detect text
    feature = vision.types.Feature(
        type=vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION)
    # Extract text from source bucket
    gcs_source_uri = 'gs://{}/{}'.format(bucket, filename)
    gcs_source = vision.types.GcsSource(uri=gcs_source_uri)
    input_config = vision.types.InputConfig(
        gcs_source=gcs_source, mime_type=mimetype)

    request = vision.types.AnnotateFileRequest(features=[feature], input_config=input_config)

    print('Waiting for the ORC operation to finish.')
    ocr_response = vision_client.batch_annotate_files(requests=[request])

    print('OCR completed.')

在回复中，我希望找到 ocr_response.responses [1 ... n] .pages [1 ... n] .blocks [1 ... n] .bounding_box 填写的顶点列表，但此列表为空.而是有一个 normalized_vertices 列表，它们是介于0和1之间的归一化顶点.为什么会这样呢?为什么 vertices 结构为空?我正在关注此文章，那里的作者使用顶点，但是我不明白为什么我没有得到它们.要将它们转换为非规范化的形式，我将规范化的顶点乘以高度和宽度，但是结果很糟糕，盒子的位置不好.

In the response, I am expecting to find into ocr_response.responses[1...n].pages[1...n].blocks[1...n].bounding_box a list of vertices filled in, but this list is empty. Instead, there is a normalized_vertices list which are the normalised vertices between 0 and 1. Why is that so? why the vertices structure is empty?I am following this article, and the author there uses vertices, but I don't understand why I don't get them.To convert them to the non normalised form, I am multiplying the normalised vertex by height and width, but the result is awful, the boxes are not well positioned.

而是

google vision API返回空的边界框顶点，而是返回normalized_vertexes

问题描述

推荐答案