问题描述
我正在尝试使用ABBYY OCR SDK使用放置的示例代码处理图像。
对于每个已识别的字符,它将包含 charParams
如您所链接的答案中所示的元素。元素将包含页面像素的坐标 - 相同的XML还包含页面
元素:
< page width =...height =...resolution =...originalCoords =...>
存储图像宽度和高度。所以 l
和 r
每个 charParams
元素都在范围内相应页面的 0..width-1
和 t
和 b $ c每个
charParams
元素的$ c>在相应页面的 0..height-1
范围内。
还明确提到所有坐标都在像素中 - 它们完全与分辨率无关。这就是为什么每当您尝试突出显示图像上的任何内容时,您都必须考虑缩放 - 图像可能不会始终按设备软件显示,但会缩小尺寸,因此您必须将页面坐标映射到缩放上 - 适当的图像坐标和高亮显示。
I'm trying to process an image using ABBYY OCR SDK using the sample code placed in this question but I'm not able get the co-ordinates right for a specific word say "OCR" on the screenshot below.
I want to draw an overlay (yellow rectangle over the word "OCR") and sometimes the rectangle is placed very far away from the actual word.
The XML you get is synthesised according to this schema.
For each recognized character it will contain an instance of charParams
element as shown in the answer you linked to. The element will contain the coordinates in page pixels - the same XML also contains a page
element:
<page width="..." height="..." resolution="..." originalCoords="...">
where the image width and height are stored. So l
and r
for each charParams
element is in range 0..width-1
of the corresponding page and t
and b
for each charParams
element is in range 0..height-1
of the corresponding page.
Also it's worth mentioning explicitly that all coordinates are in pixels - they are completely resolution-agnostic. This is why whenever you try to highlight anything on an image you have to take zoom into account - the image will likely not be always displayed as is by your device software, but will be downscaled and so you have to map page coordinates onto your zoomed-out image coordinates and highlight appropriately.
这篇关于使用ABBYY OCR SDK从图像检索到的坐标不正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!