本文介绍了屏幕截图中的低分辨率文本的OCR的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写OCR应用程序,以从屏幕快照图像中读取字符.目前,我只关注数字.我的方法部分基于此博客文章: http://blog.damiles.com/2008/11/basic-ocr-in-opencv/.

I'm writing an OCR application to read characters from a screenshot image. Currently, I'm focusing only on digits. I'm partially basing my approach on this blog post: http://blog.damiles.com/2008/11/basic-ocr-in-opencv/.

我可以使用一些巧妙的阈值方法成功提取每个单独的字符.有点棘手的地方是匹配字符.即使使用固定的字体和大小,也会有一些变量(例如背景颜色和字距调整)导致同一数字以略有不同的形状出现.例如,下面的图像分为3部分:

I can successfully extract each individual character using some clever thresholding. Where things get a bit tricky is matching the characters. Even with fixed font face and size, there are some variables such as background color and kerning that cause the same digit to appear in slightly different shapes. For example, the below image is segmented into 3 parts:

  1. 顶部:我成功从屏幕截图中提取的目标数字
  2. 中部:模板:我的训练集中的一位数字
  3. 底部:顶部和中间图像之间的误差(绝对差)

所有部件都已缩放(两条绿色水平线之间的距离代表一个像素).

The parts have all been scaled (the distance between the two green horizontal lines represents one pixel).

您可以看到,尽管顶部和中间的图像都清楚地表示为2,但是它们之间的误差还是很高的.当匹配其他数字时,这会导致误报-例如,不难看出放置在适当位置的7与上图中的目标数字如何比中间图像更好地匹配.

You can see that despite both the top and middle images clearly representing a 2, the error between them is quite high. This causes false positives when matching other digits -- for example, it's not hard to see how a well-placed 7 can match the target digit in the image above better than the middle image can.

目前,我正在通过为每位手指放置一堆训练图像,然后将目标手指与这些图像进行匹配,来实现这些功能,一对一.我尝试对训练集进行平均图像拍摄,但是并不能解决问题(其他数字误报).

Currently, I'm handling this by having a heap of training images for each digit, and matching the target digit against those images, one-by-one. I tried taking the average image of the training set, but that doesn't resolve the problem (false positives on other digits).

我有点不愿意使用移位的模板执行匹配(与我现在正在做的基本上相同).有没有比简单的绝对差更好的比较两个图像的方法了?我在想类似EMD(移地距离, http://en.wikipedia.org/Wiki/Earth_mover 's_distance)以2D形式显示:基本上,我需要一种比较方法,该方法对全局移动和局部变化较小(白色像素旁边的像素变为白色,或者黑色像素旁边的像素)不那么敏感变黑),但对全局变化敏感(与白色像素相距甚远的黑色像素变黑,反之亦然).

I'm a bit reluctant to perform matching using a shifted template (it'd be essentially the same as what I'm doing now). Is there a better way to compare the two images than simple absolute difference? I was thinking of maybe something like the EMD (earth movers distance, http://en.wikipedia.org/wiki/Earth_mover's_distance) in 2D: basically, I need a comparison method that isn't as sensitive to global shifting and small local changes (pixels next to a white pixel becoming white, or pixels next to a black pixel becoming black), but is sensitive to global changes (black pixels that are nowhere near white pixels become black, and vice versa).

有人可以提出比绝对差异更有效的匹配方法吗?

Can anybody suggest a more effective matching method than absolute difference?

我正在使用C风格的Python包装器(import cv)在OpenCV中完成所有操作.

I'm doing all this in OpenCV using the C-style Python wrappers (import cv).

推荐答案

我会研究使用Haar级联.我已经将它们用于面部检测/头部跟踪,看来您可以构建足够多的级联集合,其中包含足够的"2","3","4"等.

I would look into using Haar cascades. I've used them for face detection/head tracking, and it seems like you could build up a pretty good set of cascades with enough '2's, '3's, '4's, and so on.

http://alereimondo.no-ip.org/OpenCV/34

http://en.wikipedia.org/wiki/Haar-like_features

这篇关于屏幕截图中的低分辨率文本的OCR的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-31 09:23