

我正在编写OCR应用程序,以从屏幕快照图像中读取字符.目前,我只关注数字.我的方法部分基于此博客文章: http://blog.damiles.com/2008/11/basic-ocr-in-opencv/.

I'm writing an OCR application to read characters from a screenshot image. Currently, I'm focusing only on digits. I'm partially basing my approach on this blog post: http://blog.damiles.com/2008/11/basic-ocr-in-opencv/.


I can successfully extract each individual character using some clever thresholding. Where things get a bit tricky is matching the characters. Even with fixed font face and size, there are some variables such as background color and kerning that cause the same digit to appear in slightly different shapes. For example, the below image is segmented into 3 parts:

  1. 顶部:我成功从屏幕截图中提取的目标数字
  2. 中部:模板:我的训练集中的一位数字
  3. 底部:顶部和中间图像之间的误差(绝对差)


The parts have all been scaled (the distance between the two green horizontal lines represents one pixel).


You can see that despite both the top and middle images clearly representing a 2, the error between them is quite high. This causes false positives when matching other digits -- for example, it's not hard to see how a well-placed 7 can match the target digit in the image above better than the middle image can.


Currently, I'm handling this by having a heap of training images for each digit, and matching the target digit against those images, one-by-one. I tried taking the average image of the training set, but that doesn't resolve the problem (false positives on other digits).

我有点不愿意使用移位的模板执行匹配(与我现在正在做的基本上相同).有没有比简单的绝对差更好的比较两个图像的方法了?我在想类似EMD(移地距离, http://en.wikipedia.org/Wiki/Earth_mover 's_distance)以2D形式显示:基本上,我需要一种比较方法,该方法对全局移动和局部变化较小(白色像素旁边的像素变为白色,或者黑色像素旁边的像素)不那么敏感变黑),但对全局变化敏感(与白色像素相距甚远的黑色像素变黑,反之亦然).

I'm a bit reluctant to perform matching using a shifted template (it'd be essentially the same as what I'm doing now). Is there a better way to compare the two images than simple absolute difference? I was thinking of maybe something like the EMD (earth movers distance, http://en.wikipedia.org/wiki/Earth_mover's_distance) in 2D: basically, I need a comparison method that isn't as sensitive to global shifting and small local changes (pixels next to a white pixel becoming white, or pixels next to a black pixel becoming black), but is sensitive to global changes (black pixels that are nowhere near white pixels become black, and vice versa).


Can anybody suggest a more effective matching method than absolute difference?

我正在使用C风格的Python包装器(import cv)在OpenCV中完成所有操作.

I'm doing all this in OpenCV using the C-style Python wrappers (import cv).



I would look into using Haar cascades. I've used them for face detection/head tracking, and it seems like you could build up a pretty good set of cascades with enough '2's, '3's, '4's, and so on.




08-31 09:23