本文介绍了使用Tesseract和OpenCV(Java)从图像读取文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试制作一个程序,可以读取营养标签上的信息,但是Tesseract存在许多实际上无法读取任何内容的问题.我已经尝试使用OpenCV尝试了多种不同的图像处理技术,但似乎没有什么帮助.

I'm trying to make a program that can read the information off of a nutritional label but Tesseract is having lots of issues actually being able to read anything. I've tried a number of different Image processing techniques using OpenCV but not much seems to help.

以下是一些看起来更好的尝试(碰巧是最简单的):

Here are some of my better looking attempts (which happen to be the simplest):

探戈瓶标签未修改

编辑了探戈酒瓶标签

输出:

Irn Bru瓶标签未编辑

Irn Bru瓶标签已编辑

输出

这只是将图像更改为灰度,3x3高斯模糊和Otsu二值化.

This is just changing the images to grey scale, a 3x3 Gaussian blur and Otsu binarisation.

对于使用OpenCV或任何其他图像处理库使文本更具可读性的任何帮助,我将不胜感激.

I would appreciate any help on how to make the text more readable using OpenCV or any other image processing library.

放弃使用Tesseract并为此使用机器学习会更简单吗?

Would it be simpler to forego using Tesseract and use machine learning for this?

推荐答案

首先请阅读有关OCR的假定.

First of all read this StackOverflow Answer regarding OCR prepossessing.

上述最重要的步骤是图像二值化图像去噪

The most important steps described above are the Image Binarization and Image Denoising

这里是一个例子:

原始图片

灰度

不清晰的标记

二进制化

现在准备申请OCR

JAVA 代码

Imgproc.cvtColor(original, grey, Imgproc.COLOR_RGB2GRAY, 0);

Imgproc.GaussianBlur(grey, blur, new Size(0, 0), 3);

Core.addWeighted(blur, 1.5, unsharp, -0.5, 0, unsharp);

Imgproc.threshold(unsharp,binary,127,255,Imgproc.THRESH_BINARY);

MatOfInt params = new MatOfInt(Imgcodecs.CV_IMWRITE_PNG_COMPRESSION);
File ocrImage = new File("ocrImage.png");
Imgcodecs.imwrite(ocrImage,binary,params);

/*initialize OCR ...*/
lept.PIX image = pixRead(ocrImage);
api.SetImage(image);
String ocrOutput = api.GetUTF8Text();

这篇关于使用Tesseract和OpenCV(Java)从图像读取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-23 11:17