Java OpenCV + Tesseract OCR“代码” regocnition

本文介绍了Java OpenCV + Tesseract OCR“代码” regocnition的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试自动执行某人手动将代码转换为数字代码的过程。

I'm trying to automate a process where someone manually converts a code to a digital one.

然后我开始阅读有关OCR的内容。所以我安装了tesseract OCR并在一些图像上尝试了它。它甚至没有检测到接近代码的东西。

Then I started reading about OCR. So I installed tesseract OCR and tried it on some images. It doesn't even detect something close to the code.

我在读完stackoverflow上的一些问题后想到，图像需要一些预处理，比如将图像倾斜为水平图像，这可以通过openCV来完成。

I figured after reading some questions on stackoverflow, that the images need some preprocessing like skewing the image to a horizontal one, which can been done by openCV for example.

现在我的问题是：

在上述图像的情况下应该使用哪种预处理方法或其他方法？

其次，我可以依赖输出吗？它会不会像上面的图像一样工作？

我希望有人可以帮助我！

I hope someone can help me!

推荐答案

我决定只捕获整张卡而不是代码。通过捕获整张卡片，可以将其转换为简单的视角，然后我可以轻松获得代码区域。

I have decided to capture the whole card instead of the code only. By capturing the whole card it is possible to transform it to a plain perspective and then I could easily get the "code" region.

此外，我学到了很多东西。特别是关于速度。此功能在高分辨率图像上很慢。尺寸为3264 x 1836可能需要10秒钟。

Also I learned a lot of things. Especially regarding speed. This function is slow on high resolution images. It can take up to 10 seconds with a size of 3264 x 1836.

我为加快速度所做的工作是将输入矩阵的大小调整为 1/4 。这使得 4 ^ 2 时间更快，并且给了我最小的精度损失。下一步是缩放我们发现回到正常大小的四边形。因此我们可以使用原始源将四边形转换为普通透视图。

What I did to speed things up, is re-sizing the input matrix by a factor of 1 / 4. Which makes it 4^2 times faster and gave me a minimal lose of precision. The next step is scaling the quadrangle which we found back to the normal size. So that we can transform the quadrangle to a plain perspective using the original source.

我为检测最大区域而创建的代码很大程度上基于我在stackoverflow上找到的代码。不幸的是，他们没有像我预期的那样工作，所以我结合了更多的代码片段并进行了大量修改。
这就是我得到的：

The code I created for detecting the largest area is heavily based on code I found on stackoverflow. Unfortunately they didn't work as expected for me, so I combined more code snippets and modified a lot.This is what I got:

    private static double angle(Point p1, Point p2, Point p0 ) {
        double dx1 = p1.x - p0.x;
        double dy1 = p1.y - p0.y;
        double dx2 = p2.x - p0.x;
        double dy2 = p2.y - p0.y;
        return (dx1 * dx2 + dy1 * dy2) / Math.sqrt((dx1 * dx1 + dy1 * dy1) * (dx2 * dx2 + dy2 * dy2) + 1e-10);
    }



    private static MatOfPoint find(Mat src) throws Exception {
        Mat blurred = src.clone();
        Imgproc.medianBlur(src, blurred, 9);

        Mat gray0 = new Mat(blurred.size(), CvType.CV_8U), gray = new Mat();

        List<MatOfPoint> contours = new ArrayList<>();

        List<Mat> blurredChannel = new ArrayList<>();
        blurredChannel.add(blurred);
        List<Mat> gray0Channel = new ArrayList<>();
        gray0Channel.add(gray0);

        MatOfPoint2f approxCurve;

        double maxArea = 0;
        int maxId = -1;

        for (int c = 0; c < 3; c++) {
            int ch[] = {c, 0};
            Core.mixChannels(blurredChannel, gray0Channel, new MatOfInt(ch));

            int thresholdLevel = 1;
            for (int t = 0; t < thresholdLevel; t++) {
                if (t == 0) {
                    Imgproc.Canny(gray0, gray, 10, 20, 3, true); // true ?
                    Imgproc.dilate(gray, gray, new Mat(), new Point(-1, -1), 1); // 1 ?
                } else {
                    Imgproc.adaptiveThreshold(gray0, gray, thresholdLevel, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, (src.width() + src.height()) / 200, t);
                }

                Imgproc.findContours(gray, contours, new Mat(), Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);

                for (MatOfPoint contour : contours) {
                    MatOfPoint2f temp = new MatOfPoint2f(contour.toArray());

                    double area = Imgproc.contourArea(contour);
                    approxCurve = new MatOfPoint2f();
                    Imgproc.approxPolyDP(temp, approxCurve, Imgproc.arcLength(temp, true) * 0.02, true);

                    if (approxCurve.total() == 4 && area >= maxArea) {
                        double maxCosine = 0;

                        List<Point> curves = approxCurve.toList();
                        for (int j = 2; j < 5; j++)
                        {

                            double cosine = Math.abs(angle(curves.get(j % 4), curves.get(j - 2), curves.get(j - 1)));
                            maxCosine = Math.max(maxCosine, cosine);
                        }

                        if (maxCosine < 0.3) {
                            maxArea = area;
                            maxId = contours.indexOf(contour);
                            //contours.set(maxId, getHull(contour));
                        }
                    }
                }
            }
        }

        if (maxId >= 0) {
            return contours.get(maxId);
            //Imgproc.drawContours(src, contours, maxId, new Scalar(255, 0, 0, .8), 8);
        }
        return null;
    }

您可以这样称呼：

MathOfPoint contour = find(src);

从轮廓中查看此四边形检测答案并将其转换为平面透视：

See this answer for quadrangle detection from a contour and transforming it to a plain perspective:Java OpenCV deskewing a contour

这篇关于Java OpenCV + Tesseract OCR“代码” regocnition的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！