问题描述
我有发票图片,我想检测上面的文字。所以我计划使用两个步骤:首先是识别文本区域,然后使用OCR识别文本。
I have an invoice image, and I want to detect the text on it. So I plan to use 2 steps: first is to identify the text areas, and then using OCR to recognize the text.
我在python中使用OpenCV 3.0。我能够识别文本(包括一些非文本区域),但我还想从图像中识别文本框(也不包括非文本区域)。
I am using OpenCV 3.0 in python for that. I am able to identify the text(including some non text areas) but I further want to identify text boxes from the image(also excluding the non-text areas).
我的输入图片是:,输出为:
我正在使用以下代码:
My input image is: and the output is:and I am using the below code for this:
img = cv2.imread('/home/mis/Text_Recognition/bill.jpg')
mser = cv2.MSER_create()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
gray_img = img.copy()
regions = mser.detectRegions(gray, None)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
cv2.polylines(gray_img, hulls, 1, (0, 0, 255), 2)
cv2.imwrite('/home/mis/Text_Recognition/amit.jpg', gray_img) #Saving
现在,我想识别文本框,并删除/取消识别发票上的任何非文本区域。我是OpenCV的新手,也是Python的初学者。我可以在和,但如果我将它们转换为python,它我会花很多时间。
Now, I want to identify the text boxes, and remove/unidentify any non-text areas on the invoice. I am new to OpenCV and am a beginner in Python. I am able to find some examples in MATAB example and C++ example, but If I convert them to python, it will take a lot of time for me.
有没有使用OpenCV的python示例,或者有人可以帮我这个吗?
Is there any example with python using OpenCV, or can anyone help me with this?
推荐答案
以下是代码
导入包
Below is the codeImport packages
import cv2
import numpy as np
#Create MSER object
mser = cv2.MSER_create()
#Your image path i-e receipt path
img = cv2.imread('/home/rafiullah/PycharmProjects/python-ocr-master/receipts/73.jpg')
#Convert to gray scale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
#detect regions in gray scale image
regions, _ = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
cv2.polylines(vis, hulls, 1, (0, 255, 0))
cv2.imshow('img', vis)
cv2.waitKey(0)
mask = np.zeros((img.shape[0], img.shape[1], 1), dtype=np.uint8)
for contour in hulls:
cv2.drawContours(mask, [contour], -1, (255, 255, 255), -1)
#this is used to find only text regions, remaining are ignored
text_only = cv2.bitwise_and(img, img, mask=mask)
cv2.imshow("text only", text_only)
cv2.waitKey(0)
这篇关于OpenCV MSER检测文本区域 - Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!