问题描述
我正在尝试使用OpenCV使用手机的相机扫描护照页.
I am trying to scan a passport page using the phone's camera using OpenCV.
在上图中,红色标记的轮廓是我的投资回报率(需要它的俯视图).执行分段,我可以检测机读区区域.并且页面应具有固定的宽高比.有没有办法使用宽高比来缩放绿色轮廓以近似红色轮廓?我尝试使用approxPolyDP
查找绿色矩形的角,然后缩放该矩形,最后进行透视变形以获取顶视图.问题是在进行矩形缩放时不考虑透视旋转,因此最终rect通常是错误的.
In the above image the contour marked in red is my ROI (will need a top view of that). Performing segmentation I can detect the MRZ area. And the pages should have a fixed aspect ratio. Is there a way to scale the green contour using the aspect ratio to approximate the red one? I have tried finding the corners of the green rect using approxPolyDP
, and then scaling that rect and finally doing a perspective warp to get the top view. The problem is that the perspective rotation is not accounted for while doing the rectangular scaling, so the final rect is often wrong.
通常我得到下图所示的输出
Often I get an output as marked in the following image
更新:添加了更多说明
关于第一个图像(假设红色矩形始终具有恒定的宽高比),
In regard to the 1st image (assuming the red rect will always have a constant aspect ratio),
- 我的目标:裁剪红色标记的部分,然后获得顶视图
- 我的方法:检测MRZ/绿色矩形->现在假定绿色矩形的底部边缘与红色矩形的底部边缘相同(足够靠近)->这样我得到了矩形的宽度和两个角->计算使用高度/纵横比的其他两个角
- 问题:我的上述计算未在第二张图像中输出红色rect,而是在第二张图像中输出了绿色rect(可能是因为这些四边形不是矩形,边缘之间的夹角不是0或90度)
推荐答案
据我了解,您的主要目标是从任意角度拍摄护照页面时,获得护照页面的俯视图.据我了解,您的方法如下:
As far as I understand your main goal is to get the top view of the passport page when its photo is taken from arbitrary angle.Also as I understand your approach is the following:
- 查找MRZ及其环绕的多边形
- 将MRZ多边形扩展到顶部-这将为您提供页面多边形
- 以透视图的形式查看顶视图.
当前的主要障碍是扩展多边形.
And the main obstacle currently is to extend the polygon.
如果对目标的理解不正确,请纠正我.
Please correct me If understood the goal incorrectly.
从数学角度来看,扩展多边形非常容易.多边形每侧的点形成一条边线.如果您进一步画线,则可以添加一个新点.以编程方式看起来可能是这样
Extending a polygon is quiet easy from mathematical perspective. Points on each side of the polygon form a side line. If you draw the line further you can put there a new point. Programmatically it may look like this
new_left_top_x = old_left_bottom_x + (old_left_top_x - old_left_bottom_x) * pass_height_to_MRZ_height_ratio
new_left_top_y = old_left_bottom_y + (old_left_top_y - old_left_bottom_y) * pass_height_to_MRZ_height_ratio
可以在右侧执行相同的操作.这种方法也可以在最大旋转45度的情况下使用.
The same can be done for the right part. This approach would also work with rotations up to 45 degrees.
但是,我担心这种方法不会给出准确的结果.我建议您检测护照页面本身而不是机读区.原因是页面本身是照片上的安静可见对象,并且可以通过findContours
功能轻松找到.
However I'm afraid this approach would not give accurate results. I would suggest to detect the passport page itself instead of MRZ. The reason is that the page itself is quiet noticeable object on the photo and can be easily found by findContours
function.
我写了一些代码来说明不需要真正检测MRZ的想法.
I wrote some code to illustrate the idea that detecting MRZ is not really necessary.
import os
import imutils
import numpy as np
import argparse
import cv2
# Thresholds
passport_page_aspect_ratio = 1.44
passport_page_coverage_ratio_threshold = 0.6
morph_size = (4, 4)
def pre_process_image(image):
# Let's get rid of color first
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Then apply Otsu threshold to reveal important areas
ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# erode white areas to "disconnect" them
# and dilate back to restore their original shape
morph_struct = cv2.getStructuringElement(cv2.MORPH_RECT, morph_size)
thresh = cv2.erode(thresh, morph_struct, anchor=(-1, -1), iterations=1)
thresh = cv2.dilate(thresh, morph_struct, anchor=(-1, -1), iterations=1)
return thresh
def find_passport_page_polygon(image):
cnts = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for cnt in cnts:
# compute the aspect ratio and coverage ratio of the bounding box
# width to the width of the image
(x, y, w, h) = cv2.boundingRect(cnt)
ar = w / float(h)
cr_width = w / float(image.shape[1])
# check to see if the aspect ratio and coverage width are within thresholds
if ar > passport_page_aspect_ratio and cr_width > passport_page_coverage_ratio_threshold:
# approximate the contour with a polygon with 4 points
epsilon = 0.02 * cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, epsilon, True)
return approx
return None
def order_points(pts):
# initialize a list of coordinates that will be ordered in the order:
# top-left, top-right, bottom-right, bottom-left
rect = np.zeros((4, 2), dtype="float32")
pts = pts.reshape(4, 2)
# the top-left point will have the smallest sum, whereas
# the bottom-right point will have the largest sum
s = pts.sum(axis=1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
# now, compute the difference between the points, the
# top-right point will have the smallest difference,
# whereas the bottom-left will have the largest difference
diff = np.diff(pts, axis=1)
rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)]
return rect
def get_passport_top_vew(image, pts):
rect = order_points(pts)
(tl, tr, br, bl) = rect
# compute the height of the new image, which will be the
# maximum distance between the top-right and bottom-right
# y-coordinates or the top-left and bottom-left y-coordinates
height_a = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
height_b = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
max_height = max(int(height_a), int(height_b))
# compute the width using standard passport page aspect ratio
max_width = int(max_height * passport_page_aspect_ratio)
# construct the set of destination points to obtain the top view, specifying points
# in the top-left, top-right, bottom-right, and bottom-left order
dst = np.array([
[0, 0],
[max_width - 1, 0],
[max_width - 1, max_height - 1],
[0, max_height - 1]], dtype="float32")
# compute the perspective transform matrix and apply it
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (max_width, max_height))
return warped
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True, help="path to images directory")
args = vars(ap.parse_args())
in_file = args["image"]
filename_base = in_file.replace(os.path.splitext(in_file)[1], "")
img = cv2.imread(in_file)
pre_processed = pre_process_image(img)
# Visualizing pre-processed image
cv2.imwrite(filename_base + ".pre.png", pre_processed)
page_polygon = find_passport_page_polygon(pre_processed)
if page_polygon is not None:
# Visualizing found page polygon
vis = img.copy()
cv2.polylines(vis, [page_polygon], True, (0, 255, 0), 2)
cv2.imwrite(filename_base + ".bounds.png", vis)
# Visualizing the warped top view of the passport page
top_view_page = get_passport_top_vew(img, page_polygon)
cv2.imwrite(filename_base + ".top.png", top_view_page)
我得到的结果:
为了获得更好的结果,补偿相机的光圈畸变也将是一件好事.
For better result it would be also good to compensate the camera aperture distortion.
这篇关于如何按比例缩放轮廓的高度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!