WO2021051939A1 - 一种证件区域定位的方法及装置 - Google Patents

一种证件区域定位的方法及装置 Download PDF

Info

Publication number
WO2021051939A1
WO2021051939A1 PCT/CN2020/099260 CN2020099260W WO2021051939A1 WO 2021051939 A1 WO2021051939 A1 WO 2021051939A1 CN 2020099260 W CN2020099260 W CN 2020099260W WO 2021051939 A1 WO2021051939 A1 WO 2021051939A1
Authority
WO
WIPO (PCT)
Prior art keywords
line
lines
groups
vertical
horizontal
Prior art date
Application number
PCT/CN2020/099260
Other languages
English (en)
French (fr)
Inventor
黄泽浩
熊冬根
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021051939A1 publication Critical patent/WO2021051939A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • This application relates to the computer field, and in particular to a method and device for locating a certificate area.
  • ID cards play a vital role in the process of personal identity verification, especially in the lending business, where borrowers are required to upload ID photos.
  • the server After the server receives the ID card image uploaded by the borrower, it usually locates the ID card area and clarifies the image to obtain a clear ID card image.
  • the inventor realizes that the photos uploaded by general users are taken in various complicated environments, and the image quality is uneven.
  • the precise positioning of the ID card area in the image directly affects the subsequent clarification of the ID card area.
  • the server After the image, it usually uses text recognition technology to recognize the text on the image to determine the text on the ID card, so as to achieve the purpose of positioning the area of the ID card according to the position of the text.
  • this method The ID text extracted from the ID card is easily affected by other texts in the image, which reduces the positioning accuracy.
  • the embodiments of the present application provide a method and device for locating a document area, which can detect and accurately locate the document area, improve the accuracy of document positioning, and facilitate subsequent image processing.
  • an embodiment of the present application provides a method for locating a credential area, including:
  • the area corresponding to the four lines is determined as the area of the credential image.
  • an embodiment of the present application provides a device for locating a document area, including:
  • An acquiring unit configured to acquire an image to be processed, where the image to be processed includes a credential image
  • a processing unit configured to perform binarization processing on the image to be processed to obtain an edge detection image including edge lines
  • the grouping unit is used to group the lines according to the type and position of the line to obtain four groups of lines;
  • the selection unit is used to select one line from each of the four groups of lines to obtain four lines;
  • the determining unit is configured to determine the area corresponding to the four lines as the area of the credential image.
  • an embodiment of the present application provides an electronic device.
  • the electronic device includes a processor and a memory, and the processor and the memory are connected to each other.
  • the memory is used to store a computer program that supports the terminal device to execute the method provided by the foregoing first aspect and/or any one of the possible implementations of the first aspect, the computer program includes program instructions, and the processor is configured to call the foregoing Program instructions to perform the following steps:
  • the area corresponding to the four lines is determined as the area of the credential image.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions for implementing the following steps:
  • the area corresponding to the four lines is determined as the area of the credential image.
  • a binary image composed of edge lines in the image is obtained, and further, a binary image composed of edge lines in the image is obtained by identifying
  • the lines are grouped and classified according to the position and type of the lines, so that four groups of lines containing the document area are obtained.
  • One line is selected from the four groups of lines to obtain four lines.
  • These four lines are used to determine the document image
  • the line of the area realizes the precise positioning of the area of the document from the image.
  • it is also possible to determine whether the lines can be divided into four groups according to the rules, thereby achieving the effect of detecting whether there is a certificate in the image.
  • Figure 1 is a flow chart of a method for locating a document area proposed in this application
  • Figure 2 is a flow chart of another method for locating a document area proposed by this application.
  • Fig. 3 is a schematic structural diagram of a device for locating a document area proposed in this application.
  • Fig. 4 is a schematic diagram of the structure of an electronic device proposed in the present application.
  • FIG. 1 is a flowchart of a method for locating a document area according to an embodiment of the present application. As shown in Figure 1, the method for locating the document area includes:
  • the device for locating the certificate area may include a server, a mobile phone, a tablet computer, a personal digital assistant (personal digital assistant, PDA), a mobile internet device (MID), and a smart wearable device (such as a smart watch). , Smart bracelet) and other electronic equipment, equipment capable of image processing.
  • a server a mobile phone, a tablet computer, a personal digital assistant (personal digital assistant, PDA), a mobile internet device (MID), and a smart wearable device (such as a smart watch). , Smart bracelet) and other electronic equipment, equipment capable of image processing.
  • the image to be processed can be obtained from the image containing the certificate uploaded by the user, or one of the stored images containing the certificate can be obtained as the image to be processed, or it can be taken by turning on the camera.
  • the way to obtain the image to be processed here for a photo containing the certificate There is no restriction on the way to obtain the image to be processed here for a photo containing the certificate.
  • the shape of the certificate and the type of the certificate are not limited.
  • edge detection is performed on the acquired image to be processed to obtain a binary image, wherein the image to be processed is input to the overall nested neural network (holistically-nested edge detection, HED).
  • HED overall nested edge detection
  • the edge detection image including the edge line can be obtained.
  • the part of the edge line can be replaced by black, and the part other than the edge line can be replaced by white, or the part of the edge line can be replaced by black, and the part other than the edge line can be replaced by black. Partially replaced with white.
  • the type of certificate in the image can be obtained.
  • the method of obtaining can be to recognize the text in the image by text recognition technology to determine the type of certificate, or it can be selected when the user uploads the image containing the certificate
  • the type of the certificate can also be the type of the certificate determined by the stored image type.
  • a large number of images containing the type of the certificate are used to train the above-mentioned neural network to obtain the trained neural network.
  • the trained neural network is used to perform edge detection on the image to be processed to obtain an edge detection image including edge lines.
  • the middle convolutional layer of the above-mentioned HED network also upsamples its output to obtain a picture that is the same as the original picture, and calculates the loss function (loss) with the label data (ground-truth).
  • the output of these middle convolutional layers is The map is called the side-output.
  • the loss caused by multiple side-outputs is directly transmitted back to the corresponding convolutional layer, which avoids the disappearance of the gradient to a certain extent.
  • the characteristics of different scales are learned in different convolutional layers (different receptive fields). (feature), so as to achieve the effect of edge detection.
  • edge recognition is performed on the detected edge line to identify the lines in the edge detection image.
  • the Hough transform can be used to extract the features in the image, and extract the lines in the above-mentioned edge detection image.
  • voting is performed in the Hough space, and each time there is a straight line equation that satisfies the (r, ⁇ ) point, that is, the intersection of multiple plane lines, the pixel value here is +1.
  • the whiter a certain point (the larger the pixel value) means, the more points pass through this straight line, which may be a line that you want to recognize.
  • intersection is further filtered, and the first threshold of the number of intersections is set, that is, the pixel value is greater than or equal to the first threshold, or the number of intersections greater than or equal to the first threshold is taken as the selected intersection, where the coordinates of the intersection (k , B) represents the slope and intercept in the edge detection image, and then the expression of the line in the edge detection image can be obtained, so as to determine the line in the edge detection image.
  • the lines after identifying the lines, first classify the lines.
  • the types of the lines include horizontal and vertical lines. According to the size of the inclination angle of the line, the obtained line is divided into two horizontal lines and vertical lines. According to the position of the line, the horizontal lines of the horizontal line are divided into multiple horizontal line groups. The positions of the lines in each horizontal line group are similar, and the length of the lines in each horizontal line group is also similar. Lines that are too long or too short appear.
  • the vertical line type lines are also treated in the same way. The vertical line type lines are divided into multiple vertical line groups according to their positions. The lines in each vertical line group are similar in position and similar in length.
  • the difference between the ratio of the aspect ratio of the document and the length of the lines in the horizontal line group and the lines in the vertical line group, and according to the angle and the right angle between the lines in the horizontal line group and the lines in the vertical line group determines the two horizontal line groups and two vertical line groups of the area containing the document, and four groups of lines are obtained.
  • one line is selected from each of the above four groups of lines, and the four selected lines are the four lines that make up the document area.
  • the selection method can be to randomly select a line in each group as the area of the certificate, or according to the position of the line to select the outermost line in each group as the area of the certificate, or according to the line Select the innermost line of each group of lines as the area of the certificate, which is not limited here.
  • the area enclosed by the obtained four lines is determined as the area of the document image.
  • a binary image composed of edge lines in the image is obtained, and further, a binary image composed of edge lines in the image is obtained by identifying
  • the lines are grouped and classified according to the position and type of the lines, so that four groups of lines containing the document area are obtained.
  • One line is selected from the four groups of lines to obtain four lines.
  • These four lines are used to determine the document image
  • the line of the area realizes the precise positioning of the area of the document from the image.
  • it is also possible to determine whether the lines can be divided into four groups according to the rules, thereby achieving the effect of detecting whether there is a certificate in the image.
  • FIG. 2 is a flowchart of another method for locating a document area according to an embodiment of the present application. As shown in Figure 2, the method for locating the document area includes:
  • the obtained lines are grouped. First, they are divided into horizontal lines and vertical lines according to their types.
  • the angle of inclination taking the x-axis as the reference, the angle formed between the positive direction of the x-axis and the upward direction of the line. Determine whether it is a horizontal line or a vertical line according to whether the inclination angle is within the threshold interval.
  • the inclination angle of the line can be obtained according to the slope of the line, or can be obtained according to the coordinates of the pixel points on the line, which is not limited here.
  • the line whose inclination angle is within the threshold interval is determined as a vertical line, where the threshold interval can be 45 degrees to 135 degrees, 60 degrees to 120 degrees, or 30 degrees to 30 degrees. 150 degrees, there is no limit here. Divide the lines in the threshold interval into vertical lines.
  • the lines outside the above threshold interval are divided into horizontal lines to obtain multiple horizontal lines. That is, the remaining lines after removing the vertical lines from all the lines are divided into horizontal lines.
  • the lines divided into horizontal lines are grouped, and it is calculated whether the difference in the inclination angle between each two horizontal lines in the horizontal line is less than a second threshold, where the second threshold may be 5 degrees. , It can also be 6 degrees or 10 degrees. It can be understood that the second threshold is an artificially set angle threshold. And calculate whether the distance between the midpoints of every two horizontal lines is less than the third threshold. If they are met at the same time, divide the two horizontal lines into the same horizontal line group. It is understandable that the third threshold is The distance threshold can be a distance threshold set according to the size of the image.
  • each horizontal line group contains at least one horizontal line.
  • the lines divided into vertical lines are grouped, and whether the difference in the inclination angle between each two horizontal lines in the vertical line is smaller than the second threshold ,
  • the second threshold may be 5 degrees, 6 degrees, or 10 degrees. It is understandable that the second threshold is an angle threshold that is artificially set. And calculate whether the distance between the midpoints of every two vertical lines is less than the third threshold, and the third threshold is the distance threshold, which can be artificially set. In the case of simultaneous satisfaction, the two vertical lines are divided into the same vertical line group.
  • each vertical line group contains at least two vertical lines.
  • the j-th vertical line is a line in one vertical line group in the plurality of vertical line groups.
  • the ratio of the length of one horizontal line in the horizontal line group to the length of two vertical lines in the vertical line group is calculated, and the length of the two horizontal lines in the horizontal line group is compared with one of the vertical lines in the vertical line group.
  • the length ratio of the vertical line is obtained by making the absolute value of the difference between the length ratio and the fourth threshold respectively to obtain the ij-th length difference, where the fourth threshold is the aspect ratio of the determined document type, which is a known value.
  • the above-mentioned fifth threshold is 90 degrees. It can be understood that the line segments in the four groups are determined to meet the aspect ratio and the document aspect ratio error is less than the sixth threshold, and the horizontal and vertical lines The angle difference of the lines is close to or equal to 90 degrees.
  • the ij-th angle difference is smaller than the seventh threshold, and the i-th horizontal line and the j-th vertical line have intersections, the horizontal line group where the i-th horizontal line is located and the above There are four sets of lines in the vertical line group where the j-th vertical line is located.
  • the sixth threshold is the error of the difference between the aspect ratio and the artificially determined, and the seventh threshold is the error of 90 degrees.
  • the sixth threshold and the seventh threshold may be artificially preset.
  • each of the four groups of lines has a line that satisfies the above-mentioned angle difference and length difference and there is an intersection point
  • the four groups of lines are determined to be the obtained four groups, and the four groups are selected separately A line thus determines the area of the document.
  • a binary image composed of edge lines in the image is obtained, and further, a binary image composed of edge lines in the image is obtained by identifying
  • the lines are grouped and classified according to the position and type of the lines, so that four groups of lines containing the document area are obtained.
  • One line is selected from the four groups of lines to obtain four lines.
  • These four lines are used to determine the document image
  • the line of the area realizes the precise positioning of the area of the document from the image.
  • it is also possible to determine whether the lines can be divided into four groups according to the rules, thereby achieving the effect of detecting whether there is a certificate in the image.
  • FIG. 3 is a schematic diagram of the structure of a device for locating a document area proposed in this application.
  • the device 3000 for locating the document area includes:
  • the acquiring unit 301 is configured to acquire an image to be processed, and the image to be processed includes a certificate image;
  • the processing unit 302 is configured to perform binarization processing on the image to be processed to obtain an edge detection image including edge lines;
  • the recognition unit 303 is configured to recognize the lines in the above-mentioned edge detection image
  • the grouping unit 304 is configured to group the above-mentioned lines according to the type of the line and the position of the line to obtain four groups of lines;
  • the selecting unit 305 is configured to select one line from each of the above four groups of lines to obtain four lines;
  • the determining unit 306 is configured to determine the area corresponding to the aforementioned four lines as the area of the aforementioned certificate image.
  • the aforementioned identification unit 303 is specifically configured to:
  • Determining the intersection of the multiple plane lines, and the number of plane lines passing through the intersection is greater than a first threshold
  • the line in the edge detection image is determined according to the straight line expression.
  • the foregoing grouping unit 304 is specifically configured to:
  • Two horizontal line groups are selected from the above-mentioned multiple horizontal line groups, and two vertical line groups are selected from the above-mentioned multiple vertical line groups to obtain four groups of lines.
  • the foregoing grouping unit 304 is specifically configured to:
  • the foregoing grouping of the plurality of horizontal lines according to the positions of the lines to obtain a plurality of horizontal line groups includes that the difference between the inclination angle of the first line and the inclination angle of the second line is less than the second threshold, and the If the distance between the midpoint and the midpoint of the second line is less than the third threshold, it is determined that the first line and the second line belong to the same horizontal line group;
  • the foregoing grouping of the plurality of vertical lines according to the positions of the foregoing lines to obtain a plurality of vertical line groups includes that the difference between the inclination angle of the third line and the inclination angle of the fourth line is less than the second threshold, and the third line If the distance between the midpoint and the midpoint of the fourth line is less than the third threshold, it is determined that the third line and the fourth line belong to the same vertical line group.
  • the foregoing apparatus 3000 further includes:
  • the culling unit 307 is configured to remove lines whose line lengths are outside the threshold interval in the multiple horizontal line groups and the multiple vertical line groups after the multiple horizontal line groups and the multiple vertical line groups are obtained;
  • the above selection unit 305 is specifically used for:
  • Two horizontal line groups are selected from the multiple horizontal line groups after culling, and two vertical line groups are selected from the multiple vertical line groups after culling, to obtain four groups of lines.
  • the aforementioned selecting unit 305 is specifically configured to:
  • the horizontal line is a line in one horizontal line group of the plurality of horizontal line groups
  • the j-th vertical line is a line in one vertical line group of the plurality of vertical line groups
  • the ij-th length difference is less than the sixth threshold
  • the ij-th angle difference is less than the seventh threshold
  • the i-th horizontal line and the j-th vertical line have intersections, based on the horizontal line group where the i-th horizontal line is located and the above
  • the vertical line group where the j-th vertical line is located obtains four groups of lines.
  • the foregoing processing unit 302 is specifically configured to:
  • the above-mentioned image to be processed is input into the overall nested neural network to obtain an edge detection image including edge lines.
  • a binary image composed of edge lines in the image is obtained, and further, by identifying the edge image obtained after the binarization process According to the position and type of the line, the lines are grouped and classified to obtain four groups of lines containing the document area. One line is selected from the four groups of lines to obtain four lines. These four lines are used to determine the The lines in the image area of the document enable precise positioning of the area of the document from the image. At the same time, by judging whether the lines can be divided into four groups according to the rules, the effect of detecting whether there is a certificate in the image is achieved.
  • FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may include:
  • processors 401 and memory 402. The aforementioned processor 401 and memory 402 are connected through a bus 403.
  • the memory 402 is configured to store a computer program, and the computer program includes program instructions.
  • the processor 401 is configured to execute the program instructions stored in the memory 402, where the processor 401 is configured to call the program instructions to perform the following steps:
  • the area corresponding to the above four lines is determined as the area of the above document image.
  • the processor 401 identifying lines in the edge detection image includes:
  • Determining the intersection of the multiple plane lines, and the number of plane lines passing through the intersection is greater than a first threshold
  • the line in the edge detection image is determined according to the straight line expression.
  • the processor 401 groups the lines according to the type and position of the line to obtain four groups of lines, including:
  • Two horizontal line groups are selected from the above-mentioned multiple horizontal line groups, and two vertical line groups are selected from the above-mentioned multiple vertical line groups to obtain four groups of lines.
  • the processor 401 groups the plurality of horizontal lines according to the positions of the lines to obtain a plurality of horizontal line groups, including the inclination angle of the first line and the inclination angle of the second line.
  • the difference is less than the second threshold, and the distance between the midpoint of the first line and the midpoint of the second line is less than the third threshold, it is determined that the first line and the second line belong to the same horizontal line group;
  • the foregoing grouping of the plurality of vertical lines according to the positions of the foregoing lines to obtain a plurality of vertical line groups includes that the difference between the inclination angle of the third line and the inclination angle of the fourth line is less than the second threshold, and the third line If the distance between the midpoint and the midpoint of the fourth line is less than the third threshold, it is determined that the third line and the fourth line belong to the same vertical line group.
  • processor 401 is also called to perform the following steps:
  • the processor 401 is also called to execute the obtained multiple horizontal line groups and the multiple vertical line groups, and then removes lines in the multiple horizontal line groups and the multiple vertical line groups whose line lengths are outside the threshold interval;
  • two horizontal line groups are selected from the above-mentioned multiple horizontal line groups
  • two vertical line groups are selected from the above-mentioned multiple vertical line groups, to obtain four groups of lines including, two selected from the multiple horizontal line groups after removal A horizontal line group, and two vertical line groups are selected from the multiple vertical line groups after removal, to obtain four groups of lines.
  • the processor 401 selects two horizontal line groups from the multiple horizontal line groups, and selects two vertical line groups from the multiple vertical line groups to obtain four groups of lines, including :
  • the horizontal line is a line in one horizontal line group of the plurality of horizontal line groups
  • the j-th vertical line is a line in one vertical line group of the plurality of vertical line groups
  • the ij-th length difference is less than the sixth threshold
  • the ij-th angle difference is less than the seventh threshold
  • the i-th horizontal line and the j-th vertical line have intersections, based on the horizontal line group where the i-th horizontal line is located and the above
  • the vertical line group where the j-th vertical line is located obtains four groups of lines.
  • the processor 401 performs binarization processing on the image to be processed to obtain an edge detection image including edge lines, including:
  • the above-mentioned image to be processed is input into the overall nested neural network to obtain an edge detection image including edge lines.
  • the aforementioned processor 401 may be a central processing unit (CPU), and the processor may also be other general-purpose processors or digital signal processors (DSP). , Application specific integrated circuit (ASIC), ready-made programmable gate array (field-programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 402 may include a read-only memory and a random access memory, and provides instructions and data to the processor 401.
  • a part of the memory 402 may also include a non-volatile random access memory.
  • the memory 402 may also store device type information.
  • the above-mentioned terminal device can execute the implementation manners provided in the steps in Figures 1 to 2 through its built-in functional modules.
  • the implementation manners provided in the above-mentioned steps please refer to the implementation manners provided in the above-mentioned steps, which will not be repeated here.
  • a binary image composed of edge lines in the image is obtained, and further, a binary image composed of edge lines in the image is obtained by identifying
  • the lines are grouped and classified according to the position and type of the lines, so that four groups of lines containing the document area are obtained.
  • One line is selected from the four groups of lines to obtain four lines.
  • These four lines are used to determine the document image
  • the line of the area realizes the precise positioning of the area of the document from the image.
  • it is also possible to determine whether the lines can be divided into four groups according to the rules, thereby achieving the effect of detecting whether there is a certificate in the image.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, when the program instructions are executed by a processor, the steps shown in FIGS. 1 to 3 are implemented.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the foregoing computer-readable storage medium may be the task processing apparatus provided in any of the foregoing embodiments or the internal storage unit of the foregoing terminal device, such as a hard disk or memory of an electronic device.
  • the computer-readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk, a smart media card (SMC), or a secure digital (SD) card equipped on the electronic device. Flash card, etc.
  • the above-mentioned computer-readable storage medium may also include magnetic disks, optical disks, read-only memory (ROM) or random access memory (RAM), etc.
  • the computer-readable storage medium may also include both an internal storage unit of the electronic device and an external storage device.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by the electronic device.
  • the computer-readable storage medium can also be used to temporarily store data that has been output or will be output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

一种证件区域定位的方法及装置,所述方法包括:获取待处理图像,所述待处理图像包含证件图像(101);对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像(102);识别所述边缘检测图像中的线条(103);按照线条的类型和线条的位置对所述线条进行分组,得到四组线条(104);从所述四组线条中的每组线条选取一条线条,得到四条线条(105);将所述四条线条对应的区域确定为所述证件图像的区域(106)。所述方法可以提升证件定位精度,方便后续的图像处理。

Description

一种证件区域定位的方法及装置
本申请要求于2019年09月18日提交中国专利局、申请号为2019108807435,发明名称为“一种证件区域定位的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机领域,尤其涉及一种证件区域定位的方法及装置。
背景技术
随着人们参与的社会活动日益增多,身份证在个人身份验证的过程中起着至关重要的作用,特别是在借贷业务中,需要借贷人上传身份证照片。服务器在接收到借贷人上传的身份证图像后,通常对图像进行身份证区域定位以及清晰化处理,从而得到清晰的身份证图像。发明人意识到,一般用户上传的照片是在各种复杂环境下拍摄的,图像质量参差不齐。而在图像中精确定位出身份证区域直接影后续对身份证区域的清晰化处理。目前,服务器在接收到图像后,通常是通过文字识别技术对图像上的文字进行识别来确定出身份证上的文字,从而达到根据文字的位置对身份证的区域进行定位的目的,但是该方法证件提取身份证文字容易受到图像中其他文字的影响,以致降低了定位精度。
发明内容
本申请实施例提供一种证件区域定位的方法及装置,可以检测并准确定位出证件区域,提升证件定位精度,方便后续的图像处理。
第一方面,本申请实施例提供一种证件区域定位的方法,包括:
获取待处理图像,所述待处理图像包含证件图像;
对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
识别所述边缘检测图像中的线条;
按照线条的类型和线条的位置对所述线条进行分组,得到四组线条;
从所述四组线条中的每组线条选取一条线条,得到四条线条;
将所述四条线条对应的区域确定为所述证件图像的区域。
第二方面,本申请实施例提供一种证件区域定位的装置,包括:
获取单元,用于获取待处理图像,所述待处理图像包含证件图像;
处理单元,用于对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
识别单元,用于识别所述边缘检测图像中的线条;
分组单元,用于按照线条的类型和线条的位置对所述线条进行分组,得到四组线条;
选取单元,用于从所述四组线条中的每组线条选取一条线条,得到四条线条;
确定单元,用于将所述四条线条对应的区域确定为所述证件图像的区域。
第三方面,本申请实施例提供了一种电子设备,该电子设备包括处理器和存储器,该处理器和存储器相互连接。该存储器用于存储支持该终端设备执行上述第一方面和/或第一方面任一种可能的实现方式提供的方法的计算机程序,该计算机程序包括程序指令,该处 理器被配置用于调用上述程序指令,以执行以下步骤:
获取待处理图像,所述待处理图像包含证件图像;
对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
识别所述边缘检测图像中的线条;
按照线条的类型和线条的位置对所述线条进行分组,得到四组线条;
从所述四组线条中的每组线条选取一条线条,得到四条线条;
将所述四条线条对应的区域确定为所述证件图像的区域。
第四方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序包括程序指令,用于实现以下步骤:
获取待处理图像,所述待处理图像包含证件图像;
对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
识别所述边缘检测图像中的线条;
按照线条的类型和线条的位置对所述线条进行分组,得到四组线条;
从所述四组线条中的每组线条选取一条线条,得到四条线条;
将所述四条线条对应的区域确定为所述证件图像的区域。
在本申请实施例中,通过对获取到的待处理图像进行二值化处理后,得到由图像中边缘线组成的二值图像,进一步地,通过识别二值化处理后得到的边缘图像中的线条,并且按照线条的位置和类型对线条进行分组归类,从而得到了包含证件区域的四组线条,从四组线条中分别选取一条线条,得到四条线条,这四条线条就是作为确定该证件图像区域的线条,实现了从图像中精确定位证件的区域。同时还可以通过判断线条能不能按照规则分为四个组,进而达到了检测图像中是否存在证件的效果。
附图说明
图1是本申请提出的一种证件区域定位的方法的流程图;
图2是本申请提出的另一种证件区域定位的方法的流程图;
图3是本申请提出的一种证件区域定位的装置的结构示意图;
图4是本申请提出的一种电子设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
请参阅图1,图1是本申请实施例提供的证件区域定位的方法的流程图。如图1所示,该证件区域定位的方法包括:
101、获取待处理图像,上述待处理图像包含证件图像。
本申请实施例中,证件区域定位的装置可以包括服务器、移动手机、平板电脑、个人数字助理(personal digital assistant,PDA)、移动互联网设备(mobile internet device,MID)、智能穿戴设备(如智能手表、智能手环)等各种电子设备,能够进行图像处理的设备。
在一种可能的实现方式中,获取待处理图像可以是从用户上传的包含证件的图像中获取,也可以是存储的包含证件的图像中获取一张作为待处理图像,还可以是打开摄像头拍摄的一张包含证件的照片,在这里获取待处理图像的方式不做限定。其中,证件的形状和证件的类型也不做限定。
102、对上述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像。
在一种可能的实现方式中,对上述获取到的待处理的图像进行边缘检测,得到一张二值图像,其中,将上述待处理图像输入整体嵌套神经网络(holistically-nested edge detection,HED)中,得到包括边缘线的边缘检测图像,可以是将边缘线的部分用黑色替代,将边缘线以外的部分用白色替代,也可以是将边缘线的部分用黑色替代,将边缘线以外的部分用白色替代。
具体地,在确定上述待处理图像后,可以获取图像中证件的类型,获取的方式可以是通过文字识别技术识别图像中的文字从而确定证件的种类,也可以是通过用户上传包含证件图像时选取的证件类型,还可以是通过存储的图像类型确定的证件的类型。在确定证件的类型后,使用大量包含该证件类型的图像对上述神经网络进行训练,得到训练后的神经网络。使用训练后的神经网络对上述待处理图像进行边缘检测,得到包括边缘线的边缘检测图像。
其中,上述HED网络的中间的卷积层也对其输出上采样得到一个与原图一样的图,并与标注数据(ground-truth)计算损失函数(loss),这些中间的卷积层输出的map称为边缘输出(side-output)。多个边缘输出(side-output)产生的损失直接反向传导到对应的卷积层,一定程度避免了梯度消失,同时也在不同的卷积层(不同的感受野)学到了不同尺度的特征(feature),从而达到了边缘检测的效果。
103、识别上述边缘检测图像中的线条。
在一种可能的实现方式中,在得到了上述包含边缘线的边缘检测图像,因为证件的形状大多都是方形的,所以对检测到的边缘线进行线条识别,识别边缘检测图像中的线条,可以使用霍夫变换(hough transform)对图像中的特征进行提取,提取上述边缘检测图像中的线条。
具体地,对于上述边缘检测图像中的边缘线的每一个像素点(x,y),经过它的直线y=kx+b,映射到k-b的空间即霍夫空间(hough space)中,得到多条平面的直线,由于与y轴平行的线条的表达式斜率无法表达,所以用参数方程r=xcosθ+ysinθ,其中,(x,y)表示某一个边缘的像素点的坐标,r表示经过该点直线到原点的距离。对于每个边缘点映射之后,在霍夫空间进行投票,每次有直线方程满足(r,θ)点,即多条平面线的交点,此处的像素值+1。某一个点越白(像素值越大)表示,越多的点经过这条直线,这就有可能是想要识别出的一条线条。进一步对交点进行过滤,设定交点的数量的第一阈值,即像素值大于或等于第一阈值,或者将大于或等于第一阈值的交点数量作为选定的交点,其中,交点的坐标(k,b)代表上述边缘检测图像中的斜率与截距,进而可以得到上述边缘检测图像中的线条的表达式,从而确定上述边缘检测图像中的线条。
104、按照线条的类型和线条的位置对上述线条进行分组,得到四组线条。
在一种可能的实现方式中,在识别出线条后,首先对线条进行分类,线条的类型有横线和竖线,按照线条的倾斜角的大小将得到的线条划分为横线和竖线两类,然后根据线条的位置将横线类的横线条划分为多个横线组,每一个横线组中的线条的位置相近,并且每一个横线组中的线条长度也是相近的,不会出现过长的线条或者过短的线条。竖线类的线条也做相同的处理,按照位置将竖线类的线条划分为多个竖线组,每一个竖线组中的线条位置相近,且长度也是相近的。
进一步地,根据证件的长宽比与上述横线组中的线条以及竖线组中的线条的长度的比值的差以及根据横线组中的线条与竖线组中的线条的夹角与直角差值确定包含证件的区域的两个横线组和两个竖线组,得到四组线条。
在一种可能的实现方式中,若不能找到四组线条中存在四条线条能确定证件的区域,则确定当前图像中不包含证件,输出用于当前图像中不存在证件的提示信息。
105、从上述四组线条中的每组线条选取一条线条,得到四条线条。
在一种可能的实现方式中,从上述四组线条中的每一组分别选取一条线条,选取的四条线条为组成证件区域的四条线条。其中,选取方式可以是随机在每一组中分别选取一条线条作为证件的区域,也可以是根据线条的位置分别选取每一组中处于最外围的一条线条作为证件的区域,还可以是根据线条的位置分别选取每一组线条中处于最内侧的一条线条作为证件的区域,这里不做限定。
106、将上述四条线条对应的区域确定为上述证件图像的区域。
在一种可能的实现方式中,将得到的四条线条围成的区域确定为证件图像的区域。
进一步地,可以对确定的区域进行截取、降噪等清晰化处理,得到最终可得到证件的图像。
在本申请实施例中,通过对获取到的待处理图像进行二值化处理后,得到由图像中边缘线组成的二值图像,进一步地,通过识别二值化处理后得到的边缘图像中的线条,并且按照线条的位置和类型对线条进行分组归类,从而得到了包含证件区域的四组线条,从四组线条中分别选取一条线条,得到四条线条,这四条线条就是作为确定该证件图像区域的线条,实现了从图像中精确定位证件的区域。同时还可以通过判断线条能不能按照规则分为四个组,进而达到了检测图像中是否存在证件的效果。
请参阅图2,图2是本申请实施例提供的另一种证件区域定位的方法的流程图。如图2所示,该证件区域定位的方法包括:
201、计算上述线条的倾斜角。
在一种可能的实现方式中,为了从识别出的线条中定位出包含证件区域的线条,所以对得到的线条进行分组,首先按照类型分为横线和竖线,其中,计算得到的线条的倾斜角,取x轴作为基准,x轴正向与线条向上方向之间所成的角。根据倾斜角是否在阈值区间内从而确定是横线还是竖线。具体地,线条的倾斜角可以根据线条的斜率得到,也可以是根据线条上的像素点的坐标得到,这里不做限定。
202、将上述倾斜角在阈值区间内的线条确定为竖线,得到多条竖线。
在一种可能的实现方式中,将上述倾斜角在阈值区间内的线条确定为竖线,其中阈值区间可以是45度至135度,也可以是60度至120度,还可以是30度至150度,这里不做限定。将在阈值区间内的线条都划分为竖线。
203、将上述线条中除上述竖线以外的线条分为确定为横线,得到多条横线。
在一种可能的实现方式中,在上述阈值区间外的线条就划分为横线,得到多条横线。即所有线条中除去竖线之后剩下的线条就划分至横线。
204、按照上述线条的位置对上述多条横线进行分组,得到多个横线组。
在一种可能的实现方式中,在划分为横线的线条进行分组,计算横线中每两条横线之间的倾斜角之差是否小于第二阈值,其中,第二阈值可以是5度,也可以是6度,还可以是10度,可以理解的是,第二阈值是人为设置的角度阈值。并且计算每两条横线中点之间的距离是否小于第三阈值,在同时满足的情况下,将这两条横线线条划分至同一横线组中,可以理解的是,第三阈值为距离阈值,可以是根据图像的大小设置的距离阈值。
进一步地,剔除每一个横线组中长度处于阈值区间以外的线条,即得到每一个横线组中的横线是倾斜角相近并且线条的中点距离也相近同时长度类似的线条,可以理解的是,每一个横线组中包含至少一条横线线条。
205、按照上述线条的位置对上述多条竖线进行分组,得到多个竖线组。
在一种可能的实现方式中,与上述划分横线组的方法类似,在划分为竖线的线条进行分组,计算竖线中每两条横线之间的倾斜角之差是否小于第二阈值,其中,第二阈值可以是5度,也可以是6度,还可以是10度,可以理解的是,第二阈值是人为设置的角度阈值。并且计算每两条竖线中点之间的距离是否小于第三阈值,第三阈值为距离阈值,可以是人为设置的。在同时满足的情况下,将这两条竖线线条划分至同一竖线组中。
进一步地,剔除每一个竖线组中长度处于阈值区间以外的线条,即得到每一个竖线组中的横线是倾斜角相近并且线条的中点距离也相近同时长度类似的线条,可以理解的是,每一个竖线组中包含至少两条竖线线条。
206、从上述多个横线组中选取两个横线组,以及从上述多个竖线组中选取两个竖线组,得到四组线条。
在一种可能的实现方式中,在得到多个横线组和多个竖线组之后,选取一个横线组和一个竖线组进行计算横线长度的与竖线长度的比值,确定第i横线的长度与第j竖线的长度的比值与第四阈值之间的绝对差值,得到第ij长度差,i=1,2,j=1,2,上述第i横线为上述多个横线组中的一个横线组中的线条,上述第j竖线为上述多个竖线组中的一个竖线组中的线条。可以理解的是,计算横线组中一条横线长度分别与竖线组中的两条竖线长度的长度比,以及计算横线组中两条横线的长度分别与竖线组中的一条竖线的长度比,通过将长度比分别与第四阈值作差的绝对值,得到第ij长度差,其中第四阈值为确定的证件类型的长宽比,为已知的一个数值。
在得到n个长度差之后,再确定角度差,其中,确定第i横线的倾斜角与第j竖线的倾 斜角的比值与第五阈值之间的绝对差值,得到第ij角度差,i=1,2,j=1,2,上述第j横线为上述多个横线组中的一个横线组中的线条,上述第j竖线为上述多个竖线组中的一个竖线组中的线条,上述第五阈值为90度,可以理解的是,在分组中确定四个分组中的线段满足长宽比与证件的长宽比误差小于第六阈值,并且横线与竖线的角度差接近或等于90度。
在上述第ij长度差小于第六阈值、上述第ij角度差小于第七阈值以及上述第i横线与上述第j竖线存在交点的情况下,将上述第i横线所在横线组和上述第j竖线所在的竖线组四组线条。其中第六阈值为人为确定的与长宽比之差的误差大小,第七阈值为与90度的误差大小。第六阈值和第七阈值可以是人为预先设置的。其中,在该四组线条中每一组有一条线条满足上述角度差与长度差以及存在交点的情况下,则确定这四组线条为得到的四个分组,并进行从四个分组中分别选取一条线条从而确定证件的区域。
在本申请实施例中,通过对获取到的待处理图像进行二值化处理后,得到由图像中边缘线组成的二值图像,进一步地,通过识别二值化处理后得到的边缘图像中的线条,并且按照线条的位置和类型对线条进行分组归类,从而得到了包含证件区域的四组线条,从四组线条中分别选取一条线条,得到四条线条,这四条线条就是作为确定该证件图像区域的线条,实现了从图像中精确定位证件的区域。同时还可以通过判断线条能不能按照规则分为四个组,进而达到了检测图像中是否存在证件的效果。
请参阅图3,图3是本申请提出的一种证件区域定位的装置结构示意图。如图3所示,该证件区域定位的装置3000包括:
获取单元301,用于获取待处理图像,上述待处理图像包含证件图像;
处理单元302,用于对上述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
识别单元303,用于识别上述边缘检测图像中的线条;
分组单元304,用于按照线条的类型和线条的位置对上述线条进行分组,得到四组线条;
选取单元305,用于从上述四组线条中的每组线条选取一条线条,得到四条线条;
确定单元306,用于将上述四条线条对应的区域确定为上述证件图像的区域。
在一种可能的实现方式中,上述识别单元303,具体用于:
将上述边缘线转换到平面直角坐标系下,得到多条平面线;
确定上述多条平面线的交点,经过上述交点的平面线的数量大于第一阈值;
根据上述交点的坐标确定上述边缘线在像素坐标系中的直线表达式;
根据上述直线表达式确定上述边缘检测图像中的线条。
在一种可能的实现方式中,上述分组单元304,具体用于:
计算上述线条的倾斜角;
将上述倾斜角在阈值区间内的线条确定为竖线,得到多条竖线;
将上述线条中除上述竖线以外的线条分为确定为横线,得到多条横线;
按照上述线条的位置对上述多条横线进行分组,得到多个横线组;
按照上述线条的位置对上述多条竖线进行分组,得到多个竖线组;
从上述多个横线组中选取两个横线组,以及从上述多个竖线组中选取两个竖线组,得到四组线条。
在一种可能的实现方式中,上述分组单元304,具体用于:
上述按照上述线条的位置对上述多条横线进行分组,得到多个横线组,包括,在第一线条的倾斜角与第二线条的倾斜角之差小于第二阈值,且第一线条的中点与第二线条的中点之间的距离小于第三阈值的情况下,确定上述第一线条与上述第二线条属于同一横线组;
上述按照上述线条的位置对上述多条竖线进行分组,得到多个竖线组包括,在第三线条的倾斜角与第四线条的倾斜角之差小于上述第二阈值,且第三线条的中点与第四线条的中点之间的距离小于上述第三阈值的情况下,确定上述第三线条与上述第四线条属于同一竖线组。
在一种可能的实现方式中,上述装置3000还包括:
剔除单元307,用于在上述得到多个横线组和上述得到多个竖线组之后,剔除上述多个横线组和上述多个竖线组中线条长度处于阈值区间以外的线条;
上述选取单元305,具体用于:
从剔除后的多个横线组中选取两个横线组,以及从剔除后的多个竖线组中选取两个竖线组,得到四组线条。
在一种可能的实现方式中,上述选取单元305,具体用于:
确定第i横线的长度与第j竖线的长度的比值与第四阈值之间的绝对差值,得到第ij长度差,i=1,2,j=1,2,上述第i横线为上述多个横线组中的一个横线组中的线条,上述第j竖线为上述多个竖线组中的一个竖线组中的线条;
确定第i横线的倾斜角与第j竖线的倾斜角的比值与第五阈值之间的绝对差值,得到第ij角度差,i=1,2,j=1,2,上述第j横线为上述多个横线组中的一个横线组中的线条,上述第j竖线为上述多个竖线组中的一个竖线组中的线条;
在上述第ij长度差小于第六阈值、上述第ij角度差小于第七阈值以及上述第i横线与上述第j竖线存在交点的情况下,根据上述第i横线所在横线组和上述第j竖线所在的竖线组得到四组线条。
在一种可能的实现方式中,上述处理单元302,具体用于:
将上述待处理图像输入整体嵌套神经网络,得到包括边缘线的边缘检测图像。
可以理解的是,图3所示的证件区域定位的装置的具体实现方式还可参考图1和图2所示的方法,这里不再一一详述。
在本申请实施例提出的装置,通过对获取到的待处理图像进行二值化处理后,得到由图像中边缘线组成的二值图像,进一步地,通过识别二值化处理后得到的边缘图像中的线条,并且按照线条的位置和类型对线条进行分组归类,从而得到了包含证件区域的四组线条,从四组线条中分别选取一条线条,得到四条线条,这四条线条就是作为确定该证件图像区域的线条,实现了从图像中精确定位证件的区域。同时还可以通过判断线条能不能按 照规则分为四个组,进而达到了检测图像中是否存在证件的效果。
请参阅图4,图4是本申请实施例提供的一电子设备的结构示意图。如图4所示,该电子设备可以包括:
一个或多个处理器401和存储器402。上述处理器401和存储器402通过总线403连接。存储器402用于存储计算机程序,该计算机程序包括程序指令,处理器401用于执行存储器402存储的程序指令,其中,处理器401被配置用于调用程序指令执行以下步骤:
获取待处理图像,上述待处理图像包含证件图像;
对上述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
识别上述边缘检测图像中的线条;
按照线条的类型和线条的位置对上述线条进行分组,得到四组线条;
从上述四组线条中的每组线条选取一条线条,得到四条线条;
将上述四条线条对应的区域确定为上述证件图像的区域。
在一种可能的实现方式中,上述处理器401识别上述边缘检测图像中的线条,包括:
将上述边缘线转换到平面直角坐标系下,得到多条平面线;
确定上述多条平面线的交点,经过上述交点的平面线的数量大于第一阈值;
根据上述交点的坐标确定上述边缘线在像素坐标系中的直线表达式;
根据上述直线表达式确定上述边缘检测图像中的线条。
在一种可能的实现方式中,上述处理器401按照线条的类型和线条的位置对上述线条进行分组,得到四组线条,包括:
计算上述线条的倾斜角;
将上述倾斜角在阈值区间内的线条确定为竖线,得到多条竖线;
将上述线条中除上述竖线以外的线条分为确定为横线,得到多条横线;
按照上述线条的位置对上述多条横线进行分组,得到多个横线组;
按照上述线条的位置对上述多条竖线进行分组,得到多个竖线组;
从上述多个横线组中选取两个横线组,以及从上述多个竖线组中选取两个竖线组,得到四组线条。
在一种可能的实现方式中,上述处理器401按照上述线条的位置对上述多条横线进行分组,得到多个横线组,包括,在第一线条的倾斜角与第二线条的倾斜角之差小于第二阈值,且第一线条的中点与第二线条的中点之间的距离小于第三阈值的情况下,确定上述第一线条与上述第二线条属于同一横线组;
上述按照上述线条的位置对上述多条竖线进行分组,得到多个竖线组包括,在第三线条的倾斜角与第四线条的倾斜角之差小于上述第二阈值,且第三线条的中点与第四线条的中点之间的距离小于上述第三阈值的情况下,确定上述第三线条与上述第四线条属于同一竖线组。
在一种可能的实现方式中,上述处理器401还被调用执行如下步骤:
上述处理器401还被调用执行上述得到多个横线组和上述得到多个竖线组之后,剔除 上述多个横线组和上述多个竖线组中线条长度处于阈值区间以外的线条;
上述从上述多个横线组中选取两个横线组,以及从上述多个竖线组中选取两个竖线组,得到四组线条包括,从剔除后的多个横线组中选取两个横线组,以及从剔除后的多个竖线组中选取两个竖线组,得到四组线条。
在一种可能的实现方式中,上述处理器401从上述多个横线组中选取两个横线组,以及从上述多个竖线组中选取两个竖线组,得到四组线条,包括:
确定第i横线的长度与第j竖线的长度的比值与第四阈值之间的绝对差值,得到第ij长度差,i=1,2,j=1,2,上述第i横线为上述多个横线组中的一个横线组中的线条,上述第j竖线为上述多个竖线组中的一个竖线组中的线条;
确定第i横线的倾斜角与第j竖线的倾斜角的比值与第五阈值之间的绝对差值,得到第ij角度差,i=1,2,j=1,2,上述第j横线为上述多个横线组中的一个横线组中的线条,上述第j竖线为上述多个竖线组中的一个竖线组中的线条;
在上述第ij长度差小于第六阈值、上述第ij角度差小于第七阈值以及上述第i横线与上述第j竖线存在交点的情况下,根据上述第i横线所在横线组和上述第j竖线所在的竖线组得到四组线条。
在一种可能的实现方式中,上述处理器401对上述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像,包括:
将上述待处理图像输入整体嵌套神经网络,得到包括边缘线的边缘检测图像。
应当理解,在一些可行的实施方式中,上述处理器401可以是中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
该存储器402可以包括只读存储器和随机存取存储器,并向处理器401提供指令和数据。存储器402的一部分还可以包括非易失性随机存取存储器。例如,存储器402还可以存储设备类型的信息。
具体实现中,上述终端设备可通过其内置的各个功能模块执行如上述图1至图2中各个步骤所提供的实现方式,具体可参见上述各个步骤所提供的实现方式,在此不再赘述。
在本申请实施例中,通过对获取到的待处理图像进行二值化处理后,得到由图像中边缘线组成的二值图像,进一步地,通过识别二值化处理后得到的边缘图像中的线条,并且按照线条的位置和类型对线条进行分组归类,从而得到了包含证件区域的四组线条,从四组线条中分别选取一条线条,得到四条线条,这四条线条就是作为确定该证件图像区域的线条,实现了从图像中精确定位证件的区域。同时还可以通过判断线条能不能按照规则分为四个组,进而达到了检测图像中是否存在证件的效果。
本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机 程序,该计算机程序包括程序指令,该程序指令被处理器执行时实现图1至图3中各个步骤所提供的方法,具体可参见上述各个步骤所提供的实现方式,在此不再赘述。其中,所述计算机可读存储介质可以是非易失性,也可以是易失性。
上述计算机可读存储介质可以是前述任一实施例提供的任务处理装置或者上述终端设备的内部存储单元,例如电子设备的硬盘或内存。该计算机可读存储介质也可以是该电子设备的外部存储设备,例如该电子设备上配备的插接式硬盘,智能存储卡(smart media card,SMC),安全数字(secure digital,SD)卡,闪存卡(flash card)等。上述计算机可读存储介质还可以包括磁碟、光盘、只读存储记忆体(read-only memory,ROM)或随机存储记忆体(random access memory,RAM)等。进一步地,该计算机可读存储介质还可以既包括该电子设备的内部存储单元也包括外部存储设备。该计算机可读存储介质用于存储该计算机程序以及该电子设备所需的其他程序和数据。该计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。
本申请的权利要求书和说明书及附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置展示该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (20)

  1. 一种证件区域定位的方法,其中,包括:
    获取待处理图像,所述待处理图像包含证件图像;
    对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
    识别所述边缘检测图像中的线条;
    按照线条的类型和线条的位置对所述线条进行分组,得到四组线条;
    从所述四组线条中的每组线条选取一条线条,得到四条线条;
    将所述四条线条对应的区域确定为所述证件图像的区域。
  2. 根据权利要求1所述的方法,其中,所述识别所述边缘检测图像中的线条,包括:
    将所述边缘线转换到平面直角坐标系下,得到多条平面线;
    确定所述多条平面线的交点,经过所述交点的平面线的数量大于第一阈值;
    根据所述交点的坐标确定所述边缘线在像素坐标系中的直线表达式;
    根据所述直线表达式确定所述边缘检测图像中的线条。
  3. 根据权利要求1所述的方法,其中,所述按照线条的类型和线条的位置对所述线条进行分组,得到四组线条,包括:
    计算所述线条的倾斜角;
    将所述倾斜角在阈值区间内的线条确定为竖线,得到多条竖线;
    将所述线条中除所述竖线以外的线条确定为横线,得到多条横线;
    按照所述线条的位置对所述多条横线进行分组,得到多个横线组;
    按照所述线条的位置对所述多条竖线进行分组,得到多个竖线组;
    从所述多个横线组中选取两个横线组,以及从所述多个竖线组中选取两个竖线组,得到四组线条。
  4. 根据权利要求3所述的方法,其中,
    所述按照所述线条的位置对所述多条横线进行分组,得到多个横线组,包括,在第一线条的倾斜角与第二线条的倾斜角之差小于第二阈值,且第一线条的中点与第二线条的中点之间的距离小于第三阈值的情况下,确定所述第一线条与所述第二线条属于同一横线组;
    所述按照所述线条的位置对所述多条竖线进行分组,得到多个竖线组包括,在第三线条的倾斜角与第四线条的倾斜角之差小于所述第二阈值,且第三线条的中点与第四线条的中点之间的距离小于所述第三阈值的情况下,确定所述第三线条与所述第四线条属于同一竖线组。
  5. 根据权利要求4所述的方法,其中,
    在所述得到多个横线组和所述得到多个竖线组之后还包括,剔除所述多个横线组和所述多个竖线组中线条长度处于阈值区间以外的线条;
    所述从所述多个横线组中选取两个横线组,以及从所述多个竖线组中选取两个竖线组,得到四组线条包括,从剔除后的多个横线组中选取两个横线组,以及从剔除后的多个竖线组中选取两个竖线组,得到四组线条。
  6. 根据权利要求3所述的方法,其中,所述从所述多个横线组中选取两个横线组,以及从所述多个竖线组中选取两个竖线组,得到四组线条,包括:
    确定第i横线的长度与第j竖线的长度的比值与第四阈值之间的绝对差值,得到第ij长度差,i=1,2,j=1,2,所述第i横线为所述多个横线组中的一个横线组中的线条,所述第j竖线为所述多个竖线组中的一个竖线组中的线条;
    确定第i横线的倾斜角与第j竖线的倾斜角的比值与第五阈值之间的绝对差值,得到第ij角度差,i=1,2,j=1,2,所述第j横线为所述多个横线组中的一个横线组中的线条,所述第j竖线为所述多个竖线组中的一个竖线组中的线条;
    在所述第ij长度差小于第六阈值、所述第ij角度差小于第七阈值以及所述第i横线与所述第j竖线存在交点的情况下,根据所述第i横线所在横线组和所述第j竖线所在的竖线组得到四组线条。
  7. 根据权利要求1-6任一项所述的方法,其中,所述对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像,包括:
    将所述待处理图像输入整体嵌套神经网络,得到包括边缘线的边缘检测图像。
  8. 一种证件区域定位的装置,其中,包括:
    获取单元,用于获取待处理图像,所述待处理图像包含证件图像;
    处理单元,用于对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
    识别单元,用于识别所述边缘检测图像中的线条;
    分组单元,用于按照线条的类型和线条的位置对所述线条进行分组,得到四组线条;
    选取单元,用于从所述四组线条中的每组线条选取一条线条,得到四条线条;
    确定单元,用于将所述四条线条对应的区域确定为所述证件图像的区域。
  9. 一种电子设备,其中,包括处理器、存储器和总线;所述处理器和所述存储器通过所述总线相互连接;其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,以执行以下步骤:
    获取待处理图像,所述待处理图像包含证件图像;
    对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
    识别所述边缘检测图像中的线条;
    按照线条的类型和线条的位置对所述线条进行分组,得到四组线条;
    从所述四组线条中的每组线条选取一条线条,得到四条线条;
    将所述四条线条对应的区域确定为所述证件图像的区域。
  10. 根据权利要求9所述的电子设备,其中,所述处理器用于:
    将所述边缘线转换到平面直角坐标系下,得到多条平面线;
    确定所述多条平面线的交点,经过所述交点的平面线的数量大于第一阈值;
    根据所述交点的坐标确定所述边缘线在像素坐标系中的直线表达式;
    根据所述直线表达式确定所述边缘检测图像中的线条。
  11. 根据权利要求9所述的电子设备,其中,所述处理器用于:
    计算所述线条的倾斜角;
    将所述倾斜角在阈值区间内的线条确定为竖线,得到多条竖线;
    将所述线条中除所述竖线以外的线条确定为横线,得到多条横线;
    按照所述线条的位置对所述多条横线进行分组,得到多个横线组;
    按照所述线条的位置对所述多条竖线进行分组,得到多个竖线组;
    从所述多个横线组中选取两个横线组,以及从所述多个竖线组中选取两个竖线组,得到四组线条。
  12. 根据权利要求11所述的电子设备,其中,所述处理器用于:
    在第一线条的倾斜角与第二线条的倾斜角之差小于第二阈值,且第一线条的中点与第二线条的中点之间的距离小于第三阈值的情况下,确定所述第一线条与所述第二线条属于同一横线组;
    所述按照所述线条的位置对所述多条竖线进行分组,得到多个竖线组包括,在第三线条的倾斜角与第四线条的倾斜角之差小于所述第二阈值,且第三线条的中点与第四线条的中点之间的距离小于所述第三阈值的情况下,确定所述第三线条与所述第四线条属于同一竖线组。
  13. 根据权利要求12所述的电子设备,其中,所述处理器用于:
    剔除所述多个横线组和所述多个竖线组中线条长度处于阈值区间以外的线条;
    所述从所述多个横线组中选取两个横线组,以及从所述多个竖线组中选取两个竖线组,得到四组线条包括,从剔除后的多个横线组中选取两个横线组,以及从剔除后的多个竖线组中选取两个竖线组,得到四组线条。
  14. 根据权利要求11所述的电子设备,其中,所述处理器用于:
    确定第i横线的长度与第j竖线的长度的比值与第四阈值之间的绝对差值,得到第ij长度差,i=1,2,j=1,2,所述第i横线为所述多个横线组中的一个横线组中的线条,所述第j竖线为所述多个竖线组中的一个竖线组中的线条;
    确定第i横线的倾斜角与第j竖线的倾斜角的比值与第五阈值之间的绝对差值,得到第ij角度差,i=1,2,j=1,2,所述第j横线为所述多个横线组中的一个横线组中的线条,所述第j竖线为所述多个竖线组中的一个竖线组中的线条;
    在所述第ij长度差小于第六阈值、所述第ij角度差小于第七阈值以及所述第i横线与所述第j竖线存在交点的情况下,根据所述第i横线所在横线组和所述第j竖线所在的竖线组得到四组线条。
  15. 根据权利要求9-14任一项所述的电子设备,其中,所述处理器用于:
    将所述待处理图像输入整体嵌套神经网络,得到包括边缘线的边缘检测图像。
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时,用于实现以下步骤:
    获取待处理图像,所述待处理图像包含证件图像;
    对所述待处理图像进行二值化处理,得到包括边缘线的边缘检测图像;
    识别所述边缘检测图像中的线条;
    按照线条的类型和线条的位置对所述线条进行分组,得到四组线条;
    从所述四组线条中的每组线条选取一条线条,得到四条线条;
    将所述四条线条对应的区域确定为所述证件图像的区域。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述程序指令被处理器执行时,还用于实现以下步骤:
    将所述边缘线转换到平面直角坐标系下,得到多条平面线;
    确定所述多条平面线的交点,经过所述交点的平面线的数量大于第一阈值;
    根据所述交点的坐标确定所述边缘线在像素坐标系中的直线表达式;
    根据所述直线表达式确定所述边缘检测图像中的线条。
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述程序指令被处理器执行时,还用于实现以下步骤:
    计算所述线条的倾斜角;
    将所述倾斜角在阈值区间内的线条确定为竖线,得到多条竖线;
    将所述线条中除所述竖线以外的线条确定为横线,得到多条横线;
    按照所述线条的位置对所述多条横线进行分组,得到多个横线组;
    按照所述线条的位置对所述多条竖线进行分组,得到多个竖线组;
    从所述多个横线组中选取两个横线组,以及从所述多个竖线组中选取两个竖线组,得到四组线条。
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述程序指令被处理器执行时,还用于实现以下步骤:
    在第一线条的倾斜角与第二线条的倾斜角之差小于第二阈值,且第一线条的中点与第二线条的中点之间的距离小于第三阈值的情况下,确定所述第一线条与所述第二线条属于同一横线组;
    所述按照所述线条的位置对所述多条竖线进行分组,得到多个竖线组包括,在第三线条的倾斜角与第四线条的倾斜角之差小于所述第二阈值,且第三线条的中点与第四线条的中点之间的距离小于所述第三阈值的情况下,确定所述第三线条与所述第四线条属于同一竖线组。
  20. 根据权利要求19所述的计算机可读存储介质,其中,所述程序指令被处理器执行时,还用于实现以下步骤:
    剔除所述多个横线组和所述多个竖线组中线条长度处于阈值区间以外的线条;
    所述从所述多个横线组中选取两个横线组,以及从所述多个竖线组中选取两个竖线组,得到四组线条包括,从剔除后的多个横线组中选取两个横线组,以及从剔除后的多个竖线组中选取两个竖线组,得到四组线条。
PCT/CN2020/099260 2019-09-18 2020-06-30 一种证件区域定位的方法及装置 WO2021051939A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910880743.5A CN110738204B (zh) 2019-09-18 2019-09-18 一种证件区域定位的方法及装置
CN201910880743.5 2019-09-18

Publications (1)

Publication Number Publication Date
WO2021051939A1 true WO2021051939A1 (zh) 2021-03-25

Family

ID=69268086

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/099260 WO2021051939A1 (zh) 2019-09-18 2020-06-30 一种证件区域定位的方法及装置

Country Status (2)

Country Link
CN (1) CN110738204B (zh)
WO (1) WO2021051939A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738204B (zh) * 2019-09-18 2023-07-25 平安科技(深圳)有限公司 一种证件区域定位的方法及装置
CN113012060A (zh) * 2021-02-07 2021-06-22 深圳柔果信息科技有限公司 一种图像处理方法、图像处理系统及电子设备
CN113139399B (zh) * 2021-05-13 2024-04-12 阳光电源股份有限公司 一种图像线框识别方法及服务器

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104217444A (zh) * 2013-06-03 2014-12-17 支付宝(中国)网络技术有限公司 定位卡片区域的方法和设备
CN105550633A (zh) * 2015-10-30 2016-05-04 小米科技有限责任公司 区域识别方法及装置
CN109711415A (zh) * 2018-11-13 2019-05-03 平安科技(深圳)有限公司 证件轮廓确定方法、装置及存储介质、服务器
US10325374B1 (en) * 2016-07-06 2019-06-18 Morphotrust Usa, Llc System and method for segmenting ID cards with curved corners
CN110738204A (zh) * 2019-09-18 2020-01-31 平安科技(深圳)有限公司 一种证件区域定位的方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729645A (zh) * 2013-12-20 2014-04-16 湖北微模式科技发展有限公司 基于单目摄像头的二代证区域定位与提取的方法及装置
SG11201811691RA (en) * 2017-06-30 2019-01-30 Beijing Didi Infinity Technology & Development Co Ltd Systems and methods for verifying authenticity of id photo

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104217444A (zh) * 2013-06-03 2014-12-17 支付宝(中国)网络技术有限公司 定位卡片区域的方法和设备
CN105550633A (zh) * 2015-10-30 2016-05-04 小米科技有限责任公司 区域识别方法及装置
US10325374B1 (en) * 2016-07-06 2019-06-18 Morphotrust Usa, Llc System and method for segmenting ID cards with curved corners
CN109711415A (zh) * 2018-11-13 2019-05-03 平安科技(深圳)有限公司 证件轮廓确定方法、装置及存储介质、服务器
CN110738204A (zh) * 2019-09-18 2020-01-31 平安科技(深圳)有限公司 一种证件区域定位的方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LI ZHI-JIE, YUAN PENG-TAI, LI BO-HAN: "A Segmentation Algorithm of Identification Card Characters of Mobile Phone Based on Perspective Transform", JISUANJI JISHU YU FAZHAN - COMPUTER TECHNOLOGY AND DEVELOPMENT, JISUANJI JISHU YU FAZHAN BIANJIBU - SHAANXI COMPUTER SOCIETY, CN, vol. 28, no. 7, 1 July 2018 (2018-07-01), CN, pages 58 - 62, XP055793058, ISSN: 1673-629X, DOI: 10.3969/j.issn.1673-629X.2018.07.013 *

Also Published As

Publication number Publication date
CN110738204B (zh) 2023-07-25
CN110738204A (zh) 2020-01-31

Similar Documents

Publication Publication Date Title
WO2021051939A1 (zh) 一种证件区域定位的方法及装置
WO2019169532A1 (zh) 车牌识别方法及云系统
CN109345553B (zh) 一种手掌及其关键点检测方法、装置和终端设备
CN109086734B (zh) 一种对人眼图像中瞳孔图像进行定位的方法及装置
WO2022089124A1 (zh) 证照真伪识别方法、装置、计算机可读介质及电子设备
CN111242124B (zh) 一种证件分类方法、装置及设备
CN109086753B (zh) 基于双通道卷积神经网络的交通标志识别方法、装置
CN108197644A (zh) 一种图像识别方法和装置
CN110852311A (zh) 一种三维人手关键点定位方法及装置
CN110503682B (zh) 矩形控件识别方法、装置、终端及存储介质
CN110502694B (zh) 基于大数据分析的律师推荐方法及相关设备
CN109447080B (zh) 一种字符识别方法及装置
WO2021042562A1 (zh) 基于手写签名的用户身份识别方法、装置及终端设备
WO2017161636A1 (zh) 一种基于指纹的终端支付方法及装置
WO2020224296A1 (zh) 证件验证、身份验证方法、装置及设备
CN113627428A (zh) 文档图像矫正方法、装置、存储介质及智能终端设备
CN113111880A (zh) 证件图像校正方法、装置、电子设备及存储介质
CN110443184A (zh) 身份证信息提取方法、装置及计算机存储介质
Kumar et al. Salient keypoint-based copy–move image forgery detection
CN108960246B (zh) 一种用于图像识别的二值化处理装置及方法
CN114511857A (zh) 一种ocr识别结果处理方法、装置、设备及存储介质
CN112396060B (zh) 基于身份证分割模型的身份证识别方法及其相关设备
CN108090728B (zh) 一种基于智能终端的快递信息录入方法及录入系统
CN113486715A (zh) 图像翻拍识别方法、智能终端以及计算机存储介质
WO2020244076A1 (zh) 人脸识别方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20864385

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20864385

Country of ref document: EP

Kind code of ref document: A1