CN113962906A

CN113962906A - Identity card image correction method and system for multi-task detection

Info

Publication number: CN113962906A
Application number: CN202111470874.XA
Authority: CN
Inventors: 王博
Original assignee: Sichuan XW Bank Co Ltd
Current assignee: Sichuan XW Bank Co Ltd
Priority date: 2021-12-03
Filing date: 2021-12-03
Publication date: 2022-01-21
Anticipated expiration: 2041-12-03
Also published as: CN113962906B

Abstract

The invention discloses an identity card image correction method and system for multi-task detection, which inputs an identity card image to be corrected into a trained identity card multi-task detection model and extracts a deep characteristic diagram of the identity card image; regressing certificate corner point combination coordinates and target position coordinates according to the deep feature map, and meanwhile judging the category of the target image; judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinates, the target position coordinates and the class to which the target image belongs; the perspective transformation is carried out on the certificate angular point combination coordinate according to the orientation of the identity card image to be corrected to obtain the corrected identity card image, the certificate key point detection and the target image detection of the face and the national emblem are simultaneously carried out through the double-branch structure of the multi-task detection model, and the prior of the identity card pattern is combined, so that the perspective correction of the identity card image is realized, and the image correction effect and efficiency are effectively improved.

Description

Identity card image correction method and system for multi-task detection

Technical Field

The invention relates to the technical field of identity card image processing, in particular to an identity card image correction method and system based on multi-task detection.

Background

With the development of the internet and big data technology, a customer needs to perform real-name authentication of an identity card when handling various businesses in an online bank, besides, links such as account opening and credit also need to upload an identity card copy as auxiliary auditing data, and all certificate materials are automatically audited and input information on the certificate through an identity card OCR service.

The basic processing flow of the identity card OCR service comprises 3 steps of identity card position detection, identity card area correction, identity card character content identification and the like, wherein the first 2 steps can be defined as a pretreatment flow of the identity card OCR.

The common OCR pretreatment methods for the identity card include two methods: firstly, the identification card target is positioned through an identification card detection model, then, a target area is overturned in four directions, namely, up, down, left and right through a character orientation model, so that the target direction of the identification card is corrected, secondly, frame straight lines and four-corner coordinates of the identification card are extracted through traditional visual means such as Hough transformation, and the identified four-corner coordinates of the identification card are used for correcting perspective transformation to obtain an identification card image with regular forward direction, so that the method can effectively reduce the difficulty in extracting information of a subsequent OCR (optical character recognition) of the identification card, but for scenes such as copy, dark light and the like, the robustness of methods such as Hough transformation is poor, and effective frame lines and corner points cannot be extracted; secondly, the segmentation of the target area of the identity card is realized through an identity card segmentation model, the target area is judged to be in four directions, namely, up, down, left and right directions through a character direction model to obtain direction information, the direction and the segmentation area are utilized to realize perspective correction of the identity card, and a positive regular identity card image is obtained, so that the problem of difficult correction of scenes such as copy, dim light and the like can be solved to a certain extent, but the segmentation model is usually large in calculated amount and long in correction time; in the prior art, 4 pieces of direction orientation information of an original image are obtained by using a character orientation recognition model and are rotationally corrected, and then an identity card area is positioned by recognition of a certificate detection model, so that the aim of correcting the identity card image is fulfilled. Although the identity card image can be corrected to a certain extent, a large number of scenes with small-angle rotation exist during mobile phone shooting, and perspective transformation is required to be performed to realize the correction of the identity card image.

Disclosure of Invention

The invention aims to solve the technical problem of how to improve the efficiency of correcting an identity card image, and aims to provide an identity card image correction method and system for multi-task detection.

The invention is realized by the following technical scheme:

in one aspect, the invention provides an identity card image correction method for multitask detection, which comprises the following steps:

inputting an identity card image to be corrected into a trained identity card multi-task detection model, and respectively extracting image key points of the identity card image to be corrected and deep feature maps of a target image;

regression of certificate corner point combination coordinates is performed according to deep feature maps of image key points; meanwhile, returning a target position coordinate according to a deep characteristic diagram of the target image and judging the category of the target image;

judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinates, the target position coordinates and the class to which the target image belongs;

and carrying out perspective transformation on the certificate corner point combination coordinates according to the orientation of the identity card image to be corrected to obtain a corrected identity card image.

Before the identification card is identified, in order to improve the accuracy of identification, the collected identification card image can be corrected, the identification of the orientation of the original image is usually carried out through a character orientation identification model when the identification card image is generally corrected, the original image is corrected according to the orientation and perspective analysis, however, a large amount of redundancy exists when the orientation of the original image is identified, a large amount of computing resources can be consumed, the processing efficiency of the whole identification card identification process is reduced, the robustness of the character orientation model is poor, and errors are easy to occur in identification of the identification card image extracted from a complex environment. Therefore, the method and the device have the advantages that the two branches are output through the multi-task detection model and respectively correspond to the certificate key point detection branch and the target detection branches of the human face and the national emblem, the multi-branch task processing can improve the detection efficiency of recognition and reduce the calculated amount; the method comprises the steps of detecting four corner points of a branch positioning identity card by using key points of a certificate, detecting the positions of a face and a national emblem in the branch positioning identity card by using a target of the face and the national emblem, combining the prior of an identity card pattern, and identifying the positions of the corner points of the certificate by using the face and the national emblem, so that the orientation of an identity card image in an image can be judged, and the perspective correction of the certificate can be carried out, so that a square certificate image can be cut off later, in the process of identifying characters of the identity card, the content of the identity card can be identified by directly using the coordinates of the key points and the positions of the target, the matching of the key words is not needed, on one hand, the identification precision loss caused by blurring, small-angle inclination and the like is reduced, on the other hand, the matching step is omitted, and the identification efficiency of identity card information is also improved.

Further, the identity card multi-task detection model adopts a Yolov5s structure, and each network head structure of the Yolov5s structure removes the last layer of convolution.

Further, the specific process of extracting the deep feature map of the image key points by using the YOLOv5s structure is as follows:

carrying out scaling processing on the identity card image to be corrected to obtain an image to be detected;

processing the image to be detected by using each network head with a YOLOv5s structure, and outputting a deep characteristic diagram corresponding to each network head;

and extracting a key point thermodynamic diagram of each deep feature map by using the key point feature convolution group, and regressing certificate corner point combination coordinates according to the key point thermodynamic diagrams.

Further, the deep feature map is processed by utilizing a key point feature convolution group of the hourglass structure, and the line segment area features of the deep feature map are enhanced by utilizing an attention mechanism module of the CBAM, so that a key point thermodynamic diagram is obtained.

Further, the specific process of regressing the certificate corner point combination coordinates comprises the following steps:

screening line segment peaks and thermal peak points from the key point thermodynamic diagram to serve as candidate angular points;

combining every 4 candidate corner groups into 1 candidate corner group to obtain a plurality of candidate corner groups, wherein each candidate corner group forms a quadrangle;

calculating the inner angle of a quadrangle formed by combining each candidate angular point;

screening candidate corner combinations with quadrilateral internal angles larger than 135 degrees, and summarizing the remaining candidate corner combinations to obtain a candidate combination set;

calculating the overlapping area between any two candidate corner combinations in the candidate combination set, deleting the candidate corner combination with lower confidence coefficient when the overlapping area is more than 0.5, repeating the process, and outputting the candidate corner combination with higher confidence coefficient in the last two candidate corner combinations;

and carrying out scaling processing on the output candidate corner combination to obtain certificate corner combination coordinates.

Further, the specific process of extracting the deep feature map of the target image by using the YOLOv5s structure is as follows:

processing the image to be detected by using each network head with a YOLOv5s structure, and outputting a deep characteristic diagram of a target image in each network head;

processing a deep characteristic map of the target image by using a target detection characteristic convolution group to obtain a target characteristic map;

regression of the coordinates of the center point and the width and the height of the target position by using the target feature map; and meanwhile, judging the category of the target image according to the target feature map, wherein the target image comprises a human face target and a national emblem target.

Further, the specific process of regression of the target position coordinates is as follows:

screening a salient region from the target characteristic diagram as a target region;

calculating the overlapping area between any two different target areas, and deleting the target area with lower confidence coefficient when the overlapping area is larger than 0.5;

calculating the central point and the width and the height of the target area as target positions on the target characteristic graph;

and carrying out zooming processing on the target position to obtain a target position coordinate.

Further, the specific process of judging the orientation of the identity card is as follows:

judging the front side and the back side of the identity card image to be corrected according to the category of the target image, wherein when the target image is a face target, the identity card image to be corrected is the front side, and when the target image is a national emblem target, the identity card image to be corrected is the back side;

calculating and comparing the distance between the center point of the target position coordinate and each corner point in the certificate corner point combination coordinate, and screening out the corner point closest to the target image to obtain the position of the target image in the identity card image to be corrected;

and obtaining the orientation of the identity card image to be corrected according to the front side and the back side of the identity card image to be corrected and the position of the target image in the identity card image to be corrected.

In another aspect, the present invention provides a system for correcting an image of an identification card with multitask detection, including:

the deep feature extraction module is used for respectively extracting image key points of the identity card image to be corrected and a deep feature map of the target image according to a detection network of shared features in the identity card multi-task detection model;

the key point regression branch module is used for processing the deep characteristic map of the key points of the image to obtain certificate corner point combination coordinates;

the target detection branch module is used for processing the deep characteristic map of the target image to obtain a target position coordinate and judging the category of the target image;

the orientation judging module is used for judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinate, the target position coordinate and the class to which the target image belongs;

and the correction module is used for carrying out perspective transformation on the certificate corner point combination coordinates according to the orientation of the identity card image to be corrected to obtain a corrected identity card image.

Compared with the prior art, the invention has the following advantages and beneficial effects:

according to the identity card image correction method and system based on multi-task detection, the double-branch structure of the multi-task detection model is used for simultaneously detecting key points of the identity card and target images of the face and the national emblem, the prior of the identity card pattern and the positions of the corner points of the identity card are combined to judge the orientation of the identity card image and correct the identity card image by means of perspective transformation, the image correction effect and efficiency are effectively improved, the identification of character information of the identity card can be directly carried out according to the corrected image, the keyword matching process is omitted, and the accuracy of the whole identity card information identification is improved.

Drawings

In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and that for those skilled in the art, other related drawings can be obtained from these drawings without inventive effort. In the drawings:

FIG. 1 is a flow chart of a method for correcting an image of an identification card with multitasking detection according to the present invention;

FIG. 2 is a schematic structural diagram of a multi-task ID card detection model according to an embodiment of the present invention;

FIG. 3 is a network architecture for processing deep feature maps in accordance with one embodiment of the present invention;

FIG. 4 is a block diagram of a multitasking ID card image rectification system according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.

Example 1

As shown in fig. 1, the identity card image rectification method for multitask detection in this embodiment includes the following steps:

step 1, inputting an identity card image to be corrected into a trained identity card multi-task detection model, and respectively extracting image key points of the identity card image to be corrected and deep feature maps of a target image;

specifically, as shown in fig. 2, the identity card multitask detection model adopts a YOLOv5s structure, which is divided into a detection network and two detection branches sharing features, where the two detection branches are a key point regression branch and a target detection branch, the detection network sharing features is used to extract image key points of an identity card image to be corrected and deep feature maps of a target image, and then the coordinates of a certificate corner point and the position coordinates of a target image of a human face and a national emblem are regressed through the key point regression branch and the target detection branch, in this embodiment, each network head structure of the YOLOv5s structure removes the last layer of convolution, and the size of an image input by each network head is (608, 3), so that the identity card image to be corrected needs to be scaled, so that the identity card image to be corrected is scaled to a long edge 608 pixel according to the width and height dimensions, and thus the scaled image can be processed by the convolutional neural network, the extraction of the deep feature map is realized, and the sizes of the deep feature maps output by the 3 network heads are (76, 255), (38, 255), (19, 255), respectively.

Step 2, in the key point regression branch, regression of evidence corner point combination coordinates according to deep feature maps of image key points; meanwhile, in the target detection branch, returning a target position coordinate according to a deep characteristic diagram of the target image and judging the category of the target image;

step 2.1, in the key point regression branch, extracting key point thermodynamic diagrams of deep feature maps of the image key points by using the key point regression branch at each network head, and regressing the key point positions through the thermodynamic diagrams, wherein the specific process is as follows:

extracting a key point thermodynamic diagram of a deep feature map of key points of the image by using a key point feature convolution group, and regressing a certificate corner-point combination coordinate according to the key point thermodynamic diagram, optionally, as shown in fig. 3, processing the deep feature map by using a key point feature convolution group of an hourglass structure, wherein the key point feature convolution group of the hourglass structure comprises three convolution layers, and the line segment region feature of the deep feature map is enhanced by using an attention mechanism module of a CBAM during convolution of a second layer to obtain the key point thermodynamic diagram.

The specific process of regressing the certificate corner point combination coordinates according to the obtained key point thermodynamic diagram of the image key points comprises the following steps:

step 2.1.1, screening line segment peaks and thermal peak points from the key point thermodynamic diagram to serve as candidate angular points;

step 2.1.2, because 4 corner points form the identity card of the quadrangle, every 4 candidate corner points are combined into 1 candidate corner point combination to obtain a plurality of candidate corner point combinations, and each candidate corner point in each candidate corner point combination is connected to form a quadrangle;

step 2.1.3, four internal angles of a quadrangle formed by combining each candidate angular point are calculated;

step 2.1.4, screening out any candidate corner combination with an internal angle of the quadrangle larger than 135 degrees, and summarizing the remaining candidate corner combinations to obtain a candidate combination set;

step 2.1.5, calculating the overlapping area between any two different candidate corner combinations in the candidate combination set, and deleting the candidate corner combination with lower confidence coefficient when the overlapping area is more than 0.5;

step 2.1.6, repeating the process of step 2.1.5 until the last two candidate corner combinations and outputting the candidate corner combination with higher confidence in the last two candidate corner combinations; according to the scaling in step 1, the magnification of 8,16 and 32 is used as the certificate corner point combination coordinates on the input image (the image scaled to the long side 608 in step 1).

2.2, in the target detection branch, extracting a target feature map of a deep feature map of the target image by using the target detection branch at each network head, regressing a target position coordinate through the target feature map, and judging the category of the target; specifically, images in the region need to be identified in the target detection branch, so that a deep feature map of the target image is processed by using a target detection feature convolution group to obtain a target feature map;

The specific process of regressing the coordinates of the center point and the width and the height of the target position according to the target feature map of the target image is as follows:

2.2.1, screening out a significant area from the target characteristic diagram as a target area;

step 2.2.2, calculating the overlapping area between any two different target areas, and deleting the target area with lower confidence coefficient when the overlapping area is larger than 0.5;

step 2.2.3, calculating the central point and the width and the height of the target area as target positions on the characteristic diagram;

step 2.2.4, the target position is processed at 8,16,32 magnifications, respectively, as the target position coordinates on the input image (the image zoomed to the long side 608 in step 1) according to the zoom scale in step 1 above.

Step 3, judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinate, the target position coordinate and the class of the target image;

step 3.1, judging the front side and the back side of the identity card image to be corrected according to the category of the target image, wherein when the target image is a human face target, the identity card image to be corrected is the front side, and when the target image is a national emblem target, the identity card image to be corrected is the back side;

step 3.2, calculating and comparing the distance between the center point of the target position coordinate and each corner point in the certificate corner point combination coordinate, and screening out the corner point closest to the target image to obtain the position of the target image in the identity card image to be corrected;

and 3.3, obtaining the orientation of the identity card image to be corrected according to the front side and the back side of the identity card image to be corrected and the position of the target image in the identity card image to be corrected.

Since whether the to-be-corrected identification card image is the front side or the back side at this time is known according to the target image in step 3.1, then the center point closest to the face target can be defined as the right upper corner point of the front side of the identification card image, and the center point closest to the national emblem target is the left upper corner point of the back side of the identification card image, division is performed according to the prior definition, and the positions (upper left, lower left, upper right and lower right) of the target image in the to-be-corrected identification card image are obtained according to the position closest to the target image in the certificate corner point combination coordinates.

And S4, carrying out perspective transformation on the area where the identity card image to be corrected is located according to the orientation of the identity card image to be corrected and the certificate corner combined coordinates, wherein the area is defined by the certificate corner combined coordinates of the identity card, and obtaining the corrected identity card image.

Specifically, according to the orientation of the to-be-corrected identification card image judged in the step 3, a perspective transformation matrix is calculated by using 4 corners of the certificate corner combination coordinates, and the area of the to-be-corrected identification card image is transformed to obtain the corrected identification card image.

Example 2

On the other hand, as shown in fig. 4, the present invention provides an identity card image rectification system with multitask detection, to which the identity card image rectification method in embodiment 1 is applied, including:

the identity card multi-task detection model adopts a YOLOv5s structure and comprises a detection network and two detection branches, wherein the detection network and the two detection branches share characteristics, the two detection branches are a key point regression branch and a target detection branch respectively, and each network head structure of the YOLOv5s structure is removed from the last layer of convolution.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be understood by those skilled in the art that all or part of the steps of the above facts and methods can be implemented by hardware related to instructions of a program, and the related program or the program can be stored in a computer readable storage medium, and when executed, the program includes the following steps: corresponding method steps are introduced here, and the storage medium may be a ROM/RAM, a magnetic disk, an optical disk, etc.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A multitask detection identity card image correction method is characterized by comprising the following steps:

2. The method as claimed in claim 1, wherein the identity card multi-tasking detection model adopts a YOLOv5s structure, and each of the net head structures of the YOLOv5s structure removes the last layer of convolution.

3. The identity card image rectification method based on multitask detection as claimed in claim 1, wherein the specific process of extracting the deep feature map of the image key points by using the YOLOv5s structure is as follows:

4. The identity card image rectification method for multitask detection according to claim 3, wherein the deep feature map is processed by utilizing a key point feature convolution group with an hourglass structure, and the key point thermodynamic map is obtained by utilizing an attention mechanism module of CBAM to enhance line segment area features of the deep feature map.

5. The method for correcting the image of the identity card based on the multi-task detection as claimed in claim 3, wherein the specific process of regressing the combined coordinates of the corner points of the identity card is as follows:

6. The method for correcting the identity card image with multitask detection according to claim 1, wherein the specific process of extracting the deep feature map of the target image by using the YOLOv5s structure is as follows:

7. The identity card image correction method of multitask detection according to claim 6, wherein the specific process of returning the target position coordinates is as follows:

8. The identity card image rectification method based on multitask detection as claimed in claim 1, characterized in that the specific process of judging the orientation of the identity card is as follows:

9. An identity card image rectification system with multitask detection is characterized by comprising:

10. The system of claim 9, wherein the identity card multi-tasking detection model employs a YOLOv5s structure, and each of the net head structures of the YOLOv5s structure is configured to remove the last layer of convolution.