CN113962906A - Identity card image correction method and system for multi-task detection - Google Patents
Identity card image correction method and system for multi-task detection Download PDFInfo
- Publication number
- CN113962906A CN113962906A CN202111470874.XA CN202111470874A CN113962906A CN 113962906 A CN113962906 A CN 113962906A CN 202111470874 A CN202111470874 A CN 202111470874A CN 113962906 A CN113962906 A CN 113962906A
- Authority
- CN
- China
- Prior art keywords
- image
- identity card
- target
- corrected
- card image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 69
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000003702 image correction Methods 0.000 title claims abstract description 11
- 238000010586 diagram Methods 0.000 claims abstract description 36
- 230000009466 transformation Effects 0.000 claims abstract description 13
- 238000012937 correction Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims description 30
- 230000008569 process Effects 0.000 claims description 20
- 238000012216 screening Methods 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 4
- 230000007246 mechanism Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 2
- 239000000284 extract Substances 0.000 abstract 1
- 238000004590 computer program Methods 0.000 description 7
- 238000012015 optical character recognition Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 240000004282 Grewia occidentalis Species 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/80—Geometric correction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20164—Salient point detection; Corner detection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Geometry (AREA)
- Quality & Reliability (AREA)
- Collating Specific Patterns (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an identity card image correction method and system for multi-task detection, which inputs an identity card image to be corrected into a trained identity card multi-task detection model and extracts a deep characteristic diagram of the identity card image; regressing certificate corner point combination coordinates and target position coordinates according to the deep feature map, and meanwhile judging the category of the target image; judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinates, the target position coordinates and the class to which the target image belongs; the perspective transformation is carried out on the certificate angular point combination coordinate according to the orientation of the identity card image to be corrected to obtain the corrected identity card image, the certificate key point detection and the target image detection of the face and the national emblem are simultaneously carried out through the double-branch structure of the multi-task detection model, and the prior of the identity card pattern is combined, so that the perspective correction of the identity card image is realized, and the image correction effect and efficiency are effectively improved.
Description
Technical Field
The invention relates to the technical field of identity card image processing, in particular to an identity card image correction method and system based on multi-task detection.
Background
With the development of the internet and big data technology, a customer needs to perform real-name authentication of an identity card when handling various businesses in an online bank, besides, links such as account opening and credit also need to upload an identity card copy as auxiliary auditing data, and all certificate materials are automatically audited and input information on the certificate through an identity card OCR service.
The basic processing flow of the identity card OCR service comprises 3 steps of identity card position detection, identity card area correction, identity card character content identification and the like, wherein the first 2 steps can be defined as a pretreatment flow of the identity card OCR.
The common OCR pretreatment methods for the identity card include two methods: firstly, the identification card target is positioned through an identification card detection model, then, a target area is overturned in four directions, namely, up, down, left and right through a character orientation model, so that the target direction of the identification card is corrected, secondly, frame straight lines and four-corner coordinates of the identification card are extracted through traditional visual means such as Hough transformation, and the identified four-corner coordinates of the identification card are used for correcting perspective transformation to obtain an identification card image with regular forward direction, so that the method can effectively reduce the difficulty in extracting information of a subsequent OCR (optical character recognition) of the identification card, but for scenes such as copy, dark light and the like, the robustness of methods such as Hough transformation is poor, and effective frame lines and corner points cannot be extracted; secondly, the segmentation of the target area of the identity card is realized through an identity card segmentation model, the target area is judged to be in four directions, namely, up, down, left and right directions through a character direction model to obtain direction information, the direction and the segmentation area are utilized to realize perspective correction of the identity card, and a positive regular identity card image is obtained, so that the problem of difficult correction of scenes such as copy, dim light and the like can be solved to a certain extent, but the segmentation model is usually large in calculated amount and long in correction time; in the prior art, 4 pieces of direction orientation information of an original image are obtained by using a character orientation recognition model and are rotationally corrected, and then an identity card area is positioned by recognition of a certificate detection model, so that the aim of correcting the identity card image is fulfilled. Although the identity card image can be corrected to a certain extent, a large number of scenes with small-angle rotation exist during mobile phone shooting, and perspective transformation is required to be performed to realize the correction of the identity card image.
Disclosure of Invention
The invention aims to solve the technical problem of how to improve the efficiency of correcting an identity card image, and aims to provide an identity card image correction method and system for multi-task detection.
The invention is realized by the following technical scheme:
in one aspect, the invention provides an identity card image correction method for multitask detection, which comprises the following steps:
inputting an identity card image to be corrected into a trained identity card multi-task detection model, and respectively extracting image key points of the identity card image to be corrected and deep feature maps of a target image;
regression of certificate corner point combination coordinates is performed according to deep feature maps of image key points; meanwhile, returning a target position coordinate according to a deep characteristic diagram of the target image and judging the category of the target image;
judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinates, the target position coordinates and the class to which the target image belongs;
and carrying out perspective transformation on the certificate corner point combination coordinates according to the orientation of the identity card image to be corrected to obtain a corrected identity card image.
Before the identification card is identified, in order to improve the accuracy of identification, the collected identification card image can be corrected, the identification of the orientation of the original image is usually carried out through a character orientation identification model when the identification card image is generally corrected, the original image is corrected according to the orientation and perspective analysis, however, a large amount of redundancy exists when the orientation of the original image is identified, a large amount of computing resources can be consumed, the processing efficiency of the whole identification card identification process is reduced, the robustness of the character orientation model is poor, and errors are easy to occur in identification of the identification card image extracted from a complex environment. Therefore, the method and the device have the advantages that the two branches are output through the multi-task detection model and respectively correspond to the certificate key point detection branch and the target detection branches of the human face and the national emblem, the multi-branch task processing can improve the detection efficiency of recognition and reduce the calculated amount; the method comprises the steps of detecting four corner points of a branch positioning identity card by using key points of a certificate, detecting the positions of a face and a national emblem in the branch positioning identity card by using a target of the face and the national emblem, combining the prior of an identity card pattern, and identifying the positions of the corner points of the certificate by using the face and the national emblem, so that the orientation of an identity card image in an image can be judged, and the perspective correction of the certificate can be carried out, so that a square certificate image can be cut off later, in the process of identifying characters of the identity card, the content of the identity card can be identified by directly using the coordinates of the key points and the positions of the target, the matching of the key words is not needed, on one hand, the identification precision loss caused by blurring, small-angle inclination and the like is reduced, on the other hand, the matching step is omitted, and the identification efficiency of identity card information is also improved.
Further, the identity card multi-task detection model adopts a Yolov5s structure, and each network head structure of the Yolov5s structure removes the last layer of convolution.
Further, the specific process of extracting the deep feature map of the image key points by using the YOLOv5s structure is as follows:
carrying out scaling processing on the identity card image to be corrected to obtain an image to be detected;
processing the image to be detected by using each network head with a YOLOv5s structure, and outputting a deep characteristic diagram corresponding to each network head;
and extracting a key point thermodynamic diagram of each deep feature map by using the key point feature convolution group, and regressing certificate corner point combination coordinates according to the key point thermodynamic diagrams.
Further, the deep feature map is processed by utilizing a key point feature convolution group of the hourglass structure, and the line segment area features of the deep feature map are enhanced by utilizing an attention mechanism module of the CBAM, so that a key point thermodynamic diagram is obtained.
Further, the specific process of regressing the certificate corner point combination coordinates comprises the following steps:
screening line segment peaks and thermal peak points from the key point thermodynamic diagram to serve as candidate angular points;
combining every 4 candidate corner groups into 1 candidate corner group to obtain a plurality of candidate corner groups, wherein each candidate corner group forms a quadrangle;
calculating the inner angle of a quadrangle formed by combining each candidate angular point;
screening candidate corner combinations with quadrilateral internal angles larger than 135 degrees, and summarizing the remaining candidate corner combinations to obtain a candidate combination set;
calculating the overlapping area between any two candidate corner combinations in the candidate combination set, deleting the candidate corner combination with lower confidence coefficient when the overlapping area is more than 0.5, repeating the process, and outputting the candidate corner combination with higher confidence coefficient in the last two candidate corner combinations;
and carrying out scaling processing on the output candidate corner combination to obtain certificate corner combination coordinates.
Further, the specific process of extracting the deep feature map of the target image by using the YOLOv5s structure is as follows:
carrying out scaling processing on the identity card image to be corrected to obtain an image to be detected;
processing the image to be detected by using each network head with a YOLOv5s structure, and outputting a deep characteristic diagram of a target image in each network head;
processing a deep characteristic map of the target image by using a target detection characteristic convolution group to obtain a target characteristic map;
regression of the coordinates of the center point and the width and the height of the target position by using the target feature map; and meanwhile, judging the category of the target image according to the target feature map, wherein the target image comprises a human face target and a national emblem target.
Further, the specific process of regression of the target position coordinates is as follows:
screening a salient region from the target characteristic diagram as a target region;
calculating the overlapping area between any two different target areas, and deleting the target area with lower confidence coefficient when the overlapping area is larger than 0.5;
calculating the central point and the width and the height of the target area as target positions on the target characteristic graph;
and carrying out zooming processing on the target position to obtain a target position coordinate.
Further, the specific process of judging the orientation of the identity card is as follows:
judging the front side and the back side of the identity card image to be corrected according to the category of the target image, wherein when the target image is a face target, the identity card image to be corrected is the front side, and when the target image is a national emblem target, the identity card image to be corrected is the back side;
calculating and comparing the distance between the center point of the target position coordinate and each corner point in the certificate corner point combination coordinate, and screening out the corner point closest to the target image to obtain the position of the target image in the identity card image to be corrected;
and obtaining the orientation of the identity card image to be corrected according to the front side and the back side of the identity card image to be corrected and the position of the target image in the identity card image to be corrected.
In another aspect, the present invention provides a system for correcting an image of an identification card with multitask detection, including:
the deep feature extraction module is used for respectively extracting image key points of the identity card image to be corrected and a deep feature map of the target image according to a detection network of shared features in the identity card multi-task detection model;
the key point regression branch module is used for processing the deep characteristic map of the key points of the image to obtain certificate corner point combination coordinates;
the target detection branch module is used for processing the deep characteristic map of the target image to obtain a target position coordinate and judging the category of the target image;
the orientation judging module is used for judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinate, the target position coordinate and the class to which the target image belongs;
and the correction module is used for carrying out perspective transformation on the certificate corner point combination coordinates according to the orientation of the identity card image to be corrected to obtain a corrected identity card image.
Further, the identity card multi-task detection model adopts a YOLOv5s structure, and each network head structure of the YOLOv5s structure removes the last layer of convolution.
Compared with the prior art, the invention has the following advantages and beneficial effects:
according to the identity card image correction method and system based on multi-task detection, the double-branch structure of the multi-task detection model is used for simultaneously detecting key points of the identity card and target images of the face and the national emblem, the prior of the identity card pattern and the positions of the corner points of the identity card are combined to judge the orientation of the identity card image and correct the identity card image by means of perspective transformation, the image correction effect and efficiency are effectively improved, the identification of character information of the identity card can be directly carried out according to the corrected image, the keyword matching process is omitted, and the accuracy of the whole identity card information identification is improved.
Drawings
In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and that for those skilled in the art, other related drawings can be obtained from these drawings without inventive effort. In the drawings:
FIG. 1 is a flow chart of a method for correcting an image of an identification card with multitasking detection according to the present invention;
FIG. 2 is a schematic structural diagram of a multi-task ID card detection model according to an embodiment of the present invention;
FIG. 3 is a network architecture for processing deep feature maps in accordance with one embodiment of the present invention;
FIG. 4 is a block diagram of a multitasking ID card image rectification system according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.
Example 1
As shown in fig. 1, the identity card image rectification method for multitask detection in this embodiment includes the following steps:
step 1, inputting an identity card image to be corrected into a trained identity card multi-task detection model, and respectively extracting image key points of the identity card image to be corrected and deep feature maps of a target image;
specifically, as shown in fig. 2, the identity card multitask detection model adopts a YOLOv5s structure, which is divided into a detection network and two detection branches sharing features, where the two detection branches are a key point regression branch and a target detection branch, the detection network sharing features is used to extract image key points of an identity card image to be corrected and deep feature maps of a target image, and then the coordinates of a certificate corner point and the position coordinates of a target image of a human face and a national emblem are regressed through the key point regression branch and the target detection branch, in this embodiment, each network head structure of the YOLOv5s structure removes the last layer of convolution, and the size of an image input by each network head is (608, 3), so that the identity card image to be corrected needs to be scaled, so that the identity card image to be corrected is scaled to a long edge 608 pixel according to the width and height dimensions, and thus the scaled image can be processed by the convolutional neural network, the extraction of the deep feature map is realized, and the sizes of the deep feature maps output by the 3 network heads are (76, 255), (38, 255), (19, 255), respectively.
Step 2, in the key point regression branch, regression of evidence corner point combination coordinates according to deep feature maps of image key points; meanwhile, in the target detection branch, returning a target position coordinate according to a deep characteristic diagram of the target image and judging the category of the target image;
step 2.1, in the key point regression branch, extracting key point thermodynamic diagrams of deep feature maps of the image key points by using the key point regression branch at each network head, and regressing the key point positions through the thermodynamic diagrams, wherein the specific process is as follows:
extracting a key point thermodynamic diagram of a deep feature map of key points of the image by using a key point feature convolution group, and regressing a certificate corner-point combination coordinate according to the key point thermodynamic diagram, optionally, as shown in fig. 3, processing the deep feature map by using a key point feature convolution group of an hourglass structure, wherein the key point feature convolution group of the hourglass structure comprises three convolution layers, and the line segment region feature of the deep feature map is enhanced by using an attention mechanism module of a CBAM during convolution of a second layer to obtain the key point thermodynamic diagram.
The specific process of regressing the certificate corner point combination coordinates according to the obtained key point thermodynamic diagram of the image key points comprises the following steps:
step 2.1.1, screening line segment peaks and thermal peak points from the key point thermodynamic diagram to serve as candidate angular points;
step 2.1.2, because 4 corner points form the identity card of the quadrangle, every 4 candidate corner points are combined into 1 candidate corner point combination to obtain a plurality of candidate corner point combinations, and each candidate corner point in each candidate corner point combination is connected to form a quadrangle;
step 2.1.3, four internal angles of a quadrangle formed by combining each candidate angular point are calculated;
step 2.1.4, screening out any candidate corner combination with an internal angle of the quadrangle larger than 135 degrees, and summarizing the remaining candidate corner combinations to obtain a candidate combination set;
step 2.1.5, calculating the overlapping area between any two different candidate corner combinations in the candidate combination set, and deleting the candidate corner combination with lower confidence coefficient when the overlapping area is more than 0.5;
step 2.1.6, repeating the process of step 2.1.5 until the last two candidate corner combinations and outputting the candidate corner combination with higher confidence in the last two candidate corner combinations; according to the scaling in step 1, the magnification of 8,16 and 32 is used as the certificate corner point combination coordinates on the input image (the image scaled to the long side 608 in step 1).
2.2, in the target detection branch, extracting a target feature map of a deep feature map of the target image by using the target detection branch at each network head, regressing a target position coordinate through the target feature map, and judging the category of the target; specifically, images in the region need to be identified in the target detection branch, so that a deep feature map of the target image is processed by using a target detection feature convolution group to obtain a target feature map;
regression of the coordinates of the center point and the width and the height of the target position by using the target feature map; and meanwhile, judging the category of the target image according to the target feature map, wherein the target image comprises a human face target and a national emblem target.
The specific process of regressing the coordinates of the center point and the width and the height of the target position according to the target feature map of the target image is as follows:
2.2.1, screening out a significant area from the target characteristic diagram as a target area;
step 2.2.2, calculating the overlapping area between any two different target areas, and deleting the target area with lower confidence coefficient when the overlapping area is larger than 0.5;
step 2.2.3, calculating the central point and the width and the height of the target area as target positions on the characteristic diagram;
step 2.2.4, the target position is processed at 8,16,32 magnifications, respectively, as the target position coordinates on the input image (the image zoomed to the long side 608 in step 1) according to the zoom scale in step 1 above.
Step 3, judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinate, the target position coordinate and the class of the target image;
step 3.1, judging the front side and the back side of the identity card image to be corrected according to the category of the target image, wherein when the target image is a human face target, the identity card image to be corrected is the front side, and when the target image is a national emblem target, the identity card image to be corrected is the back side;
step 3.2, calculating and comparing the distance between the center point of the target position coordinate and each corner point in the certificate corner point combination coordinate, and screening out the corner point closest to the target image to obtain the position of the target image in the identity card image to be corrected;
and 3.3, obtaining the orientation of the identity card image to be corrected according to the front side and the back side of the identity card image to be corrected and the position of the target image in the identity card image to be corrected.
Since whether the to-be-corrected identification card image is the front side or the back side at this time is known according to the target image in step 3.1, then the center point closest to the face target can be defined as the right upper corner point of the front side of the identification card image, and the center point closest to the national emblem target is the left upper corner point of the back side of the identification card image, division is performed according to the prior definition, and the positions (upper left, lower left, upper right and lower right) of the target image in the to-be-corrected identification card image are obtained according to the position closest to the target image in the certificate corner point combination coordinates.
And S4, carrying out perspective transformation on the area where the identity card image to be corrected is located according to the orientation of the identity card image to be corrected and the certificate corner combined coordinates, wherein the area is defined by the certificate corner combined coordinates of the identity card, and obtaining the corrected identity card image.
Specifically, according to the orientation of the to-be-corrected identification card image judged in the step 3, a perspective transformation matrix is calculated by using 4 corners of the certificate corner combination coordinates, and the area of the to-be-corrected identification card image is transformed to obtain the corrected identification card image.
Example 2
On the other hand, as shown in fig. 4, the present invention provides an identity card image rectification system with multitask detection, to which the identity card image rectification method in embodiment 1 is applied, including:
the deep feature extraction module is used for respectively extracting image key points of the identity card image to be corrected and a deep feature map of the target image according to a detection network of shared features in the identity card multi-task detection model;
the identity card multi-task detection model adopts a YOLOv5s structure and comprises a detection network and two detection branches, wherein the detection network and the two detection branches share characteristics, the two detection branches are a key point regression branch and a target detection branch respectively, and each network head structure of the YOLOv5s structure is removed from the last layer of convolution.
The key point regression branch module is used for processing the deep characteristic map of the key points of the image to obtain certificate corner point combination coordinates;
the target detection branch module is used for processing the deep characteristic map of the target image to obtain a target position coordinate and judging the category of the target image;
the orientation judging module is used for judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinate, the target position coordinate and the class to which the target image belongs;
and the correction module is used for carrying out perspective transformation on the certificate corner point combination coordinates according to the orientation of the identity card image to be corrected to obtain a corrected identity card image.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be understood by those skilled in the art that all or part of the steps of the above facts and methods can be implemented by hardware related to instructions of a program, and the related program or the program can be stored in a computer readable storage medium, and when executed, the program includes the following steps: corresponding method steps are introduced here, and the storage medium may be a ROM/RAM, a magnetic disk, an optical disk, etc.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (10)
1. A multitask detection identity card image correction method is characterized by comprising the following steps:
inputting an identity card image to be corrected into a trained identity card multi-task detection model, and respectively extracting image key points of the identity card image to be corrected and deep feature maps of a target image;
regression of certificate corner point combination coordinates is performed according to deep feature maps of image key points; meanwhile, returning a target position coordinate according to a deep characteristic diagram of the target image and judging the category of the target image;
judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinates, the target position coordinates and the class to which the target image belongs;
and carrying out perspective transformation on the certificate corner point combination coordinates according to the orientation of the identity card image to be corrected to obtain a corrected identity card image.
2. The method as claimed in claim 1, wherein the identity card multi-tasking detection model adopts a YOLOv5s structure, and each of the net head structures of the YOLOv5s structure removes the last layer of convolution.
3. The identity card image rectification method based on multitask detection as claimed in claim 1, wherein the specific process of extracting the deep feature map of the image key points by using the YOLOv5s structure is as follows:
carrying out scaling processing on the identity card image to be corrected to obtain an image to be detected;
processing the image to be detected by using each network head with a YOLOv5s structure, and outputting a deep characteristic diagram corresponding to each network head;
and extracting a key point thermodynamic diagram of each deep feature map by using the key point feature convolution group, and regressing certificate corner point combination coordinates according to the key point thermodynamic diagrams.
4. The identity card image rectification method for multitask detection according to claim 3, wherein the deep feature map is processed by utilizing a key point feature convolution group with an hourglass structure, and the key point thermodynamic map is obtained by utilizing an attention mechanism module of CBAM to enhance line segment area features of the deep feature map.
5. The method for correcting the image of the identity card based on the multi-task detection as claimed in claim 3, wherein the specific process of regressing the combined coordinates of the corner points of the identity card is as follows:
screening line segment peaks and thermal peak points from the key point thermodynamic diagram to serve as candidate angular points;
combining every 4 candidate corner groups into 1 candidate corner group to obtain a plurality of candidate corner groups, wherein each candidate corner group forms a quadrangle;
calculating the inner angle of a quadrangle formed by combining each candidate angular point;
screening candidate corner combinations with quadrilateral internal angles larger than 135 degrees, and summarizing the remaining candidate corner combinations to obtain a candidate combination set;
calculating the overlapping area between any two candidate corner combinations in the candidate combination set, deleting the candidate corner combination with lower confidence coefficient when the overlapping area is more than 0.5, repeating the process, and outputting the candidate corner combination with higher confidence coefficient in the last two candidate corner combinations;
and carrying out scaling processing on the output candidate corner combination to obtain certificate corner combination coordinates.
6. The method for correcting the identity card image with multitask detection according to claim 1, wherein the specific process of extracting the deep feature map of the target image by using the YOLOv5s structure is as follows:
carrying out scaling processing on the identity card image to be corrected to obtain an image to be detected;
processing the image to be detected by using each network head with a YOLOv5s structure, and outputting a deep characteristic diagram of a target image in each network head;
processing a deep characteristic map of the target image by using a target detection characteristic convolution group to obtain a target characteristic map;
regression of the coordinates of the center point and the width and the height of the target position by using the target feature map; and meanwhile, judging the category of the target image according to the target feature map, wherein the target image comprises a human face target and a national emblem target.
7. The identity card image correction method of multitask detection according to claim 6, wherein the specific process of returning the target position coordinates is as follows:
screening a salient region from the target characteristic diagram as a target region;
calculating the overlapping area between any two different target areas, and deleting the target area with lower confidence coefficient when the overlapping area is larger than 0.5;
calculating the central point and the width and the height of the target area as target positions on the target characteristic graph;
and carrying out zooming processing on the target position to obtain a target position coordinate.
8. The identity card image rectification method based on multitask detection as claimed in claim 1, characterized in that the specific process of judging the orientation of the identity card is as follows:
judging the front side and the back side of the identity card image to be corrected according to the category of the target image, wherein when the target image is a face target, the identity card image to be corrected is the front side, and when the target image is a national emblem target, the identity card image to be corrected is the back side;
calculating and comparing the distance between the center point of the target position coordinate and each corner point in the certificate corner point combination coordinate, and screening out the corner point closest to the target image to obtain the position of the target image in the identity card image to be corrected;
and obtaining the orientation of the identity card image to be corrected according to the front side and the back side of the identity card image to be corrected and the position of the target image in the identity card image to be corrected.
9. An identity card image rectification system with multitask detection is characterized by comprising:
the deep feature extraction module is used for respectively extracting image key points of the identity card image to be corrected and a deep feature map of the target image according to a detection network of shared features in the identity card multi-task detection model;
the key point regression branch module is used for processing the deep characteristic map of the key points of the image to obtain certificate corner point combination coordinates;
the target detection branch module is used for processing the deep characteristic map of the target image to obtain a target position coordinate and judging the category of the target image;
the orientation judging module is used for judging the orientation of the identity card image to be corrected according to the certificate corner point combination coordinate, the target position coordinate and the class to which the target image belongs;
and the correction module is used for carrying out perspective transformation on the certificate corner point combination coordinates according to the orientation of the identity card image to be corrected to obtain a corrected identity card image.
10. The system of claim 9, wherein the identity card multi-tasking detection model employs a YOLOv5s structure, and each of the net head structures of the YOLOv5s structure is configured to remove the last layer of convolution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111470874.XA CN113962906B (en) | 2021-12-03 | 2021-12-03 | Identity card image correction method and system for multitasking detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111470874.XA CN113962906B (en) | 2021-12-03 | 2021-12-03 | Identity card image correction method and system for multitasking detection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113962906A true CN113962906A (en) | 2022-01-21 |
CN113962906B CN113962906B (en) | 2024-07-12 |
Family
ID=79472942
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111470874.XA Active CN113962906B (en) | 2021-12-03 | 2021-12-03 | Identity card image correction method and system for multitasking detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113962906B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118097099A (en) * | 2022-11-25 | 2024-05-28 | 唯思电子商务(深圳)有限公司 | Certificate quality detection and orientation correction method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020252920A1 (en) * | 2019-06-18 | 2020-12-24 | 平安科技(深圳)有限公司 | Picture correction method and apparatus, computer device and computer-readable storage medium |
CN112837263A (en) * | 2020-12-21 | 2021-05-25 | 上海致宇信息技术有限公司 | Identity card information positioning method under complex background |
-
2021
- 2021-12-03 CN CN202111470874.XA patent/CN113962906B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020252920A1 (en) * | 2019-06-18 | 2020-12-24 | 平安科技(深圳)有限公司 | Picture correction method and apparatus, computer device and computer-readable storage medium |
CN112837263A (en) * | 2020-12-21 | 2021-05-25 | 上海致宇信息技术有限公司 | Identity card information positioning method under complex background |
Non-Patent Citations (2)
Title |
---|
TRAN PHUONG NAM 等: "A Pose Estimation Method for Multiple Identity Cards based on Corner Heatmaps and Part Affinity Fields", 《2020 7TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE》, 2 February 2021 (2021-02-02), pages 338 - 343 * |
杨航: "身份证号码识别算法与研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, 15 April 2018 (2018-04-15), pages 138 - 3051 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118097099A (en) * | 2022-11-25 | 2024-05-28 | 唯思电子商务(深圳)有限公司 | Certificate quality detection and orientation correction method |
Also Published As
Publication number | Publication date |
---|---|
CN113962906B (en) | 2024-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
Yuliang et al. | Detecting curve text in the wild: New dataset and new solution | |
CN110414507B (en) | License plate recognition method and device, computer equipment and storage medium | |
CN112016547A (en) | Image character recognition method, system and medium based on deep learning | |
CN112541443B (en) | Invoice information extraction method, invoice information extraction device, computer equipment and storage medium | |
CN110334709B (en) | License plate detection method based on end-to-end multi-task deep learning | |
CN112949455B (en) | Value-added tax invoice recognition system and method | |
CN113673519B (en) | Character recognition method based on character detection model and related equipment thereof | |
CN114038004A (en) | Certificate information extraction method, device, equipment and storage medium | |
CN111027538A (en) | Container detection method based on instance segmentation model | |
CN112200191B (en) | Image processing method, image processing device, computing equipment and medium | |
CN110751154A (en) | Complex environment multi-shape text detection method based on pixel-level segmentation | |
Cheng et al. | A direct regression scene text detector with position-sensitive segmentation | |
CN114648756B (en) | Book character recognition and reading method and system based on pointing vector | |
CN112883926A (en) | Identification method and device for table medical images | |
Zao et al. | Richer U-Net: Learning more details for road detection in remote sensing images | |
CN112232336A (en) | Certificate identification method, device, equipment and storage medium | |
CN113962906A (en) | Identity card image correction method and system for multi-task detection | |
CN114494751A (en) | License information identification method, device, equipment and medium | |
CN112364863B (en) | Character positioning method and system for license document | |
CN110909816B (en) | Picture identification method and device | |
CN115937537A (en) | Intelligent identification method, device and equipment for target image and storage medium | |
Zhang et al. | Implementation of high performance hardware architecture of face recognition algorithm based on local binary pattern on FPGA | |
CN111914836B (en) | Method, device, equipment and medium for extracting identity card information | |
CN114547437A (en) | Image retrieval method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |