CN112307937A

CN112307937A - Deep learning-based identity card quality inspection method and system

Info

Publication number: CN112307937A
Application number: CN202011171318.8A
Authority: CN
Inventors: 范轲; 张磊; 林康; 谭则涛; 张汉林; 柯学
Original assignee: Gf Securities Co ltd
Current assignee: Gf Securities Co ltd
Priority date: 2020-10-28
Filing date: 2020-10-28
Publication date: 2021-02-02

Abstract

The invention provides an identity card quality inspection method and system based on deep learning, wherein the method comprises the following steps: acquiring a first image comprising an identity card image and a background image; preprocessing the first image to obtain a second image only containing the identity card image; inputting the second image into a set reflection quality inspection model for reflection quality inspection to obtain a third image; inputting the third image into a set fuzzy quality inspection model for fuzzy quality inspection to obtain a fourth image; and the set fuzzy quality inspection model and the set reflection quality inspection model are obtained according to the basic structure of the SqueezeNet. The invention solves the problem that the traditional image processing technology needs to adjust a plurality of index thresholds when the identity card is inspected, and is difficult to achieve balance on positive and negative sample recall indexes; and the traditional image processing method for separating the reflective area from the non-reflective area by using brightness, gradient and color cast can not distinguish the reflective areas of the key area (characters and head portrait) and the non-key area of the identity card and can not identify the reflective great wall watermark similar to the reflective area.

Description

Deep learning-based identity card quality inspection method and system

Technical Field

The invention relates to the technical field of image processing, in particular to an identity card quality inspection method and system based on deep learning.

Background

With the development of internet finance, on one hand, the security investment group is continuously expanded, and the number of accounts opened in security companies is obviously increased; on the other hand, the expansion of off-site services, customers do not need to go to the stock business department for transaction in many cases, but can directly conduct service transaction through a mobile terminal or a website, and online account opening is a typical scene. In the process of handling online account opening business, a client needs to submit materials such as an identity card, an account opening declaration video and the like, a security company needs to judge whether the quality of the materials such as the identity card, the video and the like submitted by the client is qualified, if the certificate is reflective, fuzzy and shielded, and has the problems of corner defect and edge defect, and if the image and the video have the problems of reflective, fuzzy and shielding, and the account opening declaration does not meet the requirements, the client can handle the account opening after the certificate passes the verification, and if the certificate does not pass the verification, the client needs to refute the materials for resubmission.

Currently, securities companies mostly adopt a manual mode for auditing, so that the workload is large, and the auditing response is often too slow; and too slow audit response or too many refusals can lead to the reduction of the client's willingness to open an account, so that the client is lost. How to improve the quality, efficiency and response speed of account opening examination becomes the key for improving the success rate of account opening.

Aiming at the problems of light reflection and fuzzy detection of the identity card picture, the traditional image processing technology needs to adjust a plurality of index thresholds when the identity card is subjected to quality inspection, and the positive and negative sample recall indexes are difficult to be balanced; in addition, the traditional image processing technology more evaluates the image quality by using technical indexes such as brightness, gradient, color cast and the like, but the indexes are difficult to quantify due to the auditing standard, and the image can be judged as a reflective image only under the condition that the identification information cannot be identified clearly because the identification card quality inspection reflective is defined as that characters are positioned in a key area, the image without influencing the character identification is judged as a normal image uniformly, and in addition, the character identification failure caused by the long-wall watermark on the front surface of the second generation identification card cannot be judged as reflective, so the traditional image processing method for separating the reflective area from the non-reflective area by using the brightness is not suitable under the condition.

Disclosure of Invention

The invention provides an identity card quality inspection method and system based on deep learning, which solve the problem that the traditional image processing technology needs to adjust a plurality of index thresholds when inspecting the identity card quality, and is difficult to achieve balance on positive and negative sample recall indexes; and the traditional image processing method for separating the reflective area from the non-reflective area by using brightness, gradient and color cast can not distinguish key areas of the ID card, such as characters, head portraits and the like, which are reflective and can not identify the similar reflective great wall watermark.

One embodiment of the invention provides an identity card quality inspection method based on deep learning, which comprises the following steps:

acquiring a first image; wherein the first image comprises an identity card image and a background image;

preprocessing the first image to obtain a second image; wherein the second image comprises only an identification card image;

inputting the second image into a set reflection quality inspection model for reflection quality inspection to obtain a third image; the set reflection quality inspection model is obtained according to the basic structure of the SqueezeNet;

inputting the third image into a set fuzzy quality inspection model for fuzzy quality inspection to obtain a fourth image; and the set fuzzy quality inspection model is obtained according to the basic structure of the SqueezeNet.

Further, the set reflection quality inspection model is obtained according to the basic structure of the squeezet, and includes:

respectively taking the identity card image marked as the passing of the light reflection audit and the identity card image marked as the failing of the light reflection audit as light reflection positive sample data and light reflection negative sample data to be input into the light reflection quality inspection model; the quantity of the light reflection positive sample data is the same as that of the light reflection negative sample data;

and training a reflection quality inspection model according to the reflection positive sample data and the reflection negative sample data based on a caffe deep learning framework.

Further, the set fuzzy quality inspection model is obtained according to the basic structure of the SqueezeNet, and includes:

respectively taking the identity card image marked as the fuzzy audit pass and the identity card image marked as the fuzzy audit fail as fuzzy positive sample data and fuzzy negative sample data and inputting the fuzzy positive sample data and the fuzzy negative sample data into a fuzzy quality inspection model; wherein the number of the fuzzy positive sample data is the same as the number of the fuzzy negative sample data;

and training a fuzzy quality inspection model according to the fuzzy positive sample data and the fuzzy negative sample data based on a caffe deep learning frame.

Further, the basic structure of the SqueezeNet comprises:

changing the 3 × 3 convolution kernel into a 3 × 3-2 hole convolution kernel to increase the number of the characteristic information;

changing the last common maximum pooling layer into a spatial pyramid pooling structural layer to combine feature information of different scales in the last layers;

connecting the characteristic graphs of the first and last layers to make the last layers of the network provide superficial texture detail characteristics, namely character strokes and edge characteristics;

the L1 norm is used for constraining the weight, so that most of the weight becomes 0, and the use speed of the model is improved;

and adjusting the position of the middle maximum pooling layer and the number of convolution kernels so that the last continuous layer convolution operation can keep more characteristic information.

Further, the preprocessing the first image to obtain a second image includes:

processing the first image through a Gaussian blur algorithm to obtain a first characteristic image;

carrying out graying and binarization processing on the first characteristic image to obtain a second characteristic image;

and processing the second characteristic image through an Isotropic Sobel operator to separate the identity card image and the background image to obtain a second image.

One embodiment of the present invention provides an identity card quality inspection system based on deep learning, including:

the first image acquisition module is used for acquiring a first image; wherein the first image comprises an identity card image and a background image;

the first image preprocessing module is used for preprocessing the first image to obtain a second image; wherein the second image comprises only an identification card image;

the reflective quality inspection module is used for inputting the second image into a set reflective quality inspection model for reflective quality inspection to obtain a third image; wherein, the set first model is obtained according to the basic structure of the SqueezeNet;

the fuzzy quality inspection module is used for inputting the third image into a set fuzzy quality inspection model for fuzzy quality inspection to obtain a fourth image; and the set fuzzy quality inspection model is obtained according to the basic structure of the SqueezeNet.

Further, the reflection quality inspection module comprises:

the light reflection sample input module is used for inputting the identity card image marked as light reflection examination passing and the identity card image marked as light reflection examination failing as light reflection positive sample data and light reflection negative sample data into the light reflection quality inspection model respectively; the quantity of the light reflection positive sample data is the same as that of the light reflection negative sample data;

and the reflecting quality inspection model training module is used for training a reflecting quality inspection model according to the reflecting positive sample data and the reflecting negative sample data based on a caffe deep learning framework.

Further, the fuzzy quality inspection module comprises:

the fuzzy sample input module is used for inputting the identity card image marked as the passing of the fuzzy audit and the identity card image marked as the failing of the fuzzy audit as fuzzy positive sample data and fuzzy negative sample data into the fuzzy quality inspection model respectively; wherein the number of the fuzzy positive sample data is the same as the number of the fuzzy negative sample data;

and the fuzzy quality inspection model training module is used for training a fuzzy quality inspection model according to the fuzzy positive sample data and the fuzzy negative sample data based on a caffe deep learning frame.

Further, the first image preprocessing module comprises:

the Gaussian blur algorithm processing submodule is used for processing the first image through a Gaussian blur algorithm to obtain a first characteristic image;

the graying and binarization processing submodule is used for performing graying and binarization processing on the first characteristic image to obtain a second characteristic image;

and the image separation sub-module is used for processing the second characteristic image through an Isotropic Sobel operator so as to separate the identity card image and the background image and obtain a second image.

An embodiment of the present invention provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, the computer-readable storage medium is controlled to implement any one of the deep learning-based identity card quality inspection methods.

Compared with the prior art, the embodiment of the invention has the beneficial effects that:

one embodiment of the invention provides an identity card quality inspection method and system based on deep learning, wherein the method comprises the following steps: acquiring a first image; wherein the first image comprises an identity card image and a background image; preprocessing the first image to obtain a second image; wherein the second image comprises only an identification card image; inputting the second image into a set reflection quality inspection model for reflection quality inspection to obtain a third image; the set reflection quality inspection model is obtained according to the basic structure of the SqueezeNet; inputting the third image into a set fuzzy quality inspection model for fuzzy quality inspection to obtain a fourth image; and the set fuzzy quality inspection model is obtained according to the basic structure of the SqueezeNet. The invention solves the problem that the traditional image processing technology needs to adjust a plurality of index thresholds when the identity card is inspected, and is difficult to achieve balance on positive and negative sample recall indexes; and the traditional image processing method for separating the reflective area from the non-reflective area by using brightness, gradient and color cast can not distinguish the reflective areas of the key area (characters and head portrait) and the non-key area of the identity card and can not identify the reflective great wall watermark similar to the reflective area.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a deep learning-based method for quality inspection of an identity card according to a first embodiment of the present invention;

FIG. 2 is a flowchart of a method for quality control of an identity card based on deep learning according to a second embodiment of the present invention;

FIG. 3 is a flowchart of a method for quality control of an identity card based on deep learning according to a third embodiment of the present invention;

FIG. 4 is a flowchart illustrating a method for performing quality inspection of an identity card based on deep learning according to a fourth embodiment of the present invention;

fig. 5 is a flowchart of a deep learning-based method for quality inspection of an identity card according to a fifth embodiment of the present invention;

FIG. 6 is a schematic structural diagram of a SqueezeNet according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a spatial pyramid pooling configuration provided by an embodiment of the present invention;

FIG. 8 is a diagram of an apparatus of a deep learning based ID card quality inspection system according to a first embodiment of the present invention;

FIG. 9 is a diagram of an apparatus of a deep learning based ID card quality inspection system according to a second embodiment of the present invention;

FIG. 10 is a diagram of an apparatus of a deep learning based ID card quality inspection system according to a third embodiment of the present invention;

fig. 11 is a diagram of an apparatus of a deep learning based id card quality inspection system according to a fourth embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be understood that the step numbers used herein are for convenience of description only and are not intended as limitations on the order in which the steps are performed.

It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

The terms "comprises" and "comprising" indicate the presence of the described features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

The term "and/or" refers to and includes any and all possible combinations of one or more of the associated listed items.

The patent provides a security account opening identity card quality inspection method based on deep learning, which is characterized in that a computer is used for rapidly judging whether an identity card image reflects light, is fuzzy and the like so as to feed back unqualified image information, and normal images are supplied to account opening steps such as identity card quality inspection, public security verification, identity card image archiving and the like, so that the complexity of account opening business is reduced, the processing efficiency is improved, and the account opening loss rate is reduced.

The conventional image processing technology is developed well, and aiming at the problems of light reflection and fuzzy detection of the image of the identity card, the conventional image processing technology evaluates the image quality by using technical indexes such as brightness, gradient, color cast and the like, but the image can be judged as a light reflection image only under the condition that the identification information is not clearly identified due to the fact that the auditing standard is difficult to quantify by using the indexes and the definition of the light reflection of the quality inspection of the identity card is that characters located in a key area cause the identification information to be unclear, the image which does not influence the character identification is uniformly judged as a normal image, and in addition, the failure of the character identification caused by the watermark in the front of the second generation identity card cannot be judged as the light reflection, so the conventional image processing method for separating the light reflection area from the non-light reflection area by using the brightness.

Computer-based statistical learning methods and traditional machine learning techniques have been widely used in life practice, and in recent years, with data accumulation and breakthrough of hardware performance, deep learning begins to show better effects than traditional machine learning algorithms, leading people to begin research on deep learning and application thereof again, and with continuous development and innovation in algorithms, deep learning is applied in more and more fields, especially with image processing, speech recognition and NLP being the most prominent. The model with the best general image detection, classification, OCR and face recognition effects is obtained based on deep learning training; deep learning networks such as VGGNet, long and short term memory models, feedforward type sequence memory networks, bidirectional cyclic neural networks and the like are often used for intelligent home, mobile end voice command and content supervision, and product performance is greatly improved. The successful application of deep learning in the image field makes us see the possibility of realizing intelligent audit of account opening materials through deep learning technology.

On one hand, the manual auditing mode has larger workload, which often causes the auditing response to be too slow; too slow audit response or too many refusals can lead to weakened client account opening willingness, so that the client runs away;

on the other hand, the traditional image processing technology mostly adopts technical indexes such as brightness, gradient, color cast and the like to evaluate the image quality, and can not solve the problem that the auditing standard is difficult to quantify; and the traditional image processing method for separating the reflective area from the non-reflective area by using the brightness cannot solve the problem of identifying the reflective and reflective similar great wall watermarks in the key area.

The invention provides a security account opening identity card quality inspection method based on deep learning, which is characterized in that a computer is used for quickly judging whether an identity card image reflects light and blurs so as to feed back unqualified image information, and a normal image is supplied to account opening steps of identity card quality inspection, public security verification, identity card image archiving and the like, so that the complexity of account opening business is reduced, the processing efficiency is improved, and the account opening loss rate is reduced. And subsequently, the misdivided samples are re-labeled and input into a model for iterative learning to construct a closed loop, so that the effect of the identity card quality detection model can be continuously optimized, and the aim of intelligently replacing repeated manual work is finally realized.

A first aspect.

Referring to fig. 1 to 4, an embodiment of the present invention provides a method for quality inspection of an identity card based on deep learning, including:

s10, acquiring a first image; the first image comprises an identity card image and a background image.

In a specific embodiment, a computer acquires a first image, where the first image is an identification card picture taken by a user through a shooting device, and the identification card picture includes an identification card image and a background image.

S20, preprocessing the first image to obtain a second image; wherein the second image only includes an identification card image.

Referring to fig. 2, in one embodiment, the preprocessing includes:

and S21, processing the first image through a Gaussian blur algorithm to obtain a first characteristic image.

In one embodiment, gaussian blur, also known as gaussian smoothing, is typically used to reduce image noise and to reduce detail levels. The visual effect of the image generated by the blurring technique is as if the image is viewed through a semi-transparent screen, which is significantly different from the out-of-focus imaging effect of a lens in a stray field and in a general lighting shadow. And processing the first image through a Gaussian blur algorithm to obtain a blurred image, namely a first characteristic image.

And S22, carrying out graying and binarization processing on the first characteristic image to obtain a second characteristic image.

In one embodiment, the gray scale is a quantization of the brightness variation of the image, which is used to represent the lightness and darkness of the brightness, and is usually quantized to 256 gray scales, i.e. the gray scale ranges from 0 to 255, and the variation of 0 to 255 represents the lightness from dark to light, and corresponds to the color in the image from black to white. The gray scale range of the image can also be represented as 0-1, with 0 representing black, 1 representing white, and values between 0-1 representing gray from dark to light.

In a specific embodiment, the binarization is to make the gray value of each pixel in the pixel matrix of the image be 0 (black) or 255 (white), that is, make the whole image have only black and white effects. The range of the gray scale value in the grayed image is 0 to 255, and the range of the gray scale value in the binarized image is 0 or 255.

And S23, processing the second characteristic image through an Isotropic Sobel operator to separate the identity card image and the background image to obtain a second image.

In a specific embodiment, the Isotropic Sobel operator is mainly used for edge detection, technically, the Isotropic Sobel operator is a discrete difference operator and is used for calculating an approximate value of a gradient of an image brightness function, the Sobel operator is a typical edge detection operator based on a first derivative, and because the operation similar to local average is introduced into the operator, the Isotropic Sobel operator has a smoothing effect on noise and can well eliminate the influence of the noise.

S30, inputting the second image into a set reflection quality inspection model for reflection quality inspection to obtain a third image; the set reflection quality inspection model is obtained according to the basic structure of the SqueezeNet.

Referring to fig. 3, in one embodiment, step S30 includes:

s31, inputting the identity card image marked as passing light reflection audit and the identity card image marked as failing light reflection audit as light reflection positive sample data and light reflection negative sample data to the light reflection quality inspection model respectively; and the quantity of the light reflection positive sample data is the same as that of the light reflection negative sample data.

And S32, training a light reflection quality inspection model according to the light reflection positive sample data and the light reflection negative sample data based on a caffe deep learning framework.

In a specific embodiment, we select a caffe deep learning framework for model construction and training. According to the basic structure of the SqueezeNet in the SqueezeNet paper, inputting network parameters in a protoxt file set by the network structure in the context, and inputting training parameters in the protoxt file set by the training parameters. And finally, calling a training file and a network file, training an optimal model (marked as a model one) and acquiring parameter information.

The model preliminary result provides certain classification accuracy, but has a certain difference with a business online target, the model has similar classification accuracy for positive samples and negative samples, and the business online target pays more attention to recall of normal samples, so that the following optimization is performed.

S40, inputting the third image into a set fuzzy quality inspection model for fuzzy quality inspection to obtain a fourth image; and the set fuzzy quality inspection model is obtained according to the basic structure of the SqueezeNet.

Referring to fig. 4, in one embodiment, step S40 includes:

s41, inputting the identity card image marked as fuzzy audit pass and the identity card image marked as fuzzy audit fail as fuzzy positive sample data and fuzzy negative sample data to the fuzzy quality inspection model respectively; wherein the number of the blurred positive sample data is the same as the number of the blurred negative sample data.

And S42, training a fuzzy quality inspection model according to the fuzzy positive sample data and the fuzzy negative sample data based on a caffe deep learning frame.

In a specific embodiment, the basic structure of the SqueezeNet includes:

In a specific embodiment, only the SqueezeNet with the basic structure is used when the model is initially established, only convolution kernels with different sizes and deeper network layers are considered to enrich the characteristics, but the conditions of reflecting and blocking characters are complex, the watermark of the great wall and the recognition speed are not fully considered, so that the accuracy of reflecting and fuzzy judgment is not high. We therefore decide to adjust the network architecture improvement model one. The method specifically comprises the following steps: changing a 3 x 3 convolution kernel in each fire module into a 3 x 3-2 hole convolution kernel to enlarge the receptive field and enrich the characteristic information; changing the last common maximum pooling layer into a spatial pyramid pooling Structure (SPP), and better merging the feature information of different scales in the last layers; connecting the characteristic graphs of the first and last layers to make the last layers of the network provide superficial texture detail characteristics, namely character strokes and edge characteristics; the L1 norm is used for constraining the weight, so that most of the weight becomes 0, and the use speed of the model is improved; the position of the middle maximum pooling layer and the number of convolution kernels are adjusted according to experience, and the maximum pooling layer is used at the tail of conv1, fire3 and fire5 instead, so that more characteristic information can be reserved in the last continuous layer convolution operation. And retraining and verifying the effect of the model.

The model result is better improved compared with the initial version, but the online target is not reached, and through analysis of the classification result, the fact that although real client submitted materials are samples at present is found, due to the fact that the sample amount is large, different auditors are used for auditing and marking results in batches, image auditing standards and auditing quality of different auditors are possibly different, and therefore optimization is considered from the aspect of samples.

The invention provides an identity card quality inspection method based on deep learning, which solves the problem that character reflection and similar reflection great wall watermarks in key areas of an identity card cannot be identified, provides a relatively accurate identity card quality detection model, and subsequently, the wrong divided samples are re-marked and input into the model for iterative learning to construct a closed loop, so that the effect of the identity card quality detection model can be continuously optimized, and the goal of intelligently replacing repeated manual work is finally realized.

Referring to fig. 5, a method for quality inspection of an identity card based on deep learning according to a first embodiment of the present invention includes:

and collecting the identity card picture, wherein the part mainly collects the identity card picture data uploaded by the client in the service system, associates the corresponding auditing result and distinguishes the auditing result into approved auditing, reflected light and fuzzy type.

In a specific embodiment, the historical account opening audit record is analyzed, the audit reject reason is extracted and classified for the record that the audit does not pass, then the audit pass record with a certain proportion is randomly sampled and used for subsequent light reflection and fuzzy quality inspection model training, and finally, the relevant identity card pictures are sorted.

And (3) preprocessing the identity card picture, wherein the part is mainly used for rotating and cutting the identity card picture to remove the inconsistency of sample size, background and the like caused by the fact that the shooting condition is not communicated.

In a specific embodiment, because the devices, shooting modes and backgrounds used by the clients for shooting the identity card pictures are different, the backgrounds, sizes and the like of the identity card pictures need to be preprocessed, and the original identity card pictures are mainly rotated and cut to output a uniform format.

And (4) labeling the identity card picture, namely labeling the identity card picture based on a historical auditing result, and then manually correcting a classification result after model initial training.

In one embodiment, the identification card image is qualified without clear and strict standards and rules, such as the influence of light reflection and blurring on character recognition cannot be quantified. Although there are real samples and auditing results at present, the sample amount is large, and different auditors perform auditing and labeling in batches, so that image auditing standards and auditing qualities of different auditors may be different.

In order to solve the problems, in a specific implementation process, an identity card picture is labeled based on a historical auditing result, then a corresponding identity card light-reflecting fuzzy quality inspection model is preliminarily trained based on the labeling, and then a sample with a model classification result inconsistent with a real situation is manually corrected. In the subsequent model iteration process, the step can be skipped to for checking and labeling again according to the model effect.

The identity card light-reflecting quality inspection model training part is mainly used for training an identity card light-reflecting quality inspection model to judge whether an identity card picture reflects light or not, the essence belongs to a picture classification model, the model structure is improved based on a SqueezeNet basic structure, an audit pass and light-reflecting refute sample is used for training, a training test set is divided for training and model evaluation, and finally an output index meets an expected identity card light-reflecting quality inspection model.

In a specific embodiment, the part is mainly used for training an identity card light-reflecting quality inspection model to judge whether an identity card picture reflects light or not, the essence belongs to a picture classification model, the model structure is improved based on a SqueezeNet basic structure, an audit pass and light-reflecting refute sample is used for training, a training test set is divided for training and model evaluation, and finally an output index meets an expected identity card light-reflecting quality inspection model.

The detail problem of identity card image reflection judgment mainly lies in that:

1. the light reflection area has various uncertain sizes, and the network needs to introduce different scales of receptive fields to acquire abnormal light reflection;

2. because the identity card quality inspection model belongs to a part of intelligent account opening business, the identity card picture needs to be detected in real time, the business requires that the light reflection judgment of each image should be within 1S, and the network structure needs to be compressed and accelerated;

3. the influence limit on character recognition caused by the character shielding of the light reflection area is abstract, different light reflection positions, sizes and degree combinations have different influences on character recognition, and a deep learning network needs to have higher hierarchical fusion capability;

4. the front surface of the identity card is provided with a great wall watermark similar to a reflective expression, the great wall watermark has the characteristic that high brightness influences character recognition, and a network needs to learn the high-level morphological characteristics of the great wall watermark and distinguish the watermark from the reflective characteristic.

In consideration of the above problem, in a specific implementation process, we first perform quality inspection model training based on the squeezet basic structure, after analyzing the model training result, consider optimizing the fire module of squeezet by using hole convolution, where hole convolution is a convolution method to solve the problem that pooling operation reduces resolution and loses information, and it enlarges the receptive field by adding holes (intervals) between convolution kernel elements, such as 2-hole convolution of 3 × 3, and the convolution kernel originally having a size of 3 obtains the receptive field having a size of 7 (scaled rate of 2) without changing parameters and calculation amount, and at this time, it can not perform downsampling any more. Replacing one 3 x 3 convolution in the fire module in the SqueezeNet with a 3 x 3-2 hole convolution can enrich the receptive field of the convolution layer, splice the characteristics of different scales and further enrich the characteristic information.

The identity card fuzzy quality inspection model training part is mainly used for training an identity card fuzzy quality inspection model to judge whether an identity card picture is fuzzy or not, the identity card fuzzy quality inspection model belongs to a picture classification model in nature, the model structure is improved based on a SqueezeNet basic structure, an audit pass and fuzzy refute sample is used for training, a training test set is divided for training and model evaluation, and finally an output index meets an expected identity card fuzzy quality inspection model.

In a specific embodiment, the part is mainly used for training an identity card fuzzy quality inspection model to judge whether an identity card picture is fuzzy or not, the essence belongs to a picture classification model, the model structure is improved based on a SqueezeNet basic structure, an audit pass and fuzzy refute sample is used for training, a training test set is divided for training and model evaluation, and finally an output index meets an expected identity card fuzzy quality inspection model.

For the application requirement of multiple labels, two solutions exist, the first is to train a deep learning network of the multiple labels; the second method is to train a classifier for each label, and then perform the fusion of the score layer or the decision layer on the result of the multiple classifiers to obtain the final classification result.

In an actual project, a second scheme is adopted, namely networks are respectively trained aiming at illumination and fuzziness, and then a decision layer fusion method is utilized to obtain a classification result. This has the advantages of:

1) obtaining the illumination and the fuzzy label of the same identification card image brings a lot of labeling work. By adopting the second scheme, a large amount of marking workload can be saved.

2) With the development of business, more quality inspection items are planned to be added, such as whether key text information is blocked, whether a second-generation certificate is a real certificate or a copied image. The functions of the quality inspection algorithm can be conveniently expanded by only training the classifier according to the quality inspection items.

The specific implementation process of the identity card fuzzy model training and the used model are the same as the identity card reflecting model training.

The index test of the identity card quality inspection model mainly performs batch index test on identity card light-reflecting and fuzzy quality inspection models trained by the module 4 and the module 5, judges whether the models reach the standard or not, and can return identification inconsistent results to the identity card picture marking module for manual correction.

In a specific embodiment, the deep learning model classifies pictures, which more reflects a probability problem at present, cannot be completely accurate, and cannot be measured like system function development, so that the ground mode of the quality inspection model needs to be related in advance and the acceptance standard which can be quantized needs to be determined. The specific real-time process is verified in the following two ways:

1. sampling and extracting a batch of samples from each category, and performing overall index evaluation;

2. determining boundary samples for each category, manually determining quality inspection results, and carrying out boundary detection on the model;

and judging whether the model reaches the standard according to the two test results, and selecting to adjust an identity card picture marking module for marking and checking or transferring to a corresponding model training module for model iterative training.

In another embodiment, the invention provides an identity card quality inspection model based on deep learning, which is characterized in that a computer is used for quickly judging whether an identity card image reflects light and blurs to feed back unqualified image information, and a normal image is supplied to account opening steps such as identity card identification, public security verification, identity card image archiving and the like, so that the complexity of account opening business is reduced, the processing efficiency is improved, and the account opening loss rate is reduced.

Construction and difficulty of identity card quality inspection model

The core of establishing the identity card quality inspection model is to train the quality inspection model to judge the quality of the identity card image uploaded when a customer opens an account on line based on the existing customer material and the auditing result.

This core has two major problems: on one hand, whether the identity card image is qualified or not has no clear and strict standards and rules, for example, influence severity of light reflection and blurring on character recognition cannot be quantified, for this reason, a manual labeling result of whether an auditor is qualified or not on a material in history account opening data is used as a classification basis, and a neural network self-learning method, namely deep learning, is used for classifying whether the image is light reflection or not and is fuzzy. This causes another problem, and although there is a real client submitting material as a sample at present, since the sample amount is large and the result is checked and labeled by different auditors in batches, the image checking standards and the checking quality of different auditors may be different. For the problem, on one hand, an algorithm is continuously optimized according to an experimental result, including parameter adjustment, introduction of a related processing method or a network structure and the like, on the other hand, the model is trained again after the abnormal sample label is corrected according to a model initial result, and the process is circulated in such a way, and finally the consistency of the judgment standard is achieved.

Theory of technology

The light reflection and fuzzy judgment of the quality inspection of the second-generation identity cards are very challenging problems. Considering that a user may take an identification card image under any lighting condition, and that a great wall watermark positioned between a name and a gender on the front surface of the second-generation identification card has the characteristic similar to light reflection, the traditional image processing method of separating light reflection and non-light reflection areas by using brightness is not suitable under the condition, and the problem that the standard cannot be quantized exists in the image blurring process. Therefore, the deep learning is used for judging the glistening and the blurring and is a better choice. Although the fuzzy judgment and the light reflection judgment of the second-generation identity card are similar, in an actual project, because the data condition is not met and the caffe does not support multiple labels, the two networks for learning at different depths are finally adopted to respectively train illumination and fuzzy classification. The following description will take a light reflecting model as an example.

Based on the business requirement of extracting the client identity data required by account opening, the definition of the model on light reflection is that the image can be judged as a light reflection image only under the condition that the identity information is not clearly identified due to covering characters in a key area, the image without influencing character identification is uniformly judged as a normal image, and in addition, the failure of character identification caused by the watermark in the great wall on the front surface of the second-generation identity card can not be judged as light reflection. The blurred image is defined as an image which influences character recognition due to all or part of abnormal positions in the image, and the image which is slightly blurred and does not influence character recognition needs to be judged as a normal image.

Taking the light reflection model as an example, the difficulty of image recognition of the model different from the common certificate is analyzed, and then the deep learning technology is selected correspondingly according to the difficulty.

the light reflecting area has various uncertain sizes, and the network needs to introduce different-scale receptive fields to acquire light reflecting abnormity.

Because the identity card quality inspection model belongs to a part of intelligent account opening business, the business requires that the light reflection judgment of each image should be within 100ms, and the network structure needs to be compressed and accelerated.

The influence limit on character recognition caused by the fact that characters are shielded by the light reflection area is abstract, different combinations of positions, sizes and degrees of light reflection have different influences on character recognition, and a deep learning network needs to have higher hierarchical fusion capability.

The front surface of the identity card is provided with a great wall watermark similar to a reflective expression, the great wall watermark has the characteristic that high brightness influences character recognition, and a network needs to learn the high-level morphological characteristics of the great wall watermark and distinguish the watermark from the reflective characteristic.

Image recognition typically uses Convolutional Neural Network (CNN) techniques. Neural Network (ANN) models typically include an input layer, a hidden layer, and an output layer. The hidden layer of the CNN is a plurality of neural network layers, such as convolutional layers, excitation layers, pooling layers, full-link layers, etc., which play different roles. The convolution layer uses convolution to check each feature in the image to perform local extraction, then performs comprehensive operation on the local part to finally obtain global information, and reduces the number of parameters as much as possible, which is similar to that a human visual system takes an entire anterior picture into eyes and feeds the whole anterior picture back to the brain. The excitation layer is used for performing nonlinear transformation on the output data of the convolutional layer. The pooling layer performs down-sampling operation, comprehensively realizes feature dimension reduction and parameter and data compression by using a large amount of information, and simultaneously reduces overfitting to improve the generalization capability of the model, and the visual system identifies different independent individuals in the whole picture according to the difference of distance of objects and three-dimensional characteristics. The full-connection layer is used for learning, important classification features and the probability of the classification features are determined according to the output of the upper layer, the result features are identified and classified, the output classification results are compared with the real values, and then the real values are returned reversely to adjust the weight (parameters on the convolution kernel).

With the increase of image processing requirements, in order to improve network performance, the number of network layers is continuously increased, so that the network structure is complex and has more parameters, and the deep network structure has serious efficiency problems, including a storage problem of a model and a speed problem of model prediction. The solution method comprises two aspects of lightweight model and model compression. A lightweight model, namely a high-efficiency network computing mode aiming at a convolution mode is designed, so that the aim of reducing network parameters and keeping better network performance is fulfilled, for example, a SqueezeNet model (derived from a paper SqueezeNet: AlexNet-level accuracy with 50x power parameters and <0.5MB model size). In the aspect of model compression, compression can be performed on a trained model, for example, methods such as L1 pruning constraint weight and the like are used for reducing network parameters, so that the storage problem and the speed problem are solved, or a pooling layer and the like are optimized.

The key to distinguishing the SqueezeNet from the conventional convolutional neural network convolution approach is that its ordinary convolutional layer is replaced with a fire module, where the conventional fire module contains two convolutional layers — the squeeze layer and the expanded layer. The squeeze layer uses s 11 × 1 convolution kernels to replace the common 3 × 3 convolution kernels to perform convolution on the feature mapping output by the previous layer, and the number of the convolution kernels is less than that of the feature mapping of the previous layer, so that the function of reducing the feature dimension (compressing the number of channels of the feature mapping) is achieved; the expanded layers are convolved with e 11 x 1 and e3 3 x 3 convolution kernels respectively, and the obtained feature maps are connected depending on the feature dimension (the number of channels), so that the feature maps are re-fused and re-expressed, and at the moment, the resolution of the feature maps is unchanged, and only the feature map dimension is changed. The 3 hyper-parameters (number of convolution kernels) s1, e1 and e3 of each fire module are gradually increased, and the relation of 4s 1-e 1-e 3 exists, so that the squeeze layer can also reduce the number of input channels for the network of the previous layer. The method aims at the problem that the light reflection is not fixed in shape and size, convolution kernels with different sizes can be used for splicing features with different sizes, and feature information is greatly enriched.

The SqueezeNet proposed in the paper starts with the normal convolutional layer (conv1), uses 8 fire modules (fire2-9) in series, and finally makes the structure with the normal convolutional layer (conv 10).

The hole convolution is a convolution method for solving the problem of information loss caused by the reduced resolution of the pooling operation, and it enlarges the receptive field by adding holes (spaces) among convolution kernel elements, such as 3 × 3 2-hole convolution, and the original convolution kernel with size 3 obtains the receptive field with size 7 (scaled rate 2) without changing the parameters and the calculation amount, at this time, the downsampling can not be performed any more. Replacing one 3 x 3 convolution in the fire module in the SqueezeNet with a 3 x 3-2 hole convolution can enrich the receptive field of the convolution layer, splice the characteristics of different scales and further enrich the characteristic information.

Spatial pyramid pooling (SPP, available from the thesis Spatial)

Pyramid Pooling in Deep conditional Networks for Visual registration) is a method for downsampling an arbitrary size picture to generate an image of the same size and inputting the image of the same size to a full link layer. The method comprises the steps of dividing an input image by utilizing three scales with different sizes to obtain 21(16+4+1) blocks, extracting a feature from each block to obtain a 21-dimensional feature vector, namely converting an image with any size into a feature with a fixed size. Each scale is a layer of a pyramid, each layer is divided into 16, 4 and 1 picture blocks corresponding to window sizes (ws) of 4, 2 and 1, so that the size of each picture block in each layer is (w/ws, h/ws), and w and h are the image sizes of the original image. Using a spatial pyramid pooling SPP structure instead of maximal pooling after the last convolutional layer of the neural network, feature information of different scales can be better merged.

The L1 regularized pruning deletion is used for deleting parameters which have small influence on the model, so that most of weights are changed into 0, the number of the parameters of the model is reduced under the condition that the performance of the model is not changed, and the calculation speed of the model can be increased to meet the service requirement.

In consideration of the complex situation of the blocking of characters by reflected light, deep learning needs to learn the detail features of the blocked characters, so the shallow features of the network should be preserved. We connect the feature maps of the last layers with those of the previous layers to provide shallow texture detail features at the last layer of the network, where text strokes and edge features are usually extracted.

Development process and effect:

1. image input and pre-processing

The method comprises the steps of extracting identification card patterns submitted by online account opening in last three years in the client of the department, distinguishing the identification card patterns into two types of approval passing and approval rejecting, selecting a light-reflecting (or fuzzy) image as a negative sample and a normal image as a positive sample based on rejecting reasons, randomly selecting a final sample according to the ratio of the positive sample to the negative sample being 1:1, and randomly dividing the final sample into a training set, a testing set and a verification set. Due to different shooting conditions, the size, background and the like of the sample are preprocessed.

2. Model training and validation

We select a caffe deep learning framework for model construction and training. According to the basic structure of the SqueezeNet in the SqueezeNet paper, inputting network parameters in a protoxt file set by the network structure in the context, and inputting training parameters in the protoxt file set by the training parameters. And finally, calling a training file and a network file, training an optimal model (marked as a model one) and acquiring parameter information.

3. Improvement of model

1) Model two

As shown in fig. 6-7, only the squeezet with the basic structure is used for the initial modeling, and only the convolution kernels with different sizes and deeper network layers are considered to enrich the characteristics, but the complicated situation of the reflective and shielding characters, the water mark of the great wall and the recognition speed are not fully considered, so the accuracy of reflective and fuzzy judgment is not high. We therefore decide to adjust the network architecture improvement model one. The method specifically comprises the following steps: changing a 3 x 3 convolution kernel in each fire module into a 3 x 3-2 hole convolution kernel to enlarge the receptive field and enrich the characteristic information; changing the last common maximum pooling layer into a spatial pyramid pooling Structure (SPP), and better merging the feature information of different scales in the last layers; connecting the characteristic graphs of the first and last layers to make the last layers of the network provide superficial texture detail characteristics, namely character strokes and edge characteristics; the L1 norm is used for constraining the weight, so that most of the weight becomes 0, and the use speed of the model is improved; the position of the middle maximum pooling layer and the number of convolution kernels are adjusted according to experience, and the maximum pooling layer is used at the tail of conv1, fire3 and fire5 instead, so that more characteristic information can be reserved in the last continuous layer convolution operation. And retraining and verifying the effect of the model.

2) Model III

The result of the model II shows that the judgment effect is limited, and more samples with wrong scores exist. Therefore, we start from the sample, and find that the reason for misjudgment may be a dependent variable of the sample, that is, whether the artificial labeling standard of reflection/blurring is inconsistent, which is one of the modeling difficulties mentioned above, so that the reflection and blurring judgment boundary in the sample is not clear.

Therefore, the samples were processed while keeping the structure of the SqueezeNet network in model two unchanged. On one hand, for the misclassified samples, the service personnel checks and corrects the misclassified samples again, on the other hand, the unified judgment standard is adopted in consideration of the mode of synthesizing various high-quality reflective and fuzzy samples, and meanwhile, the reflective and fuzzy diversity of the samples is improved. And finally achieving the goal of service expectation online through multiple iterative training.

A second aspect.

Referring to fig. 8-11, an embodiment of the invention provides a deep learning-based id card quality inspection system, which includes:

a first image obtaining module 10, configured to obtain a first image; the first image comprises an identity card image and a background image.

A first image preprocessing module 20, configured to preprocess the first image to obtain a second image; wherein the second image only includes an identification card image.

Referring to fig. 7, in an embodiment, the first image preprocessing module 20 includes:

and the Gaussian blur algorithm processing submodule 21 is configured to process the first image through a Gaussian blur algorithm to obtain a first characteristic image.

And the graying and binarization processing sub-module 22 is used for performing graying and binarization processing on the first characteristic image to obtain a second characteristic image.

And the image separation submodule 23 is configured to process the second feature image through an Isotropic Sobel operator to separate the identity card image and the background image, so as to obtain a second image.

The reflective quality inspection module 30 is configured to input the second image into a set reflective quality inspection model for reflective quality inspection to obtain a third image; wherein, the set first model is obtained according to the basic structure of the SqueezeNet.

Referring to fig. 8, in an embodiment, the reflective quality inspection module 30 includes:

the light reflection sample input module 31 is configured to input the identity card image marked as light reflection approval passed and the identity card image marked as light reflection approval failed as light reflection positive sample data and light reflection negative sample data to the light reflection quality inspection model respectively; and the quantity of the light reflection positive sample data is the same as that of the light reflection negative sample data.

And the reflection quality inspection model training module 32 is used for training a reflection quality inspection model according to the reflection positive sample data and the reflection negative sample data based on a caffe deep learning framework.

The fuzzy quality inspection module 40 is used for inputting the third image into a set fuzzy quality inspection model for fuzzy quality inspection to obtain a fourth image; and the set fuzzy quality inspection model is obtained according to the basic structure of the SqueezeNet.

Referring to fig. 9, in an embodiment, the fuzzy quality inspection module 40 includes:

the fuzzy sample input module 41 is configured to input the identity card image marked as passed through the fuzzy audit and the identity card image marked as failed through the fuzzy audit as fuzzy positive sample data and fuzzy negative sample data to the fuzzy quality inspection model respectively; wherein the number of the blurred positive sample data is the same as the number of the blurred negative sample data.

And the fuzzy quality inspection model training module 42 is configured to perform training of a fuzzy quality inspection model according to the fuzzy positive sample data and the fuzzy negative sample data based on a caffe deep learning framework.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. An identity card quality inspection method based on deep learning is characterized by comprising the following steps:

2. The deep learning-based quality inspection method for identity cards as claimed in claim 1, wherein the set reflective quality inspection model is obtained according to the basic structure of Squeezenet, and comprises:

3. The method as claimed in claim 1, wherein the step of setting the fuzzy quality inspection model according to the basic structure of squeezet comprises:

4. The deep learning-based identity card quality inspection method according to claim 2 or 3, wherein the basic structure of the SqueezeNet comprises:

5. The method for quality control of ID card based on deep learning of claim 1, wherein the preprocessing the first image to obtain the second image comprises:

6. An identity card quality inspection system based on deep learning, comprising:

7. The deep learning-based identity card quality inspection system of claim 6, wherein the reflective quality inspection module comprises:

8. The deep learning-based identity card quality inspection system of claim 6, wherein the fuzzy quality inspection module comprises:

9. The deep learning-based quality inspection system for identity cards of claim 6, wherein the first image preprocessing module comprises:

10. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the method for quality inspection of identity cards based on deep learning according to any one of claims 1 to 5.