CN111062338A - Certificate portrait consistency comparison method and system - Google Patents
Certificate portrait consistency comparison method and system Download PDFInfo
- Publication number
- CN111062338A CN111062338A CN201911319574.4A CN201911319574A CN111062338A CN 111062338 A CN111062338 A CN 111062338A CN 201911319574 A CN201911319574 A CN 201911319574A CN 111062338 A CN111062338 A CN 111062338A
- Authority
- CN
- China
- Prior art keywords
- portrait
- license
- comparison model
- deep learning
- different
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 239000013598 vector Substances 0.000 claims abstract description 79
- 238000013135 deep learning Methods 0.000 claims abstract description 40
- 230000006870 function Effects 0.000 claims description 35
- 238000013528 artificial neural network Methods 0.000 claims description 27
- 238000012549 training Methods 0.000 claims description 26
- 238000011176 pooling Methods 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 14
- 238000013527 convolutional neural network Methods 0.000 claims description 13
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 8
- 238000003491 array Methods 0.000 claims description 7
- 210000005252 bulbus oculi Anatomy 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 6
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000005286 illumination Methods 0.000 abstract 1
- 230000035945 sensitivity Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000012550 audit Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Probability & Statistics with Applications (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a method and a system for comparing consistency of license portraits, which are characterized in that portrait cutting is carried out through face feature recognition to establish a sample set, a deep learning license portrait comparison model is established and trained, a threshold value capable of correctly distinguishing different samples from similar samples is calculated, two license portraits to be detected are input into the deep learning license portrait comparison model to obtain two feature vectors, the cosine distance of the two feature vectors is calculated, and the value of the cosine distance is subtracted from the threshold value to obtain whether the two license portraits are consistent. The invention has the beneficial effects that: compared with the traditional manual comparison, the method can more accurately express the image characteristics of the portrait, can sensitively distinguish obviously different clothes and hairstyles, can effectively distinguish the same portrait in images with different illumination backgrounds and different proportions, and reduces the interference of severe conditions such as pollution, blur and the like.
Description
Technical Field
The invention relates to a method and a system for comparing consistency of identification portrait, belonging to the field of image identification.
Background
When a customer transacts business in a bank, the customer needs to provide an original identity card and an identity card copy, the bank background reserves the identity card copy as a reserved file, and when other subsequent businesses are transacted, a bank center person needs to take a picture and compare whether a personal identity card picture of the customer and a certificate picture reserved in a bank electronic system are the same certificate or not.
In the traditional banking business, the certificate image comparison is carried out by manual visual examination by a special examiner, most banking businesses need to carry out identity card examination and verification, a large amount of human resources are consumed in the business, and the accuracy of comparison is influenced when the examiner is tired in work, so that the handling of customer business is influenced. For example, when the portrait on the identification card and the portrait on the copy with the background are similar in terms of hairstyle, wearing, facial form, etc., the auditing staff may judge that the same certificate is the same or need to be authenticated through secondary auditing; in another situation, when the current staff take the identity card, the identity card is taken with blur, distortion, pollution and other defects due to the change of ambient light and the shake of the shooting equipment, and the auditor determines that different certificates need to be taken again after multiple comparisons.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method and a system for comparing the consistency of the certificate portrait, a deep learning system is established, manual audit is replaced by automatic audit of the certificate portrait by a machine through a neural network, and the method is high in accuracy and high in speed.
The technical scheme of the invention is as follows:
a method for comparing consistency of license portrait comprises the following steps:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristics are different in shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples;
s7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
The step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
In the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
In the step S4, the Arcface loss function specifically includes:
wherein the content of the first and second substances,l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
In step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
Technical scheme two
A certificate portrait consistency comparison system comprising a memory and a processor, the memory storing instructions adapted to be loaded by the processor and to perform the steps of:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristic images have different shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples;
s7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
The step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
In the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
In the step S4, the loss function is an Arcface loss function, which specifically includes:
wherein the content of the first and second substances,l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
In step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
The invention has the following beneficial effects:
1. according to the method and the system for comparing the consistency of the license portrait, the face feature recognition is realized through the MTCNN multitask convolution neural network, and the consistency of the portrait cutting position is ensured;
2. according to the method and the system for comparing the identity of the license portrait, provided by the invention, the neural network is built to extract and compare the features of the license portrait, so that the accuracy is high and the efficiency is high;
3. according to the method and the system for comparing the consistency of the license portrait, the constructed neural network adopts the arcface loss function, the angle separability of the characteristics is directly concerned, and the classification interval m is artificially set to reduce the inter-class distance and enlarge the inter-class difference, so that the fitting performance of the model is stronger, and the accuracy is higher;
4. according to the method and the system for comparing the identity ratio of the license portrait, the optimal threshold value is obtained by calculating the cosine distance and the ROC curve algorithm, and the accuracy is high.
Drawings
FIG. 1 is a flowchart of a method and system for comparing identity of a license;
FIG. 2 is a MTCNN face key point detection schematic diagram of a license portrait consistency comparison method and system of the present invention;
FIG. 3 is a schematic diagram of portrait cropping in the method and system for comparing consistency of identification portrait according to the present invention;
FIG. 4 is a schematic diagram of an ROC curve algorithm of the method and the system for comparing the identity ratio of the license and the portrait.
Detailed Description
The invention is described in detail below with reference to the figures and the specific embodiments.
Example one
As shown in fig. 1, a method for comparing consistency of a license portrait comprises the following steps:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristics are different in shooting parameters.
The identity comparison of the certificates is to compare whether two certificates are completely identical or not, namely, whether the two certificates are identical or not, wherein the same certificate refers to the identical certificate, and two head portraits worn by the same person in different hairstyles cannot be calculated to be identical. In the bank workflow, the original copy of the customer identity card is not changed, the original copy is scanned again and compared with the original image stored in the system, then the obtained sample is possibly subjected to adjustment of shooting parameters for many times based on the same certificate, or various image processing such as scaling and cutting are recorded as one type, other types are recorded as two types, and the image can be identified as the same type or different types through the training of the neural network.
In the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
In addition to obtaining different images of the same document by changing the shooting conditions during shooting, the images can also be changed by subjecting the reference image of the same document to image processing, such as blurring, adding noise, warping, etc.
S2: and identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types.
During training, the certificate images can be divided into a plurality of groups, each group is provided with a reference certificate image, the same type of image of the reference certificate and other different types of images are included, for example, in a training group of the group A, a plurality of certificate photo images of the group A are marked as the same type, and certificate image images such as EP-T-D are marked as different types; in the training group B, a plurality of certificate photo images of the group B are recorded as the same type, certificate photo images of the group B, such as the certificate photo images of the group A, the certificate photo images of the group B, the certificate photo images of the group A, the certificate photo images of the group C, the certificate photo images of the group B, and the certificate photo images of the group B are recorded as different types, wherein the group A is. The accuracy of the model can be improved through multiple sets of training.
S3: all sample files are divided into a training set and a test set.
In this embodiment, all sample files are divided into training sets and test sets at an 8:2 ratio, where the amount of training data and test data is balanced.
S4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, the value of the array represents the probability that the input sample file is of the same type and different types, the neural network is provided with a loss function to be connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation.
The 512-dimensional vector is used to characterize various features of the image, and is used by the machine itself to represent an image, which can be understood as the unique ID of the current image. The neural network is trained to obtain a model that best represents the image features. The full-connection layer has the function that the probability of correct prediction of the full-connection layer is higher and higher through continuous training, and further the 512-dimensional vector for representing the image is more and more accurate.
The full-connection layer converts the multi-dimensional vectors output after convolution pooling into an array, and the prediction work is executed.
In the step S4, the loss function is an Arcface loss function, which specifically includes:
wherein the content of the first and second substances,l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
The loss function (loss function) is a function that maps the value of a random event or its related random variables to non-negative real numbers to represent the "risk" or "loss" of the random event. In application, the loss function is usually associated with the optimization problem as a learning criterion, i.e. the model is solved and evaluated by minimizing the loss function. In deep learning, a calculation result of a loss function represents a difference value between an output result of a convolutional network layer and an actual labeling result, a commonly used loss function is softmax, but softmax only learns characteristics in an Euclidean space, only considers whether samples can be correctly classified, does not consider intra-class and inter-class distances, and cannot optimize the characteristics, so that a positive sample pair obtains high similarity and a negative sample pair obtains low similarity. The Arcface loss function is adopted in the embodiment, the Arcface loss function directly concerns the angle separability of the features, and the classification interval m is artificially set to reduce the distance between the inner classes and enlarge the gap between the classes, so that the fitting performance of the model is stronger.
S5: and inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, so that the license portrait comparison model capable of correctly expressing the license portrait type characteristics is obtained.
S6: and inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array.
Combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
and substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples.
In step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
The difference between the two spatial lattice arrays can be calculated by using various calculation methods, such as Euclidean distance, standardized Euclidean distance, Mahalanobis distance, cosine distance, Hamming distance, Manhattan distance and the like. Wherein the cosine distance calculation method is more accurate for 512-dimensional vectors, and the difference between the results is obvious.
Two samples each correspond to an output vector, which is 512-dimensional. All the cosine distances of the same type and the cosine distances of different types are calculated, and then a threshold value is calculated by an algorithm. Similarly finding all values and then plotting on the graph will find that at some point the upper combinations are mostly homogeneous and the lower combinations are mostly non-homogeneous, then this point is the threshold, although ROC is not so simple to compute, it is an existing mathematical method that is specialized to compute this.
The ROC curve is a characteristic curve of the operation of a subject/the operating characteristic curve of a receiver (receiver operating characteristic curve), is a comprehensive index reflecting continuous variables of sensitivity and specificity, is a mutual relation of sensitivity and specificity disclosed by a composition method, a series of sensitivity and specificity are calculated by setting the continuous variables into a plurality of different critical values, then the sensitivity is used as an ordinate, and the (1-specificity) is used as an abscissa to draw a curve, and the larger the area below the curve is, the higher the diagnosis accuracy is. On the ROC curve, the point closest to the top left of the graph is the cut-off value for high sensitivity and specificity.
And (3) substituting cosine distances of two eigenvectors of all the samples of the same type and the samples of different types into an ROC algorithm to obtain an ROC curve, and setting a threshold value to be 0.87 if the ordinate true value of the nearest vertex is 0.87 as shown in FIG. 4.
S7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
As shown in fig. 3, the step S2 specifically includes:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
By positioning the key points of the human face and expanding the proportion, the whole human face image can be obtained, and meanwhile, the interference of redundant parts on subsequent recognition is reduced. And the output image is in the same proportion regardless of the size of the original image.
In step S21, as shown in fig. 2, the picture containing the license portrait is input into an MTCNN multitask convolutional neural network, which identifies 5 key points of the face through convolutional pooling, and obtains a key point surrounding frame.
And when the MTCNN executes the detection of the key points of the face, automatically drawing an outer surrounding frame according to the connection line of the key points.
According to the method and the system for comparing the consistency of the license portrait, the face feature recognition is realized through the MTCNN multitask convolution neural network, and the consistency of the portrait cutting position is ensured; the neural network is built to extract and compare the identification portrait characteristics, so that the accuracy is high and the efficiency is high; the built neural network adopts an arcface loss function, the angle separability of the characteristics is directly concerned, and the classification interval m is artificially set to reduce the intra-class interval and enlarge the inter-class difference, so that the fitting performance of the model is stronger, and the accuracy is higher; the optimal threshold value is obtained by calculating the cosine distance and the ROC curve algorithm, and the accuracy is high.
Example two
A certificate portrait consistency comparison system comprising a memory and a processor, the memory storing instructions adapted to be loaded by the processor and to perform the steps of:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristic images have different shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
and substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples.
S7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
The step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
In the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
In the step S4, the loss function is an Arcface loss function, which specifically includes:
wherein the content of the first and second substances,l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
In step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (10)
1. A method for comparing the consistency of a certificate portrait is characterized by comprising the following steps:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristics are different in shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples;
s7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
2. The method for comparing the consistency of the license portrait according to claim 1, which is characterized in that: the step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
3. The method for comparing the consistency of the license portrait according to claim 1, which is characterized in that: in the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
4. The method for comparing the consistency of the license portrait according to claim 1, which is characterized in that: in the step S4, the loss function is an Arcface loss function, which specifically includes:
wherein the content of the first and second substances,l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
5. The method for comparing the consistency of the license portrait according to claim 1, which is characterized in that: in step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
6. A certificate portrait consistency comparison system, comprising a memory and a processor, wherein the memory stores instructions adapted to be loaded by the processor and to perform the steps of:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristic images have different shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples;
s7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
7. The system according to claim 6, wherein the system comprises: the step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
8. The system according to claim 6, wherein the system comprises: in the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
9. The system according to claim 6, wherein the system comprises: in the step S4, the loss function is an Arcface loss function, which specifically includes:
wherein the content of the first and second substances,l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
10. The system according to claim 6, wherein the system comprises: in step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911319574.4A CN111062338B (en) | 2019-12-19 | 2019-12-19 | License and portrait consistency comparison method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911319574.4A CN111062338B (en) | 2019-12-19 | 2019-12-19 | License and portrait consistency comparison method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111062338A true CN111062338A (en) | 2020-04-24 |
CN111062338B CN111062338B (en) | 2023-11-17 |
Family
ID=70302451
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911319574.4A Active CN111062338B (en) | 2019-12-19 | 2019-12-19 | License and portrait consistency comparison method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111062338B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111652285A (en) * | 2020-05-09 | 2020-09-11 | 济南浪潮高新科技投资发展有限公司 | Tea cake category identification method, equipment and medium |
CN118230344A (en) * | 2024-05-22 | 2024-06-21 | 盛视科技股份有限公司 | Full-page recognition method for multilingual certificate based on word recognition |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106127103A (en) * | 2016-06-12 | 2016-11-16 | 广州广电运通金融电子股份有限公司 | A kind of off-line identity authentication method and device |
CN106934408A (en) * | 2015-12-29 | 2017-07-07 | 北京大唐高鸿数据网络技术有限公司 | Identity card picture sorting technique based on convolutional neural networks |
WO2017133009A1 (en) * | 2016-02-04 | 2017-08-10 | 广州新节奏智能科技有限公司 | Method for positioning human joint using depth image of convolutional neural network |
CN108009528A (en) * | 2017-12-26 | 2018-05-08 | 广州广电运通金融电子股份有限公司 | Face authentication method, device, computer equipment and storage medium based on Triplet Loss |
-
2019
- 2019-12-19 CN CN201911319574.4A patent/CN111062338B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106934408A (en) * | 2015-12-29 | 2017-07-07 | 北京大唐高鸿数据网络技术有限公司 | Identity card picture sorting technique based on convolutional neural networks |
WO2017133009A1 (en) * | 2016-02-04 | 2017-08-10 | 广州新节奏智能科技有限公司 | Method for positioning human joint using depth image of convolutional neural network |
CN106127103A (en) * | 2016-06-12 | 2016-11-16 | 广州广电运通金融电子股份有限公司 | A kind of off-line identity authentication method and device |
CN108009528A (en) * | 2017-12-26 | 2018-05-08 | 广州广电运通金融电子股份有限公司 | Face authentication method, device, computer equipment and storage medium based on Triplet Loss |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111652285A (en) * | 2020-05-09 | 2020-09-11 | 济南浪潮高新科技投资发展有限公司 | Tea cake category identification method, equipment and medium |
CN118230344A (en) * | 2024-05-22 | 2024-06-21 | 盛视科技股份有限公司 | Full-page recognition method for multilingual certificate based on word recognition |
Also Published As
Publication number | Publication date |
---|---|
CN111062338B (en) | 2023-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110348319B (en) | Face anti-counterfeiting method based on face depth information and edge image fusion | |
CN105389593B (en) | Image object recognition methods based on SURF feature | |
WO2018086543A1 (en) | Living body identification method, identity authentication method, terminal, server and storage medium | |
US8064653B2 (en) | Method and system of person identification by facial image | |
CN103136504B (en) | Face identification method and device | |
CN100423020C (en) | Human face identifying method based on structural principal element analysis | |
US8666122B2 (en) | Assessing biometric sample quality using wavelets and a boosted classifier | |
CN104680144B (en) | Based on the lip reading recognition methods and device for projecting very fast learning machine | |
CN111126240B (en) | Three-channel feature fusion face recognition method | |
CN104464079A (en) | Multi-currency-type and face value recognition method based on template feature points and topological structures of template feature points | |
CN112488211A (en) | Fabric image flaw classification method | |
CN101739555A (en) | Method and system for detecting false face, and method and system for training false face model | |
CN107862267A (en) | Face recognition features' extraction algorithm based on full symmetric local weber description | |
CN109740572A (en) | A kind of human face in-vivo detection method based on partial color textural characteristics | |
CN108960142B (en) | Pedestrian re-identification method based on global feature loss function | |
CN111832405A (en) | Face recognition method based on HOG and depth residual error network | |
CN111062338B (en) | License and portrait consistency comparison method and system | |
CN111274883A (en) | Synthetic sketch face recognition method based on multi-scale HOG (histogram of oriented gradient) features and deep features | |
CN115240280A (en) | Construction method of human face living body detection classification model, detection classification method and device | |
CN113095158A (en) | Handwriting generation method and device based on countermeasure generation network | |
CN112115835A (en) | Face key point-based certificate photo local anomaly detection method | |
CN113436735A (en) | Body weight index prediction method, device and storage medium based on face structure measurement | |
CN108010015A (en) | One kind refers to vein video quality evaluation method and its system | |
Andiani et al. | Face recognition for work attendance using multitask convolutional neural network (MTCNN) and pre-trained facenet | |
JP4749884B2 (en) | Learning method of face discriminating apparatus, face discriminating method and apparatus, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |