CN111062338A - Certificate portrait consistency comparison method and system - Google Patents

Certificate portrait consistency comparison method and system Download PDF

Info

Publication number
CN111062338A
CN111062338A CN201911319574.4A CN201911319574A CN111062338A CN 111062338 A CN111062338 A CN 111062338A CN 201911319574 A CN201911319574 A CN 201911319574A CN 111062338 A CN111062338 A CN 111062338A
Authority
CN
China
Prior art keywords
portrait
license
comparison model
deep learning
different
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911319574.4A
Other languages
Chinese (zh)
Other versions
CN111062338B (en
Inventor
林玉玲
郝占龙
陈文传
庄国金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Shangji Network Technology Co ltd
Original Assignee
Xiamen Shangji Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Shangji Network Technology Co ltd filed Critical Xiamen Shangji Network Technology Co ltd
Priority to CN201911319574.4A priority Critical patent/CN111062338B/en
Publication of CN111062338A publication Critical patent/CN111062338A/en
Application granted granted Critical
Publication of CN111062338B publication Critical patent/CN111062338B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method and a system for comparing consistency of license portraits, which are characterized in that portrait cutting is carried out through face feature recognition to establish a sample set, a deep learning license portrait comparison model is established and trained, a threshold value capable of correctly distinguishing different samples from similar samples is calculated, two license portraits to be detected are input into the deep learning license portrait comparison model to obtain two feature vectors, the cosine distance of the two feature vectors is calculated, and the value of the cosine distance is subtracted from the threshold value to obtain whether the two license portraits are consistent. The invention has the beneficial effects that: compared with the traditional manual comparison, the method can more accurately express the image characteristics of the portrait, can sensitively distinguish obviously different clothes and hairstyles, can effectively distinguish the same portrait in images with different illumination backgrounds and different proportions, and reduces the interference of severe conditions such as pollution, blur and the like.

Description

Certificate portrait consistency comparison method and system
Technical Field
The invention relates to a method and a system for comparing consistency of identification portrait, belonging to the field of image identification.
Background
When a customer transacts business in a bank, the customer needs to provide an original identity card and an identity card copy, the bank background reserves the identity card copy as a reserved file, and when other subsequent businesses are transacted, a bank center person needs to take a picture and compare whether a personal identity card picture of the customer and a certificate picture reserved in a bank electronic system are the same certificate or not.
In the traditional banking business, the certificate image comparison is carried out by manual visual examination by a special examiner, most banking businesses need to carry out identity card examination and verification, a large amount of human resources are consumed in the business, and the accuracy of comparison is influenced when the examiner is tired in work, so that the handling of customer business is influenced. For example, when the portrait on the identification card and the portrait on the copy with the background are similar in terms of hairstyle, wearing, facial form, etc., the auditing staff may judge that the same certificate is the same or need to be authenticated through secondary auditing; in another situation, when the current staff take the identity card, the identity card is taken with blur, distortion, pollution and other defects due to the change of ambient light and the shake of the shooting equipment, and the auditor determines that different certificates need to be taken again after multiple comparisons.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method and a system for comparing the consistency of the certificate portrait, a deep learning system is established, manual audit is replaced by automatic audit of the certificate portrait by a machine through a neural network, and the method is high in accuracy and high in speed.
The technical scheme of the invention is as follows:
a method for comparing consistency of license portrait comprises the following steps:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristics are different in shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples;
s7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
The step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
In the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
In the step S4, the Arcface loss function specifically includes:
Figure BDA0002326775870000031
wherein the content of the first and second substances,
Figure BDA0002326775870000032
l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
In step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
Technical scheme two
A certificate portrait consistency comparison system comprising a memory and a processor, the memory storing instructions adapted to be loaded by the processor and to perform the steps of:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristic images have different shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples;
s7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
The step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
In the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
In the step S4, the loss function is an Arcface loss function, which specifically includes:
Figure BDA0002326775870000051
wherein the content of the first and second substances,
Figure BDA0002326775870000061
l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
In step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
The invention has the following beneficial effects:
1. according to the method and the system for comparing the consistency of the license portrait, the face feature recognition is realized through the MTCNN multitask convolution neural network, and the consistency of the portrait cutting position is ensured;
2. according to the method and the system for comparing the identity of the license portrait, provided by the invention, the neural network is built to extract and compare the features of the license portrait, so that the accuracy is high and the efficiency is high;
3. according to the method and the system for comparing the consistency of the license portrait, the constructed neural network adopts the arcface loss function, the angle separability of the characteristics is directly concerned, and the classification interval m is artificially set to reduce the inter-class distance and enlarge the inter-class difference, so that the fitting performance of the model is stronger, and the accuracy is higher;
4. according to the method and the system for comparing the identity ratio of the license portrait, the optimal threshold value is obtained by calculating the cosine distance and the ROC curve algorithm, and the accuracy is high.
Drawings
FIG. 1 is a flowchart of a method and system for comparing identity of a license;
FIG. 2 is a MTCNN face key point detection schematic diagram of a license portrait consistency comparison method and system of the present invention;
FIG. 3 is a schematic diagram of portrait cropping in the method and system for comparing consistency of identification portrait according to the present invention;
FIG. 4 is a schematic diagram of an ROC curve algorithm of the method and the system for comparing the identity ratio of the license and the portrait.
Detailed Description
The invention is described in detail below with reference to the figures and the specific embodiments.
Example one
As shown in fig. 1, a method for comparing consistency of a license portrait comprises the following steps:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristics are different in shooting parameters.
The identity comparison of the certificates is to compare whether two certificates are completely identical or not, namely, whether the two certificates are identical or not, wherein the same certificate refers to the identical certificate, and two head portraits worn by the same person in different hairstyles cannot be calculated to be identical. In the bank workflow, the original copy of the customer identity card is not changed, the original copy is scanned again and compared with the original image stored in the system, then the obtained sample is possibly subjected to adjustment of shooting parameters for many times based on the same certificate, or various image processing such as scaling and cutting are recorded as one type, other types are recorded as two types, and the image can be identified as the same type or different types through the training of the neural network.
In the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
In addition to obtaining different images of the same document by changing the shooting conditions during shooting, the images can also be changed by subjecting the reference image of the same document to image processing, such as blurring, adding noise, warping, etc.
S2: and identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types.
During training, the certificate images can be divided into a plurality of groups, each group is provided with a reference certificate image, the same type of image of the reference certificate and other different types of images are included, for example, in a training group of the group A, a plurality of certificate photo images of the group A are marked as the same type, and certificate image images such as EP-T-D are marked as different types; in the training group B, a plurality of certificate photo images of the group B are recorded as the same type, certificate photo images of the group B, such as the certificate photo images of the group A, the certificate photo images of the group B, the certificate photo images of the group A, the certificate photo images of the group C, the certificate photo images of the group B, and the certificate photo images of the group B are recorded as different types, wherein the group A is. The accuracy of the model can be improved through multiple sets of training.
S3: all sample files are divided into a training set and a test set.
In this embodiment, all sample files are divided into training sets and test sets at an 8:2 ratio, where the amount of training data and test data is balanced.
S4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, the value of the array represents the probability that the input sample file is of the same type and different types, the neural network is provided with a loss function to be connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation.
The 512-dimensional vector is used to characterize various features of the image, and is used by the machine itself to represent an image, which can be understood as the unique ID of the current image. The neural network is trained to obtain a model that best represents the image features. The full-connection layer has the function that the probability of correct prediction of the full-connection layer is higher and higher through continuous training, and further the 512-dimensional vector for representing the image is more and more accurate.
The full-connection layer converts the multi-dimensional vectors output after convolution pooling into an array, and the prediction work is executed.
In the step S4, the loss function is an Arcface loss function, which specifically includes:
Figure BDA0002326775870000091
wherein the content of the first and second substances,
Figure BDA0002326775870000092
l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
The loss function (loss function) is a function that maps the value of a random event or its related random variables to non-negative real numbers to represent the "risk" or "loss" of the random event. In application, the loss function is usually associated with the optimization problem as a learning criterion, i.e. the model is solved and evaluated by minimizing the loss function. In deep learning, a calculation result of a loss function represents a difference value between an output result of a convolutional network layer and an actual labeling result, a commonly used loss function is softmax, but softmax only learns characteristics in an Euclidean space, only considers whether samples can be correctly classified, does not consider intra-class and inter-class distances, and cannot optimize the characteristics, so that a positive sample pair obtains high similarity and a negative sample pair obtains low similarity. The Arcface loss function is adopted in the embodiment, the Arcface loss function directly concerns the angle separability of the features, and the classification interval m is artificially set to reduce the distance between the inner classes and enlarge the gap between the classes, so that the fitting performance of the model is stronger.
S5: and inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, so that the license portrait comparison model capable of correctly expressing the license portrait type characteristics is obtained.
S6: and inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array.
Combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
and substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples.
In step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
The difference between the two spatial lattice arrays can be calculated by using various calculation methods, such as Euclidean distance, standardized Euclidean distance, Mahalanobis distance, cosine distance, Hamming distance, Manhattan distance and the like. Wherein the cosine distance calculation method is more accurate for 512-dimensional vectors, and the difference between the results is obvious.
Two samples each correspond to an output vector, which is 512-dimensional. All the cosine distances of the same type and the cosine distances of different types are calculated, and then a threshold value is calculated by an algorithm. Similarly finding all values and then plotting on the graph will find that at some point the upper combinations are mostly homogeneous and the lower combinations are mostly non-homogeneous, then this point is the threshold, although ROC is not so simple to compute, it is an existing mathematical method that is specialized to compute this.
The ROC curve is a characteristic curve of the operation of a subject/the operating characteristic curve of a receiver (receiver operating characteristic curve), is a comprehensive index reflecting continuous variables of sensitivity and specificity, is a mutual relation of sensitivity and specificity disclosed by a composition method, a series of sensitivity and specificity are calculated by setting the continuous variables into a plurality of different critical values, then the sensitivity is used as an ordinate, and the (1-specificity) is used as an abscissa to draw a curve, and the larger the area below the curve is, the higher the diagnosis accuracy is. On the ROC curve, the point closest to the top left of the graph is the cut-off value for high sensitivity and specificity.
And (3) substituting cosine distances of two eigenvectors of all the samples of the same type and the samples of different types into an ROC algorithm to obtain an ROC curve, and setting a threshold value to be 0.87 if the ordinate true value of the nearest vertex is 0.87 as shown in FIG. 4.
S7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
As shown in fig. 3, the step S2 specifically includes:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
By positioning the key points of the human face and expanding the proportion, the whole human face image can be obtained, and meanwhile, the interference of redundant parts on subsequent recognition is reduced. And the output image is in the same proportion regardless of the size of the original image.
In step S21, as shown in fig. 2, the picture containing the license portrait is input into an MTCNN multitask convolutional neural network, which identifies 5 key points of the face through convolutional pooling, and obtains a key point surrounding frame.
And when the MTCNN executes the detection of the key points of the face, automatically drawing an outer surrounding frame according to the connection line of the key points.
According to the method and the system for comparing the consistency of the license portrait, the face feature recognition is realized through the MTCNN multitask convolution neural network, and the consistency of the portrait cutting position is ensured; the neural network is built to extract and compare the identification portrait characteristics, so that the accuracy is high and the efficiency is high; the built neural network adopts an arcface loss function, the angle separability of the characteristics is directly concerned, and the classification interval m is artificially set to reduce the intra-class interval and enlarge the inter-class difference, so that the fitting performance of the model is stronger, and the accuracy is higher; the optimal threshold value is obtained by calculating the cosine distance and the ROC curve algorithm, and the accuracy is high.
Example two
A certificate portrait consistency comparison system comprising a memory and a processor, the memory storing instructions adapted to be loaded by the processor and to perform the steps of:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristic images have different shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
and substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples.
S7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
The step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
In the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
In the step S4, the loss function is an Arcface loss function, which specifically includes:
Figure BDA0002326775870000141
wherein the content of the first and second substances,
Figure BDA0002326775870000142
l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
In step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method for comparing the consistency of a certificate portrait is characterized by comprising the following steps:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristics are different in shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples;
s7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
2. The method for comparing the consistency of the license portrait according to claim 1, which is characterized in that: the step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
3. The method for comparing the consistency of the license portrait according to claim 1, which is characterized in that: in the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
4. The method for comparing the consistency of the license portrait according to claim 1, which is characterized in that: in the step S4, the loss function is an Arcface loss function, which specifically includes:
Figure FDA0002326775860000021
wherein the content of the first and second substances,
Figure FDA0002326775860000031
l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
5. The method for comparing the consistency of the license portrait according to claim 1, which is characterized in that: in step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
6. A certificate portrait consistency comparison system, comprising a memory and a processor, wherein the memory stores instructions adapted to be loaded by the processor and to perform the steps of:
s1: obtaining a picture containing a license portrait, wherein the picture containing the license portrait comprises: different characteristic images of the same certificate and images of different certificates, wherein the different characteristic images have different shooting parameters;
s2: identifying the portrait in the picture containing the certificate portrait, cutting the portrait according to a set size and storing the portrait as a sample file, marking the sample files belonging to the same certificate as the same type, and marking the sample files not belonging to the same certificate as different types;
s3: dividing all sample files into a training set and a testing set;
s4: building a deep learning portrait comparison model, wherein a neural network backbone of the deep learning portrait comparison model adopts a 50-layer residual error convolution network, the sample file is input, a 512-dimensional characteristic vector is output through multilayer convolution and pooling of the convolution network, the 512-dimensional characteristic vector is accessed to a full connection layer, the full connection layer converts the 512-dimensional characteristic vector into an array, and the value of the array represents the probability that the input sample file is of the same type and different types, wherein the neural network is provided with a loss function connected with the full connection layer, and the operation result of the deep learning portrait comparison model tends to the optimal solution through the loss function operation;
s5: inputting the sample files in the training set obtained in the step S3 into the deep learning portrait comparison model set up in the step S4 for training until the probability that the array prediction type output by the full connection layer is correct is the highest, and obtaining a license portrait comparison model capable of correctly expressing the license portrait type characteristics;
s6: inputting the sample files in the test set into the license portrait comparison model, combining every two 512-dimensional feature vectors output by all similar sample files through the deep learning portrait comparison model, calculating the difference value of the two feature vectors in each combination, and recording as a similar array;
combining every two 512-dimensional feature vectors output by the deep learning portrait comparison model of all different types of sample files, calculating the difference value of the two feature vectors in each combination, and recording as a different array;
substituting the two arrays into an ROC curve algorithm to obtain a threshold value, wherein two samples higher than the threshold value are similar samples, and two samples lower than the threshold value are different samples;
s7: processing a pair of to-be-verified license portrait images through the step S2 to obtain normalized images, respectively inputting the normalized images into the license portrait comparison model, outputting 512-dimensional feature description vectors of two license portraits, calculating a difference value of the two feature vectors, subtracting the difference value from the threshold value obtained in the step S6, judging that the two license portraits are consistent when the result is positive, and judging that the two license portraits are inconsistent when the result is negative.
7. The system according to claim 6, wherein the system comprises: the step of S2 is specifically:
s21: detecting key points of a face, inputting the picture containing the license portrait into an MTCNN (multiple-tuned neural network) multitask convolutional neural network, identifying 5 key points of the face through convolutional pooling by the MTCNN multitask convolutional neural network, and establishing a key point outer surrounding frame, wherein the key points are two eyeball, one nose tip and two mouth corners of the face;
s22: and expanding the set width and height proportion outwards by using the key point outer surrounding frame to obtain four corner point coordinates, and connecting and cutting the four corner point coordinates to obtain a portrait, wherein the width and height proportion expanded outwards are respectively 30% and 40%.
8. The system according to claim 6, wherein the system comprises: in the step S1, the shooting device obtains different characteristic images of the same document by changing the shooting angle, the ambient light intensity and the shooting distance.
9. The system according to claim 6, wherein the system comprises: in the step S4, the loss function is an Arcface loss function, which specifically includes:
Figure FDA0002326775860000051
wherein the content of the first and second substances,
Figure FDA0002326775860000052
l is a loss value, N represents the number of samples, i represents the ith sample, j represents the jth class, yi represents the fraction of the class to which the ith sample belongs, s and m are hyper-parameters of the model, W is the weight of the depth model, X is the input feature vector, theta is the included angle between the input vector X and the weight W, and T represents the transpose of the vector.
10. The system according to claim 6, wherein the system comprises: in step S6, the method for calculating the difference between the two eigenvectors in each combination is a cosine distance calculation method.
CN201911319574.4A 2019-12-19 2019-12-19 License and portrait consistency comparison method and system Active CN111062338B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911319574.4A CN111062338B (en) 2019-12-19 2019-12-19 License and portrait consistency comparison method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911319574.4A CN111062338B (en) 2019-12-19 2019-12-19 License and portrait consistency comparison method and system

Publications (2)

Publication Number Publication Date
CN111062338A true CN111062338A (en) 2020-04-24
CN111062338B CN111062338B (en) 2023-11-17

Family

ID=70302451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911319574.4A Active CN111062338B (en) 2019-12-19 2019-12-19 License and portrait consistency comparison method and system

Country Status (1)

Country Link
CN (1) CN111062338B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652285A (en) * 2020-05-09 2020-09-11 济南浪潮高新科技投资发展有限公司 Tea cake category identification method, equipment and medium
CN118230344A (en) * 2024-05-22 2024-06-21 盛视科技股份有限公司 Full-page recognition method for multilingual certificate based on word recognition

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127103A (en) * 2016-06-12 2016-11-16 广州广电运通金融电子股份有限公司 A kind of off-line identity authentication method and device
CN106934408A (en) * 2015-12-29 2017-07-07 北京大唐高鸿数据网络技术有限公司 Identity card picture sorting technique based on convolutional neural networks
WO2017133009A1 (en) * 2016-02-04 2017-08-10 广州新节奏智能科技有限公司 Method for positioning human joint using depth image of convolutional neural network
CN108009528A (en) * 2017-12-26 2018-05-08 广州广电运通金融电子股份有限公司 Face authentication method, device, computer equipment and storage medium based on Triplet Loss

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934408A (en) * 2015-12-29 2017-07-07 北京大唐高鸿数据网络技术有限公司 Identity card picture sorting technique based on convolutional neural networks
WO2017133009A1 (en) * 2016-02-04 2017-08-10 广州新节奏智能科技有限公司 Method for positioning human joint using depth image of convolutional neural network
CN106127103A (en) * 2016-06-12 2016-11-16 广州广电运通金融电子股份有限公司 A kind of off-line identity authentication method and device
CN108009528A (en) * 2017-12-26 2018-05-08 广州广电运通金融电子股份有限公司 Face authentication method, device, computer equipment and storage medium based on Triplet Loss

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652285A (en) * 2020-05-09 2020-09-11 济南浪潮高新科技投资发展有限公司 Tea cake category identification method, equipment and medium
CN118230344A (en) * 2024-05-22 2024-06-21 盛视科技股份有限公司 Full-page recognition method for multilingual certificate based on word recognition

Also Published As

Publication number Publication date
CN111062338B (en) 2023-11-17

Similar Documents

Publication Publication Date Title
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
CN105389593B (en) Image object recognition methods based on SURF feature
WO2018086543A1 (en) Living body identification method, identity authentication method, terminal, server and storage medium
US8064653B2 (en) Method and system of person identification by facial image
CN103136504B (en) Face identification method and device
CN100423020C (en) Human face identifying method based on structural principal element analysis
US8666122B2 (en) Assessing biometric sample quality using wavelets and a boosted classifier
CN104680144B (en) Based on the lip reading recognition methods and device for projecting very fast learning machine
CN111126240B (en) Three-channel feature fusion face recognition method
CN104464079A (en) Multi-currency-type and face value recognition method based on template feature points and topological structures of template feature points
CN112488211A (en) Fabric image flaw classification method
CN101739555A (en) Method and system for detecting false face, and method and system for training false face model
CN107862267A (en) Face recognition features' extraction algorithm based on full symmetric local weber description
CN109740572A (en) A kind of human face in-vivo detection method based on partial color textural characteristics
CN108960142B (en) Pedestrian re-identification method based on global feature loss function
CN111832405A (en) Face recognition method based on HOG and depth residual error network
CN111062338B (en) License and portrait consistency comparison method and system
CN111274883A (en) Synthetic sketch face recognition method based on multi-scale HOG (histogram of oriented gradient) features and deep features
CN115240280A (en) Construction method of human face living body detection classification model, detection classification method and device
CN113095158A (en) Handwriting generation method and device based on countermeasure generation network
CN112115835A (en) Face key point-based certificate photo local anomaly detection method
CN113436735A (en) Body weight index prediction method, device and storage medium based on face structure measurement
CN108010015A (en) One kind refers to vein video quality evaluation method and its system
Andiani et al. Face recognition for work attendance using multitask convolutional neural network (MTCNN) and pre-trained facenet
JP4749884B2 (en) Learning method of face discriminating apparatus, face discriminating method and apparatus, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant