WO2018054283A1

WO2018054283A1 - Face model training method and device, and face authentication method and device

Info

Publication number: WO2018054283A1
Application number: PCT/CN2017/102255
Authority: WO
Inventors: 王洋; 张伟琳; 陆小军
Original assignee: 北京眼神科技有限公司
Priority date: 2016-09-23
Filing date: 2017-09-19
Publication date: 2018-03-29
Also published as: CN107871100A; CN107871100B

Abstract

A face model training method and device, and a face authentication method and device. The training method comprises: obtaining a training sample (101), the training sample comprising training image data and credential image data; obtaining a training face image and a credential face image according to the training image data and the credential image data (102); training a face feature model by using the training face image (103); and adjusting the face feature model by using the paired training face image and credential face image (104). A model is trained by means of a method for pre-training an identification signal and fine-adjusting an authentication signal, the problem of unbalanced number of samples is solved, and the performance of the model is improved, thereby improving the accuracy of face authentication.

Description

Face model training method and device, face authentication method and device

The present application claims priority to Chinese Patent Application No. 201610848965.5, entitled "Surface Model Training Method and Apparatus, Face Authentication Method and Apparatus", filed on September 23, 2016, the entire contents of which are hereby incorporated by reference. This is incorporated herein by reference.

Technical field

The embodiments of the present application relate to the technical field of biological data, and in particular, to a training method for a face model, a face authentication method based on a face model, a training device for a face model, and a face model based on a face model. Face authentication device.

Background technique

With the widespread application of second-generation ID cards, residence permits and other documents in the fields of finance and commerce, there have been more and more problems such as theft of documents and forgery of documents.

Face authentication has the characteristics of low user cooperation, non-contact, and non-mandatory in use. Face authentication assists in the verification of documents in the fields of finance and commerce.

However, face authentication is also highly susceptible to the external environment (such as lighting, gestures, expressions, etc.), and the images in the document are compressed, the resolution is low, and the age difference between the current video image is large, and the background difference is obvious.

At present, the method of authentication processing based on documents is mainly based on traditional statistical learning and machine learning methods, for example, MMP-PCA method, LGBP-PCA-LDA method, BSF-PCA-LDA method, and the like.

Most of these face authentication methods use hand-crafted features, which are less robust to illumination, attitude, and age changes, and the training process requires a large number of ID photos and video photos as samples, but The number of photo IDs is generally small, often only one, and there is a problem of the imbalance of the number of sample images during the training process, that is, the number of video photos is large, and the number of photo photos is small. The model is trained by using the above-mentioned number of unbalanced sample images. The performance of the model leading to training is poor, and the accuracy of face authentication is low.

Summary of the invention

In view of the above problems, in order to solve the above features, the robustness is poor, the number of samples is large, and the model is The problem that the accuracy of the difference and the face authentication is low, the embodiment of the present application proposes a training method of the face model, a face authentication method based on the face model and a corresponding training of the face model. A device, a face authentication device based on a face model.

In order to solve the above problem, the embodiment of the present application discloses a training method for a face model, including:

Obtaining a training sample, the training sample including training image data and document image data;

Obtaining a training face image and a document face image according to the training image data and the document image data;

Training the facial feature model by using the trained face image;

The face feature model is adjusted by using a paired training face image and a document face image.

The embodiment of the present application further discloses a face authentication method based on a face model, wherein the face model is a face model obtained by the above training method, and the face model includes a face feature model, and the face is Certification methods include:

Collecting target image data when receiving an instruction for face authentication;

Extracting a target face image in the target image data;

Extracting the target face image into a pre-trained face feature model to extract a target face feature;

The authentication process is performed according to the target face feature and the specified document image data.

The embodiment of the present application further discloses a training device for a face model, including:

a training sample obtaining module, configured to acquire a training sample, where the training sample includes training image data and document image data;

a sample face image extraction module, configured to obtain a training face image and a document face image according to the training image data and the document image data;

a face model training module, configured to train a face feature model by using the trained face image;

Face model adjustment module for pairing training face images and document face images, The face feature model is adjusted.

The embodiment of the present application further discloses a face authentication device based on a face model, wherein the face model is a face model obtained by the training device, and the face model includes a face feature model, and the face is The authentication device includes:

a target image data module, configured to collect target image data when receiving an instruction for face authentication;

a target face image extraction module, configured to extract a target face image in the target image data;

a target facial feature extraction module, configured to input the target facial image into a pre-trained facial feature model to extract a target facial feature;

The authentication processing module is configured to perform an authentication process according to the target facial feature and the specified document image data.

On the other hand, an embodiment of the present application provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

The training method of the face model according to any one of the embodiments of the present application is implemented when the processor is configured to execute a computer program stored in the memory.

On the other hand, the embodiment of the present application provides a computer program for a training method that is executed to perform the face model according to any one of the embodiments of the present application.

On the other hand, an embodiment of the present application provides a storage medium for storing a computer program, which is executed to perform training of a face model according to any one of the embodiments of the present application. method.

a memory for storing a computer program;

The face recognition method based on any of the face models provided by the embodiments of the present application is implemented when the processor is configured to execute a computer program stored in the memory.

On the other hand, the embodiment of the present application provides a computer program, which is used to execute a face-based face authentication method according to any one of the embodiments of the present application.

On the other hand, the embodiment of the present application provides a storage medium, where the storage medium is used to store a computer program, and the computer program is executed to execute the face model based on any of the embodiments provided by the embodiments of the present application. Face authentication method.

Embodiments of the present application include the following advantages:

In the embodiment of the present application, the training face image and the document face image are extracted from the training image data and the document image data, and the face model is trained by training the face image, and the paired training face image and the document face image are used to The face feature model is adjusted to identify the signal training model for pre-training and authentication signal fine-tuning, to solve the problem of unbalanced sample quantity, improve the performance of the model, and thus improve the accuracy of face authentication.

Moreover, the feature expression of the face does not depend on the artificial selection of features, and shows good robustness to factors such as age, posture and illumination.

DRAWINGS

1 is a flow chart of steps of an embodiment of a training method for a face model according to an embodiment of the present application;

2 is a diagram showing an example of a training sample according to an embodiment of the present application;

3 is a flow chart showing the steps of an embodiment of a training method for a face model according to an embodiment of the present application;

4 is a flowchart of processing of a convolutional neural network according to an embodiment of the present application;

FIG. 5 is a diagram showing an example of the structure of an Inception according to an embodiment of the present application; FIG.

6 is a flow chart of steps of an embodiment of a face authentication method based on a face model according to an embodiment of the present application;

7A-7D are diagrams showing an example of an image of a database according to an embodiment of the present application;

8 is a comparison diagram of a test ROC curve according to an embodiment of the present application;

9 is a structural block diagram of an embodiment of a training apparatus for a face model according to an embodiment of the present application;

10 is a structural block diagram of an embodiment of a face authentication device based on a face model according to an embodiment of the present application;

FIG. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;

FIG. 12 is a schematic structural diagram of another electronic device according to an embodiment of the present application.

detailed description

The above described objects, features and advantages of the present application will become more apparent and understood.

Referring to FIG. 1 , a flow chart of steps of a method for training a face model of the present application is shown, which may specifically include the following steps:

In step 101, a training sample is obtained.

In one implementation, the training samples include training image data and document image data.

The document image data may be image data stored in a certain document, for example, a second-generation ID card, a residence permit, a driver's license, etc., and the image data of the certificate is generally subjected to high-intensity compression, the resolution is low, and the number is generally Less, usually only one pair of documents, the background is relatively pure (such as white, blue, red, etc.).

The training image data may be image data different from the document image data, such as video image data. The training image data is generally not subjected to high-intensity compression, and the resolution is higher than the document image data, and may be collected by a camera or the like, and the number is generally There are many image data in the certificate, and the background is more complicated (such as containing environmental information).

For example, as shown in FIG. 2, the leftmost image data is the document image data, and the remaining image data is the training image data.

Step 102: Obtain a training face image and a document face image according to the training image data and the document image data.

The training image data and the document image data generally have a user's face, from which the training face image and the document face image are extracted, and the face feature model is trained.

In an embodiment of the present application, step 102 may include the following sub-steps:

Sub-step S11, performing face detection on the training image data and the document image data respectively, and determining the training face image and the document face image;

Sub-step S12, performing face feature point positioning in the training face image and the document face image respectively, and determining training eye data and document eye data;

Sub-step S13, aligning the position of the training eye data and the position of the document eye data with the preset template position;

Sub-step S14, performing similar transformation on the training face image other than the training eye data according to the positional relationship of the training eye data, to obtain a normalized training face image;

Sub-step S15, similarly transforming the document face image other than the document eye data according to the positional relationship of the document eye data to obtain a normalized document face image.

In the embodiment of the present application, AdaBoost (Adaptive Lifting Method) may be used to perform face detection on the training samples, and coase-to-fine is adopted on the detected face images (ie, training face images and document face images). The (CF, cascading depth model) method locates the face image and normalizes it using the similarity transformation using the position coordinates of the positioned eye data. For example, the normalized face image has a size of 100×100.

In step 103, the face feature model is trained by training the face image.

In one implementation, the trained face model includes a face feature model, which may be a model for extracting face features.

In an embodiment of the present application, step 103 may include the following sub-steps:

Sub-step S21, training the face feature model based on face recognition using the trained face image to train the initial parameter values of the model parameters of the face feature model.

For neural network models such as convolutional neural network models, the quantity and quality of training data often directly affect the ability of the model to extract features and the effect of classification.

However, since the image data of the ID card and other documents are mostly single samples, that is, only one face image is stored in one ID card, and the training image data and the ID image appear when the data set is constructed. The problem of an imbalance in the amount of data.

Therefore, the embodiment of the present application uses a method of identifying signal pre-training and authentication signal fine-tuning to train the model, thereby solving the problem of imbalance in the number of samples.

That is to say, in the embodiment of the present application, the face feature model is first trained based on the training face image as the identification signal, and subsequently, the training face image and the document face image as the pairing of the authentication signal are used, and the training result is obtained. The face feature model is adjusted, and then the face model is obtained based on the adjusted face feature model. That is, the embodiment of the present application uses a method of identifying signal pre-training and authentication signal fine-tuning to train the model, thereby solving the problem of imbalance in the number of samples.

In one implementation, the face feature model can be trained by random gradient descent, the minibatch (training batch) size is 64, and the impulse is 0.9. The goal is to obtain the model parameters of the face feature model through dual-signal supervised training. θ _c .

In the first stage, the training face image is used for the identification signal to be supervised and trained to obtain the model parameter θ _id , which is the initial parameter of the second stage.

In an embodiment of the present application, sub-step S21 may include the following sub-steps:

Sub-step S211, randomly extracting a training face image;

Sub-step S212, the randomly extracted training face image is input into the preset facial feature model to extract the training facial feature;

Sub-step S213, calculating a first loss rate when the training face feature is used for face recognition;

Sub-step S214, it is determined whether the first loss rate converges; if not, then sub-step S215 is performed, after which, return to the execution sub-step S216;

Sub-step S215, taking the parameter value of the model parameter of the current iteration as the initial parameter value;

Sub-step S216, calculating a first gradient by using a first loss rate;

Sub-step S217, the parameter value of the model parameter is decreased by using the first gradient and the preset learning rate, and the process returns to the sub-step S211.

The parameter values of the first phase are initialized to random parameters obeying the Gaussian distribution N(0, σ ² ), where

In the first phase, the input training data set is {(x _i , y _i ), i=1, 2,..., N}, where x _i represents the training face image and y _i is the user tag (ie the category tag) , which user belongs to).

Before training, the model parameters θ _id (where id represents the initial parameter value), the learning rate η(t), and the number of iterations t are initialized in the face feature model, and the initial values are configured, such as the learning rate η(t The initial value is 0.1, and the initial value of t is 0 (t←0).

The training process is as follows:

In the t+1th iteration (t←t+1), the training samples {(x _i , y _i )} are randomly extracted from the training data set.

Calculate the forward process and obtain training face features:

f _i =Conv(x _i ,θ _id )

Among them, Conv () represents the face feature model.

Calculating the first loss rate when the trained face feature is used for face recognition, and calculating the first gradient by using the first loss rate to derive the partial deviation of the model parameters:

Among them, IdentificationLoss represents the first loss rate when training face features for face recognition.

In one implementation, the probability that the training facial feature f _i belongs to the preset user tag is calculated by means of multiple regression.

The first loss rate IdentificationLoss for face recognition using the probabilistic calculation to train the face features.

Where p _i is the probability distribution of the target (ie, the probability distribution of the target user tag),

The probability distribution for the prediction (ie, the probability distribution of the predicted user labels).

If the first loss rate does not converge (eg, the difference between multiple consecutive first loss rates is greater than or equal to At the preset difference threshold), the model parameters of the face feature model are updated for the next iteration:

Conversely, if the first loss rate converges (eg, the difference between the plurality of consecutive first loss rates is less than the preset difference threshold), the training is ended and the model parameter θ _{id is} output.

Of course, in addition to whether the first loss rate converges as an iterative judgment condition, other conditions may be used as the judgment condition of the iteration, such as whether the first gradient converges, whether the number of iterations reaches the iteration threshold, and the like. There is no restriction on this.

In step 104, the face feature model is adjusted by using the paired training face image and the document face image.

In an implementation manner, the face feature model can be adaptively adjusted according to the characteristics of the face image of the document.

In an implementation manner, after the face model is adjusted by using the paired training face image and the document face image, the adjusted face feature model can be obtained. In one case, the adjusted person The face feature model belongs to the face model, that is, the face model includes the above-mentioned adjusted face feature model.

In an embodiment of the present application, step 104 may include the following sub-steps:

Sub-step S31, using the paired training face image and the document face image to train the face feature model based on face authentication to adjust the model parameter from the initial parameter value to the target parameter value.

In the second stage, the training face image and the fingerprint face image paired sample are used for the authentication signal to be supervised and trained to obtain the final model parameter θ _c = θ _ve .

In an embodiment of the present application, sub-step S31 may include the following sub-steps:

Sub-step S311, pairing the training face image and the document face image belonging to the same user;

Sub-step S312, randomly extracting the paired training face image and the document face image;

Sub-step S313, the randomly extracted, paired training face image and the document face image are input into the face feature model to extract the training face feature and the document face feature;

Sub-step S314, calculating a training face feature and a document face feature for the second face authentication Loss rate

Sub-step S315, it is determined whether the second loss rate converges; if so, sub-step S316 is performed, and if not, sub-step S317 is performed;

Sub-step S316, taking the parameter value of the model parameter of the current iteration as the target parameter value;

Sub-step S317, calculating a second gradient by using a second loss rate;

Sub-step S318, the parameter value of the model parameter is decreased by using the second gradient and the preset learning rate, and the process returns to the sub-step S312.

In one implementation, the facial feature model can be trained by random gradient descent.

In the second phase, the input training data set is {(X _ij , l _ij ), i=1, 2, . . . , M, j=1, 2, . . . , N, where Xij=xi, xj represents a For training face images and document face images, lij is a binary label, l _ij is a classification label, l _ij =1 indicates that the training face image and the document face image are from the same person, and l _ij =-1 indicates training the face. Image and document face images come from different people.

For example, as shown in FIG. 2, the first document face image and the second training face image can be paired, and the first document face image and the third training face image can be paired, first The image of the document face and the image of the training face of the fourth frame can be paired, and so on.

Before the adjustment, the model parameters θ _ve (where ve is the target parameter value), the learning rate η(t), and the number of iterations t in the face feature model are initialized, and the initial values, such as θ _ve = θ _id , are configured. The initial value of the learning rate η(t) is 0.1, and the initial value of t is 0 (t←0).

The adjustment process is as follows:

In the t+1th iteration (t←t+1), the training samples {(X _ij , l _ij )} are randomly extracted from the training data set.

Calculate the forward process to obtain training face features and document face features:

f _ij =Conv(X _ij , θ _ve )

Among them, Conv () represents the face feature model.

Calculate the second loss rate of the face features and the face of the document specially used for face authentication, and calculate the second gradient by using the second loss rate to obtain the partial deviation of the model parameters:

Among them, VerificationLoss indicates the second loss rate when the face feature is used for face authentication.

In one implementation, the distance between the training face feature and the document face feature can be calculated.

The loss rate VerificationLoss for face authentication using distance calculation to train face features and document face features.

among them,

Represents the distance between the training face feature f _i and the document face feature f _j , σ represents the weight, w represents the slope, and b represents the intercept.

If the second loss rate does not converge (eg, the difference between the plurality of consecutive second loss rates is greater than or equal to the preset difference threshold), the model parameters of the face feature model are updated for the next iteration:

Conversely, if the second loss rate converges (eg, the difference between the plurality of consecutive second loss rates is less than the preset difference threshold), the adjustment is ended and the model parameter θ _c = θ _{ve is} output.

Of course, in addition to whether the second loss rate converges as an iterative judgment condition, other conditions may be used as the judgment condition of the iteration, such as whether the second gradient converges, whether the number of iterations reaches the iteration threshold, and the like. There is no restriction on this.

Referring to FIG. 3, a flow chart of steps of a method for training a face model of another application of the present application is shown, which may specifically include the following steps:

In step 301, a training sample is obtained.

Wherein, the training samples include training image data and document image data.

Step 302, extracting a training face image and a document face image in the training image data and the document image data.

In step 303, the face feature model is trained by training the face image.

In step 304, the face feature model is adjusted by using the paired training face image and the document face image.

Step 305, using the paired training face image and the document face image, and training the face authentication model according to the joint Bayesian.

In one implementation, the trained face model includes a face authentication model, which may be used to calculate similarities between facial features.

That is to say, in one implementation manner, the trained face model may include not only a face feature model but also a face authentication model.

In an embodiment of the present application, in order to further enhance the discriminability of the face feature and perform the authentication process, the Joint Bayesian (JB, Joint Bayesian) classifier may be trained by training the face image and the document face image.

Among them, the joint Bayesian is a classifier based on the Bayesian method. By classifying a pair of features by the logarithm of the ratio of the two posterior probabilities, the inter-class error can be increased and the intra-class error can be reduced.

At training, the input training data set is {(f _ij ,l _ij )}(i=1,2,...,m _i ,j=1,2,...,N), where f _ij =Conv(x _Ij ; θ _conv ), X _ij = (x _i , x _j ) represents a pair of trained face images and document face images, Conv () represents a face feature model, l _ij is a classification label, and l _ij =1 represents training The face image and the document face image are from the same person, and l _ij = -1 indicates that the training face image and the document face image are from different people.

The training process is as follows:

Sub-step S41, initializing the covariance matrices S _μ and S _ε :

Sub-step S42, calculating matrices F and G:

F=S _ε ^-1

G=-(m _i S _μ +S _ε ) ^-1 S _μ S _ε ^-1

Sub-step S43, calculating μ _i and ε _ij :

Sub-step S44, updating the covariance matrices S _μ and S _ε :

Sub-step S44, it is judged whether S _μ and S _ε converge, and if so, sub-step S45 is performed, and if not, the execution sub-step S42 is returned.

Sub-step S45, the matrices F, G and A are respectively calculated according to the following formula:

F=S _ε ^-1

G=-(2S _μ +S _ε ) ^-1 S _μ S _ε ^-1

A=(S _μ +S _ε ) ^-1 -(F+G)

Sub-step S46, outputting a face authentication model r(x ₁ , x ₂ )

In the embodiment of the present application, the face feature model includes a network model such as a Convolutional Neural Network (CNN) or a Deep Neural Networks (DNN).

In the embodiment of the present application, the facial feature model may be a network model such as a Convolutional Neural Network (CNN) model or a Deep Neural Networks (DNN) model.

Among them, convolutional neural networks introduce convolutional structures in artificial neural networks. By means of local weight sharing, on the one hand, the amount of computation can be reduced, and on the other hand, more abstract features can be extracted.

In one implementation, the convolutional neural network model can include an input layer, one or more convolutional layers, one or more sampling layers, and an output layer.

Each layer of a convolutional neural network is generally composed of multiple maps. Each map is composed of multiple neural units. All neural units of the same map share a convolution kernel (ie, weight), and the convolution kernel often represents a feature. For example, if a convolution kernel represents an arc, then the convolution kernel is rolled over the entire picture, and the area with a larger convolution value is likely to be an arc.

Input layer: The input layer has no input value and has an output vector. The size of this vector is the size of the tiled face image, such as a matrix of 100×100.

Convolutional layer: The input of the convolutional layer can be derived from the input layer and can be derived from the sampling layer. Each map of the convolutional layer has a convolution kernel of the same size.

Subsampling (Pooling): The sampling layer is a sampling process of the upper layer map. The sampling method is to collect statistics on adjacent small areas of the upper layer map.

In the embodiment of the present application, the model parameters of the convolutional neural network model include a convolution kernel whose parameter value is a value of a convolution kernel, that is, when the face feature model is trained and adjusted, the value of the convolution kernel can be performed. Training and adjustment.

Referring to FIG. 4, a processing flowchart of a convolutional neural network model of an embodiment of the present application is shown, which may specifically include the following steps:

Step 401: When the convolutional layer belongs to the first depth range, the convolution operation is performed by using the specified single convolution kernel.

In the embodiment of the present application, the face image may be input into the convolutional neural network model, and the face image may include a training face image during offline training, a document face image, and may also include a target person in online face authentication. Face images can also include other face images, and so on.

In the shallow layer (ie, the first depth range), convolution kernels can be directly used for convolution, reducing the amount of calculation.

After the first depth range convolution is completed, the normalization operation and the activation operation may be performed by means of a BN (Batch Normalization) operator, a ReLU (Rectified Linear Units) function, or the like.

Step 402: When the convolutional layer belongs to the second depth range, the convolution operation is performed by using the hierarchical linear model Inception.

The number of layers in the second depth range is greater than the number of layers in the first depth range.

In the embodiment of the present application, Inception can be used for convolution in the deep layer (ie, the second depth range). On the one hand, the width and depth of the convolutional neural network model can be increased under the condition of constant calculation, thereby enhancing convolution. The performance of the neural network model; on the other hand, multi-scale facial features can be extracted due to the use of convolution kernels of different sizes (eg, 1×1, 3×3, 5×5).

In an embodiment of the present application, the hierarchical linear model Inception includes a first layer, a second layer, a third layer, and a fourth layer in parallel. In the embodiment of the present application, step 402 may include the following sub-steps:

Sub-step S51, in the first layer, performing convolution operation on the image data input to the hierarchical linear model Inception by using the specified first convolution kernel and the first step length to obtain first feature image data;

In the first layer, the first feature image data can be normalized by means of a BN operator or the like. Work.

It should be noted that, since the face image of the input convolutional neural network model may be a training face image or a document face image during offline training, it may also be a target face image during online face authentication, and therefore, the input point is The image data of the layer linear model Inception also differs in these cases.

Sub-step S52, in the second layer, performing convolution operation on the image data of the hierarchical linear model Inception by using the specified second convolution kernel and the second step to obtain second feature image data;

In the second layer, the second feature image data may be normalized and activated by a BN operator, a ReLU function, or the like.

Sub-step S53, performing convolution operation on the second feature image data by using the specified third convolution kernel and the third step to obtain third feature image data;

In an implementation manner, the third feature image data may be normalized by a BN operator or the like.

Sub-step S54, in the third layer, performing convolution operation on the image data input to the hierarchical linear model Inception by using the specified fourth convolution kernel and the fourth step to obtain fourth feature image data;

In the third layer, the fourth feature image data can be normalized and activated by a BN operator, a ReLU function, or the like.

Sub-step S55, performing a convolution operation on the fourth feature image data by using the specified fifth convolution kernel and the fifth step to obtain the fifth feature image data;

In an implementation manner, the fifth feature image data may be normalized by a BN operator or the like.

Sub-step S56, in the fourth layer, performing convolution operation on the image data of the input hierarchical linear model Inception by using the specified sixth convolution kernel and the sixth step to obtain the sixth feature image data;

In the fourth layer, the sixth feature image data can be normalized by a BN operator or the like.

Sub-step S57, performing a maximum downsampling operation on the sixth feature image data to obtain seventh feature image data;

In the embodiment of the present application, the eighth feature image data activation operation may be performed by a ReLU function or the like.

Sub-step S58, connecting the first feature image data, the third feature image data, the fifth feature image data, and the seventh feature image data to obtain the eighth feature image data.

In one case, the sizes of the first convolution kernel, the second convolution kernel, the third convolution kernel, the fourth convolution kernel, the fifth convolution kernel, and the sixth convolution kernel may be the same or different; The size of the first step, the second step, the third step, the fourth step, the fifth step, and the sixth step may be the same or different, and the comparison of the embodiments of the present application is not limited.

Further, in the hierarchical linear model Inception, the processing of the first layer (sub-step S51), the processing of the second layer (sub-step S52 and sub-step S53), processing of the third layer (sub-step S54 and sub-step S55) The processing of the fourth layer (sub-step S56 and sub-step S57) can be performed in parallel, regardless of the order.

To enable a person skilled in the art to better understand the embodiments of the present application, the following examples are used to illustrate the Inception in the embodiments of the present application.

As shown in Figure 5, for input image data (such as a tiled face image):

In the first layer, a 1×1 convolution kernel can be used, a convolution operation is performed with a step size of 1, and then BN normalization is performed.

In the second layer, a 1×1 convolution kernel can be used, the convolution operation is performed with a step size of 1, and then BN normalization and ReLU activation are performed.

Then, using a 5×5 convolution kernel, the convolution operation is performed with a step size of 1, and then BN normalization is performed.

In the third layer, a 1×1 convolution kernel can be used, the convolution operation is performed with a step size of 1, and then BN normalization and ReLU activation are performed.

Then, using a 3×3 convolution kernel, the convolution operation is performed with a step size of 1, and then BN normalization is performed.

In the fourth layer, a 1×1 convolution kernel can be used, the convolution operation is performed with a step size of 1, and the BN normalization is performed, followed by maximization (Max) downsampling.

The image data outputted from the first layer to the fourth layer are connected together, and then ReLu is activated to obtain the output of Inception.

In step 403, maximum downsampling is performed in the sampling layer.

Step 404: Obtain a feature vector according to the plurality of image data outputted by the convolutional neural network model as a face feature of the face image.

In an implementation manner, in the embodiment of the present application, there is no fixed execution order between step 401, step 402 and step 403, and the execution order may be determined according to the actual structure of the convolutional neural network.

In order to enable those skilled in the art to better understand the embodiments of the present application, the convolutional neural network model in the embodiment of the present application is described below by way of specific examples.

Table 1

In this example, as shown in Table 1, the convolutional neural network has 17 convolutional layers and a sampling layer, of which 1, 3, 4, 6, 7, 9, 10, 11, 12, 13, 15 16 layers are convoluted layers, and the first, third, and fourth layers are shallow, the sixth, seventh, ninth, tenth, eleventh, eleventh, thirteenth, and thirteenth, and sixteenth; Pick Sample layer.

Convolution layer 1:

Suppose that a 100×100 gray-scale block face image is normalized after inputting a frame. First, a 5×5 convolution kernel is used, and the step size is 2, which is convolved to obtain 64 50×50 feature images. Then, the 64 50×50 feature images are first BN normalized, and then ReLU is activated.

Sampling layer 1:

The 64 50×50 feature images outputted by the convolutional layer 1 are subjected to 3×3 maximization downsampling with a step size of 2 to obtain 64 14×14 feature images.

Convolution 2:

For the 64 14×14 feature images output by the sampling layer 1, a 1×1 convolution kernel is used, and a convolution operation is performed with a step size of 1, to obtain 64 14×14 feature images, and then the 64 frames 14× The feature image of 14 is first normalized by BN, and then ReLU is activated.

Convolution layer 3

For the 64 14×14 feature images outputted by the convolutional layer 2, a 3×3 convolution kernel is used, and a convolution operation is performed with a step size of 1, to obtain 92 14×14 feature images, and then to the 92 frames 14 The feature image of ×14 is first normalized by BN, and then ReLU is activated.

Sampling layer 2

The 92 14×14 feature images outputted by the convolution layer 3 are subjected to 3×3 maximization downsampling with a step size of 1, and 92 14×14 feature images are obtained.

Convolution layer 4

For the 92 14×14 feature images output by the sampling layer 2, the following operations are performed using Inception as shown in FIG. 5 to obtain 256 14×14 feature images:

Step 1, using a 1×1 convolution kernel for 92 14×14 feature images output by the sampling layer 2, performing a convolution operation with a step size of 1, obtaining 64 14×14 feature images, and then 64 The feature image of 14×14 is subjected to BN normalization.

Step 2, using a 1×1 convolution kernel for 92 14×14 feature images output by the sampling layer 2, The convolution operation is performed with a step size of 1, and 96 14×14 feature images are obtained. Then, the 96 14×14 feature images are first BN normalized, and then ReLU is activated.

Then, using a 3×3 convolution kernel, the convolution operation is performed with a step size of 1, and 128 14×14 feature images are obtained, and then the 128 14×14 feature images are BN normalized.

Step 3: Using a 1×1 convolution kernel for 92 14×14 feature images output by the sampling layer 2, performing a convolution operation with a step size of 1, obtaining 16 14×14 feature images, and then 16 The feature image of 14×14 is first normalized by BN, and then ReLU is activated.

Then, using a 5×5 convolution kernel, the convolution operation is performed with a step size of 1, and 32 14×14 feature images are obtained, and then the 32 14×14 feature images are BN normalized.

Step 4: Using a 1×1 convolution kernel for 92 14×14 feature images output by the sampling layer 2, performing a convolution operation with a step size of 1, obtaining 32 14×14 feature images, and then 32 The feature image of 14×14 is subjected to BN normalization.

Then, the maximum downsampling operation is performed on the 32 14×14 feature images to obtain 32 14×14 feature images.

Step 5: Connect the feature images output in steps 1 to 4 to obtain 256 14×14 feature images, and perform ReLu activation on the connected 256 14×14 feature images to obtain the output of the convolution layer 4. .

For the operation of the convolutional layer 5-convolution layer 12, the sampling layer 3-sampling layer 5, the processes of convolutional layer 1-4, sampling layer 1-2 can be referred to.

Finally, the sampling layer 15 outputs 1024 1×1 feature images, and sequentially arranges the 1024 1×1 feature images into a feature vector with a dimension of 1024 dimensions, which is a frame 100×100 person. The original face feature of the face image through the convolution network.

Referring to FIG. 6 , a flow chart of a method for a face authentication method based on a face model of the present application is shown. The face model includes a face feature model, and the method may specifically include the following steps:

Step 601: When receiving an instruction of face authentication, collecting target image data.

In an actual application, the embodiment of the present application can be applied to a face recognition system, such as an access control system, a monitoring system, a payment system, etc., to perform authentication processing on a user.

If an instruction for face authentication is received in the face recognition system, the target image data can be acquired by a camera or the like.

Step 602: Extract a target face image in the target image data.

In one embodiment of the present application, step 602 can include the following sub-steps:

Sub-step S61, performing face detection in the target image data to determine a target face image;

Sub-step S62, performing face feature point positioning in the target face image to determine target eye data;

Sub-step S63, aligning the target eye data;

Sub-step S64, the target face image other than the target eye data is similarly transformed according to the positional relationship of the target eye data, and the normalized target face image is obtained.

In the embodiment of the present application, AdaBoost may be used to perform face detection on the target image data, and the target face image is mapped on the detected target face image by using a coase-to-fine method, and the target eye data after the positioning is utilized. The position coordinates are normalized by a similar transformation. For example, the normalized target face image has a size of 100×100.

Step 603: Input the target face image into the pre-trained face feature model to extract the target face feature.

Applying the embodiment of the present application, the face feature model can be trained as follows:

Sub-step 6031, acquiring training samples, where the training samples include training image data and document image data;

Sub-step 6032, extracting a training face image and a document face image in the training image data and the document image data;

Sub-step 6033, training the face feature model by training the face image;

Sub-step 6034, the face feature model is adjusted by using the paired training face image and the document face image.

Step 604, performing authentication processing according to the target facial feature and the specified document image data.

In one embodiment of the present application, step 604 can include the following sub-steps:

Sub-step S71, acquiring the document face feature of the document face image in the specified document image data;

The document image data may be image data in a user ID that needs to be authenticated.

For example, in the payment system, the certificate image data of the ID card of the user to which the account belongs is specified to perform authentication processing.

The document face features of the document face image can be extracted in advance and stored in the database, and can be directly extracted when the face is authenticated.

Sub-step S72, the target face feature and the document face feature are input according to the face authentication model of the joint Bayesian training, and the similarity is obtained;

In an implementation manner, the face model may further include a face authentication model, and the target face feature and the document face feature may be input according to the face recognition model of the joint Bayesian training to obtain the similarity.

Applying the embodiment of the present application, the face authentication model can be trained as follows:

Sub-step S721, using the paired training face image and the document face image, and training the face authentication model according to the joint Bayesian;

Sub-step S73, it is determined whether the similarity is greater than or equal to the preset similarity threshold; if so, sub-step S74 is performed, and if not, sub-step S75 is performed;

Sub-step S74, determining that the target face image and the document face image belong to the same person;

Sub-step S75, determining that the target face image and the document face image do not belong to the same person.

In the embodiment of the present application, a similarity threshold T may be preset.

If the similarity ≥ T, the target face image is similar to the document face image, and the larger one may come from the same person, and the face authentication is successful.

If the similarity <T, it means that the target face image is far away from the document face image, which may come from different people, and the face authentication fails.

In the embodiment of the present application, since the training methods of the face feature model and the face authentication model are basically similar to the application method of the face model training method, the description is relatively simple, and the related method can be found in the training method of the face model. The description of the embodiments may be omitted, and the embodiments of the present application are not described in detail herein.

The database used in the training of the embodiment of the present application is the NEU_Web database as shown in FIG. 7A.

The databases used in the tests were three ID database databases ID_454, ID_55, and ID_229, that is, the database used during training did not overlap with the database used during the test.

As shown in FIG. 7B, ID_454 is a database constructed of 445 video images and corresponding ID images collected in an indoor environment, and has strong control over changes in illumination, posture, and expression.

As shown in FIG. 7C, ID_55 is an identity card database of 55 people, and each person in the database contains 9 different photos, different facial expressions and corresponding ID photos.

As shown in FIG. 7D, ID_229 is an ID card database collected under the bank usage scenario, and has more complicated illumination, posture and expression changes.

Calculate the authentication rate on the three databases when the equal error rate is 1%, as shown in Table 2.

Table 2 Face recognition rate of second generation ID card (FRR=1%)

In addition, the EBGM (Elastic Bunch Graph Matching) algorithm, the LGBP (Local Gabor Binary Patterns) algorithm, and the BSF (Block Statistical Features) algorithm are respectively used in the embodiments of the present application. The algorithm, when the equal error rate is 1%, the results are shown in Table 3, and the corresponding ROC (receiver operating characteristic curve) curve is shown in Fig. 8.

Table 3 Comparison of certification results (FRR=1%)

The curve 801 is the ROC curve of the embodiment of the present application, the curve 802 is the ROC curve of the BSF, the curve 803 is the ROC curve of the LGBP, and the curve 804 is the ROC curve of the EBGM.

It can be seen from FIG. 8 that the ROC curve of the embodiment of the present application is closer to the upper left corner than the EBGM, LGBP, and BSF algorithms, that is, the face of the embodiment of the present application is compared with the three algorithms of EBGM, LGBP, and BSF. The accuracy of the certification is higher.

It should be noted that, for the method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the embodiments of the present application are not limited by the described action sequence, because In accordance with embodiments of the present application, certain steps may be performed in other sequences or concurrently. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required in the embodiments of the present application.

Referring to FIG. 9, a structural block diagram of an embodiment of a training apparatus for a face model of the present application is shown, which may specifically include the following modules:

a training sample obtaining module 901, configured to acquire training samples, where the training samples include training image data and document image data;

The sample face image extraction module 902 is configured to obtain a training face image and a document face image according to the training image data and the document image data;

a face model training module 903, configured to train a face feature model by using the trained face image;

The face model adjustment module 904 is configured to adjust the face feature model by using the paired training face image and the document face image.

In an embodiment of the present application, the sample face image extraction module 902 may include:

a sample face detection sub-module, configured to perform face detection on the training image data and the document image data respectively, and determine a training face image and a document face image;

a sample face positioning sub-module for use in the training face image and the document face image Perform facial feature point positioning separately to determine training eye data and document eye data;

a sample face alignment submodule, configured to align a position of the training eye data and a position of the document eye data with a preset template position;

Training a face normalization sub-module for performing a similar transformation on the training face image other than the training eye data according to the positional relationship of the training eye data to obtain a normalized training face image;

The document face normalization sub-module is configured to perform a similar transformation on the document face image other than the document eye data according to the positional relationship of the document eye data to obtain a normalized document face image.

In an embodiment of the present application, the face model training module 903 includes:

The training sub-module is configured to train the preset facial feature model based on the face recognition using the trained facial image to train an initial parameter value of the model parameter of the facial feature model.

In an embodiment of the present application, the face model adjustment module 904 includes:

The authentication training sub-module is configured to train the facial feature model based on face authentication by using the paired training face image and the document face image to adjust the model parameter from an initial parameter value to a target parameter value.

In an embodiment of the present application, the identifying the training submodule includes:

a first random sampling unit, configured to randomly extract a training face image;

a first sample face feature extraction unit, configured to input a randomly extracted training face image into a preset face feature model to extract a trained face feature;

a first loss rate calculation unit, configured to calculate a first loss rate when the training face feature is used for face recognition;

a first convergence determining unit, configured to determine whether the first loss rate converges; if yes, call an initial parameter value setting unit, and if not, invoke a first gradient calculating module;

An initial parameter value setting unit, configured to use a parameter value of the model parameter of the current iteration as an initial parameter value;

a first gradient calculating unit, configured to calculate a first gradient by using the first loss rate;

The first gradient descent sub-module is configured to: use the first gradient and the preset learning rate to decrease a parameter value of the model parameter, and return to invoke the first random sampling sub-module.

In an embodiment of the present application, the first loss rate calculation unit includes:

a probability calculation subunit, configured to calculate a probability that the training face feature belongs to a preset user tag;

A face recognition loss rate calculation subunit is configured to calculate a first loss rate for face recognition using the probability to calculate the training face feature.

In an embodiment of the present application, the authentication training sub-module includes:

a data matching unit, configured to pair the training face image and the document face image belonging to the same user;

a second random sampling unit, configured to randomly extract the paired training face image and the document face image;

a second sample face feature extraction unit, configured to input a randomly extracted, paired training face image and a document face image into the face feature model to extract a training face feature and a document face feature;

a second loss rate calculation unit, configured to calculate a loss rate when the training face feature and the document face feature are used for face authentication;

a second convergence determining unit, configured to determine whether the second loss rate converges; if yes, the target parameter value setting unit is called, and if not, the second gradient calculating unit is invoked;

a target parameter value setting unit, configured to use a parameter value of the model parameter of the current iteration as a target parameter value;

a second gradient calculating unit, configured to calculate a second gradient by using the second loss rate;

And a second gradient descent sub-module, configured to decrease a parameter value of the model parameter by using the second gradient and a preset learning rate, and return to call the second random sampling sub-module.

In an embodiment of the present application, the second loss rate calculation submodule includes:

a distance calculation unit, configured to calculate a distance between the training face feature and the document face feature;

And a second authentication loss rate calculation unit, configured to calculate, by using the distance, the second loss rate for the face authentication of the training face feature and the document face feature.

In an embodiment of the present application, the method further includes:

The face authentication model training module is used to train the face authentication model according to the joint Bayesian training face image and the document face image.

In an embodiment of the present application, the face feature model may be a convolutional neural network model, and the convolutional neural network model may include one or more convolution layers, one or more sampling layers, and the volume The model parameters of the product neural network include a convolution kernel;

The convolutional neural network model can include:

a shallow convolution module, configured to perform a convolution operation by using a specified single convolution kernel when the convolutional layer belongs to a first depth range;

a deep convolution module, configured to perform a convolution operation using a hierarchical linear model Inception when the convolutional layer belongs to a second depth range, wherein a number of layers of the second depth range is greater than the first depth range Number of layers

Maximizing the downsampling module for maximizing downsampling in the sampling layer of the person;

And a feature obtaining module, configured to obtain a feature vector according to the plurality of image data output by the convolutional neural network model, as a face feature of the face image.

In an embodiment of the present application, the convolutional neural network model may further include:

The first convolution auxiliary module is configured to perform normalization operation and activation operation after the first depth range convolution is completed.

In an embodiment of the present application, the hierarchical linear model Inception includes a first layer, a second layer, a third layer, and a fourth layer;

The deep convolution module can include:

a first convolution sub-module, configured to perform a convolution operation on the image data input to the hierarchical linear model Inception by using a specified first convolution kernel and a first step length in the first layer to obtain a first feature map Like data;

a second convolution sub-module, configured to perform convolution operation on the image data input to the hierarchical linear model Inception by using the specified second convolution kernel and the second step in the second layer to obtain the second feature image data;

a third convolution sub-module, configured to perform a convolution operation on the second feature image data by using a specified third convolution kernel and a third step size to obtain third feature image data;

a fourth convolution sub-module, configured to perform a convolution operation on the image data input to the hierarchical linear model Inception by using a specified fourth convolution kernel and a fourth step length in the third layer to obtain a fourth characteristic image data;

a fifth convolution sub-module, configured to perform a convolution operation on the fourth feature image data by using a specified fifth convolution kernel and a fifth step to obtain fifth feature image data;

a sixth convolution sub-module, configured to perform a convolution operation on the image data input to the hierarchical linear model Inception by using a specified sixth convolution kernel and a sixth step length in the fourth layer to obtain a sixth characteristic image data;

a sampling sub-module, configured to perform a maximum downsampling operation on the sixth feature image data to obtain seventh feature image data;

And an image connection submodule, configured to connect the first feature image data, the third feature image data, the fifth feature image data, and the seventh feature image data to obtain eighth feature image data.

In an embodiment of the present application, the deep convolution module may further include:

a second convolution auxiliary submodule, configured to perform normalization operation on the first feature image data in the first layer;

a third convolution auxiliary submodule, configured to perform normalization operation and an activation operation on the second feature image data in the second layer;

a fourth convolution auxiliary submodule, configured to perform normalization operation on the third feature image data;

a fifth convolution auxiliary submodule, configured to perform the fourth feature image data in the third layer Normalized operations and activation operations;

a sixth convolution auxiliary submodule, configured to perform normalization operation on the fifth feature image data;

a seventh convolution auxiliary submodule, configured to perform normalization operation on the sixth feature image data in the fourth layer;

And an eighth convolution auxiliary submodule, configured to activate the eighth feature image data.

Referring to FIG. 10, a block diagram of a face face device-based face authentication device according to the present application is shown. The face model includes a face feature model, and the device may specifically include the following modules:

The target image data module 1001 is configured to collect target image data when receiving an instruction of face authentication;

a target face image extraction module 1002, configured to extract a target face image in the target image data;

The target facial feature extraction module 1003 is configured to input the target facial image into the pre-trained facial feature model to extract the target facial feature;

The authentication processing module 1004 is configured to perform an authentication process according to the target facial feature and the specified document image data;

In one implementation, the face model invokes the following module training:

a sample face image extraction module, configured to extract a training face image and a document face image in the training image data and the document image data;

The face model adjustment module is configured to adjust the face feature model by using the paired training face image and the document face image.

In an embodiment of the present application, the target face image extraction module 1002 may include:

a target face detection sub-module, configured to perform face detection in the target image data to determine a target face image;

a target face positioning sub-module, configured to perform face feature point positioning in the target face image to determine target eye data;

a target face alignment submodule for aligning the target eye data;

The target face normalization sub-module is configured to perform similar transformation on the target facial image other than the target eye data according to the positional relationship of the target eye data to obtain a normalized target facial image.

In an embodiment of the present application, the face model further includes a face authentication model, and the authentication processing module 1004 may include:

a document face feature acquisition sub-module, configured to obtain a document face feature of the document face image in the specified document image data;

a similarity calculation sub-module, configured to input the target facial feature and the document face feature according to a face authentication model of joint Bayesian training to obtain a similarity;

a similarity threshold determining sub-module, configured to determine whether the similarity is greater than or equal to a preset similarity threshold; if yes, calling the first determining sub-module; if not, calling the second determining sub-module;

a first determining submodule, configured to determine that the target face image and the document face image belong to the same person;

And a second determining submodule, configured to determine that the target face image and the document face image do not belong to the same person.

For the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.

The embodiment of the present application further provides an electronic device, as shown in FIG. 11, including a processor 111, a communication interface 112, a memory 113, and a communication bus 114. The processor 111, the communication interface 112, and the memory 113 pass through the communication bus 114. Completing communication with each other; a memory 113 for storing a computer program;

The processor 111 is configured to implement the training method of any face model provided by the embodiment of the present application, where the training method of the face model may include the following steps:

Training the facial feature model by using the trained face image;

Applying the embodiment of the present application, the processor of the electronic device runs the computer program stored in the memory to perform the training method of any face model provided by the embodiment of the present application, thereby achieving the problem of solving the problem of imbalance in the number of samples and improving The performance of the model improves the accuracy of face authentication. Moreover, the feature expression of the face does not depend on the artificial selection of features, and shows good robustness to factors such as age, posture and illumination.

The embodiment of the present application further provides a computer program for a training method that is executed to perform any of the face models provided by the embodiments of the present application, wherein the training method of the face model may include the following steps:

Training the facial feature model by using the trained face image;

Applying the embodiment of the present application, the computer program can execute the training method of any face model provided by the embodiment of the present application at runtime, so that the problem of unbalanced sample quantity can be solved, and the performance of the model is improved, thereby improving the person. The accuracy of face authentication. Moreover, the feature expression of the face does not depend on the artificial selection of features, and shows the factors such as age, posture and illumination. Good robustness.

The embodiment of the present application provides a storage medium for storing a computer program, where the computer program is executed to perform a training method for any face model provided by the embodiment of the present application, wherein the face model The training method can include the steps of:

Training the facial feature model by using the trained face image;

Applying the embodiment of the present application, the storage medium stores a computer program that executes the training method of any of the face models provided by the embodiments of the present application at runtime, and thus can solve the problem of unbalanced sample quantity and improve the performance of the model. Thereby improving the accuracy of face authentication. Moreover, the feature expression of the face does not depend on the artificial selection of features, and shows good robustness to factors such as age, posture and illumination.

The embodiment of the present application further provides an electronic device, as shown in FIG. 12, including a processor 121, a communication interface 122, a memory 123, and a communication bus 124. The processor 121, the communication interface 122, and the memory 123 pass through the communication bus 124. Completing communication with each other; a memory for storing computer programs;

The processor 121 is configured to implement the face authentication method based on the face model provided by the embodiment of the present application when the computer program stored in the memory 123 is executed, wherein the face authentication method based on the face model can be Including steps:

Extracting a target face image in the target image data;

Applying the embodiment of the present application, the processor of the electronic device runs the computer program stored in the memory In order to implement the face authentication method based on the face model provided by the embodiment of the present application, it is possible to solve the problem of unbalanced sample quantity and improve the performance of the model, thereby improving the accuracy of face authentication. . Moreover, the feature expression of the face does not depend on the artificial selection of features, and shows good robustness to factors such as age, posture and illumination.

The embodiment of the present application further provides a computer program for executing a face authentication method based on a face model provided by an embodiment of the present application, wherein the face model based person The face authentication method may include the steps of:

Extracting a target face image in the target image data;

Applying the embodiment of the present application, the computer program can execute any face authentication method based on the face model provided by the embodiment of the present application at runtime, so that the problem of unbalanced sample quantity can be solved, and the performance of the model is improved. Thereby improving the accuracy of face authentication. Moreover, the feature expression of the face does not depend on the artificial selection of features, and shows good robustness to factors such as age, posture and illumination.

The embodiment of the present application provides a storage medium, where the storage medium is used to execute a computer face program, and the computer program is executed to execute any face recognition method based on the face model provided by the embodiment of the present application, where The face authentication method based on the face model may include the steps of:

Extracting a target face image in the target image data;

Applying the embodiment of the present application, the storage medium stores a computer program for executing the face authentication method based on the face model provided by the embodiment of the present application at the time of running, thereby implementing the problem of solving the problem of imbalance in the number of samples and improving the problem. The performance of the model, which improves the accuracy of face authentication. Moreover, the feature expression of the face does not depend on the artificial selection of features, and shows good robustness to factors such as age, posture and illumination.

The various embodiments in the present specification are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the various embodiments can be referred to each other.

Those skilled in the art will appreciate that embodiments of the embodiments of the present application can be provided as a method, apparatus, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.

Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.

The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.

These computer program instructions can also be loaded onto a computer or other programmable data processing terminal device such that a series of operational steps are performed on the computer or other programmable terminal device to produce computer-implemented processing, such that the computer or other programmable terminal device The instructions executed above provide steps for implementing the functions specified in one or more blocks of the flowchart or in a block or blocks of the flowchart.

While a preferred embodiment of the embodiments of the present application has been described, those skilled in the art can make further changes and modifications to the embodiments once they are aware of the basic inventive concept. Therefore, the appended claims are intended to be interpreted as including all the modifications and the modifications

Finally, it should also be noted that in this context, relational terms such as first and second are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities. There is any such actual relationship or order between operations. Furthermore, the terms "comprises" or "comprising" or "comprising" or any other variations are intended to encompass a non-exclusive inclusion, such that a process, method, article, or terminal device that includes a plurality of elements includes not only those elements but also Other elements that are included, or include elements inherent to such a process, method, article, or terminal device. An element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article, or terminal device that comprises the element, without further limitation.

The above is a training method for a face model provided by the present application, a face authentication method based on a face model, a training device for a face model, and a face authentication device based on a face model. DETAILED DESCRIPTION OF THE INVENTION The principles and embodiments of the present application have been described with reference to specific examples. The description of the above embodiments is only for helping to understand the method of the present application and its core ideas. Meanwhile, for those skilled in the art, In view of the idea of the present application, there are variations in the specific embodiments and the scope of application, and the contents of the present specification should not be construed as limiting the present application.

Claims

A training method for a face model, comprising:

Obtaining a training sample, the training sample including training image data and document image data;

Obtaining a training face image and a document face image according to the training image data and the document image data;

Training the facial feature model by using the trained face image;

The face feature model is adjusted by using a paired training face image and a document face image.
The method according to claim 1, wherein the step of training the facial feature model using the trained face image comprises:

The preset face feature model is trained based on face recognition using the trained face image to train an initial parameter value of the model parameter of the face feature model.
The method according to claim 2, wherein the step of adjusting the face feature model by using the paired training face image and the document face image comprises:

The face feature model is trained based on face authentication using the paired training face image and the document face image to adjust the model parameter from the initial parameter value to the target parameter value.
The method according to claim 2, wherein the training the face feature model based on face recognition using the trained face image to train an initial model parameter of the face feature model The parameter values for training the face image include:

Randomly extract training face images;

Extracting the trained face image into the preset face feature model to extract the trained face feature;

Calculating a first loss rate when the training face feature is used for face recognition;

Determining whether the first loss rate converges;

If yes, the parameter value of the model parameter of the current iteration is used as the initial parameter value;

If not, calculating the first gradient by using the first loss rate; using the first gradient and the preset learning rate to decrease the parameter value of the model parameter, and returning to performing the random extraction training face image step.
The method according to claim 4, wherein the step of calculating the first loss rate when the training face feature is used for face recognition comprises:

Calculating a probability that the training face feature belongs to a preset user tag;

The first loss rate for face recognition of the trained face feature is calculated using the probability.
The method according to claim 3, wherein said training of the face image and the document face image using the pairing are performed based on face authentication to train the face feature model to The steps of adjusting the initial parameter values to the target parameter values include:

Pairing training face images and document face images belonging to the same user;

Randomly extracting paired training face images and document face images;

Importing the randomly extracted, paired training face image and the document face image into the face feature model to extract the training face feature and the document face feature;

Calculating a second loss rate when the training face feature and the document face feature are used for face authentication;

Determining whether the second loss rate converges;

If yes, the parameter value of the model parameter of the current iteration is taken as the target parameter value;

If not, calculating the second gradient by using the second loss rate;

And decreasing the parameter value of the model parameter by using the second gradient and the preset learning rate, and returning to the step of performing the random extraction pairing of the training face image and the document face image.
The method according to claim 6, wherein the step of calculating the second loss rate when the training face feature and the document face feature are used for face authentication comprises:

Calculating a distance between the training face feature and the document face feature;

The second loss rate for face authentication of the training face feature and the document face feature is calculated using the distance.
The method of any of claims 1-6, further comprising:

The paired training face image and the document face image are used to train the face authentication model according to the joint Bayesian.
The method according to any one of claims 1 to 6, wherein the face feature model is a convolutional neural network model, the convolutional neural network model comprising one or more convolution layers, one or more a sampling layer, the model parameters of the convolutional neural network including a convolution kernel;

The convolutional neural network model processes the input face image as follows:

When the convolutional layer belongs to the first depth range, the convolution operation is performed using the specified single convolution kernel;

When the convolutional layer belongs to the second depth range, the convolution operation is performed by using a hierarchical linear model Inception, wherein the number of layers of the second depth range is greater than the number of layers of the first depth range;

In the sampling layer, performing maximum downsampling;

A feature vector is obtained from the plurality of image data output from the convolutional neural network model as a face feature of the face image.
The method according to claim 9, wherein the hierarchical linear model Inception comprises a first layer, a second layer, a third layer, and a fourth layer;

The steps of performing a convolution operation using a hierarchical linear model Inception include:

In the first layer, performing convolution operation on the image data input to the hierarchical linear model Inception by using the specified first convolution kernel and the first step length to obtain first feature image data;

In the second layer, convoluting the image data input to the hierarchical linear model Inception with a specified second convolution kernel and a second step to obtain second feature image data; using the specified third convolution Performing a convolution operation on the second feature image data by the core and the third step to obtain third feature image data;

In the third layer, the image data input to the hierarchical linear model Inception is convoluted by using the specified fourth convolution kernel and the fourth step to obtain the fourth feature image data; and the specified fifth convolution is adopted. Performing a convolution operation on the fourth feature image data by the kernel and the fifth step to obtain the fifth feature Image data

In the fourth layer, performing convolution operation on the image data input to the hierarchical linear model Inception by using the specified sixth convolution kernel and the sixth step to obtain the sixth feature image data;

Performing a maximum downsampling operation on the sixth feature image data to obtain seventh feature image data;

The first feature image data, the third feature image data, the fifth feature image data, and the seventh feature image data are connected to obtain eighth feature image data.
A face authentication method based on a face model, wherein the face model is a face model obtained by the training method according to any one of claims 1 to 10, wherein the face model includes a face a feature model, the face authentication method includes:

Collecting target image data when receiving an instruction for face authentication;

Extracting a target face image in the target image data;

Extracting the target face image into a pre-trained face feature model to extract a target face feature;

The authentication process is performed according to the target face feature and the specified document image data.
The method according to claim 11, wherein the face model further comprises a face authentication model, and the step of performing authentication processing according to the target face feature and the specified document image data comprises:

Obtaining a document face feature of the document face image in the specified document image data;

And inputting the target facial feature and the document face feature according to a face authentication model of joint Bayesian training to obtain a similarity;

Determining whether the similarity is greater than or equal to a preset similarity threshold;

If yes, determining that the target face image and the document face image belong to the same person;

If not, it is determined that the target face image and the document face image do not belong to the same person.
A training device for a face model, comprising:

a training sample obtaining module, configured to acquire a training sample, where the training sample includes training image data and document image data;

a sample face image extraction module, configured to obtain a training face image and a document face image according to the training image data and the document image data;

a face model training module, configured to train a face feature model by using the trained face image;

The face model adjustment module is configured to adjust the face feature model by using the paired training face image and the document face image.
A face authentication device based on a face model, wherein the face model is a face model obtained by the training device according to claim 13, the face model includes a face feature model, The face authentication device includes:

a target image data module, configured to collect target image data when receiving an instruction for face authentication;

a target face image extraction module, configured to extract a target face image in the target image data;

a target facial feature extraction module, configured to input the target facial image into a pre-trained facial feature model to extract a target facial feature;

The authentication processing module is configured to perform an authentication process according to the target facial feature and the specified document image data.
An electronic device, comprising: a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

A training method for implementing the face model according to any one of claims 1 to 10 when the processor is configured to execute a computer program stored on the memory.
A computer program for use in a training method executed to perform the face model of any of claims 1-10.
A storage medium, characterized in that the storage medium is for storing a computer program, the computer program being executed to perform the training method of the face model according to any one of claims 1-10.
An electronic device, comprising: a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

The face recognition method based on the face model according to claim 11 or 12 is implemented when the processor is configured to execute a computer program stored on the memory.
A computer program, characterized in that the computer program is used to execute a face model based face authentication method according to claim 11 or 12.
A storage medium, characterized in that the storage medium is for storing a computer program, the computer program being executed to execute the face model based face authentication method according to claim 11 or 12.