CN109711358B

CN109711358B - Neural network training method, face recognition system and storage medium

Info

Publication number: CN109711358B
Application number: CN201811630038.1A
Authority: CN
Inventors: 孔彦; 吴富章; 赵宇航; 赵玉军; 王黎明
Original assignee: Beijing Yuanjian Information Technology Co Ltd
Current assignee: Beijing Yuanjian Information Technology Co Ltd
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2020-09-04
Anticipated expiration: 2038-12-28
Also published as: CN109711358A

Abstract

The application provides a neural network training method, a face recognition system and a storage medium, and relates to the field of face recognition. The neural network training method of the present application includes: adjusting the weight vector of the full-connection layer according to the output vector of the reference characteristic vector passing through the convolution layer; and optimizing the parameters of the convolutional layer according to the loss value obtained by the characteristic vector through the loss function layer and a specified optimization algorithm to obtain the final convolutional layer parameters. Correspondingly, based on the neural network training method provided by the application, the application also provides a face recognition method. Compared with the prior art, the face recognition method provided by the application has the advantages that when the number of the reference images and the number of the sample images in the training process are seriously unbalanced or the shooting scenes of the reference images and the sample images are obviously different, the face recognition effect and accuracy are greatly improved; meanwhile, in some general face recognition application scenes, the face recognition method of the embodiment of the application also has a good face recognition effect.

Description

Neural network training method, face recognition system and storage medium

Technical Field

The present application relates to the field of face recognition technology, and in particular, to a neural network training method, a face recognition system, and a storage medium.

Background

The human face recognition is a technology for identifying the identities of different people based on the human face appearance characteristics, the application scenes are wide, and related research and application have been carried out for decades. With the development of related technologies such as big data and deep learning in recent years, the face recognition effect is improved dramatically, and the face recognition method is applied to scenes such as identity authentication, video monitoring and beauty and entertainment. The problem of human-certificate comparison, namely the problem of face recognition between a standard certificate photo and a life photo, is solved because the target person is recognized only by deploying the certificate photo in a database, the trouble that the target person collects the life photo in the system for registration is avoided, and more attention is paid.

The face recognition model in the prior art has good face recognition effect and accuracy in some general face recognition scenes. However, in some face recognition application scenarios, for example, in application scenarios of testimony comparison, when the number of reference images and sample images is seriously unbalanced, or shooting scenarios of the reference images and the sample images are significantly different, the face recognition effect and accuracy of the face recognition method in the prior art are very low.

Disclosure of Invention

The neural network training method, the face recognition system and the storage medium provided by the embodiment of the invention can solve the problem of low face recognition accuracy rate in the prior art when the number of reference images and sample images is seriously unbalanced or the shooting scenes of the reference images and the sample images are obviously different.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, an embodiment of the present application provides a neural network training method, where the neural network includes a convolutional layer, a fully-connected layer, and a loss function module, which are connected in sequence, and in a single iterative training process, the method includes:

acquiring T feature vectors, wherein the T feature vectors comprise feature vectors of D reference face images and feature vectors of M sample face images, the number D of the feature vectors of the reference face images is the same as that of the weight vectors of the full-connection layer, each sample face image in the M sample face images and one reference face image in the D reference face images are face images of the same person, T is equal to the sum of D and M, and T, D, M are positive integers; setting the learning rate of the full connection layer to be zero; inputting the T-th feature vector in the T feature vectors into a neural network, and obtaining the T-th output feature vector through the processing of the convolutional layer, wherein T is an integer which is greater than or equal to 1 and less than or equal to T, and the T-th feature vector is a feature vector of a reference face image or a feature vector of a sample face image in the T feature vectors; normalizing the tth output feature vector in a full-connection layer to obtain tth normalizationOutputting the characteristic vector; judging whether the t-th feature vector is the feature vector of a reference face image, and when the t-th feature vector is the feature vector of the reference face image, adjusting the weight matrix after the full-connection layer normalization according to the t-th normalized output feature vector, wherein the weight matrix after the full-connection layer normalization consists of D column vectors; in the full connection layer, obtaining a t-th classification vector according to the normalized output vector and the normalized weight matrix of the full connection layer; inputting the t-th classification vector into a loss function module, and judging whether the number of the classification vectors in the loss function module reaches N_nN, N_nN is an integer of 1 or more and T or less_nThe sum is equal to T; when the number of classification vectors in the loss function module reaches N_nIn time, the loss function module is based on the N_nObtaining an nth loss function value according to the classification vectors, optimizing the parameters of the convolution layer according to the nth loss function value and a specified optimization algorithm, and emptying the classification vectors of the loss function module; and sequentially taking 1 to T according to T to execute the steps to obtain the target parameters of the convolutional layer.

In the embodiment of the present application, in a single iteration, the weight matrix of the fully-connected layer needs to be adjusted according to the output vector of the reference face image passing through the convolutional layer, and therefore, in the training process, the learning rate of the fully-connected layer of the neural network needs to be set to zero, so as to ensure that the fully-connected layer cannot be affected by the back propagation of the neural network when the reference feature vector and the sample feature vector are used to train the neural network. In single iteration, feature vectors are sequentially input into a neural network, when the feature vectors input into the neural network are the feature vectors of a reference face image, the feature vectors of the reference face image are processed by a convolution layer to obtain a reference output feature vector, a weight matrix of a full connection layer is adjusted according to the reference output feature vector, and when the reference image is far smaller than a sample image or a shooting scene of the reference image and the sample image has a significant difference, the weight vectors of the full connection layer are adjusted according to the reference output feature vector, so that the improvement of the performance of the neural network under the comparison scene of the reference image and the sample image is facilitated, and the convergence speed of the neural network training process can be accelerated.

Further, in single iteration, the characteristic vectors of the reference face image and the sample face image are sequentially processed by the convolution layer, the full-link layer and the loss function module to obtain a plurality of loss values, and parameters of the convolution layer are adjusted to be final parameters according to each loss value and a specified optimization algorithm. In the face recognition application, the final parameters are used for replacing convolution layer parameters of a face recognition system, and whether the faces corresponding to the reference image and the sample image are the same face or not can be recognized. And when the reference face image is far smaller than the sample face image in the training process, because the weight matrix of the full connection layer is adjusted according to the output characteristic vector of the reference image during the training of the parameter of the convolutional layer, the output vector of the sample face image and the output vector of the reference face image which are output by the processing of the convolutional layer can more accurately represent the sample face characteristic and the reference face characteristic, and therefore, based on the parameter of the convolutional layer, the comparison of the reference image characteristic vector and the sample image characteristic vector through the face recognition system has good accuracy.

With reference to the first aspect, an embodiment of the present application provides a first possible implementation manner of the first aspect, and when the tth feature vector is a feature vector of a reference face image, the tth normalized output feature vector is obtained according to the tth normalized output feature vector

Adjusting the weight matrix after the normalization of the full connection layer

The method comprises the following steps: when the t-th feature vector is the feature vector of the d-th reference face image, the output vector of the t-th normalized reference image is used

Is replaced byThe d-th weight vector of the fully-connected layer; correspondingly, in the single iteration, after the steps are completely executed on the D reference face feature vectors, a full-connection-layer normalized weight matrix composed of the output vectors of the D normalized reference images is obtained

In the neural network training method, the output vector of the normalized reference image is replaced by the weight vector of the full connection layer, so that the optimization process of the neural network can pay more attention to the change of the characteristic angle part in the characteristic vector, and the performance of the neural network is improved.

With reference to the first aspect, an embodiment of the present application provides a second possible implementation manner of the first aspect, where in the full-connected layer, the normalized output vector is obtained according to the normalized output vector

And the weight matrix after the normalization of the full connection layer

Get the t classification vector f_tThe method comprises the following steps: outputting the normalized sample face image to a vector

And the normalized weight matrix

Multiplying to obtain a classification vector f_tWherein, in the step (A),

correspondingly, in the single iteration, after the steps are completely executed on the T feature vectors, the T classification vectors are obtained.

In the embodiment of the application, the normalized output vector is used as the basis

And the weight matrix after the normalization of the full connection layer

And the normalized weight matrix

Multiplication, i.e. classification vectors

The obtained classification vector f_tThe classification characteristics of the sample face feature vectors can be well expressed.

With reference to the second possible implementation manner of the first aspect, an embodiment of the present application provides a third possible implementation manner of the first aspect, and the loss function module is according to the N_nObtaining an nth loss function value by the classification vector, including: setting a cosine measurement scaling parameter s in the loss function and another cosine measurement scaling parameter m in the loss function according to a specified rule, wherein m is greater than or equal to 0 and less than or equal to 1; according to the N in the loss function module_nA number of said classification vectors

And the following formula, obtaining the nth loss value L_nWherein j is 1 to N in sequence_nInteger of (1), N_nThe number of classification vectors required to calculate the nth loss value, correspondingly, n_jTake in turn

To

The number of the integer (c) of (d),

wherein the n-th_jA classification vector

And y_jThe reference face images correspond to the same face,

as a classification vector

Y in (1)_jValue, y_jIs an integer of 1 to D.

In the embodiment of the application, the classification values of a certain number of feature vectors are input into the loss function module, the loss values are obtained through the processing of the loss function module, and the obtained loss values can well reflect the performance of the convolutional layer in the neural network. In a single iteration, multiple loss values may be obtained. The scaling parameter of the cosine distance in the loss function is beneficial to enlarging the characteristic distance between the sample face and the reference face by the network, so that the performance of the recognition model can be improved.

With reference to the third possible implementation manner of the first aspect, this application provides an example of a fourth possible implementation manner of the first aspect, where the specified optimization algorithm is a random gradient descent method, and the optimizing the parameters of the convolutional layer according to the specified optimization algorithm according to the nth loss function value includes: setting parameters of the random gradient descent method according to preset conditions; and obtaining parameters of the convolutional layer according to the nth loss value and the random gradient descent method, and adjusting the parameters of the convolutional layer.

In the embodiment of the present application, an optimization algorithm for optimizing the neural network convolutional layer needs to be specified, parameters of the neural network convolutional layer are optimized according to the loss value and the specified optimization algorithm, and after multiple iterations, final parameters of the neural network convolutional layer can be obtained.

With reference to the first aspect, an embodiment of the present application provides a fifth possible implementation manner of the first aspect, and before performing the initial iterative training, the method further includes: initializing parameters of the convolutional layer and weight of the full-link layer.

In the embodiment of the present application, before performing the initial iteration training, the parameters of the convolutional layer and the weight of the fully-connected layer need to be initialized to ensure that the initial iteration can be started normally.

With reference to the first aspect, an embodiment of the present application provides a sixth possible implementation manner of the first aspect, including: and executing multiple iterations until a preset iteration termination condition is met, wherein the T eigenvectors obtained in any two iterations of the multiple iterations are the same.

In the embodiment of the application, one training process needs to be executed for multiple iterations, and the feature vectors used in each iteration are the same.

In a second aspect, a face recognition method provided in an embodiment of the present application includes: respectively performing feature extraction on a reference image feature vector and a sample image feature vector by the convolutional layer obtained by performing at least one iterative training by using the method of the first aspect to obtain a reference feature and a sample feature; calculating the similarity of the reference feature and the sample feature; and judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature.

In the embodiment of the application, feature extraction is respectively performed on the reference image feature vector and the sample image feature vector through the convolution layer obtained in the first aspect to obtain a reference feature and a sample feature; calculating the similarity of the reference feature and the sample feature according to the reference feature and the sample feature; and then judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature. The face recognition system obtained in the embodiment of the application has good effect and accuracy when judging whether the faces corresponding to the reference image feature vector and the sample image feature vector are the same face.

In a third aspect, a face recognition system provided in an embodiment of the present application includes: the method comprises the steps of performing at least one iterative training by using the method of the first aspect to obtain a convolutional layer, wherein the convolutional layer is used for performing feature extraction on a reference image feature vector and a sample image feature vector respectively to obtain a reference feature and a sample feature; a cosine distance calculation module for calculating the similarity of the reference feature and the sample feature; and the judging module is used for judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature.

In a fourth aspect, the present application provides a storage medium having stored thereon instructions that, when executed on a computer, cause the computer to perform the method of any one of the possible implementations of the first or second aspect.

Additional features and advantages of the disclosure will be set forth in the description which follows, or in part may be learned by the practice of the above-described techniques of the disclosure, or may be learned by practice of the disclosure.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

FIG. 1 is a functional block diagram of a neural network in an embodiment of the present application;

FIG. 2 is a flow chart of a neural network training method in an embodiment of the present application;

fig. 3 is a functional block diagram of a face recognition system in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without inventive step, are within the scope of the present application.

First embodiment

The present embodiments provide a neural network training method and a face recognition method, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here. The present embodiment will be described in detail below.

Referring to fig. 1, the neural network training method provided in this embodiment includes a convolutional layer, a fully-connected layer, and a loss function module, which are connected in sequence.

The method may include multiple iterative trainings.

Referring to fig. 2, a single iteration of the training process includes: step S100, step S200, step S300, step S400, step S500, step S600, step S700, step S800, and step S900.

Step S100: and acquiring T feature vectors, wherein the T feature vectors comprise feature vectors of D reference face images and feature vectors of M sample face images, the number D of the feature vectors of the reference face images is the same as that of the weight vectors of the full-connection layer, each sample face image in the M sample face images and one reference face image in the D reference face images are face images of the same person, T is equal to the sum of D and M, and T, D, M are positive integers.

Referring to fig. 2, in a single iteration of the neural network training process, feature vectors of reference faces and a feature vector set of sample faces need to be obtained, the number of the feature vectors of the reference face images is the same as the number of the weight vectors of the full connection layer, and each sample face in the feature vector set of the sample faces corresponds to one face in the reference faces.

For example, in an application scenario of comparing a certificate photo and a living photo, a single iteration of the neural network training process needs to acquire feature vectors of D certificate photos and feature vector sets of M living photos to form T feature vector sets. The number D of the identification photos is the same as the number of the weight vectors of the full connecting layer, each of the M life photos corresponds to the same face as one of the D identification photos, and the total number of the feature vectors of the identification photos and the total number of the feature vectors of the life photos are T. Specifically, D may be 50000, and M may be 1280000, that is, feature vectors of 50000 identification photos and feature vector sets of 1280000 life photos are obtained, so as to form 1330000 feature vectors.

Step S200: setting the learning rate of the full connection layer to zero.

Referring to fig. 2, in a single iteration of the neural network training process, the learning rate of the fully-connected layer needs to be set to zero to ensure that the fully-connected layer is not affected by back propagation in the neural network training process.

For example, in the application scenario of the certificate photo and the living photo comparison, the learning rate of the fully-connected layer is set to zero in a single iteration of the neural network training process to ensure that the fully-connected layer is not affected by back propagation in the neural network training process.

Step S300: the T pieces are processedInputting the t-th feature vector in the feature vectors into a neural network, and obtaining the t-th output feature vector x through the processing of the convolutional layer_tAnd T is an integer greater than or equal to 1 and less than or equal to T, and the T-th feature vector is a feature vector of a reference face image or a feature vector of a sample face image in the T feature vectors.

Referring to fig. 2, in a single iteration of the neural network training process, T eigenvectors are sequentially sent into the neural network, and are sequentially obtained through processing of the convolutional layer, wherein the T-th output eigenvector is x_t。

For example, in the application scenario of comparing the identification photo and the living photo, 1330000 feature vectors need to be sequentially sent into the neural network, and 1330000 output feature vectors are sequentially obtained through the processing of the convolutional layer, wherein the t-th output feature vector is x_t。

Step S400: normalizing the tth output feature vector in a full-connection layer to obtain the tth normalized output feature vector

Referring to fig. 2, in a single iteration of the neural network training process, the output vectors output from the feature vectors via the convolutional layer are normalized in the full connection layer to obtain normalized output feature vectors, where the tth normalized output feature vector

For example, in the application scenario of comparing the identification photo and the living photo, the output vectors of the identification photo and the living photo, which are output by passing the feature vectors of the identification photo and the living photo through the convolution layer, are required to be normalized in the full connection layer to obtain the normalized output feature vectors of the identification photo and the living photo, wherein the tth normalized output feature vector is

Step S500: judging whether the t-th feature vector is the feature vector of the reference face image, and when the t-th feature vector is the feature vector of the reference face image, outputting the feature vector according to the t-th normalized output feature vector

The weight matrix after the normalization of the full connection layer

Consisting of D column vectors.

Referring to fig. 2, in a single iteration of the neural network training process, the weight matrix of the fully-connected layer needs to be adjusted according to the feature vector of the reference face image.

For example, in an application scenario of comparing a certificate photo and a living photo, a weight matrix composed of 50000 weight vectors of a fully-connected layer needs to be adjusted according to the feature vectors of 50000 certificate photos.

Optionally, step S500 includes: when the t-th feature vector is the feature vector of the d-th reference face image, the output vector of the t-th normalized reference image is used

Replacing with the d-th weight vector of the full connection layer; correspondingly, in the single iteration, after the steps are completely executed on the D reference face feature vectors, a full-connection-layer normalized weight matrix composed of the output vectors of the D normalized reference images is obtained

Method for adjusting weight matrix of full-connection layer according to characteristic vector of reference face image in single iteration of neural network training processComprises the following steps: when the t-th feature vector is the feature vector of the d-th reference face image, the output vector of the t-th normalized reference image is used

And replacing the d-th weight vector of the full connection layer. After a single iteration is performed, the weight matrix after full connection layer normalization

Consisting of the output vectors of the D normalized reference images. For example, in an application scenario of comparing a certificate photo and a living photo, a method for adjusting a weight matrix of a full connection layer according to a feature vector of the certificate photo includes: replacing the normalized output vector of the d-th certificate photo with the d-th weight vector of the full connection layer, and after executing single iteration, performing the normalized weight matrix of the full connection layer

The device consists of D normalized output vectors of the certificate photo.

Step S600: in the full connection layer, according to the normalized output vector

And the weight matrix after the normalization of the full connection layer

Get the t classification vector f_t。

Referring to fig. 2, in a single iteration of the neural network training process, after normalization processing is performed on the output vector obtained through convolutional layer processing in the fully-connected layer, a classification vector needs to be obtained according to the normalized output vector and the normalized weight matrix.

For example, in an application scenario of comparing a certificate photo and a living photo, in a single iteration of a neural network training process, after normalization processing is performed on an output vector obtained through convolution layer processing in a full connection layer, a classification vector needs to be obtained according to the output vector after normalization of the certificate photo or the living photo and a weight matrix after normalization.

Optionally, step S600 includes: outputting the normalized sample face image to a vector

And the normalized weight matrix

Multiplying to obtain a classification vector f_tWherein, in the step (A),

correspondingly, in the single iteration, after the steps are completely executed on the T feature vectors, the T classification vectors are obtained. In a single iteration of the neural network training process, the t-th classification vector f is obtained_tThe method comprises the following steps: outputting the normalized human face image to a vector

And the normalized weight matrix

Multiplication, in other words, classification vectors

For example, in the application scenario of comparing the certificate photo and the living photo, the t-th classification vector f is obtained in a single iteration of the neural network training process_tThe method comprises the following steps: the normalized output vector of the certificate photo or the life photo

And the normalized weight matrix

Multiplication.

Step S700: classifying the t-thInputting the vectors into a loss function module, and judging whether the number of the classified vectors in the loss function module reaches N_nN, N_nN is an integer of 1 or more and T or less_nThe sum is equal to T.

Referring to fig. 2, in a single iteration of the neural network training process, parameters of the convolutional layer need to be optimized according to a specified optimization algorithm according to a loss function value output by the loss function, and each loss function value needs to be obtained in the loss function module by calculation according to a plurality of classification vectors, so that when the t-th classification vector is input into the loss function module, it is necessary to determine whether the number of the classification vectors in the loss function module reaches a specified number of the calculated loss function values.

For example, in an application scenario of comparing a certificate photo and a life photo, when a classification vector of the certificate photo or the life photo is input into a loss function module, it is necessary to determine whether the number of the classification vectors in the loss function module reaches a specified number for calculating a loss function value.

Step S800: when the number of classification vectors in the loss function module reaches N_nIn time, the loss function module is based on the N_nObtaining an nth loss function value according to the classification vectors, optimizing the parameters of the convolution layer according to the nth loss function value and a specified optimization algorithm, and emptying the classification vectors of the loss function module.

Referring to fig. 2, in a single iteration of the neural network training process, a plurality of loss function values need to be obtained according to T classification vectors input to the loss function module, wherein the nth loss function value needs to be obtained in the loss function module according to N_nThe classification vectors are obtained by calculation. After the nth loss function value, the parameters of the convolution layer need to be optimized according to a specified optimization algorithm, and the classification vectors of the loss function module are emptied, so that the number of the classification vectors in the loss function module is conveniently recalculated. After one iteration is performed, the loss function module outputs a plurality of loss function values, and accordingly, the parameters of the convolutional layer are updated for a plurality of times.

For example, in the application scenario of comparing the identification photo and the living photo, 1330000 feature vector sets consisting of feature vectors of the identification photo and feature vectors of the living photo are input into the neural network, the loss function module will output a plurality of loss function values in sequence, and each time a loss value is output, the parameters of the convolution layer are updated according to the loss value.

Optionally, step S800 includes: setting a cosine measurement scaling parameter s in the loss function and another cosine measurement scaling parameter m in the loss function according to a specified rule, wherein m is greater than or equal to 0 and less than or equal to 1; according to the N in the loss function module_nA number of said classification vectors

To

The number of the integer (c) of (d),

wherein the n-th_jA classification vector

And y_jThe reference face images correspond to the same face,

as a classification vector

Y in (1)_jValue, y_jIs an integer of 1 to D. For example, in the application scenario of the certificate photo and the living photo, the cosine measure scaling parameter s in the loss function is set to 45, and m is set to 0.35. Classifying the life photo or certificate photo

Inputting the j into the loss function module, and sequentially taking 1 to N_nTo get the nth loss value L_n. The weight matrix W is composed of 50000 weight vectors, where the y-th_jThe individual weight vector is the y-th weight vector in 50000 identification photographs_jAnd the weight vector of the full connecting layer corresponding to each identification photo.

Optionally, the step of optimizing the parameters of the convolutional layer according to the specified optimization algorithm according to the nth loss function value by using a random gradient descent method includes: setting parameters of the random gradient descent method according to preset conditions; and obtaining parameters of the convolutional layer according to the nth loss value and the random gradient descent method, and adjusting the parameters of the convolutional layer. For example, in an application scenario of comparing a certificate photo and a living photo, an impulse parameter of a random gradient descent method may be set to be 0.9, a learning rate of a convolutional layer may be set to be 0.1, a nth parameter of the convolutional layer may be obtained according to the nth loss value and the random gradient descent method, and a parameter of the neural network convolutional layer may be adjusted to be the nth parameter.

Step S900: and sequentially taking 1 to T according to T, executing the steps S100, S200, S300, S400, S500, S600, S700 and S800 to obtain the target parameters of the convolutional layer.

Referring to fig. 2, in a single iteration of the neural network training process, T needs to be sequentially taken from 1 to T, so as to ensure that all samples obtained in the single iteration are input into the neural network for processing.

For example, in the application scenario of comparing the certificate photo and the life photo, T is taken from 1 to T, so as to ensure that the feature vectors of all the certificate photos and the feature vectors of the life photo obtained in a single iteration are input into the neural network.

Optionally, before step S900, the method further includes: initializing parameters of the convolutional layer and weight of the full-link layer. For example, in an application scenario of comparing a certificate photo and a living photo, before performing initial iterative training, the parameters of the convolutional layer and the weight of the fully-connected layer may be set to random values.

Optionally, after step S900, the method further includes: and executing multiple iterations until a preset iteration termination condition is met, wherein the T eigenvectors obtained in any two iterations of the multiple iterations are the same. For example, in an application scenario of comparing a certificate photo and a life photo, multiple iterations are required, and feature vectors of the certificate photo and the life photo used in each iteration are the same.

Second embodiment

The face recognition method provided by the embodiment comprises the following steps: respectively performing feature extraction on a reference image feature vector and a sample image feature vector by the convolutional layer obtained by performing at least one iterative training by the method in the first embodiment to obtain a reference feature and a sample feature; calculating the similarity of the reference feature and the sample feature; and judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature. For example, in an application scenario of comparing a certificate photo and a life photo, the feature vector of the certificate photo and the feature vector of the life photo are respectively subjected to feature extraction by the convolution layer obtained by performing at least one iterative training according to the method in the first embodiment, and the similarity of the certificate photo feature and the life photo feature is calculated; and judging whether the face corresponding to the feature vector of the identification photo and the face corresponding to the feature vector of the life photo are the faces of the same person or not according to the similarity of the face corresponding to the identification photo feature and the life photo feature. Specifically, under the condition of one thousandth of false alarm rate, when the convolution layer is obtained by performing iterative training for 120 hours according to the method described in the first embodiment, feature extraction is performed on the feature vector of the identification photo and the feature vector of the life photo through the obtained convolution layer, and finally, the accuracy rate of judging whether the face corresponding to the feature vector of the identification photo and the face corresponding to the feature vector of the life photo are the face of the same person can reach more than 60%, whereas in the prior art, the accuracy rates of NormFace and CosFace are respectively 9.2% and 32.2%; correspondingly; under the scene of one-tenth false alarm rate, when the convolution layer is obtained by performing 120-hour iterative training according to the method described in the first embodiment, feature extraction is performed on the feature vector of the identification photograph and the feature vector of the life photograph through the obtained convolution layer, and finally, the accuracy rate of judging whether the face corresponding to the feature vector of the identification photograph and the face corresponding to the feature vector of the life photograph are the same person face can reach more than 40%, whereas in the prior art, the accuracy rate of the NormFace and the accuracy rate of the CosFace are respectively 2.6% and 19.3%.

In a traditional life photo comparison scene, the feature vectors of different life photos are respectively subjected to feature extraction by the convolutional layer obtained by carrying out at least one iterative training according to the method in the embodiment one, and the similarity of the features of the different life photos is calculated; and judging whether the faces corresponding to the feature vectors of different life photographs are the faces of the same person or not according to the similarity of the faces corresponding to the features of the different life photographs. Specifically, in the one-thousandth false alarm rate scenario, when the convolutional layer is obtained by performing 120-hour iterative training according to the method described in the first embodiment, and feature extraction is respectively carried out on the feature vectors of different life photographs through the obtained convolution layer, the accuracy rate of judging whether the human faces corresponding to the feature vectors of the different life photographs are the human faces of the same person is 94.8 percent, the accuracy rates of NormFace and CosFace in the prior art are 85.0 percent and 91.9 percent respectively, in the scenario of one-tenth false alarm rate, when the convolutional layer is obtained by performing 120-hour iterative training according to the method described in the first embodiment, and respectively extracting the features of the feature vectors of different life photographs through the obtained convolution layer, and finally judging whether the accuracy of the faces corresponding to the feature vectors of the different life photographs are the faces of the same person is 92.2%, and the accuracy of NormFace and CosFace in the prior art is 73.2% and 87.9% respectively.

Third embodiment

An embodiment of the present application provides a face recognition system, please refer to fig. 3, including:

a convolutional layer obtained by performing at least one iterative training by the method according to the first embodiment, where the convolutional layer is used to perform feature extraction on a reference image feature vector and a sample image feature vector respectively to obtain a reference feature and a sample feature;

a cosine distance calculation module for calculating the similarity of the reference feature and the sample feature;

and the judging module is used for judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature. The judging module may be a judging module that determines whether the faces corresponding to the input reference image feature vector and the sample image feature vector are the same face by using a preset threshold.

In this embodiment, the reference feature and the sample feature obtained by respectively performing feature extraction on the reference image feature vector and the sample image feature vector by the convolutional layer obtained by performing at least one iterative training according to the method described in the first embodiment can well express the features of the reference image and the sample image, so as to ensure that the output results of the cosine distance calculation module and the determination module have higher accuracy.

Fourth embodiment

An embodiment of the present application provides a storage medium, where instructions are stored on the storage medium, and when the instructions are run on a computer, the instructions cause the computer to execute the face recognition methods in the first embodiment and the second embodiment.

From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by hardware, or by software plus a necessary general hardware platform, and based on such understanding, the technical solution of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute the method of the various implementation scenarios of the present invention.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A neural network training method is characterized in that the neural network comprises a convolutional layer, a fully-connected layer and a loss function module which are connected in sequence, and in the process of single iteration training, the method comprises the following steps:

acquiring T feature vectors, wherein the T feature vectors comprise feature vectors of D reference face images and feature vectors of M sample face images, the number D of the feature vectors of the reference face images is the same as that of the weight vectors of the full-connection layer, each sample face image in the M sample face images and one reference face image in the D reference face images are face images of the same person, T is equal to the sum of D and M, and T, D, M are positive integers;

setting the learning rate of the full connection layer to be zero;

inputting the T-th feature vector in the T feature vectors into a neural network, and obtaining the T-th output feature vector x through the processing of the convolutional layer_tWherein T is an integer greater than or equal to 1 and less than or equal to T, and the tth feature vector is a feature vector of a reference face image or a feature vector of a sample face image in the T feature vectors;

normalizing the tth output feature vector in a full-connection layer to obtain the tth normalized output feature vector

Judging whether the t-th feature vector is the feature vector of a reference face image, and when the t-th feature vector is the feature vector of the D-th reference face image in the D reference face images, outputting the t-th normalized output feature vector

Replacing the D-th weight vector of the full connection layer, wherein D is a positive integer which is more than 1 and less than or equal to D, and in the single iteration, after the characteristic vectors of the D reference face images are replaced, obtaining a full connection layer normalized weight matrix formed by the output vectors of the D normalized reference face images

In the full connection layer, according to the normalized output feature vector

And the weight matrix after the normalization of the full connection layer

Get the t classification vector f_t；

Inputting the t-th classification vector into a loss function module, and judging whether the number of the classification vectors in the loss function module reaches N_nA, wherein N_nDenotes the number of samples in the nth sample after dividing the total sample T into N_nIs an integer greater than or equal to 1, less than or equal to T, the sum of N1, N2.. Nn being equal to T;

when the number of classification vectors in the loss function module reaches N_nIn time, the loss function module is based on the N_nObtaining the nth loss function value according to the classification vector, and according to the nth loss function value, according to the indexA fixed optimization algorithm, which optimizes the parameters of the convolution layer and empties the classification vectors of the loss function module;

and sequentially taking 1 to T according to T to execute the steps to obtain the target parameters of the convolutional layer.

2. The method of claim 1, wherein in the fully-connected layer, the normalized output eigenvector is used as a basis for the output eigenvector

And the weight matrix after the normalization of the full connection layer

Get the t classification vector f_tThe method comprises the following steps:

the normalized output feature vector is processed

And the normalized weight matrix

Multiplying to obtain a classification vector f_tWherein, in the step (A),

3. The method of claim 2, wherein the loss function module is based on the N_nObtaining an nth loss function value by the classification vector, including:

setting a cosine measurement scaling parameter s in the loss function and another cosine measurement scaling parameter m in the loss function according to a specified rule, wherein m is greater than or equal to 0 and less than or equal to 1;

according to the N in the loss function module_nA number of said classification vectors

To

The number of the integer (c) of (d),

wherein the classification vector

And y_jThe reference face images correspond to the same face,

as a classification vector

Y in (1)_jValue, y_jIs an integer of 1 to D.

4. The method of claim 3, wherein the specified optimization algorithm is a stochastic gradient descent method, and wherein optimizing the parameters of the convolutional layer according to the specified optimization algorithm based on the nth loss function value comprises:

setting parameters of the random gradient descent method according to preset conditions;

and obtaining parameters of the convolutional layer according to the nth loss value and the random gradient descent method, and adjusting the parameters of the convolutional layer.

5. The method of claim 1, wherein prior to performing the first iterative training, the method further comprises:

initializing parameters of the convolutional layer and weight of the full-link layer.

6. The method of claim 1, further comprising:

and executing multiple iterations until a preset iteration termination condition is met, wherein the T eigenvectors obtained in any two iterations of the multiple iterations are the same.

7. A method of face recognition, comprising:

respectively performing feature extraction on a reference image feature vector and a sample image feature vector by the convolutional layer obtained by performing at least one iterative training through the method according to any one of claims 1 to 6 to obtain a reference feature and a sample feature;

calculating the similarity of the reference feature and the sample feature;

and judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature.

8. A face recognition system, comprising:

a convolutional layer obtained by performing at least one iterative training through the method according to any one of claims 1 to 6, wherein the convolutional layer is used for performing feature extraction on the reference image feature vector and the sample image feature vector respectively to obtain a reference feature and a sample feature;

and the judging module is used for judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature.

9. A storage medium having stored thereon instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 6.