CN109711358B - Neural network training method, face recognition system and storage medium - Google Patents

Neural network training method, face recognition system and storage medium Download PDF

Info

Publication number
CN109711358B
CN109711358B CN201811630038.1A CN201811630038A CN109711358B CN 109711358 B CN109711358 B CN 109711358B CN 201811630038 A CN201811630038 A CN 201811630038A CN 109711358 B CN109711358 B CN 109711358B
Authority
CN
China
Prior art keywords
feature
vector
vectors
feature vector
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811630038.1A
Other languages
Chinese (zh)
Other versions
CN109711358A (en
Inventor
孔彦
吴富章
赵宇航
赵玉军
王黎明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yuanjian Information Technology Co Ltd
Original Assignee
Beijing Yuanjian Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yuanjian Information Technology Co Ltd filed Critical Beijing Yuanjian Information Technology Co Ltd
Priority to CN201811630038.1A priority Critical patent/CN109711358B/en
Publication of CN109711358A publication Critical patent/CN109711358A/en
Application granted granted Critical
Publication of CN109711358B publication Critical patent/CN109711358B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The application provides a neural network training method, a face recognition system and a storage medium, and relates to the field of face recognition. The neural network training method of the present application includes: adjusting the weight vector of the full-connection layer according to the output vector of the reference characteristic vector passing through the convolution layer; and optimizing the parameters of the convolutional layer according to the loss value obtained by the characteristic vector through the loss function layer and a specified optimization algorithm to obtain the final convolutional layer parameters. Correspondingly, based on the neural network training method provided by the application, the application also provides a face recognition method. Compared with the prior art, the face recognition method provided by the application has the advantages that when the number of the reference images and the number of the sample images in the training process are seriously unbalanced or the shooting scenes of the reference images and the sample images are obviously different, the face recognition effect and accuracy are greatly improved; meanwhile, in some general face recognition application scenes, the face recognition method of the embodiment of the application also has a good face recognition effect.

Description

Neural network training method, face recognition system and storage medium
Technical Field
The present application relates to the field of face recognition technology, and in particular, to a neural network training method, a face recognition system, and a storage medium.
Background
The human face recognition is a technology for identifying the identities of different people based on the human face appearance characteristics, the application scenes are wide, and related research and application have been carried out for decades. With the development of related technologies such as big data and deep learning in recent years, the face recognition effect is improved dramatically, and the face recognition method is applied to scenes such as identity authentication, video monitoring and beauty and entertainment. The problem of human-certificate comparison, namely the problem of face recognition between a standard certificate photo and a life photo, is solved because the target person is recognized only by deploying the certificate photo in a database, the trouble that the target person collects the life photo in the system for registration is avoided, and more attention is paid.
The face recognition model in the prior art has good face recognition effect and accuracy in some general face recognition scenes. However, in some face recognition application scenarios, for example, in application scenarios of testimony comparison, when the number of reference images and sample images is seriously unbalanced, or shooting scenarios of the reference images and the sample images are significantly different, the face recognition effect and accuracy of the face recognition method in the prior art are very low.
Disclosure of Invention
The neural network training method, the face recognition system and the storage medium provided by the embodiment of the invention can solve the problem of low face recognition accuracy rate in the prior art when the number of reference images and sample images is seriously unbalanced or the shooting scenes of the reference images and the sample images are obviously different.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:
in a first aspect, an embodiment of the present application provides a neural network training method, where the neural network includes a convolutional layer, a fully-connected layer, and a loss function module, which are connected in sequence, and in a single iterative training process, the method includes:
acquiring T feature vectors, wherein the T feature vectors comprise feature vectors of D reference face images and feature vectors of M sample face images, the number D of the feature vectors of the reference face images is the same as that of the weight vectors of the full-connection layer, each sample face image in the M sample face images and one reference face image in the D reference face images are face images of the same person, T is equal to the sum of D and M, and T, D, M are positive integers; setting the learning rate of the full connection layer to be zero; inputting the T-th feature vector in the T feature vectors into a neural network, and obtaining the T-th output feature vector through the processing of the convolutional layer, wherein T is an integer which is greater than or equal to 1 and less than or equal to T, and the T-th feature vector is a feature vector of a reference face image or a feature vector of a sample face image in the T feature vectors; normalizing the tth output feature vector in a full-connection layer to obtain tth normalizationOutputting the characteristic vector; judging whether the t-th feature vector is the feature vector of a reference face image, and when the t-th feature vector is the feature vector of the reference face image, adjusting the weight matrix after the full-connection layer normalization according to the t-th normalized output feature vector, wherein the weight matrix after the full-connection layer normalization consists of D column vectors; in the full connection layer, obtaining a t-th classification vector according to the normalized output vector and the normalized weight matrix of the full connection layer; inputting the t-th classification vector into a loss function module, and judging whether the number of the classification vectors in the loss function module reaches NnN, NnN is an integer of 1 or more and T or lessnThe sum is equal to T; when the number of classification vectors in the loss function module reaches NnIn time, the loss function module is based on the NnObtaining an nth loss function value according to the classification vectors, optimizing the parameters of the convolution layer according to the nth loss function value and a specified optimization algorithm, and emptying the classification vectors of the loss function module; and sequentially taking 1 to T according to T to execute the steps to obtain the target parameters of the convolutional layer.
In the embodiment of the present application, in a single iteration, the weight matrix of the fully-connected layer needs to be adjusted according to the output vector of the reference face image passing through the convolutional layer, and therefore, in the training process, the learning rate of the fully-connected layer of the neural network needs to be set to zero, so as to ensure that the fully-connected layer cannot be affected by the back propagation of the neural network when the reference feature vector and the sample feature vector are used to train the neural network. In single iteration, feature vectors are sequentially input into a neural network, when the feature vectors input into the neural network are the feature vectors of a reference face image, the feature vectors of the reference face image are processed by a convolution layer to obtain a reference output feature vector, a weight matrix of a full connection layer is adjusted according to the reference output feature vector, and when the reference image is far smaller than a sample image or a shooting scene of the reference image and the sample image has a significant difference, the weight vectors of the full connection layer are adjusted according to the reference output feature vector, so that the improvement of the performance of the neural network under the comparison scene of the reference image and the sample image is facilitated, and the convergence speed of the neural network training process can be accelerated.
Further, in single iteration, the characteristic vectors of the reference face image and the sample face image are sequentially processed by the convolution layer, the full-link layer and the loss function module to obtain a plurality of loss values, and parameters of the convolution layer are adjusted to be final parameters according to each loss value and a specified optimization algorithm. In the face recognition application, the final parameters are used for replacing convolution layer parameters of a face recognition system, and whether the faces corresponding to the reference image and the sample image are the same face or not can be recognized. And when the reference face image is far smaller than the sample face image in the training process, because the weight matrix of the full connection layer is adjusted according to the output characteristic vector of the reference image during the training of the parameter of the convolutional layer, the output vector of the sample face image and the output vector of the reference face image which are output by the processing of the convolutional layer can more accurately represent the sample face characteristic and the reference face characteristic, and therefore, based on the parameter of the convolutional layer, the comparison of the reference image characteristic vector and the sample image characteristic vector through the face recognition system has good accuracy.
With reference to the first aspect, an embodiment of the present application provides a first possible implementation manner of the first aspect, and when the tth feature vector is a feature vector of a reference face image, the tth normalized output feature vector is obtained according to the tth normalized output feature vector
Figure BDA0001927994110000041
Adjusting the weight matrix after the normalization of the full connection layer
Figure BDA0001927994110000042
The method comprises the following steps: when the t-th feature vector is the feature vector of the d-th reference face image, the output vector of the t-th normalized reference image is used
Figure BDA0001927994110000043
Is replaced byThe d-th weight vector of the fully-connected layer; correspondingly, in the single iteration, after the steps are completely executed on the D reference face feature vectors, a full-connection-layer normalized weight matrix composed of the output vectors of the D normalized reference images is obtained
Figure BDA0001927994110000044
In the neural network training method, the output vector of the normalized reference image is replaced by the weight vector of the full connection layer, so that the optimization process of the neural network can pay more attention to the change of the characteristic angle part in the characteristic vector, and the performance of the neural network is improved.
With reference to the first aspect, an embodiment of the present application provides a second possible implementation manner of the first aspect, where in the full-connected layer, the normalized output vector is obtained according to the normalized output vector
Figure BDA0001927994110000045
And the weight matrix after the normalization of the full connection layer
Figure BDA0001927994110000046
Get the t classification vector ftThe method comprises the following steps: outputting the normalized sample face image to a vector
Figure BDA0001927994110000047
And the normalized weight matrix
Figure BDA0001927994110000048
Multiplying to obtain a classification vector ftWherein, in the step (A),
Figure BDA0001927994110000049
correspondingly, in the single iteration, after the steps are completely executed on the T feature vectors, the T classification vectors are obtained.
In the embodiment of the application, the normalized output vector is used as the basis
Figure BDA00019279941100000410
And the weight matrix after the normalization of the full connection layer
Figure BDA00019279941100000411
Get the t classification vector ftThe method comprises the following steps: outputting the normalized sample face image to a vector
Figure BDA00019279941100000412
And the normalized weight matrix
Figure BDA00019279941100000413
Multiplication, i.e. classification vectors
Figure BDA00019279941100000414
The obtained classification vector ftThe classification characteristics of the sample face feature vectors can be well expressed.
With reference to the second possible implementation manner of the first aspect, an embodiment of the present application provides a third possible implementation manner of the first aspect, and the loss function module is according to the NnObtaining an nth loss function value by the classification vector, including: setting a cosine measurement scaling parameter s in the loss function and another cosine measurement scaling parameter m in the loss function according to a specified rule, wherein m is greater than or equal to 0 and less than or equal to 1; according to the N in the loss function modulenA number of said classification vectors
Figure BDA0001927994110000051
And the following formula, obtaining the nth loss value LnWherein j is 1 to N in sequencenInteger of (1), NnThe number of classification vectors required to calculate the nth loss value, correspondingly, njTake in turn
Figure BDA0001927994110000052
To
Figure BDA0001927994110000053
The number of the integer (c) of (d),
Figure BDA0001927994110000054
wherein the n-thjA classification vector
Figure BDA0001927994110000055
And yjThe reference face images correspond to the same face,
Figure BDA0001927994110000056
as a classification vector
Figure BDA0001927994110000057
Y in (1)jValue, yjIs an integer of 1 to D.
In the embodiment of the application, the classification values of a certain number of feature vectors are input into the loss function module, the loss values are obtained through the processing of the loss function module, and the obtained loss values can well reflect the performance of the convolutional layer in the neural network. In a single iteration, multiple loss values may be obtained. The scaling parameter of the cosine distance in the loss function is beneficial to enlarging the characteristic distance between the sample face and the reference face by the network, so that the performance of the recognition model can be improved.
With reference to the third possible implementation manner of the first aspect, this application provides an example of a fourth possible implementation manner of the first aspect, where the specified optimization algorithm is a random gradient descent method, and the optimizing the parameters of the convolutional layer according to the specified optimization algorithm according to the nth loss function value includes: setting parameters of the random gradient descent method according to preset conditions; and obtaining parameters of the convolutional layer according to the nth loss value and the random gradient descent method, and adjusting the parameters of the convolutional layer.
In the embodiment of the present application, an optimization algorithm for optimizing the neural network convolutional layer needs to be specified, parameters of the neural network convolutional layer are optimized according to the loss value and the specified optimization algorithm, and after multiple iterations, final parameters of the neural network convolutional layer can be obtained.
With reference to the first aspect, an embodiment of the present application provides a fifth possible implementation manner of the first aspect, and before performing the initial iterative training, the method further includes: initializing parameters of the convolutional layer and weight of the full-link layer.
In the embodiment of the present application, before performing the initial iteration training, the parameters of the convolutional layer and the weight of the fully-connected layer need to be initialized to ensure that the initial iteration can be started normally.
With reference to the first aspect, an embodiment of the present application provides a sixth possible implementation manner of the first aspect, including: and executing multiple iterations until a preset iteration termination condition is met, wherein the T eigenvectors obtained in any two iterations of the multiple iterations are the same.
In the embodiment of the application, one training process needs to be executed for multiple iterations, and the feature vectors used in each iteration are the same.
In a second aspect, a face recognition method provided in an embodiment of the present application includes: respectively performing feature extraction on a reference image feature vector and a sample image feature vector by the convolutional layer obtained by performing at least one iterative training by using the method of the first aspect to obtain a reference feature and a sample feature; calculating the similarity of the reference feature and the sample feature; and judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature.
In the embodiment of the application, feature extraction is respectively performed on the reference image feature vector and the sample image feature vector through the convolution layer obtained in the first aspect to obtain a reference feature and a sample feature; calculating the similarity of the reference feature and the sample feature according to the reference feature and the sample feature; and then judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature. The face recognition system obtained in the embodiment of the application has good effect and accuracy when judging whether the faces corresponding to the reference image feature vector and the sample image feature vector are the same face.
In a third aspect, a face recognition system provided in an embodiment of the present application includes: the method comprises the steps of performing at least one iterative training by using the method of the first aspect to obtain a convolutional layer, wherein the convolutional layer is used for performing feature extraction on a reference image feature vector and a sample image feature vector respectively to obtain a reference feature and a sample feature; a cosine distance calculation module for calculating the similarity of the reference feature and the sample feature; and the judging module is used for judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature.
In a fourth aspect, the present application provides a storage medium having stored thereon instructions that, when executed on a computer, cause the computer to perform the method of any one of the possible implementations of the first or second aspect.
Additional features and advantages of the disclosure will be set forth in the description which follows, or in part may be learned by the practice of the above-described techniques of the disclosure, or may be learned by practice of the disclosure.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a functional block diagram of a neural network in an embodiment of the present application;
FIG. 2 is a flow chart of a neural network training method in an embodiment of the present application;
fig. 3 is a functional block diagram of a face recognition system in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without inventive step, are within the scope of the present application.
First embodiment
The present embodiments provide a neural network training method and a face recognition method, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here. The present embodiment will be described in detail below.
Referring to fig. 1, the neural network training method provided in this embodiment includes a convolutional layer, a fully-connected layer, and a loss function module, which are connected in sequence.
The method may include multiple iterative trainings.
Referring to fig. 2, a single iteration of the training process includes: step S100, step S200, step S300, step S400, step S500, step S600, step S700, step S800, and step S900.
Step S100: and acquiring T feature vectors, wherein the T feature vectors comprise feature vectors of D reference face images and feature vectors of M sample face images, the number D of the feature vectors of the reference face images is the same as that of the weight vectors of the full-connection layer, each sample face image in the M sample face images and one reference face image in the D reference face images are face images of the same person, T is equal to the sum of D and M, and T, D, M are positive integers.
Referring to fig. 2, in a single iteration of the neural network training process, feature vectors of reference faces and a feature vector set of sample faces need to be obtained, the number of the feature vectors of the reference face images is the same as the number of the weight vectors of the full connection layer, and each sample face in the feature vector set of the sample faces corresponds to one face in the reference faces.
For example, in an application scenario of comparing a certificate photo and a living photo, a single iteration of the neural network training process needs to acquire feature vectors of D certificate photos and feature vector sets of M living photos to form T feature vector sets. The number D of the identification photos is the same as the number of the weight vectors of the full connecting layer, each of the M life photos corresponds to the same face as one of the D identification photos, and the total number of the feature vectors of the identification photos and the total number of the feature vectors of the life photos are T. Specifically, D may be 50000, and M may be 1280000, that is, feature vectors of 50000 identification photos and feature vector sets of 1280000 life photos are obtained, so as to form 1330000 feature vectors.
Step S200: setting the learning rate of the full connection layer to zero.
Referring to fig. 2, in a single iteration of the neural network training process, the learning rate of the fully-connected layer needs to be set to zero to ensure that the fully-connected layer is not affected by back propagation in the neural network training process.
For example, in the application scenario of the certificate photo and the living photo comparison, the learning rate of the fully-connected layer is set to zero in a single iteration of the neural network training process to ensure that the fully-connected layer is not affected by back propagation in the neural network training process.
Step S300: the T pieces are processedInputting the t-th feature vector in the feature vectors into a neural network, and obtaining the t-th output feature vector x through the processing of the convolutional layertAnd T is an integer greater than or equal to 1 and less than or equal to T, and the T-th feature vector is a feature vector of a reference face image or a feature vector of a sample face image in the T feature vectors.
Referring to fig. 2, in a single iteration of the neural network training process, T eigenvectors are sequentially sent into the neural network, and are sequentially obtained through processing of the convolutional layer, wherein the T-th output eigenvector is xt
For example, in the application scenario of comparing the identification photo and the living photo, 1330000 feature vectors need to be sequentially sent into the neural network, and 1330000 output feature vectors are sequentially obtained through the processing of the convolutional layer, wherein the t-th output feature vector is xt
Step S400: normalizing the tth output feature vector in a full-connection layer to obtain the tth normalized output feature vector
Figure BDA0001927994110000091
Referring to fig. 2, in a single iteration of the neural network training process, the output vectors output from the feature vectors via the convolutional layer are normalized in the full connection layer to obtain normalized output feature vectors, where the tth normalized output feature vector
Figure BDA0001927994110000092
For example, in the application scenario of comparing the identification photo and the living photo, the output vectors of the identification photo and the living photo, which are output by passing the feature vectors of the identification photo and the living photo through the convolution layer, are required to be normalized in the full connection layer to obtain the normalized output feature vectors of the identification photo and the living photo, wherein the tth normalized output feature vector is
Figure BDA0001927994110000101
Step S500: judging whether the t-th feature vector is the feature vector of the reference face image, and when the t-th feature vector is the feature vector of the reference face image, outputting the feature vector according to the t-th normalized output feature vector
Figure BDA0001927994110000102
Adjusting the weight matrix after the normalization of the full connection layer
Figure BDA0001927994110000103
The weight matrix after the normalization of the full connection layer
Figure BDA0001927994110000104
Consisting of D column vectors.
Referring to fig. 2, in a single iteration of the neural network training process, the weight matrix of the fully-connected layer needs to be adjusted according to the feature vector of the reference face image.
For example, in an application scenario of comparing a certificate photo and a living photo, a weight matrix composed of 50000 weight vectors of a fully-connected layer needs to be adjusted according to the feature vectors of 50000 certificate photos.
Optionally, step S500 includes: when the t-th feature vector is the feature vector of the d-th reference face image, the output vector of the t-th normalized reference image is used
Figure BDA0001927994110000105
Replacing with the d-th weight vector of the full connection layer; correspondingly, in the single iteration, after the steps are completely executed on the D reference face feature vectors, a full-connection-layer normalized weight matrix composed of the output vectors of the D normalized reference images is obtained
Figure BDA0001927994110000106
Method for adjusting weight matrix of full-connection layer according to characteristic vector of reference face image in single iteration of neural network training processComprises the following steps: when the t-th feature vector is the feature vector of the d-th reference face image, the output vector of the t-th normalized reference image is used
Figure BDA0001927994110000107
And replacing the d-th weight vector of the full connection layer. After a single iteration is performed, the weight matrix after full connection layer normalization
Figure BDA0001927994110000108
Consisting of the output vectors of the D normalized reference images. For example, in an application scenario of comparing a certificate photo and a living photo, a method for adjusting a weight matrix of a full connection layer according to a feature vector of the certificate photo includes: replacing the normalized output vector of the d-th certificate photo with the d-th weight vector of the full connection layer, and after executing single iteration, performing the normalized weight matrix of the full connection layer
Figure BDA0001927994110000109
The device consists of D normalized output vectors of the certificate photo.
Step S600: in the full connection layer, according to the normalized output vector
Figure BDA0001927994110000111
And the weight matrix after the normalization of the full connection layer
Figure BDA0001927994110000112
Get the t classification vector ft
Referring to fig. 2, in a single iteration of the neural network training process, after normalization processing is performed on the output vector obtained through convolutional layer processing in the fully-connected layer, a classification vector needs to be obtained according to the normalized output vector and the normalized weight matrix.
For example, in an application scenario of comparing a certificate photo and a living photo, in a single iteration of a neural network training process, after normalization processing is performed on an output vector obtained through convolution layer processing in a full connection layer, a classification vector needs to be obtained according to the output vector after normalization of the certificate photo or the living photo and a weight matrix after normalization.
Optionally, step S600 includes: outputting the normalized sample face image to a vector
Figure BDA0001927994110000113
And the normalized weight matrix
Figure BDA0001927994110000114
Multiplying to obtain a classification vector ftWherein, in the step (A),
Figure BDA0001927994110000115
correspondingly, in the single iteration, after the steps are completely executed on the T feature vectors, the T classification vectors are obtained. In a single iteration of the neural network training process, the t-th classification vector f is obtainedtThe method comprises the following steps: outputting the normalized human face image to a vector
Figure BDA0001927994110000116
And the normalized weight matrix
Figure BDA0001927994110000117
Multiplication, in other words, classification vectors
Figure BDA0001927994110000118
For example, in the application scenario of comparing the certificate photo and the living photo, the t-th classification vector f is obtained in a single iteration of the neural network training processtThe method comprises the following steps: the normalized output vector of the certificate photo or the life photo
Figure BDA0001927994110000119
And the normalized weight matrix
Figure BDA00019279941100001110
Multiplication.
Step S700: classifying the t-thInputting the vectors into a loss function module, and judging whether the number of the classified vectors in the loss function module reaches NnN, NnN is an integer of 1 or more and T or lessnThe sum is equal to T.
Referring to fig. 2, in a single iteration of the neural network training process, parameters of the convolutional layer need to be optimized according to a specified optimization algorithm according to a loss function value output by the loss function, and each loss function value needs to be obtained in the loss function module by calculation according to a plurality of classification vectors, so that when the t-th classification vector is input into the loss function module, it is necessary to determine whether the number of the classification vectors in the loss function module reaches a specified number of the calculated loss function values.
For example, in an application scenario of comparing a certificate photo and a life photo, when a classification vector of the certificate photo or the life photo is input into a loss function module, it is necessary to determine whether the number of the classification vectors in the loss function module reaches a specified number for calculating a loss function value.
Step S800: when the number of classification vectors in the loss function module reaches NnIn time, the loss function module is based on the NnObtaining an nth loss function value according to the classification vectors, optimizing the parameters of the convolution layer according to the nth loss function value and a specified optimization algorithm, and emptying the classification vectors of the loss function module.
Referring to fig. 2, in a single iteration of the neural network training process, a plurality of loss function values need to be obtained according to T classification vectors input to the loss function module, wherein the nth loss function value needs to be obtained in the loss function module according to NnThe classification vectors are obtained by calculation. After the nth loss function value, the parameters of the convolution layer need to be optimized according to a specified optimization algorithm, and the classification vectors of the loss function module are emptied, so that the number of the classification vectors in the loss function module is conveniently recalculated. After one iteration is performed, the loss function module outputs a plurality of loss function values, and accordingly, the parameters of the convolutional layer are updated for a plurality of times.
For example, in the application scenario of comparing the identification photo and the living photo, 1330000 feature vector sets consisting of feature vectors of the identification photo and feature vectors of the living photo are input into the neural network, the loss function module will output a plurality of loss function values in sequence, and each time a loss value is output, the parameters of the convolution layer are updated according to the loss value.
Optionally, step S800 includes: setting a cosine measurement scaling parameter s in the loss function and another cosine measurement scaling parameter m in the loss function according to a specified rule, wherein m is greater than or equal to 0 and less than or equal to 1; according to the N in the loss function modulenA number of said classification vectors
Figure BDA0001927994110000121
And the following formula, obtaining the nth loss value LnWherein j is 1 to N in sequencenInteger of (1), NnThe number of classification vectors required to calculate the nth loss value, correspondingly, njTake in turn
Figure BDA0001927994110000131
To
Figure BDA0001927994110000132
The number of the integer (c) of (d),
Figure BDA0001927994110000133
wherein the n-thjA classification vector
Figure BDA0001927994110000134
And yjThe reference face images correspond to the same face,
Figure BDA0001927994110000135
as a classification vector
Figure BDA0001927994110000136
Y in (1)jValue, yjIs an integer of 1 to D. For example, in the application scenario of the certificate photo and the living photo, the cosine measure scaling parameter s in the loss function is set to 45, and m is set to 0.35. Classifying the life photo or certificate photo
Figure BDA0001927994110000137
Inputting the j into the loss function module, and sequentially taking 1 to NnTo get the nth loss value Ln. The weight matrix W is composed of 50000 weight vectors, where the y-thjThe individual weight vector is the y-th weight vector in 50000 identification photographsjAnd the weight vector of the full connecting layer corresponding to each identification photo.
Optionally, the step of optimizing the parameters of the convolutional layer according to the specified optimization algorithm according to the nth loss function value by using a random gradient descent method includes: setting parameters of the random gradient descent method according to preset conditions; and obtaining parameters of the convolutional layer according to the nth loss value and the random gradient descent method, and adjusting the parameters of the convolutional layer. For example, in an application scenario of comparing a certificate photo and a living photo, an impulse parameter of a random gradient descent method may be set to be 0.9, a learning rate of a convolutional layer may be set to be 0.1, a nth parameter of the convolutional layer may be obtained according to the nth loss value and the random gradient descent method, and a parameter of the neural network convolutional layer may be adjusted to be the nth parameter.
Step S900: and sequentially taking 1 to T according to T, executing the steps S100, S200, S300, S400, S500, S600, S700 and S800 to obtain the target parameters of the convolutional layer.
Referring to fig. 2, in a single iteration of the neural network training process, T needs to be sequentially taken from 1 to T, so as to ensure that all samples obtained in the single iteration are input into the neural network for processing.
For example, in the application scenario of comparing the certificate photo and the life photo, T is taken from 1 to T, so as to ensure that the feature vectors of all the certificate photos and the feature vectors of the life photo obtained in a single iteration are input into the neural network.
Optionally, before step S900, the method further includes: initializing parameters of the convolutional layer and weight of the full-link layer. For example, in an application scenario of comparing a certificate photo and a living photo, before performing initial iterative training, the parameters of the convolutional layer and the weight of the fully-connected layer may be set to random values.
Optionally, after step S900, the method further includes: and executing multiple iterations until a preset iteration termination condition is met, wherein the T eigenvectors obtained in any two iterations of the multiple iterations are the same. For example, in an application scenario of comparing a certificate photo and a life photo, multiple iterations are required, and feature vectors of the certificate photo and the life photo used in each iteration are the same.
Second embodiment
The face recognition method provided by the embodiment comprises the following steps: respectively performing feature extraction on a reference image feature vector and a sample image feature vector by the convolutional layer obtained by performing at least one iterative training by the method in the first embodiment to obtain a reference feature and a sample feature; calculating the similarity of the reference feature and the sample feature; and judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature. For example, in an application scenario of comparing a certificate photo and a life photo, the feature vector of the certificate photo and the feature vector of the life photo are respectively subjected to feature extraction by the convolution layer obtained by performing at least one iterative training according to the method in the first embodiment, and the similarity of the certificate photo feature and the life photo feature is calculated; and judging whether the face corresponding to the feature vector of the identification photo and the face corresponding to the feature vector of the life photo are the faces of the same person or not according to the similarity of the face corresponding to the identification photo feature and the life photo feature. Specifically, under the condition of one thousandth of false alarm rate, when the convolution layer is obtained by performing iterative training for 120 hours according to the method described in the first embodiment, feature extraction is performed on the feature vector of the identification photo and the feature vector of the life photo through the obtained convolution layer, and finally, the accuracy rate of judging whether the face corresponding to the feature vector of the identification photo and the face corresponding to the feature vector of the life photo are the face of the same person can reach more than 60%, whereas in the prior art, the accuracy rates of NormFace and CosFace are respectively 9.2% and 32.2%; correspondingly; under the scene of one-tenth false alarm rate, when the convolution layer is obtained by performing 120-hour iterative training according to the method described in the first embodiment, feature extraction is performed on the feature vector of the identification photograph and the feature vector of the life photograph through the obtained convolution layer, and finally, the accuracy rate of judging whether the face corresponding to the feature vector of the identification photograph and the face corresponding to the feature vector of the life photograph are the same person face can reach more than 40%, whereas in the prior art, the accuracy rate of the NormFace and the accuracy rate of the CosFace are respectively 2.6% and 19.3%.
In a traditional life photo comparison scene, the feature vectors of different life photos are respectively subjected to feature extraction by the convolutional layer obtained by carrying out at least one iterative training according to the method in the embodiment one, and the similarity of the features of the different life photos is calculated; and judging whether the faces corresponding to the feature vectors of different life photographs are the faces of the same person or not according to the similarity of the faces corresponding to the features of the different life photographs. Specifically, in the one-thousandth false alarm rate scenario, when the convolutional layer is obtained by performing 120-hour iterative training according to the method described in the first embodiment, and feature extraction is respectively carried out on the feature vectors of different life photographs through the obtained convolution layer, the accuracy rate of judging whether the human faces corresponding to the feature vectors of the different life photographs are the human faces of the same person is 94.8 percent, the accuracy rates of NormFace and CosFace in the prior art are 85.0 percent and 91.9 percent respectively, in the scenario of one-tenth false alarm rate, when the convolutional layer is obtained by performing 120-hour iterative training according to the method described in the first embodiment, and respectively extracting the features of the feature vectors of different life photographs through the obtained convolution layer, and finally judging whether the accuracy of the faces corresponding to the feature vectors of the different life photographs are the faces of the same person is 92.2%, and the accuracy of NormFace and CosFace in the prior art is 73.2% and 87.9% respectively.
Third embodiment
An embodiment of the present application provides a face recognition system, please refer to fig. 3, including:
a convolutional layer obtained by performing at least one iterative training by the method according to the first embodiment, where the convolutional layer is used to perform feature extraction on a reference image feature vector and a sample image feature vector respectively to obtain a reference feature and a sample feature;
a cosine distance calculation module for calculating the similarity of the reference feature and the sample feature;
and the judging module is used for judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature. The judging module may be a judging module that determines whether the faces corresponding to the input reference image feature vector and the sample image feature vector are the same face by using a preset threshold.
In this embodiment, the reference feature and the sample feature obtained by respectively performing feature extraction on the reference image feature vector and the sample image feature vector by the convolutional layer obtained by performing at least one iterative training according to the method described in the first embodiment can well express the features of the reference image and the sample image, so as to ensure that the output results of the cosine distance calculation module and the determination module have higher accuracy.
Fourth embodiment
An embodiment of the present application provides a storage medium, where instructions are stored on the storage medium, and when the instructions are run on a computer, the instructions cause the computer to execute the face recognition methods in the first embodiment and the second embodiment.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by hardware, or by software plus a necessary general hardware platform, and based on such understanding, the technical solution of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute the method of the various implementation scenarios of the present invention.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. A neural network training method is characterized in that the neural network comprises a convolutional layer, a fully-connected layer and a loss function module which are connected in sequence, and in the process of single iteration training, the method comprises the following steps:
acquiring T feature vectors, wherein the T feature vectors comprise feature vectors of D reference face images and feature vectors of M sample face images, the number D of the feature vectors of the reference face images is the same as that of the weight vectors of the full-connection layer, each sample face image in the M sample face images and one reference face image in the D reference face images are face images of the same person, T is equal to the sum of D and M, and T, D, M are positive integers;
setting the learning rate of the full connection layer to be zero;
inputting the T-th feature vector in the T feature vectors into a neural network, and obtaining the T-th output feature vector x through the processing of the convolutional layertWherein T is an integer greater than or equal to 1 and less than or equal to T, and the tth feature vector is a feature vector of a reference face image or a feature vector of a sample face image in the T feature vectors;
normalizing the tth output feature vector in a full-connection layer to obtain the tth normalized output feature vector
Figure FDA0002585806090000011
Judging whether the t-th feature vector is the feature vector of a reference face image, and when the t-th feature vector is the feature vector of the D-th reference face image in the D reference face images, outputting the t-th normalized output feature vector
Figure FDA0002585806090000012
Replacing the D-th weight vector of the full connection layer, wherein D is a positive integer which is more than 1 and less than or equal to D, and in the single iteration, after the characteristic vectors of the D reference face images are replaced, obtaining a full connection layer normalized weight matrix formed by the output vectors of the D normalized reference face images
Figure FDA0002585806090000013
In the full connection layer, according to the normalized output feature vector
Figure FDA0002585806090000014
And the weight matrix after the normalization of the full connection layer
Figure FDA0002585806090000015
Get the t classification vector ft
Inputting the t-th classification vector into a loss function module, and judging whether the number of the classification vectors in the loss function module reaches NnA, wherein NnDenotes the number of samples in the nth sample after dividing the total sample T into NnIs an integer greater than or equal to 1, less than or equal to T, the sum of N1, N2.. Nn being equal to T;
when the number of classification vectors in the loss function module reaches NnIn time, the loss function module is based on the NnObtaining the nth loss function value according to the classification vector, and according to the nth loss function value, according to the indexA fixed optimization algorithm, which optimizes the parameters of the convolution layer and empties the classification vectors of the loss function module;
and sequentially taking 1 to T according to T to execute the steps to obtain the target parameters of the convolutional layer.
2. The method of claim 1, wherein in the fully-connected layer, the normalized output eigenvector is used as a basis for the output eigenvector
Figure FDA0002585806090000021
And the weight matrix after the normalization of the full connection layer
Figure FDA0002585806090000022
Get the t classification vector ftThe method comprises the following steps:
the normalized output feature vector is processed
Figure FDA0002585806090000023
And the normalized weight matrix
Figure FDA0002585806090000024
Multiplying to obtain a classification vector ftWherein, in the step (A),
Figure FDA0002585806090000025
correspondingly, in the single iteration, after the steps are completely executed on the T feature vectors, the T classification vectors are obtained.
3. The method of claim 2, wherein the loss function module is based on the NnObtaining an nth loss function value by the classification vector, including:
setting a cosine measurement scaling parameter s in the loss function and another cosine measurement scaling parameter m in the loss function according to a specified rule, wherein m is greater than or equal to 0 and less than or equal to 1;
according to the N in the loss function modulenA number of said classification vectors
Figure FDA0002585806090000026
And the following formula, obtaining the nth loss value LnWherein j is 1 to N in sequencenInteger of (1), NnThe number of classification vectors required to calculate the nth loss value, correspondingly, njTake in turn
Figure FDA0002585806090000027
To
Figure FDA0002585806090000028
The number of the integer (c) of (d),
Figure FDA0002585806090000031
wherein the classification vector
Figure FDA0002585806090000032
And yjThe reference face images correspond to the same face,
Figure FDA0002585806090000033
as a classification vector
Figure FDA0002585806090000034
Y in (1)jValue, yjIs an integer of 1 to D.
4. The method of claim 3, wherein the specified optimization algorithm is a stochastic gradient descent method, and wherein optimizing the parameters of the convolutional layer according to the specified optimization algorithm based on the nth loss function value comprises:
setting parameters of the random gradient descent method according to preset conditions;
and obtaining parameters of the convolutional layer according to the nth loss value and the random gradient descent method, and adjusting the parameters of the convolutional layer.
5. The method of claim 1, wherein prior to performing the first iterative training, the method further comprises:
initializing parameters of the convolutional layer and weight of the full-link layer.
6. The method of claim 1, further comprising:
and executing multiple iterations until a preset iteration termination condition is met, wherein the T eigenvectors obtained in any two iterations of the multiple iterations are the same.
7. A method of face recognition, comprising:
respectively performing feature extraction on a reference image feature vector and a sample image feature vector by the convolutional layer obtained by performing at least one iterative training through the method according to any one of claims 1 to 6 to obtain a reference feature and a sample feature;
calculating the similarity of the reference feature and the sample feature;
and judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature.
8. A face recognition system, comprising:
a convolutional layer obtained by performing at least one iterative training through the method according to any one of claims 1 to 6, wherein the convolutional layer is used for performing feature extraction on the reference image feature vector and the sample image feature vector respectively to obtain a reference feature and a sample feature;
a cosine distance calculation module for calculating the similarity of the reference feature and the sample feature;
and the judging module is used for judging whether the face corresponding to the sample image feature vector and the face corresponding to the reference image feature vector are the faces of the same person or not according to the similarity of the reference feature and the face corresponding to the sample feature.
9. A storage medium having stored thereon instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 6.
CN201811630038.1A 2018-12-28 2018-12-28 Neural network training method, face recognition system and storage medium Active CN109711358B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811630038.1A CN109711358B (en) 2018-12-28 2018-12-28 Neural network training method, face recognition system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811630038.1A CN109711358B (en) 2018-12-28 2018-12-28 Neural network training method, face recognition system and storage medium

Publications (2)

Publication Number Publication Date
CN109711358A CN109711358A (en) 2019-05-03
CN109711358B true CN109711358B (en) 2020-09-04

Family

ID=66259348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811630038.1A Active CN109711358B (en) 2018-12-28 2018-12-28 Neural network training method, face recognition system and storage medium

Country Status (1)

Country Link
CN (1) CN109711358B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378092B (en) * 2019-07-26 2020-12-04 北京积加科技有限公司 Identity recognition system, client, server and method
CN110610140B (en) * 2019-08-23 2024-01-19 平安科技(深圳)有限公司 Training method, device and equipment of face recognition model and readable storage medium
CN112949672B (en) * 2019-12-11 2024-10-15 顺丰科技有限公司 Commodity identification method, commodity identification device, commodity identification equipment and computer readable storage medium
CN111242162B (en) * 2019-12-27 2023-06-20 北京地平线机器人技术研发有限公司 Training method and device of image classification model, medium and electronic equipment
CN113269010B (en) * 2020-02-14 2024-03-26 深圳云天励飞技术有限公司 Training method and related device for human face living body detection model
CN113554145B (en) * 2020-04-26 2024-03-29 伊姆西Ip控股有限责任公司 Method, electronic device and computer program product for determining output of neural network
CN112509154B (en) * 2020-11-26 2024-03-22 北京达佳互联信息技术有限公司 Training method of image generation model, image generation method and device
CN112862096A (en) * 2021-02-04 2021-05-28 百果园技术(新加坡)有限公司 Model training and data processing method, device, equipment and medium
CN112906676A (en) * 2021-05-06 2021-06-04 北京远鉴信息技术有限公司 Face image source identification method and device, storage medium and electronic equipment
WO2023283805A1 (en) * 2021-07-13 2023-01-19 深圳大学 Face image clustering method, apparatus and device, and computer-readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850825A (en) * 2015-04-18 2015-08-19 中国计量学院 Facial image face score calculating method based on convolutional neural network
CN105138993A (en) * 2015-08-31 2015-12-09 小米科技有限责任公司 Method and device for building face recognition model
CN106951825A (en) * 2017-02-13 2017-07-14 北京飞搜科技有限公司 A kind of quality of human face image assessment system and implementation method
CN107871106A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Face detection method and device
CN107992842A (en) * 2017-12-13 2018-05-04 深圳云天励飞技术有限公司 Biopsy method, computer installation and computer-readable recording medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018022821A1 (en) * 2016-07-29 2018-02-01 Arizona Board Of Regents On Behalf Of Arizona State University Memory compression in a deep neural network
CN108009625B (en) * 2016-11-01 2020-11-06 赛灵思公司 Fine adjustment method and device after artificial neural network fixed point
CN106780906B (en) * 2016-12-28 2019-06-21 北京品恩科技股份有限公司 A kind of testimony of a witness unification recognition methods and system based on depth convolutional neural networks
CN106991474B (en) * 2017-03-28 2019-09-24 华中科技大学 The parallel full articulamentum method for interchanging data of deep neural network model and system
CN107609634A (en) * 2017-08-21 2018-01-19 哈尔滨工程大学 A kind of convolutional neural networks training method based on the very fast study of enhancing
CN108229298A (en) * 2017-09-30 2018-06-29 北京市商汤科技开发有限公司 The training of neural network and face identification method and device, equipment, storage medium
CN107886064B (en) * 2017-11-06 2021-10-22 安徽大学 Face recognition scene adaptation method based on convolutional neural network
CN107944399A (en) * 2017-11-28 2018-04-20 广州大学 A kind of pedestrian's recognition methods again based on convolutional neural networks target's center model
CN108009528B (en) * 2017-12-26 2020-04-07 广州广电运通金融电子股份有限公司 Triple Loss-based face authentication method and device, computer equipment and storage medium
CN108090565A (en) * 2018-01-16 2018-05-29 电子科技大学 Accelerated method is trained in a kind of convolutional neural networks parallelization
CN108416343B (en) * 2018-06-14 2020-12-22 北京远鉴信息技术有限公司 Face image recognition method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850825A (en) * 2015-04-18 2015-08-19 中国计量学院 Facial image face score calculating method based on convolutional neural network
CN105138993A (en) * 2015-08-31 2015-12-09 小米科技有限责任公司 Method and device for building face recognition model
CN107871106A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Face detection method and device
CN106951825A (en) * 2017-02-13 2017-07-14 北京飞搜科技有限公司 A kind of quality of human face image assessment system and implementation method
CN107992842A (en) * 2017-12-13 2018-05-04 深圳云天励飞技术有限公司 Biopsy method, computer installation and computer-readable recording medium

Also Published As

Publication number Publication date
CN109711358A (en) 2019-05-03

Similar Documents

Publication Publication Date Title
CN109711358B (en) Neural network training method, face recognition system and storage medium
CN108182394B (en) Convolutional neural network training method, face recognition method and face recognition device
CN110210560B (en) Incremental training method, classification method and device, equipment and medium of classification network
CN107529650B (en) Closed loop detection method and device and computer equipment
WO2019237846A1 (en) Image processing method and apparatus, face recognition method and apparatus, and computer device
CN109584884B (en) Voice identity feature extractor, classifier training method and related equipment
CN112800876B (en) Super-spherical feature embedding method and system for re-identification
CN109460793A (en) A kind of method of node-classification, the method and device of model training
CN111291817B (en) Image recognition method, image recognition device, electronic equipment and computer readable medium
CN107679572B (en) Image distinguishing method, storage device and mobile terminal
KR20160072768A (en) Method and apparatus for recognizing and verifying image, and method and apparatus for learning image recognizing and verifying
CN110929836B (en) Neural network training and image processing method and device, electronic equipment and medium
TWI803243B (en) Method for expanding images, computer device and storage medium
CN111914908A (en) Image recognition model training method, image recognition method and related equipment
CN114155388B (en) Image recognition method and device, computer equipment and storage medium
CN113011532A (en) Classification model training method and device, computing equipment and storage medium
CN111382791A (en) Deep learning task processing method, image recognition task processing method and device
US9928408B2 (en) Signal processing
CN112446428B (en) Image data processing method and device
CN114547365A (en) Image retrieval method and device
CN114359993A (en) Model training method, face recognition device, face recognition equipment, face recognition medium and product
CN112257689A (en) Training and recognition method of face recognition model, storage medium and related equipment
CN117237757A (en) Face recognition model training method and device, electronic equipment and medium
CN111507218A (en) Matching method and device of voice and face image, storage medium and electronic equipment
CN116503670A (en) Image classification and model training method, device and equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 80001-2, floor 7, building 1, No. 158, West Fourth Ring North Road, Haidian District, Beijing 100097

Applicant after: Beijing Yuanjian Information Technology Co.,Ltd.

Address before: 615000 3 people's West Road, new town, Zhaojue County, Liangshan Yi Autonomous Prefecture, Sichuan 1-1

Applicant before: SICHUAN YUANJIAN TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190503

Assignee: RUN TECHNOLOGIES Co.,Ltd. BEIJING

Assignor: Beijing Yuanjian Information Technology Co.,Ltd.

Contract record no.: X2022990000639

Denomination of invention: Neural network training method, face recognition method and system and storage medium

Granted publication date: 20200904

License type: Common License

Record date: 20220913

EE01 Entry into force of recordation of patent licensing contract