CN108133238B

CN108133238B - Face recognition model training method and device and face recognition method and device

Info

Publication number: CN108133238B
Application number: CN201711487853.2A
Authority: CN
Inventors: 孙源良; 段立新; 刘萌
Original assignee: Sic Youe Data Co ltd
Current assignee: Guoxin Youe Data Co Ltd
Priority date: 2017-12-29
Filing date: 2017-12-29
Publication date: 2020-05-19
Anticipated expiration: 2037-12-29
Also published as: CN108133238A

Abstract

The invention provides a face recognition model training method and device and a face recognition method and device. The face recognition model training method comprises the following steps: inputting the comparison image set into a first neural network, and extracting a first feature vector for each comparison image in the comparison image set by using the first neural network; inputting the source image set and the target image set into a second neural network, and extracting a second feature vector for each source image in the source image set after performing feature learning on the source images in the source image set and the target images in the target image set; inputting the second feature vector into a face classifier to obtain a classification result; performing the training of the current round on a second neural network and a face classifier based on the comparison result of the first feature vector and the corresponding second feature vector and the classification result; and performing multi-round training on the second neural network and the face classifier to obtain a face recognition model. The method can effectively identify the face image with poor quality.

Description

Face recognition model training method and device and face recognition method and device

Technical Field

The invention relates to the technical field of image processing, in particular to a face recognition model training method and device and a face recognition method and device.

Background

The traditional identity authentication method mainly proves the identity through identity identification articles (such as keys, identity documents, bank cards and the like) and identity identification knowledge (such as user names and passwords). In such a mode, the personal identification object and the identification knowledge for identifying the identity are stolen or forgotten by the outside, so that the identity is easily faked or replaced by others. Different from the traditional identity authentication mode, the biological characteristics (signature, fingerprint, face, iris, palm print and the like) become a new medium for identity authentication due to the characteristics of uniqueness, invariance for life, portability, difficult loss and falseness, good anti-counterfeiting performance and the like, and have wide application prospect. The object identified by the method is the person per se, other markers outside the person are not needed, and the identification by the method is safer, more accurate and more reliable. Meanwhile, the system is easy to be matched with a computer to integrate safety, monitoring and management systems, and automatic management is realized.

Compared with fingerprint identification, iris identification and palm print identification, the face identification has the characteristics of non-contact and uneasy to be perceived, so that the identification method for identity authentication through the face identification does not make users feel dislike, and is uneasy to be deceived because the method is uneasy to attract the attention of people. In principle, the face recognition is to extract the relevant features of the detected face from the video or picture, compare the features with the face features in the existing database, and then calculate which kind of face in the database the features of the detected face are most similar to, so as to obtain the identity corresponding to the detected face. Due to the 3D structure of the human face, the shadow cast by illumination can strengthen or weaken the original human face features. In addition, the sources of the face images may be various, the quality of the obtained face images is different due to the difference of the acquisition equipment, and particularly, for the face images with poor quality such as low resolution, large noise and the like, the existing face recognition method cannot effectively perform face recognition.

Disclosure of Invention

In view of the above, an object of the embodiments of the present invention is to provide a method and an apparatus for training a face recognition model, and a method and an apparatus for face recognition, which can effectively recognize a face image with preset features, where training samples of the face image with the preset features are insufficient.

In a first aspect, an embodiment of the present invention provides a face recognition model training method, including:

acquiring a comparison image set, a source image set and a target image set;

inputting the comparison image set into a first neural network, and extracting a first feature vector for each comparison image in the input comparison image set by using the first neural network;

inputting the source image set and the target image set into a second neural network, and extracting a second feature vector for each source image in the source image set after performing feature learning on the source images in the source image set and the target images in the target image set; inputting the second feature vector into a face classifier to obtain a classification result;

performing a current round of training on the second neural network and the face classifier based on a comparison result of the first feature vector and the corresponding second feature vector and the classification result;

performing multi-round training on the second neural network and the face classifier to obtain a face recognition model;

wherein the comparison image set comprises at least one comparison image with a label; the source image set comprises at least one tagged source image; and the comparison image from which the first feature vector is compared and the source image from which the corresponding second feature vector is from are images of the same person.

In a second aspect, an embodiment of the present invention further provides a face recognition method, including:

acquiring an image to be identified;

and performing face recognition on the image to be recognized by using the face recognition model obtained by the face recognition model training method of any one of the first aspect, and generating a face recognition result corresponding to the image to be recognized.

In a third aspect, an embodiment of the present invention further provides a face recognition model training device, where the face recognition model training device includes:

the acquisition module is used for acquiring the comparison image set, the source image set and the target image set;

the first processing module is used for inputting the comparison image set into a first neural network and extracting a first feature vector for each comparison image in the input comparison image set by using the first neural network;

the second processing module is used for inputting the source image set and the target image set into a second neural network, and extracting a second feature vector for each source image in the source image set after performing feature learning on the source images in the source image set and the target images in the target image set; inputting the second feature vector into a face classifier to obtain a classification result;

the training module is used for carrying out the training of the second neural network and the face classifier in the current round based on the comparison result of the first feature vector and the corresponding second feature vector and the classification result; performing multi-round training on the second neural network and the face classifier to obtain a face recognition model;

In a fourth aspect, an embodiment of the present invention further provides a face recognition apparatus, where the apparatus includes:

the image acquisition module is used for acquiring an image to be identified;

the recognition module is used for carrying out face recognition on the image to be recognized by using the face recognition model obtained by the face recognition model training method provided by the embodiment of the invention, and generating a face recognition result corresponding to the image to be recognized.

According to the face recognition model training method and device and the face recognition method and device provided by the embodiment of the invention, when the face recognition model is trained, the source image set and the target image set are input into the same second neural network, and the second neural network is used for carrying out parameter-sharing feature learning on the source image in the source image set and the target image in the target image set, so that the second neural network is influenced by the target image during training, and some features of the target image are learned, therefore, a second feature vector obtained when the second neural network is used for carrying out feature extraction on the source image can extract the features of the source image, and the extracted second feature vector can introduce the features of the target image with poor quality; when the second feature vector is used for training the face classifier, the influence of the features of the target image with poor quality can be received, so that the face recognition model obtained by training through the face recognition model training method can obtain a good recognition effect when the face recognition is performed on the image to be recognized with poor quality.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

FIG. 1 is a flow chart of a face recognition model training method according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a specific distance determining operation method in the face recognition model training method according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating a specific classification operation method in the face recognition model training method according to the embodiment of the present invention;

FIG. 4 is a flow chart of another face recognition model training method provided by the embodiment of the invention;

fig. 5 is a flowchart illustrating a specific domain classification operation method in the face recognition model training method according to the embodiment of the present invention;

FIG. 6 shows an example of a face recognition model training method provided by an embodiment of the present invention;

fig. 7 is a flowchart illustrating a face recognition method according to an embodiment of the present invention;

FIG. 8 is a schematic structural diagram illustrating a face recognition model training apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a face recognition apparatus according to an embodiment of the present invention;

fig. 10 shows a schematic structural diagram of a computer device provided by an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

The flow of face recognition generally includes: detecting a human face, namely detecting and positioning the position of the human face from an image; extracting features, namely extracting the features of the face from the positioned face position; reducing dimension of the features, and identifying classification. Most current research and application scenarios for face recognition require that the environment of face recognition is controllable and that people to be recognized cooperate, e.g. people to be recognized need to stand right in front of a camera and maintain uniform illumination and face without obstruction. In the non-constrained face recognition, the obtained face image can be generally considered to be the face image obtained under the controllable environment with the added preset features, for example, in the monitoring state, the face orientation is non-controllable, meanwhile, the illumination condition changes more frequently, the face can be shielded, meanwhile, the image quality obtained by different acquisition devices is different, and the image quality is poor, so that the face image obtained under the condition is considered to have the added low-resolution features, and the face image training sample with the preset features is insufficient, and the face recognition model training effect based on the face image training sample with the preset features is not good. Based on the above, the face recognition model training method and device, and the face recognition method and device provided by the invention can perform face recognition model training based on a large number of training samples obtained under a controllable environment, and a small number of training samples obtained in the non-constrained face recognition are added in the training process, so that the trained model can effectively recognize face images obtained in the non-constrained face recognition.

In order to facilitate understanding of the embodiment, a detailed description is first given to a face recognition model training method disclosed in the embodiment of the present invention. In the present invention, the obtained face recognition model includes two parts: a feature extraction network and a classifier. The feature extraction network is used for extracting features of the image to be recognized, and the classifier is used for recognizing the face of the image to be recognized based on the features extracted by the feature extraction network.

Referring to fig. 1, a face recognition model training method provided by the embodiment of the present invention includes:

s101: and acquiring a comparison image set, a source image set and a target image set.

In a specific implementation, the comparison image set includes a plurality of comparison images, and the comparison images are images with good image quality, for example, clear face images with no obstruction on the face obtained by using an image acquisition device with high resolution under a uniform illumination condition. The face in the contrast image may be a face image of various angles, such as a face front view image, a side view image, a squint image, a bottom view image, a top view image, and the like. The images in the comparison image set are used for constraining the training process of the face recognition model and training a classifier in the face recognition model.

The source image set comprises a plurality of source images, the source images are images with good image quality, for example, clear face images with no obstruction on faces are obtained by using image acquisition equipment with high resolution under the condition of uniform illumination. The face in the source image can be a face image with various angles, such as a face front view image, a side view image, a squint image, a bottom view image, a top view image and the like.

The target image set includes a plurality of target images, and the target images are images with preset characteristics, such as: the image with poor image quality can be an unclear human face image acquired by an image acquisition device with low resolution under various non-uniform different illumination conditions. The face in the target image can also be a face image of various angles.

Further, it can be considered that the number of images with better image quality in the source image set is sufficient as training samples, and the actual requirement requires face recognition on images with preset features, and the number of images with preset features as training samples is insufficient, or the difficulty in the training process is large, so that the preset features need to be learned simultaneously in the process of learning images in the source image set through transfer learning, so that the preset features are fused with the features of the source images, and the trained face recognition model can recognize faces in images with preset features, and therefore, target images in the target image set can have features that the face recognition model wants to learn, for example, low-resolution features.

In specific implementation, the source image set and the target image set are used for training the face recognition model. In the process of training the face recognition model, the source images and the target images are subjected to domain fusion so as to realize the learning of features between mixed domains, so that the face recognition model obtained by training can learn some features in the target images while learning the features in the source images in the process of training the model.

Preferably, the comparison image in the comparison image set and the source image in the source image set may be the same. Because the comparison image in the comparison image set is the same as the source image in the source image set, the distance between the second neural network and the first characteristic vector extracted by the first neural network can be better used when the second neural network is used for extracting the second characteristic vector, the degree of the second neural network influenced by the target image is measured, a better second neural network training result is obtained, and the accuracy of face recognition is improved.

S102: the comparison image set is input into a first neural network, and a first feature vector is extracted for each comparison image in the input comparison image set by using the first neural network.

In a specific implementation, the first neural Network may perform feature extraction on each comparison image in the image set by using a Convolutional Neural Network (CNN) comparison, so as to obtain a first feature vector corresponding to each comparison image.

Specifically, the overall comparison image in the comparison image set may be subjected to feature extraction through the first neural network to obtain a first feature vector of the overall comparison image, or the first neural network may be used to perform local feature extraction for the input comparison image to generate a local first feature vector for each comparison image.

Here, when local feature extraction is performed on a comparison image input using the first neural network, a portion to be subjected to the local feature extraction is first determined. For example, local positions of glasses, mouth, nose, eyebrows and forehead in each comparison image can be determined first, and then feature extraction is performed on the selected local positions in each source image by using a CNN network, so as to generate corresponding first feature vectors for each local position. It should be noted here that there are a plurality of first feature vectors of corresponding parts of each comparison image; the local first feature vectors jointly form a first feature vector of the comparison image.

S103: inputting the source image set and the target image set into a second neural network, and extracting a second feature vector for each source image in the source image set after performing feature learning on the source images in the source image set and the target images in the target image set; and inputting the second feature vector into a face classifier to obtain a classification result.

In a specific implementation, a source image in a source image set carries a tag, and the tag is used for indicating the identity corresponding to a face in the source image; the target images in the set of target images may or may not carry tags. After the source image set and the target image set are input into the second neural network, the second neural network performs shared parameter feature learning on the source images in the source image set and the target images in the target image set. In the process, because the second neural network carries out supervised learning on the source images in the source image set and carries out unsupervised learning on the target images in the target image set, the parameters used in the second neural network can be continuously adjusted in the process of using the same second neural network to carry out parameter sharing learning on the two images, and therefore, in the process of training the second neural network, the parameters of the second neural network can be influenced by the target image set. And then the second neural network performs feature learning on the source images and the target images, and performs feature extraction on each source image to obtain the interference of the second feature vector by the target images, so that inter-domain mixing of the source images and the target images is realized.

S104: and performing the current round of training on the second neural network and the face classifier based on the comparison result of the first feature vector and the corresponding second feature vector and the classification result.

In a specific implementation, the first feature vector extracted from the comparison image in the comparison image set is used to constrain the influence of the target image on the second neural network, that is, the parameters in the second neural network are influenced by the target image, but the target image cannot greatly influence the parameters in the second neural network, so that the first feature vector and the second feature vector are compared after the first feature vector and the second feature vector are obtained. If the difference between the first feature vector and the second feature vector is too large, the influence of the target image on the parameters in the second neural network is too large, the parameters in the second neural network are readjusted, and the influence of the target image on the parameters in the second neural network is reduced.

Meanwhile, the parameters in the second neural network are influenced by the target image, so that the second feature vector extracted from the source image by the second neural network is influenced to a certain extent, and the second feature vector is used for carrying out supervised training on the classifier, so that the classifier obtained by training can correctly classify the source image by using the second feature vector extracted from a plurality of source images.

Further, the comparison image from which the first feature vector is compared and the source image from which the corresponding second feature vector is from are images of the same person, that is, in one round of training, the image set input to the first neural network and the second neural network may be images of a plurality of persons, and the corresponding feature vector may be output for each person.

Specifically, in an alternative embodiment, the embodiment of the present invention completes the current round of training on the second neural network and the face classifier based on the first neural network by performing the following distance determination operation and classification operation until the distance is not greater than the preset distance threshold and the obtained classification result is correct.

As shown in fig. 2, the distance determining operation includes:

s201: a distance between the first feature vector and the currently determined corresponding second feature vector is determined.

Here, the distance between the first feature vector and the second feature vector is actually used for measuring the similarity between the comparison image and the source image; the smaller the distance between the first feature vector and the second feature vector is, the more similar the comparison image and the source image are, and the greater the probability that the comparison image and the source image belong to the same face at the same time is.

S202: and generating first feedback information aiming at the condition that the distance is greater than a preset distance threshold value, and carrying out parameter adjustment on the second neural network based on the first feedback information.

Here, since the second neural network is constrained by using the distance between the first feature vector and the second feature vector, a distance threshold is preset, and when the first feature vector and the second feature vector are greater than the preset distance threshold, the first feedback information is generated. The first feedback information is fed back to the second neural network, and the second neural network can adjust the parameters based on the first feedback information.

S203: based on the adjusted parameters, a new second feature vector is extracted for each source image in the set of source images using the second neural network, and the distance determination operation is performed again.

Until the distance between the first feature vector and the second feature vector is not greater than a preset distance threshold.

It should be noted here that when the distance between the first feature vector and the second feature vector is not greater than the preset distance threshold, another feedback information is also generated and fed back to the second neural network, so that the second neural network performs a smaller-magnitude adjustment on the parameter, and strives for the gradient to decrease to the local optimum.

Referring to fig. 3, the sorting operation includes:

s301: and classifying the currently determined second feature vector by using a face classifier.

S302: and generating second feedback information aiming at the condition that the classification result is wrong, and adjusting parameters of the second neural network and the face classifier based on the second feedback information.

Specifically, when the result of the classifier classifying the currently determined second feature vector is wrong, it is considered that the second neural network has excessive parameters affected by the target image when domain-mixing the source image and the target image, and therefore when the classification result is wrong, the classifier generates second feedback information and feeds the second feedback information back to the second neural network and the face classifier. The second neural network and the face classifier can adjust the parameters of the second neural network and the face classifier respectively based on the second feedback information of the classification errors.

S303: based on the adjusted parameters, a new second feature vector is extracted for each source image in the source image set using the second neural network, and the classification operation is performed again.

Until the classification result of the face classifier is correct.

It should be noted here that when the classification of the classifier is correct, corresponding feedback information is also generated and fed back to the second neural network and the face classifier. The second neural network and the face classifier can adjust the parameters with smaller amplitude based on the correctly classified feedback information, and strive for gradient reduction to local optimum.

S105: and performing multi-round training on the second neural network and the face classifier to obtain a face recognition model.

In the specific implementation, a round of training is performed on the second neural network and the face classifier, which means that the second neural network and the face classifier are trained by using a group of comparison image sets, a source image set and a target image set. And then, continuously inputting a plurality of groups of comparison image sets, training a second neural network and a face classifier by the source image set and the target image set until the second neural network and the face classifier which meet the requirements are obtained, and taking the obtained second neural network and the face classifier as the obtained face recognition model.

In the face recognition model training method provided by the embodiment of the invention, the source image set and the target image set are input into the same second neural network, and the second neural network is used for carrying out feature learning of parameter sharing on the source image in the source image set and the target image in the target image set, so that the second neural network is influenced by the target image during training, and some features of the target image are learned, therefore, the second feature vector obtained when the second neural network is used for carrying out feature extraction on the source image can extract the features of the source image, and can also lead the extracted second feature vector to introduce preset features of the target image; when using the second eigenvector to train the face classifier, also can receive the influence of the preset characteristics of the target image, thereby making the face recognition model obtained by the training method of the face recognition model, when carrying out face recognition on the image to be recognized with the preset characteristics, obtaining better recognition effect, for example: the better recognition effect can be obtained by carrying out face recognition on the image with poor quality.

In another embodiment of the present invention, referring to fig. 4, after inputting the source image set and the target image set into the second neural network and performing feature learning on the source image and the target image, the method further includes:

s401: a third feature vector is extracted for each target image in the set of target images.

S402: and performing gradient inversion processing on the second feature vector and the third feature vector.

S403: and inputting the second feature vector and the third feature vector subjected to gradient reversal processing into a domain classifier.

S404: and adjusting parameters of the second neural network according to the domain classification result of the source image set and the target image set respectively represented by the second feature vector and the third feature vector of the domain classifier.

In the implementation, the process of training the second neural network by using the source image and the target image is actually a process of domain mixing the source image and the target image. The second feature vector obtained by feature extraction of the source image by using the second neural network is influenced by features in the target image, namely, the second feature vector is close to the features of the target image; meanwhile, a third feature vector obtained by using the second neural network to perform feature extraction on the source image is influenced by features in the source image, namely, the third feature vector is close to the features of the source image. Therefore, in order to realize domain mixing of the source image and the target image, after extracting the third feature vector for each target image in the target image set, the second feature vector and the third feature vector are subjected to gradient inversion processing, then the second feature vector and the third feature vector subjected to gradient inversion processing are input to a domain classifier, and the domain classifier is used for domain classification of the second feature vector and the third feature vector.

The result of the domain classification is correct, that is, the probability that the domain classifier can correctly classify the second feature vector and the third feature vector is higher, the smaller the degree of domain mixing is explained; the larger the probability of wrong result of the domain classification is, that is, the smaller the probability of correct classification of the second feature vector and the third feature vector by the domain classifier is, the larger the degree of domain mixing is, so that the parameter adjustment is performed on the second neural network based on the result of classification of the source image set and the target image set respectively guaranteed by the domain classifier on the second feature vector and the third feature vector.

Specifically, referring to fig. 5, the parameter adjustment of the second neural network according to the domain classification result can be realized by performing the following domain classification loss determination operation:

s501: and determining the domain classification loss of the current domain classification of the source image set and the target image set respectively characterized by the current second feature vector and the third feature vector.

Here, the degree of domain mixing is characterized by the domain classification loss. The domain classification loss of the source image set refers to the number of source images classified in the source image set into the target image set in the process of classifying the source images and the target images based on the second feature vectors and the third feature vectors. The domain classification loss of the target image set refers to the number of target images classified into the source image set in the target image set in the process of classifying the original image and the target images based on the second feature vector and the third feature vector. After the domain classifier is used for carrying out domain classification on the source image set and the target image set respectively represented by the second feature vector and the third feature vector, a domain classification result can be obtained, and then domain classification losses respectively corresponding to the source image set and the target image set are determined according to the domain classification result.

S502: and generating third feedback information aiming at the fact that the difference between the domain classification losses of the latest preset times is not smaller than a preset difference threshold value, and carrying out parameter adjustment on the second neural network based on the third feedback information.

Here, a preset difference threshold is used to constrain the degree of domain mixing. The domain classifier pre-stores the distribution of the domains to which the second feature vector and the third feature vector respectively belong, and when the difference between the domain classification losses of the latest preset times is not less than a preset difference threshold, the domain classification is considered to be in a stable state, that is, in a certain domain classification, the domain classifier can correctly distinguish the domains to which the second feature vector and the third feature vector respectively belong, in a certain domain classification, the domain classifier cannot correctly distinguish the domains to which the second feature vector and the third feature vector respectively belong, and the domain mixing degree is not stable, so that the parameters of the second neural network need to be adjusted, and third feedback information with overlarge domain classification loss difference can be generated and fed back to the second neural network. And after receiving the third feedback information with the excessive domain classification loss difference, the second neural network adjusts the parameters of the second neural network.

S503: based on the adjusted parameters, a second neural network is used for extracting a new second feature vector for each source image in the source image set, a new third feature vector is extracted for a target image in the target image set, domain classification loss determination operation is carried out until the difference is not greater than a preset difference threshold value, and the current round of training of the second neural network based on the domain classifier is completed.

Here, it should be noted that when the difference between the domain classification losses of the last preset number of times is smaller than the preset difference threshold, feedback information with appropriate domain classification loss difference is also generated and fed back to the second neural network. After receiving the feedback information with proper domain classification loss difference, the second neural network also adjusts the parameters of the second neural network in a smaller amplitude, and strives for gradient reduction to be locally optimal.

Therefore, through the domain classification loss, the distance between the first feature vector and the currently determined corresponding second feature vector and the triple constraint of the face classifier on the classification result of the second feature vector, a more optimized face recognition model can be obtained when the second neural network and the face classifier are trained.

Referring to fig. 6, an embodiment of the present invention further provides a specific example of a face model training method, including:

s601: and acquiring a comparison image set, a source image set and a target image set. Jumping to S602 and S603. Wherein, S602 and S603 are executed after S601 is executed, and the execution order of S602 and S603 is not limited.

S602: the comparison image set is input into a first neural network, and a first feature vector is extracted for each comparison image in the input comparison image set by using the first neural network. Jumping to S605.

S603: and inputting the source image set and the target image set into a second neural network, and performing feature learning on the source images in the source image set and the target images in the target image set. Jump to S604.

S604: and extracting a second feature vector for each source image in the source image set by using the second neural network after feature learning. Jumping to S605, S608 and S612. S605, S608, and S612 are executed after S604, and the execution order of S605, S608, and S612 is not limited.

S605: a distance between the first feature vector and a currently determined corresponding second feature vector is determined. Jump to S606.

S606: whether the distance is larger than a preset distance threshold value is detected. If yes, jumping to S607; if not, jumping to S619.

S607: and generating first feedback information, and performing parameter adjustment on the second neural network based on the first feedback information. Jump to S604.

S608: and inputting the second feature vector into a face classifier. Jump to S609.

S609: and classifying the currently determined second feature vector by using a face classifier. Jump to S610.

S610: and detecting whether the classification result is correct. If not, jumping to S611, if yes, jumping to S619.

S611: and generating second feedback information, and performing parameter adjustment on the second neural network and the face classifier based on the second feedback information. Jump to S604.

S612: and extracting a third feature vector for each target image in the target image set by using the second neural network subjected to feature learning. Jumping to S613:

s613: and performing gradient inversion processing on the second feature vector and the third feature vector. Jump to S614.

S614: and inputting the second feature vector and the third feature vector subjected to gradient reversal processing into a domain classifier. Jumping to S615.

S615: and performing domain classification on the source image set and the target image set respectively characterized by the second feature vector and the third feature vector by using a domain classifier. Jump to S616.

S616: and determining the domain classification loss of the current domain classification of the source image set and the target image set respectively characterized by the current second feature vector and the third feature vector. Jump to S617.

S617: detecting whether the difference between the domain classification losses of the latest preset times is smaller than a preset difference threshold value. If not, then jump to S618. If so, it jumps to S619.

S618: and generating third feedback information, and performing parameter adjustment on the second neural network based on the third feedback information. Jump to S604.

S619: the training of the current round is completed.

And when the second neural network and the face classifier are subjected to multi-round training and the parameters in the obtained second neural network and the face classifier tend to be stable, the obtained second neural network and the face classifier are used as a face recognition model. The face recognition model can effectively recognize face images with poor quality.

Further, in S606, S610, and S617, when "yes" is satisfied at the same time, the process may actually go to S619, and when one or "no" of the three exists, the current round of training is not really completed, but information is continuously fed back to the neural network to fine-tune the neural network as described above.

Referring to fig. 7, an embodiment of the present invention further provides a face recognition method, where the face recognition method specifically includes:

s701: acquiring an image to be identified;

s702: and carrying out face recognition on the image to be recognized by using the face recognition model to generate a face recognition result corresponding to the image to be recognized.

The face recognition model is obtained by adopting the face recognition model training method provided by the embodiment of the invention.

Specifically, the face recognition model includes: a second neural network and a face classifier; when a face recognition model is used for carrying out face recognition on an image to be recognized to generate a face recognition result corresponding to the image to be recognized, firstly, the image to be recognized is input into a second neural network, and feature extraction is carried out on the image to be recognized by using the second neural network to obtain a feature vector of the image to be recognized; and secondly, classifying the images to be recognized by adopting a face classifier based on the feature vectors of the images to be recognized to obtain the classification result of the images to be recognized.

In the face recognition method provided by the embodiment of the invention, when the face recognition model is trained, the source image set and the target image set are input into the same second neural network, and the second neural network is used for carrying out feature learning of parameter sharing on the source image in the source image set and the target image in the target image set, so that the second neural network is influenced by the target image during training, and some features of the target image are learned, therefore, a second feature vector obtained when the second neural network is used for carrying out feature extraction on the source image not only extracts the features capable of being extracted from the source image, but also introduces the features of the target image with poor quality into the extracted second feature vector; when the second feature vector is used for training the face classifier, the influence of the features of the target image with poor quality can be received, so that the face recognition model obtained by training through the face recognition model training method can obtain a good recognition effect when the face recognition is performed on the image to be recognized with poor quality.

Based on the same inventive concept, the embodiment of the invention also provides a face recognition model training device corresponding to the face recognition model training method.

Referring to fig. 8, a face recognition model training apparatus provided in an embodiment of the present invention includes:

the training module is used for carrying out the training of the current round on the second neural network and the face classifier based on the comparison result of the first feature vector and the corresponding second feature vector and the classification result; performing multi-round training on a second neural network and a face classifier to obtain a face recognition model;

wherein, the comparison image set comprises at least one comparison image with a label; the source image set comprises at least one source image with a label; and the comparison image from the first feature vector and the source image from the corresponding second feature vector are images of the same person.

Optionally, the second processing module is further configured to: inputting the source image set and the target image set into a second neural network, and extracting a third feature vector for each target image in the target image set after performing feature learning on the source images in the source image set and the target images in the target image set; and are

Carrying out gradient reversal processing on the second feature vector and the third feature vector; and

inputting the second feature vector and the third feature vector subjected to gradient reversal processing into a domain classifier;

and adjusting parameters of the second neural network according to the domain classification result of the source image set and the target image set respectively represented by the second feature vector and the third feature vector of the domain classifier.

Optionally, the training module is specifically configured to: performing the following distance determination operation and classification operation until the distance between the first feature vector and the second feature vector is not greater than a preset distance threshold and the obtained classification result is correct, and finishing the current round of training on a second neural network and a face classifier based on the first neural network;

the distance determining operation includes:

determining a distance between the first feature vector and a currently determined corresponding second feature vector;

generating first feedback information aiming at the condition that the distance is greater than a preset distance threshold value, and carrying out parameter adjustment on the second neural network based on the first feedback information;

based on the adjusted parameters, extracting a new second feature vector for each source image in the source image set by using a second neural network, and executing distance determination operation again;

the classification operation includes:

classifying the second feature vector determined currently by using a face classifier;

generating second feedback information aiming at the condition that the classification result is wrong, and adjusting parameters of the second neural network and the face classifier based on the second feedback information;

based on the adjusted parameters, a new second feature vector is extracted for each source image in the source image set using the second neural network, and the classification operation is performed again.

Optionally, the second processing module is specifically configured to: the following domain classification loss determination operations are performed:

determining the domain classification loss of the current domain classification of the source image set and the target image set respectively represented by the current second feature vector and the third feature vector;

generating third feedback information aiming at the fact that the difference between the domain classification losses of the latest preset times is not smaller than a preset difference threshold value, and adjusting parameters of the second neural network based on the third feedback information;

based on the adjusted parameters, a new second feature vector is extracted for each source image in the source image set by using the second neural network, a new third feature vector is extracted for each target image in the target image set, and domain classification loss determination operation is executed until the difference is not greater than a preset difference threshold value, and the current round of training on the second neural network based on the domain classifier is completed.

Optionally, the first processing module is specifically configured to: and performing local feature extraction on the input comparison images by using a first neural network, and generating a local first feature vector for each comparison image.

Optionally, the source image set is the same as the comparison image set.

According to the face recognition model training device provided by the embodiment of the invention, when the face recognition model is trained, the source image set and the target image set are input into the same second neural network, and the second neural network is used for performing feature learning of parameter sharing on the source image in the source image set and the target image in the target image set, so that the second neural network is influenced by the target image during training, and some features of the target image are learned, therefore, the second feature vector obtained when the second neural network is used for performing feature extraction on the source image can extract the features of the source image, and the extracted second feature vector can introduce the features of the target image with poor quality; when the second feature vector is used for training the face classifier, the influence of the features of the target image with poor quality can be received, so that the face recognition model obtained by training through the face recognition model training method can obtain a good recognition effect when the face recognition is performed on the image to be recognized with poor quality.

Referring to fig. 9, an embodiment of the present invention further provides a face recognition apparatus, where the apparatus includes:

the image acquisition module is used for acquiring an image to be identified;

the recognition module is used for carrying out face recognition on the image to be recognized by using the face recognition model obtained by the face recognition model training method provided by the embodiment of the invention so as to generate a face recognition result corresponding to the image to be recognized.

Specifically, the face recognition model includes: a second neural network and a face classifier; the identification module is specifically used for inputting the image to be identified into a second neural network, extracting the features of the image to be identified by using the second neural network and acquiring the feature vector of the image to be identified; and classifying the images to be recognized by adopting a face classifier based on the characteristic vectors of the images to be recognized to obtain the classification result of the images to be recognized.

In the face recognition device provided by the embodiment of the invention, when the face recognition model is trained, the source image set and the target image set are input into the same second neural network, and the second neural network is used for performing feature learning of parameter sharing on the source image in the source image set and the target image in the target image set, so that the second neural network is influenced by the target image during training, and some features of the target image are learned, therefore, a second feature vector obtained when the second neural network is used for performing feature extraction on the source image not only extracts the features capable of being extracted from the source image, but also introduces the features of the target image with poor quality into the extracted second feature vector; when the second feature vector is used for training the face classifier, the influence of the features of the target image with poor quality can be received, so that the face recognition model obtained by training through the face recognition model training method can obtain a good recognition effect when the face recognition is performed on the image to be recognized with poor quality.

Corresponding to the face recognition model training method in fig. 1 to fig. 6, an embodiment of the present invention further provides a computer device, as shown in fig. 10, the device includes a memory 1000, a processor 2000 and a computer program stored on the memory 1000 and operable on the processor 2000, wherein the processor 2000 implements the steps of the face recognition model training method when executing the computer program.

Specifically, the memory 1000 and the processor 2000 can be general memories and processors, which are not limited herein, and when the processor 2000 runs a computer program stored in the memory 1000, the face recognition model training method can be executed, so as to solve the problem that the current face recognition method cannot recognize a face image with poor quality, and further achieve an effect of effectively recognizing the face image with poor quality.

Corresponding to the face recognition model training method in fig. 1 to fig. 6, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the face recognition model training method.

Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, and when a computer program on the storage medium is executed, the face recognition model training method can be executed, so that the problem that the existing face recognition method cannot recognize face images with poor quality is solved, and an effect of effectively recognizing the face images with poor quality is achieved.

The face recognition model training method and apparatus, and the computer program product of the face recognition method and apparatus provided in the embodiments of the present invention include a computer-readable storage medium storing a program code, and instructions included in the program code may be used to execute the method in the foregoing method embodiments.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A face recognition model training method is characterized by comprising the following steps:

acquiring a comparison image set, a source image set and a target image set;

inputting the source image set and the target image set into a second neural network, performing domain fusion on source images in the source image set and target images in the target image set based on feature learning of shared parameters, and extracting a second feature vector for each source image in the source image set; inputting the second feature vector into a face classifier to obtain a classification result of the source image;

wherein the comparison image set comprises at least one comparison image with a label; the source image set comprises at least one source image with a label; and the comparison image from the first feature vector and the source image from the corresponding second feature vector are images of the same person.

2. The method of claim 1, wherein inputting the source image set and the target image set into a second neural network, and after feature learning of source images in the source image set and target images in the target image set, further comprises:

extracting a third feature vector for each target image in the set of target images; and are

Performing gradient inversion processing on the second feature vector and the third feature vector; and

and adjusting parameters of the second neural network according to the domain classification results of the source image set and the target image set respectively characterized by the second feature vector and the third feature vector of the domain classifier.

3. The method of claim 1, wherein the performing a current training round on the second neural network and the face classifier based on the comparison result of the first feature vector and the corresponding second feature vector and the classification result comprises:

performing the following distance determination operation and classification operation until the distance between the first feature vector and the second feature vector is not greater than a preset distance threshold and the obtained classification result is correct, and then completing the current training of the second neural network and the face classifier based on the first neural network;

the distance determining operation includes:

based on the adjusted parameters, extracting a new second feature vector for each source image in the source image set by using a second neural network, and executing the distance determination operation again;

the classification operation includes:

classifying, using the face classifier, the source image based on the currently determined second feature vector;

generating second feedback information aiming at the condition that the classification result is wrong, and carrying out parameter adjustment on the second neural network and the face classifier based on the second feedback information;

based on the adjusted parameters, extracting a new second feature vector for each of the source images in the source image set using a second neural network, and performing the classification operation again.

4. The method according to claim 2, wherein performing parameter adjustment on the second neural network according to the domain classification result of the source image set and the target image set respectively characterized by the second feature vector and the third feature vector of the domain classifier specifically includes:

the following domain classification loss determination operations are performed:

generating third feedback information aiming at the fact that the difference between the domain classification losses of the latest preset times is not smaller than a preset difference threshold value, and carrying out parameter adjustment on the second neural network based on the third feedback information;

based on the adjusted parameters, a second neural network is used for extracting a new second feature vector for each source image in the source image set, a new third feature vector is extracted for each target image in the target image set, the domain classification loss determining operation is executed until the difference between the last preset times of pre-classification losses is not larger than a preset difference threshold value, and the training of the second neural network based on the domain classifier is completed.

5. The method according to any one of claims 1 to 4, wherein the extracting a first feature vector for the input comparison image set using the first neural network comprises:

and local feature extraction is carried out on the input comparison images by using the first neural network, and a local first feature vector is generated for each comparison image.

6. The method of any one of claims 1-4, wherein the source image set is the same as the alignment image set.

7. A face recognition method, comprising:

acquiring an image to be identified;

carrying out face recognition on the image to be recognized by using the face recognition model obtained by the face recognition model training method according to any one of claims 1 to 6, and generating a face recognition result corresponding to the image to be recognized.

8. The method of claim 7, wherein the face recognition model comprises: a second neural network and a face classifier;

the performing face recognition on the image to be recognized by using the face recognition model obtained by the face recognition model training method according to any one of claims 1 to 6 specifically includes:

inputting the image to be recognized into the second neural network, and performing feature extraction on the image to be recognized by using the second neural network to obtain a feature vector of the image to be recognized;

and classifying the image to be recognized by adopting a face classifier based on the feature vector of the image to be recognized to obtain a classification result of the image to be recognized.

9. A face recognition model training device, the device comprising:

the second processing module is used for inputting the source image set and the target image set into a second neural network, performing domain fusion on the source images in the source image set and the target images in the target image set based on feature learning of shared parameters, and extracting a second feature vector for each source image in the source image set; inputting the second feature vector into a face classifier to obtain a classification result of the source image;

the training module is used for carrying out the training of the current round on the second neural network and the face classifier based on the comparison result of the first feature vector and the corresponding second feature vector and the classification result; performing multi-round training on the second neural network and the face classifier to obtain a face recognition model;

wherein the comparison image set comprises at least one comparison image with a label; the source image set comprises at least one tagged source image; and the comparison image from the first feature vector and the source image from the corresponding second feature vector are images of the same person.

10. A face recognition apparatus, comprising:

the image acquisition module is used for acquiring an image to be identified;

a recognition module, configured to perform face recognition on the image to be recognized by using the face recognition model obtained by the face recognition model training method according to any one of claims 1 to 6, and generate a face recognition result corresponding to the image to be recognized.