CN113361384A

CN113361384A - Face recognition model compression method, device, medium, and computer program product

Info

Publication number: CN113361384A
Application number: CN202110617959.XA
Authority: CN
Inventors: 朱振文; 吴泽衡; 周古月; 徐倩; 杨强
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2021-06-03
Filing date: 2021-06-03
Publication date: 2021-09-07

Abstract

The application discloses a face recognition model compression method, a device, a medium and a computer program product, wherein the face recognition model compression method comprises the following steps: acquiring a training face sample and a sample class label corresponding to the training face sample, and respectively extracting the characteristics of the training face sample based on a face recognition teacher model and a face recognition student model to be trained to obtain a first training face characteristic representation and a second training face characteristic representation; extracting center representation of each face category from the face recognition teacher model, and generating target model loss corresponding to the face recognition student model to be trained according to the center representation of each face category, the sample category label, the first training face feature representation and the second training face feature representation; and optimizing the face recognition student model to be trained based on the target model loss to obtain a face recognition compression model corresponding to the face recognition teacher model. The technical problem that a face recognition model is poor in compression effect is solved.

Description

Face recognition model compression method, device, medium, and computer program product

Technical Field

The present application relates to the field of artificial intelligence in financial technology (Fintech), and in particular, to a method, apparatus, medium, and computer program product for compressing a face recognition model.

Background

With the continuous development of financial science and technology, especially internet science and technology, more and more technologies (such as distributed technology, artificial intelligence and the like) are applied to the financial field, but the financial industry also puts higher requirements on the technologies, for example, higher requirements on the distribution of backlog in the financial industry are also put forward.

With the continuous development of computer technology, the application of artificial intelligence is more and more extensive, in recent years, the face recognition technology is more mature, the requirement on the recognition accuracy is higher and higher, but the face recognition model is also more and more complex, and with the increase of the complexity of the face recognition model, the hardware requirement and the memory requirement required for deploying the face recognition model are also higher and higher, and further causes higher difficulty in deploying the face recognition model, at present, the trained face recognition model is usually pruned, the face recognition model is compressed, when the pruning proportion is higher, the precision of the compressed face recognition model after pruning is influenced, when the pruning proportion is low, the compressed face recognition model is still complex and is inconvenient for model deployment, so that the effect of compressing the face recognition model by the current model compression mode is poor.

Disclosure of Invention

The present application mainly aims to provide a face recognition model compression method, device, medium and computer program product, and aims to solve the technical problem of poor compression effect of a face recognition model in the prior art.

In order to achieve the above object, the present application provides a face recognition model compression method, which is applied to a face recognition model compression device, and the face recognition model compression method includes:

acquiring a training face sample and a sample class label corresponding to the training face sample, and respectively extracting the characteristics of the training face sample based on a face recognition teacher model and a face recognition student model to be trained to obtain a first training face characteristic representation and a second training face characteristic representation;

extracting each face class center representation from the face recognition teacher model, and generating a target model loss corresponding to the to-be-trained face recognition student model according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation;

and optimizing the face recognition student model to be trained based on the target model loss to obtain a face recognition compression model corresponding to the face recognition teacher model.

In order to achieve the above object, the present application provides a face recognition method, where the face recognition method is applied to a face recognition device, and the face recognition method includes:

acquiring a face image to be recognized, expressing an optimized face recognition compression model based on each face category center of a face recognition teacher model, and extracting the features of the face image to be recognized to obtain target face feature expression;

and generating a face recognition result corresponding to the face image to be recognized by calculating the similarity between the target face feature representation and the face feature representation corresponding to each pre-recorded face.

The application still provides a face identification model compression device, face identification model compression device is virtual device, just face identification model compression device is applied to face identification model compression equipment, face identification model compression device includes:

the characteristic extraction module is used for acquiring a training face sample and a sample class label corresponding to the training face sample, and respectively extracting the characteristics of the training face sample based on a face recognition teacher model and a face recognition student model to be trained to obtain a first training face characteristic representation and a second training face characteristic representation;

the loss calculation module is used for extracting each face class center representation from the face recognition teacher model and generating a target model loss corresponding to the to-be-trained face recognition student model according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation;

and the model optimization module is used for optimizing the to-be-trained face recognition student model based on the target model loss to obtain a face recognition compression model corresponding to the face recognition teacher model.

The present application further provides a face recognition device, the face recognition model is a virtual device, just the face recognition device is applied to face recognition equipment, the face recognition device includes:

the characteristic extraction module is used for acquiring a face image to be recognized, expressing an optimized face recognition compression model based on each face category center of a face recognition teacher model, and extracting the characteristics of the face image to be recognized to obtain target face characteristic expression;

and the generating module is used for generating a face recognition result corresponding to the face image to be recognized by calculating the similarity between the target face feature representation and the face feature representation corresponding to each pre-recorded face.

The present application further provides a face recognition model compression device, the face recognition model compression device is an entity device, the face recognition model compression device includes: a memory, a processor and a program of the face recognition model compression method stored on the memory and executable on the processor, the program of the face recognition model compression method being executable by the processor to implement the steps of the face recognition model compression method as described above.

The present application further provides a face recognition device, the face recognition device is an entity device, the face recognition device includes: a memory, a processor and a program of the face recognition method stored on the memory and executable on the processor, the program of the face recognition method being executable by the processor to implement the steps of the face recognition method as described above.

The present application further provides a medium, which is a readable storage medium, wherein a program for implementing the face recognition model compression method is stored on the readable storage medium, and when executed by a processor, the program for implementing the face recognition model compression method implements the steps of the face recognition model compression method as described above.

The present application further provides a medium, which is a readable storage medium, on which a program for implementing the face recognition method is stored, and when the program for implementing the face recognition method is executed by a processor, the steps of the face recognition method are implemented as described above.

The present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the face recognition model compression method as described above.

The present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the face recognition method as described above.

The application provides a face recognition model compression method, equipment, medium and computer program product, compared with the technical means of compressing a face recognition model by pruning a trained face recognition model adopted in the prior art, the application firstly obtains a training face sample and a sample class label corresponding to the training face sample, and respectively extracts the characteristics of the training face sample based on a face recognition teacher model and a face recognition student model to be trained, so as to obtain a first training face characteristic representation and a second training face characteristic representation, further extracts each face class center representation in the face recognition teacher model, and generates target model loss according to each face class center representation, the sample class label, the first training face characteristic representation and the second training face characteristic representation, further, the purpose of directly sharing the class center representation of the face recognition teacher model to the face recognition student model to be trained is realized, and the face recognition teacher model is a complex model, and the loss generated is further expressed based on the class center representation of the face recognition teacher model, so that the face recognition student model to be trained is directly optimized, the purpose of learning the feature distribution of the face recognition teacher model on the basis of the class center representation of the face recognition teacher model is realized, so that on the premise that the face recognition learning model to be trained is a light-weight model, the learning efficiency and accuracy of the knowledge of the face recognition teacher model to be trained for learning the face recognition student model are improved, the face recognition compression model corresponding to the face recognition teacher model is further obtained, and the precision of the face recognition compression model as a light-weight model is improved, therefore, the technical defects that when the pruning proportion is high, the precision of the compressed face recognition model after pruning is influenced, and when the pruning proportion is low, the compressed face recognition model is still complex and inconvenient to deploy, so that the effect of compressing the face recognition model is poor are overcome, and the compression effect of the face recognition model is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a schematic flow chart of an embodiment of a face recognition model compression method according to the present application;

FIG. 2 is a schematic flow chart of iterative training of a face recognition student model to be trained in the face recognition model compression method of the present application;

FIG. 3 is a schematic flow chart illustrating an embodiment of a face recognition method according to the present application;

FIG. 4 is a schematic structural diagram of a hardware operating environment related to a face recognition model compression method in an embodiment of the present application;

fig. 5 is a schematic structural diagram of a hardware operating environment related to a face recognition method in the embodiment of the present application.

The objectives, features, and advantages of the present application will be further described with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In a first embodiment of the face recognition model compression method, referring to fig. 1, the face recognition model compression method is applied to a face recognition model compression device, and the face recognition model compression method includes:

step S10, acquiring a training face sample and a sample class label corresponding to the training face sample, and respectively extracting the features of the training face sample based on a face recognition teacher model and a face recognition student model to be trained to obtain a first training face feature representation and a second training face feature representation;

in this embodiment, it should be noted that the training face sample is sample data representing a face image, such as a pixel matrix representing the face image, the sample class label is an identifier of a sample class to which the training face sample belongs, the face recognition teacher model is a trained face recognition model with a complex network structure, where the face recognition model with the complex network structure has a high model precision and a high requirement on hardware resources, the complex network structure includes network structures such as Resnet101, inclusion-v 4, and senenet, and the to-be-trained face recognition student model is a non-trained face recognition model with a lightweight network structure, where the face recognition model with the lightweight network structure has a low requirement on hardware resources and the precision of the face recognition model with the lightweight network structure is lower than that of the face recognition model with the complex network structure, the lightweight network structure comprises a mobile face net, a vargface net, a shufflenet and other network structures, the first training face feature represents a vector of a preset dimension of the face feature of the training face sample generated by the face recognition teacher model, the second training face feature represents a vector of the preset dimension of the face feature of the training face sample generated by the to-be-trained face recognition student model, wherein the preset dimension can be set to 512 dimensions, 1024 dimensions and the like.

Acquiring a training face sample and a sample class label corresponding to the training face sample, and respectively performing feature extraction on the training face sample based on a face recognition teacher model and a face recognition student model to be trained to obtain a first training face feature representation and a second training face feature representation, and specifically, acquiring a training face sample and a sample class label corresponding to the training face sample, wherein the training face sample is a face image pixel matrix representing a face image, and further performing feature extraction on the face image pixel matrix based on the face recognition teacher model to map the face image pixel matrix to a vector of a preset dimension to obtain a first training face feature representation, and performing feature extraction on the face image pixel matrix based on the face recognition student model to be trained to map the face image pixel matrix to a vector of a preset dimension, a second training face feature representation is obtained.

Step S20, extracting each face class center representation from the face recognition teacher model, and generating a target model loss corresponding to the to-be-trained face recognition student model according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation;

in this embodiment, it should be noted that the face class center is represented as a model parameter of a classification layer in the face recognition teacher model, and is used to represent the face class center, for example, if the model parameter of the classification layer in the face recognition teacher model is a matrix (a, B, C), a vector a is represented as the face class center of a face class a, a vector B is represented as the face class center of a face class B, and a vector C is represented as the face class center of a face class C.

Extracting each face class center representation from the face recognition teacher model, and generating a target model loss corresponding to the to-be-trained face recognition student model according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation, specifically, extracting model parameters of a classification layer in the face recognition teacher model as each face class center representation, and further generating a face recognition loss, a class distribution similarity loss and a feature representation similarity loss according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation, wherein the face recognition loss is a model loss for guiding the to-be-trained face recognition student model to improve the model identification capability, making the inter-class similarity of the feature representation output by the face recognition learning model to be trained smaller than a first preset similarity threshold and the intra-class similarity larger than a second preset similarity threshold, wherein the inter-class similarity is the similarity between the feature representations not belonging to the same face class, the intra-class similarity is the similarity between the feature representations belonging to the same face class, the first preset similarity threshold is not larger than the second preset similarity threshold, the class distribution similarity loss is the model loss used for guiding the feature representation output by the face recognition student model to be trained to learn the similarity distribution between the teacher model and the center representation of each face class, the class distribution similarity loss comprises KL divergence loss and the like, and the feature representation similarity loss is the model loss used for guiding the face recognition student model to be trained to learn the feature representation output by the face recognition teacher model, and for the same face image, the similarity between the feature representation output by the to-be-trained face recognition student model and the feature representation output by the face recognition teacher model is larger than a third preset similarity threshold.

The step of generating a target model loss corresponding to the to-be-trained face recognition student model according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation includes:

step S21, generating the face recognition loss based on the first similarity between the second training face feature representation and each face class center representation and the sample class label;

in this embodiment, the face recognition loss is generated based on the first similarity between the second training face feature representation and each face class center representation and the sample class label, specifically, the cosine similarity between the second training face feature representation and each face class center representation is calculated to obtain each first similarity, and then the similarity corresponding to the sample class label is selected from each first similarity as an intra-class similarity, and a first similarity that is not an intra-class similarity is used as an inter-class similarity, and further, the face recognition loss is calculated based on each inter-class similarity and the intra-class similarity.

Wherein the first similarity includes an intra-class similarity and an inter-class similarity,

the step of, based on the first similarity between the second training face feature representation and each of the face class center representations and the sample class label, comprising:

step S211, determining, in each face class center representation, an affiliated class center representation corresponding to the second training face feature representation and each non-affiliated class center representation corresponding to the second training face feature representation based on the sample class label;

in this embodiment, it should be noted that the sample class label is an identifier of a sample class to which a training face sample belongs, where one sample class corresponds to a face class center representation.

And determining a belonging class center representation and corresponding non-belonging class center representations corresponding to the second training face feature representation in each face class center representation based on the sample class label, specifically, taking the face class center representation corresponding to the sample class label as the belonging class center representation and taking other face class center representations not corresponding to the sample class label as the non-belonging class center representations based on the corresponding relation between the sample class and the face class center representation.

Step S212, calculating the similarity between the second training face feature representation and the belonging class center representation to obtain the intra-class similarity;

in this embodiment, the similarity between the second training face feature representation and the belonging class center representation is calculated to obtain the internal similarity, and specifically, the internal similarity is generated by calculating a cosine value between the second training face feature representation and the belonging class center representation, where it is required to be noted that, preferably, before calculating the cosine value between the second training face feature representation and the belonging class center representation, normalization processing needs to be performed on the second training face feature representation and the belonging class center representation.

Step S213, respectively calculating the similarity between the second training face feature representation and the central representation of each non-affiliated category to obtain the similarity between the categories;

in this embodiment, the similarity between the second training face feature representation and each non-belonging category central representation is respectively calculated to obtain the similarity between the categories, and specifically, the similarity between the categories is generated by respectively calculating cosine values between the second training face feature representation and each non-belonging category central representation, where it is preferable that the second training face feature representation and each non-belonging category central representation need to be normalized before the similarity between the second training face feature representation and each non-belonging category central representation is respectively calculated.

Step S214, generating the face recognition loss based on the intra-class similarity and the inter-class similarity.

In this embodiment, based on the intra-class similarity and each inter-class similarity, a calculation formula for calculating the face recognition loss is as follows:

wherein L is_regFor the face recognition loss, N is the number of face samples trained in one iteration when the face recognition student model to be trained is iteratively trained, s represents a scale, where s is 64, and m1, m2, and m3 represent margin in three dimensions, and m is an edge₁＝1，m₂＝0.5，m₃＝0，cos(m₁θ_yi+m₂) For the degree of similarity in the inner phase, cos (θ)_j) For the inter-class similarity, θ_yiIs the angle between the second training face feature representation and the face class center representation corresponding to the sample class to which the second training face feature representation belongs, theta_jAnd representing an included angle between the second training face feature representation and a face class center representation corresponding to a sample class which the second training face feature representation does not belong to, wherein the sample class which the second training face feature representation belongs to is a sample class to which the training face sample belongs, and the sample class which the second training face feature representation does not belong to is a sample class to which the training face sample does not belong.

Step S22, generating the category distribution similarity loss based on each of the first similarities and the second similarities between the first training face feature representations and each of the face category centers;

in this embodiment, the class distribution similarity loss is generated based on the first similarities and the second similarities between the first training face feature representation and the face class centers, specifically, cosine values between the first training feature representation and the face class centers are respectively calculated to obtain the second similarities, and then KL divergence loss is calculated based on the first similarities and the second similarities, and is taken as the class distribution similarity loss, where a calculation formula for calculating the KL divergence loss is as follows:

wherein L is_klFor the KL divergence loss, N is the number of training face samples in one iteration when a face recognition student model to be trained is iteratively trained, C is the number of face class center representations, t_ijIs said second degree of similarity, s_ijAnd T is a temperature variable and is used for mapping the first similarity and the second similarity to a preset value range.

Step S23, generating the feature representation similarity loss based on the first training face feature representation and the second training face feature representation;

in this embodiment, based on the first training face feature representation and the second training face feature representation, the feature representation similarity loss is generated, specifically, a feature representation similarity loss calculation formula is input to the first training face feature representation and the second training face feature representation, and the feature representation similarity loss is calculated, where the preset feature representation similarity loss calculation formula is as follows:

wherein L is_cosAnd expressing similarity loss for the features, wherein N is the number of training face samples in one iteration when the face recognition student model to be trained is iteratively trained, feat _ t is the first training face feature representation, and feat _ s is the second training face feature representation.

Step S24, generating the target model loss based on the face recognition loss, the category distribution similarity loss, and the feature representation similarity loss.

In this embodiment, the target model loss is generated based on the face recognition loss, the class distribution similarity loss and the feature expression similarity loss, and specifically, the face recognition loss, the class distribution similarity loss and the feature expression similarity loss are weighted and summed to obtain the target model loss, where a calculation formula for weighting and summing the face recognition loss, the class distribution similarity loss and the feature expression similarity loss is as follows:

Loss＝αL_reg+βL_kl+γL_cos

wherein Loss is the target model Loss, L_regFor said face recognition loss, L_klDistributing the loss of similarity for said classes, L_cosFor the features representing the loss of similarity, α, β and γ are all weighted weights.

Wherein the step of generating a target model loss corresponding to the to-be-trained face recognition student model according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation further comprises:

step A10, generating the face recognition loss based on the first similarity between the second training face feature representation and each face class center representation and the sample class label;

in this embodiment, the face recognition loss is generated based on the first similarity between the second training face feature representation and each face class center representation and the sample class label, specifically, the cosine similarity between the second training face feature representation and each face class center representation is calculated to obtain each first similarity, and then the similarity corresponding to the sample class label is selected from each first similarity as an intra-class similarity, and the first similarity which is not the intra-class similarity is used as an inter-class similarity, and further, the face recognition loss is calculated based on each inter-class similarity and the intra-class similarity, where the calculation of the face recognition loss specifically refers to the contents of steps S211 to S214, and is not repeated here.

Step A20, generating the category distribution similarity loss based on each first similarity and a second similarity between the first training face feature representation and each face category center;

in this embodiment, the class distribution similarity loss is generated based on the first similarities and the second similarities between the first training face feature representation and the face class centers, specifically, cosine values between the first training feature representation and the face class centers are respectively calculated to obtain the second similarities, and then KL divergence loss is calculated based on the first similarities and the second similarities, and the KL divergence loss is used as the class distribution similarity loss, where the method for calculating KL divergence loss may specifically be the content in step S22, and is not repeated here.

Step A30, generating the target model loss based on the face recognition loss and the category distribution similarity loss.

In this embodiment, the target model loss is generated based on the face recognition loss and the class distribution similarity loss, specifically, the face recognition loss and the class distribution similarity loss are weighted and summed to obtain the target model loss, and then the face recognition student model to be trained is optimized based on the target model loss, so that the purpose of performing combined distillation on the face recognition teacher model based on the face recognition loss and the class distribution similarity loss in combination with the face class center representation of the face recognition teacher model can be achieved, compared with the model compression method of direct model distillation, the face recognition student model to be trained can learn more efficiently and more knowledge of the face recognition teacher model, and the face recognition accuracy of the face recognition compression model obtained after the iterative training is completed is higher, the effect of compressing the face recognition model is improved.

And step S30, based on the target model loss, optimizing the to-be-trained face recognition student model to obtain a face recognition compression model corresponding to the face recognition teacher model.

In this embodiment, the to-be-trained face recognition student model is optimized based on the target model loss to obtain a face recognition compression model corresponding to the face recognition teacher model, specifically, the to-be-trained face recognition student model is optimized based on the model gradient corresponding to the target model loss, and whether the optimized to-be-trained face recognition student model meets a preset iterative training end condition is determined, if yes, the to-be-trained face recognition student model is used as the face recognition compression model corresponding to the face recognition teacher model, and if not, the step of obtaining a training face sample and a sample class label corresponding to the training face sample is returned, where the preset iterative training end condition includes an iteration maximum iteration number threshold, a model loss convergence, and the like.

Wherein, it should be noted that, because the number of classification categories related to the face recognition model is extremely high, if the face recognition student model to be trained is optimized based on the category distribution similarity loss only, the face recognition learning model to be trained is difficult to learn the knowledge of the face recognition teacher model comprehensively, and then the face recognition learning model to be trained is difficult to converge in the iterative training process, and the embodiment of the present application optimizes the face recognition student model to be trained by combining the face recognition loss, the category distribution similarity loss and the target model loss generated by the feature representation similarity loss, so that the face recognition learning model to be trained can converge faster in the iterative training process, and can learn the knowledge of the face recognition teacher model comprehensively, thereby realizing the similarity loss based on the face recognition loss, the category distribution similarity loss and the feature representation similarity loss, compared with a mode of directly performing model distillation, the face recognition student model to be trained can learn more knowledge of the face recognition teacher model more efficiently and more, and the face recognition accuracy of the face recognition compression model obtained after the iterative training is completed is higher, so that the effect of compressing the face recognition model is improved, as shown in fig. 2, which is a schematic flow chart of the face recognition student model to be trained in the embodiment of the present application, wherein I is the training face sample, Label is the sample class Label, teacher model is the face recognition teacher model, feat _ t is the first training face feature representation, feat _ s is the second training face feature representation, student model is the face recognition student model to be trained, class center class weight is the center representation of each face class, and Lreg is the face recognition loss, Lkl is the KL divergence loss, Lcos is the feature representation similarity loss, a total loss function is the target model loss, and model parameters are the model parameters of the face recognition student model to be trained.

The step of optimizing the to-be-trained face recognition student model based on the target model loss to obtain a face recognition compression model corresponding to the face recognition teacher model comprises the following steps of:

step S31, updating the model parameters of the to-be-trained face recognition student model based on the model gradient corresponding to the target model loss, and judging whether the updated to-be-trained face recognition student model meets the preset iterative training end condition;

in this embodiment, based on the model gradient corresponding to the target model loss, the model parameters of the to-be-trained face recognition student model are updated, and it is determined whether the updated to-be-trained face recognition student model meets a preset iterative training end condition, specifically, based on the target model loss, the model gradient corresponding to the to-be-trained face recognition student model is calculated, and based on the model gradient, the model parameters of the to-be-trained face recognition student model are updated in a preset model parameter updating manner, and it is determined whether the to-be-trained face recognition student model after updating the model parameters meets the preset iterative training end condition, where the preset model parameter updating manner includes a gradient descent method, a gradient ascent method, and the like.

Step S32, if yes, the to-be-trained face recognition student model is used as the face recognition compression model;

and step S33, if not, returning to the step of obtaining the training face sample and the sample class label corresponding to the training face sample.

In this embodiment, if yes, the to-be-trained face recognition student model is used as the face recognition compression model; and if not, returning to the step of obtaining the training face sample and the sample class label corresponding to the training face sample to continue iterative training of the to-be-trained face recognition student model until whether the to-be-trained face recognition student model meets a preset iterative training end condition or not.

After the step of optimizing the to-be-trained face recognition student model based on the target model loss and obtaining the target compression model corresponding to the face recognition teacher model, the face recognition model compression method further includes:

step S40, acquiring a face image to be recognized, and extracting the features of the face image to be recognized based on the face recognition compression model to obtain a target face feature representation;

in this embodiment, a face image to be recognized is obtained, feature extraction is performed on the face image to be recognized based on the face recognition compression model, and target face feature representation is obtained, specifically, the face image to be recognized is obtained, the face image to be recognized is input into the face recognition compression model, and the face image to be recognized is mapped into a vector with a preset dimension by performing feature extraction on the face image to be recognized, so that target face feature representation is obtained.

And step S50, generating a face recognition result corresponding to the face image to be recognized by calculating the similarity between the target face feature representation and the feature representation corresponding to each pre-recorded face image.

In this embodiment, it should be noted that the pre-entered face is a pre-entered face, and is used for comparing with an input face to be recognized to realize face recognition.

Generating a face recognition result corresponding to the face image to be recognized by calculating similarity between the target face feature representation and feature representations corresponding to the pre-recorded face images, specifically, calculating cosine similarity between the target face feature representation and the feature representations corresponding to the pre-recorded faces, and further selecting a face identity label of the pre-recorded face image with the highest cosine similarity as the face recognition result corresponding to the face image to be recognized, wherein the face identity label is an identifier representing the identity of a face, such as an identity card number, a mobile phone number and the like, wherein, as the face recognition compression model is generated by combining face category center representation of a face recognition teacher model, based on face recognition loss, category distribution similarity loss and feature representation similarity loss, the face recognition teacher model is subjected to combined distillation, therefore, the student model for face recognition to be trained can learn the knowledge of the teacher model for face recognition more efficiently and more, and the accuracy of face recognition is improved.

The embodiment of the application provides a face recognition model compression method, compared with the technical means of compressing a face recognition model by pruning a trained face recognition model in the prior art, the embodiment of the application firstly acquires a training face sample and a sample class label corresponding to the training face sample, and respectively extracts face class center representation from the face recognition teacher model based on a face recognition teacher model and the to-be-trained face recognition student model, and respectively extracts the feature of the training face sample to obtain first training face feature representation and second training face feature representation, so that the face recognition teacher model extracts face class center representation, and according to each face class center representation, the sample class label generates target model loss, and further realizes that the class center representation of the face recognition teacher model is directly shared to the to-be-trained face recognition student model The purpose of the model is that the face recognition teacher model is a complex model, and the loss generated is expressed based on the class center of the face recognition teacher model, so that the face recognition student model to be trained is directly optimized, the purpose of learning the feature distribution of the face recognition teacher model on the basis of the class center of the face recognition teacher model is realized, the learning efficiency and accuracy of the face recognition student model to be trained in learning the knowledge of the face recognition teacher model are improved on the premise that the face recognition learning model to be trained is a light-weight model, the face recognition compression model corresponding to the face recognition teacher model is obtained, the precision of the face recognition compression model serving as a light-weight model is improved, and therefore, the problem that when the pruning proportion is high, the precision of the compressed face recognition model after pruning is influenced is solved, when the pruning proportion is low, the compressed face recognition model is still complex and inconvenient for model deployment, so that the technical defect that the effect of compressing the face recognition model is poor is caused, and the compression effect of the face recognition model is improved.

Further, referring to fig. 3, in another embodiment of the present application, the face recognition method is applied to a face recognition device, and the face recognition method includes:

step B10, acquiring a face image to be recognized, expressing an optimized face recognition compression model based on each face category center of a face recognition teacher model, and extracting the features of the face image to be recognized to obtain target face feature expression;

in this embodiment, it should be noted that the face recognition compression model is generated based on directly sharing the class center representation of the face recognition teacher model with the to-be-trained face recognition student model, and since the face recognition teacher model is a complex model, the purpose of learning the knowledge of the face recognition teacher model on the basis of the class center representation of the face recognition teacher model by the face recognition compression model is further achieved, so that on the premise that the to-be-trained face recognition learning model is a light-weight model, the learning efficiency and accuracy of the to-be-trained face recognition student model for learning the knowledge of the face recognition teacher model are improved.

The method comprises the steps of obtaining a face image to be recognized, representing an optimized face recognition compression model based on each face category center of a face recognition teacher model, carrying out feature extraction on the face image to be recognized, obtaining target face feature representation, specifically, obtaining the face image to be recognized, representing the optimized face recognition compression model based on each category center of the face recognition teacher model, carrying out feature extraction on the face image to be recognized, mapping the face image to be recognized into a vector with preset dimensions, and obtaining face feature representation.

Before the step of representing an optimized face recognition compression model based on the class center of the face recognition teacher model, performing feature extraction on the face image to be recognized and obtaining face feature representation, the face recognition method further comprises the following steps:

step C10, acquiring a training face sample and a sample class label corresponding to the training face sample, and respectively extracting the features of the training face sample based on the face recognition teacher model and the face recognition student model to be trained to obtain a first training face feature representation and a second training face feature representation;

in this embodiment, a training face sample and a sample class label corresponding to the training face sample are obtained, and feature extraction is performed on the training face sample respectively based on the face recognition teacher model and a face recognition student model to be trained to obtain a first training face feature representation and a second training face feature representation, and specifically, a training face sample and a sample class label corresponding to the training face sample are obtained, wherein the training face sample is a face image pixel matrix representing a face image, and further, feature extraction is performed on the face image pixel matrix based on the face recognition teacher model to map the face image pixel matrix to a vector of a preset dimension to obtain a first training face feature representation, and feature extraction is performed on the face image pixel matrix based on the face recognition student model to be trained, and mapping the face image pixel matrix to a vector with a preset dimension to obtain a second training face feature representation, wherein the specific explanation of step C10 may refer to the content in step S10, which is not described herein again.

Step C20, generating a target model loss corresponding to the to-be-trained face recognition student model according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation;

in this embodiment, a target model loss corresponding to the to-be-trained face recognition student model is generated according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation, specifically, a face recognition loss, a class distribution similarity loss and a feature representation similarity loss are generated according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation, and further the face recognition loss, the class distribution similarity loss and the feature representation similarity loss are weighted and summed to obtain a target model loss, where the specific content of generating the target model loss may refer to the contents in step S20, step S21 to step S24 and step a10 to step a30, and will not be described in detail herein.

step C21, generating the face recognition loss based on the first similarity between the second training face feature representation and each face class center representation and the sample class label;

in this embodiment, the face recognition loss is generated based on the first similarity between the second training face feature representation and each face class center representation and the sample class label, specifically, the cosine similarity between the second training face feature representation and each face class center representation is calculated to obtain each first similarity, and then the similarity corresponding to the sample class label is selected from each first similarity as an intra-class similarity, and the first similarity which is not the intra-class similarity is used as an inter-class similarity, and further, the face recognition loss is calculated based on each inter-class similarity and the intra-class similarity, where the specific content of calculating the face recognition loss may refer to the content in step S21 and steps S211 to S214, and is not described herein again.

Step C22, generating the category distribution similarity loss based on each first similarity and a second similarity between the first training face feature representation and each face category center;

in this embodiment, the class distribution similarity loss is generated based on the first similarities and the second similarities between the first training face feature representation and the face class centers, specifically, cosine values between the first training feature representation and the face class centers are respectively calculated to obtain the second similarities, and then KL divergence loss is calculated based on the first similarities and the second similarities, and the KL divergence loss is used as the class distribution similarity loss, where specific contents of calculating KL divergence loss may refer to the contents in step S22, and are not repeated here.

Step C23, generating the feature representation similarity loss based on the first training face feature representation and the second training face feature representation;

in this embodiment, a preset feature representation similarity loss calculation formula is input for the first training face feature representation and the second training face feature representation, and a feature representation similarity loss is calculated, where specific contents for calculating the feature representation similarity loss may refer to the contents in step S23, and are not described herein again.

And step C24, generating the target model loss based on the face recognition loss, the category distribution similarity loss and the feature representation similarity loss.

In this embodiment, the target model loss is generated based on the face recognition loss, the class distribution similarity loss, and the feature representation similarity loss, and specifically, the face recognition loss, the class distribution similarity loss, and the feature representation similarity loss are weighted and summed to obtain the target model loss.

And step C30, based on the target model loss, optimizing the to-be-trained face recognition student model to obtain a face recognition compression model corresponding to the face recognition teacher model.

And step B20, generating a face recognition result corresponding to the face image to be recognized by calculating the similarity between the target face feature representation and the face feature representation corresponding to each pre-recorded face.

In this embodiment, a face recognition result corresponding to the face image to be recognized is generated by calculating similarity between the target face feature representation and face feature representations corresponding to pre-recorded faces, specifically, cosine similarity between the target face feature representation and the feature representations corresponding to the pre-recorded faces is calculated, and then a face identification tag of the pre-recorded face image with the highest cosine similarity is selected as the face recognition result corresponding to the face image to be recognized, wherein the face identification tag is an identifier representing the identity of a face, such as an identity card number, a mobile phone number, and the like, and the face recognition compression model is generated by performing combined distillation on the face recognition teacher model by combining face category center representation of the face recognition teacher model based on face recognition loss, category distribution similarity loss, and feature representation similarity loss, therefore, the student model for face recognition to be trained can learn the knowledge of the teacher model for face recognition more efficiently and more, and the accuracy of face recognition is improved.

The embodiment of the application provides a face recognition method, namely, firstly, a face image to be recognized is obtained, then an optimized face recognition compression model is represented based on each face category center of a face recognition teacher model, the feature extraction is carried out on the face image to be recognized, a target face feature representation is obtained, then the similarity between the target face feature representation and the face feature representation corresponding to each pre-recorded face is calculated, a face recognition result corresponding to the face image to be recognized is generated, wherein the face recognition compression model is a model of a light-weight network structure generated by combining and distilling the face recognition teacher model based on the face recognition loss, the category distribution similarity loss and the feature representation similarity loss, and further the face recognition student model to be trained can learn more efficiently to the model with the complex network structure The knowledge of the face recognition teacher model improves the accuracy of a face recognition compression model with a lightweight network structure, and further improves the accuracy of face recognition.

Referring to fig. 4, fig. 4 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present application.

As shown in fig. 4, the face recognition model compression apparatus may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. The communication bus 1002 is used for realizing connection communication between the processor 1001 and the memory 1005. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a memory device separate from the processor 1001 described above.

Optionally, the face recognition model compression device may further include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like. The rectangular user interface may comprise a Display screen (Display), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also comprise a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).

It will be appreciated by those skilled in the art that the configuration of the face recognition model compression device shown in FIG. 4 does not constitute a limitation of the face recognition model compression device, and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 4, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, and a face recognition model compression program. The operating system is a program that manages and controls the hardware and software resources of the face recognition model compression device, and supports the operation of the face recognition model compression program as well as other software and/or programs. The network communication module is used to implement communication between the various components within the memory 1005, as well as communication with other hardware and software in the face recognition model compression system.

In the face recognition model compression apparatus shown in fig. 4, the processor 1001 is configured to execute a face recognition model compression program stored in the memory 1005, and implement the steps of the face recognition model compression method described in any one of the above.

The specific implementation of the face recognition model compression device of the present application is substantially the same as that of each embodiment of the face recognition model compression method, and is not described herein again.

Referring to fig. 5, fig. 5 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present application.

As shown in fig. 5, the face recognition apparatus may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. The communication bus 1002 is used for realizing connection communication between the processor 1001 and the memory 1005. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a memory device separate from the processor 1001 described above.

Optionally, the face recognition device may further include a rectangular user interface, a network interface, a camera, RF (Radio Frequency) circuitry, a sensor, audio circuitry, a WiFi module, and so on. The rectangular user interface may comprise a Display screen (Display), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also comprise a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).

Those skilled in the art will appreciate that the face recognition device configuration shown in fig. 5 is not intended to be limiting, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 5, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, and a face recognition program. The operating system is a program that manages and controls the hardware and software resources of the face recognition device, supporting the operation of the face recognition program as well as other software and/or programs. The network communication module is used to enable communication between the various components within the memory 1005, as well as with other hardware and software in the face recognition system.

In the face recognition apparatus shown in fig. 5, the processor 1001 is configured to execute a face recognition program stored in the memory 1005, and implement the steps of the face recognition method according to any one of the above.

The specific implementation of the face recognition device of the present application is basically the same as that of the above-mentioned face recognition method, and is not described herein again.

The embodiment of the present application further provides a face recognition model compression device, the face recognition model compression device is applied to face recognition model compression equipment, the face recognition model compression device includes:

Optionally, the loss calculation module is further configured to:

generating the face recognition loss based on a first similarity between the second training face feature representation and each face class center representation and the sample class label;

generating the class distribution similarity loss based on each first similarity and a second similarity between the first training face feature representation and each face class center;

generating the feature representation similarity loss based on the first training face feature representation and the second training face feature representation;

and generating the target model loss based on the face recognition loss, the category distribution similarity loss and the feature representation similarity loss.

Optionally, the loss calculation module is further configured to:

determining a belonging category central representation corresponding to the second training face feature representation and corresponding non-belonging category central representations in each face category central representation based on the sample category label;

calculating the similarity between the second training face feature representation and the belonging class center representation to obtain the intra-class similarity;

respectively calculating the similarity between the second training face feature representation and the central representation of each non-affiliated category to obtain the similarity between the categories;

and generating the face recognition loss based on the intra-class similarity and each inter-class similarity.

Optionally, the loss calculation module is further configured to:

and generating the target model loss based on the face recognition loss and the category distribution similarity loss.

Optionally, the model optimization module is further configured to:

updating model parameters of the face recognition student model to be trained based on the model gradient corresponding to the target model loss, and judging whether the updated face recognition student model to be trained meets a preset iterative training end condition;

if so, taking the to-be-trained face recognition student model as the face recognition compression model;

and if not, returning to the step of obtaining the training face sample and the sample class label corresponding to the training face sample.

Optionally, the face recognition model compression device is further configured to:

acquiring a face image to be recognized, and extracting the features of the face image to be recognized based on the face recognition compression model to obtain a target face feature representation;

and generating a face recognition result corresponding to the face image to be recognized by calculating the similarity between the target face feature representation and the feature representation corresponding to each pre-recorded face image.

The embodiment of the present application further provides a face recognition apparatus, the face recognition apparatus is applied to face recognition equipment, the face recognition apparatus includes:

Optionally, the face recognition apparatus is further configured to:

acquiring a training face sample and a sample class label corresponding to the training face sample, and respectively extracting the characteristics of the training face sample based on the face recognition teacher model and the face recognition student model to be trained to obtain a first training face characteristic representation and a second training face characteristic representation;

generating a target model loss corresponding to the to-be-trained face recognition student model according to each face class center representation, the sample class label, the first training face feature representation and the second training face feature representation;

Optionally, the face recognition apparatus is further configured to:

The present application provides a medium, which is a readable storage medium, and the readable storage medium stores one or more programs, and the one or more programs are further executable by one or more processors for implementing the steps of the face recognition model compression method described in any one of the above.

The specific implementation of the readable storage medium of the present application is substantially the same as that of each embodiment of the face recognition model compression method, and is not described herein again.

The present application provides a medium, which is a readable storage medium, and the readable storage medium stores one or more programs, and the one or more programs are further executable by one or more processors for implementing the steps of the face recognition method described in any one of the above.

The specific implementation of the readable storage medium of the present application is substantially the same as that of each embodiment of the face recognition method, and is not described herein again.

The present application provides a computer program product, and the computer program product includes one or more computer programs, which can also be executed by one or more processors for implementing the steps of the face recognition model compression method described in any one of the above.

The specific implementation of the computer program product of the present application is substantially the same as that of each embodiment of the face recognition model compression method, and is not described herein again.

The present application provides a computer program product, and the computer program product includes one or more computer programs, which can also be executed by one or more processors for implementing the steps of the face recognition method described in any one of the above.

The specific implementation of the computer program product of the present application is substantially the same as that of each embodiment of the face recognition method, and is not described herein again.

The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims

1. A face recognition model compression method is characterized by comprising the following steps:

2. The method for compressing a face recognition model according to claim 1, wherein the step of generating a target model loss corresponding to the student model for face recognition to be trained according to each of the face class center representation, the sample class label, the first training face feature representation and the second training face feature representation comprises:

3. The face recognition model compression method of claim 2, wherein the first similarity includes an intra-class similarity and an inter-class similarity,

4. The method for compressing a face recognition model according to claim 1, wherein the step of generating a target model loss corresponding to the student model for face recognition to be trained according to each of the face class center representation, the sample class label, the first training face feature representation and the second training face feature representation further comprises:

5. The face recognition model compression method of claim 1, wherein the step of optimizing the face recognition student model to be trained based on the target model loss to obtain the face recognition compression model corresponding to the face recognition teacher model comprises:

6. The face recognition model compression method of claim 1, wherein after the step of optimizing the face recognition student model to be trained based on the target model loss to obtain the target compression model corresponding to the face recognition teacher model, the face recognition model compression method further comprises:

7. A face recognition method is characterized by comprising the following steps:

8. The face recognition method of claim 7, wherein before the step of performing feature extraction on the face image to be recognized to obtain the face feature representation based on the face recognition compression model optimized by the class center representation of the face recognition teacher model, the face recognition method further comprises:

9. The face recognition method of claim 8, wherein the step of generating a target model loss corresponding to the student model for face recognition to be trained according to each of the face class center representation, the sample class label, the first training face feature representation and the second training face feature representation comprises:

10. A face recognition model compression device, the face recognition model compression device comprising: a memory, a processor and a program stored on the memory for implementing the face recognition model compression method,

the memory is used for storing a program for realizing the face recognition model compression method;

the processor is configured to execute a program implementing the face recognition model compression method to implement the steps of the face recognition model compression method according to any one of claims 1 to 6.

11. A face recognition apparatus characterized by comprising: a memory, a processor and a program stored on the memory for implementing the face recognition method,

the memory is used for storing a program for realizing the face recognition method;

the processor is configured to execute a program implementing the face recognition method to implement the steps of the face recognition method according to any one of claims 7 to 9.

12. A medium which is a readable storage medium, characterized in that the readable storage medium has stored thereon a program for implementing a face recognition model compression method, the program for implementing the face recognition model compression method being executed by a processor to implement the steps of the face recognition model compression method according to any one of claims 1 to 6.

13. A medium which is a readable storage medium, characterized in that the readable storage medium has stored thereon a program for implementing a face recognition method, the program being executed by a processor to implement the steps of the face recognition method according to any one of claims 7 to 9.

14. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, carries out the steps of the face recognition model compression method according to one of claims 1 to 6 or the steps of the face recognition method according to one of claims 7 to 9.