CN114267060A

CN114267060A - Face age identification method and system based on uncertain suppression network model

Info

Publication number: CN114267060A
Application number: CN202111376110.4A
Authority: CN
Inventors: 张利军; 徐勇; 曹士平
Original assignee: Shenzhen Graduate School Harbin Institute of Technology
Current assignee: Shenzhen Graduate School Harbin Institute of Technology
Priority date: 2021-11-19
Filing date: 2021-11-19
Publication date: 2022-04-01

Abstract

The invention discloses a face age identification method and a face age identification system based on an uncertain suppression network model, wherein the face age identification method comprises the following steps: a training set acquisition step, namely preprocessing a face image containing an age label to obtain a training image with label distribution; a model training step, inputting a training image into the uncertainty suppression network model for iterative training until the prediction accuracy of the uncertainty suppression network model on the verification set does not rise any more in the iterative process, and obtaining a weight file of the trained uncertainty suppression network model; and an age prediction step, namely identifying the face image to be identified by using the trained uncertainty suppression network model weight file to obtain a face age prediction result. The model uses a lightweight Resnet network as a main network to extract image characteristics, uses an L-Net branch network to eliminate the influence of a face contour on age prediction, and simultaneously calculates the uncertainty of the image, and reduces the interference of uncertainty data on age prediction.

Description

Face age identification method and system based on uncertain suppression network model

Technical Field

The invention belongs to the field of face attribute identification, and particularly relates to a face age identification method and system based on an uncertain suppression network model.

Background

The traditional method divides the task of estimating the age of the face into two stages of feature extraction and age estimation, the feature extraction stage adopts manually designed features, has poor stability and low robustness, and is extremely sensitive to illumination, posture and expression change in a natural scene; age estimation stage classification, regression models, and poor feature resolution. In recent years, the rise of deep neural networks has advanced the computer vision field to a new level of development. Relevant researches show that the convolutional neural network has the advantages of being incapable of manual characteristics in the construction of high-level semantic characteristics of images; in addition, the feature extraction capability can adapt to different visual scenes, has stronger generalization capability, and is not exceptional for the task of face age estimation.

The deep learning model is not favorable for being deployed in practical application due to huge model weight and calculation complexity, and the lightweight networks such as the MobileNet and the like are not good enough in the field of face attribute recognition, especially in an age estimation task, so that the model weight and the calculation complexity are effectively reduced under the condition of ensuring the model precision to become a hot spot in the age estimation task; in addition, the learning process is difficult due to the specificity of the labels and the great individual difference of the age of the human face, the accuracy is low in practical application, the influence of the individual difference is avoided, and the unique age label distribution is difficult to construct for each individual.

Disclosure of Invention

The invention provides a face age identification method, a face age identification system and a storage medium based on an uncertain suppression network model, aiming at the problems, and the face age identification method, the face age identification system and the storage medium can effectively reduce the weight and the calculation complexity of an identification model and improve the face age accuracy.

The invention provides a face age identification method based on an uncertain suppression network model, which comprises the following steps:

a training set acquisition step, namely preprocessing a face image containing an age label to obtain a training image with label distribution;

a model training step, inputting a training image into the uncertainty suppression network model for iterative training until the prediction accuracy of the uncertainty suppression network model on the verification set does not rise any more in the iterative process, and obtaining a weight file of the trained uncertainty suppression network model;

an age prediction step, namely identifying a face image to be identified by using a trained uncertainty suppression network model weight file to obtain a face age prediction result;

the uncertainty suppression network model comprises a lightweight Resnet network for image feature extraction and an L-Net branch network for eliminating facial contour differences based on face key point information, and the model training step comprises the following steps:

calculating a smooth first-order regular loss function value L by using a global pooling feature vector obtained by a lightweight Resnet network and a facial contour feature vector obtained by an L-Net branch network_sa；

Calculating KLD loss function value L by utilizing predicted probability density distribution and label distribution obtained by uncertainty suppression network model_r；

Calculating smooth first-order regular loss function value L by using prediction result obtained by uncertainty suppression network model and real age_a；

Using two smoothed first order canonical loss function values L_s、L_aAnd a KLD loss function value L_rAnd carrying out back propagation to update the weight file of the uncertainty suppression network model.

According to some embodiments of the invention, the uncertainty suppression network model further comprises an image uncertainty evaluation module based on a batch attention mechanism, the image uncertainty evaluation module comprises a full connection layer, a batch pooling layer and a normalization layer, the full connection layer is used for performing global feature conversion on each training image, the image uncertainty evaluation module calculates uncertainty scores of training image samples by using a Query-Key matching mechanism, the Query is obtained by summing and averaging global pooling features of all training images in the batch samples, each sample corresponds to a Key value vector and is obtained by the full connection layer with input and output dimensions of 512, and the image uncertainty evaluation module mainly reduces interference brought to the network by the uncertain images in the training process and accelerates the convergence speed of the network.

According to some embodiments of the present invention, the preprocessing the face image containing the age tag includes:

label coding, converting the age label into label distribution, specifically, assuming that the label distribution of the face age conforms to normal distribution in advance, setting the mean value of the normal distribution as a real age label, and converting the age label from a specific numerical value into the corresponding label distribution, wherein the variance is prior, and the method has the advantages of smoothing the change process of the face age and effectively learning the correlation among the age labels;

data enhancement: the key point detection tool is used for obtaining key point information and pupil coordinates of the face, and the face alignment is carried out according to the pupil coordinates.

According to some embodiments of the invention, before the model training step, the weight file pre-trained on MS-Celeb-1M data is loaded into the lightweight Resnet network, and Kaiming Normal is used to initialize the weights of the L-Net branch network and the image uncertainty evaluation module.

According to some embodiments of the invention, the lightweight Resnet network is based on a Resnet18 network and is improved, including:

resizing the input image to 112 x 112;

reducing the repetition times of all the residual blocks in the Resnet18 network to 1;

the number of channels of all convolutional layers in the Resnet18 network is reduced to 1/2.

According to some embodiments of the invention, the L-Net branch network is composed of a fully connected network and an orthogonal separation plane, the L-Net branch network uses the coordinates of key points of a face as input features, the orthogonal separation plane is composed of feature vectors extracted from the coordinates of the key points of the face and feature vectors extracted from a lightweight Resnet network, and is a dot product operation between the two feature vectors, and the L-Net branch network has the beneficial effects that the face contour can be constructed through the auxiliary information of the key points, so that the difference of the face contour among individuals can be learned, and the effect of differential information elimination can be achieved.

According to some embodiments of the invention, the KLD loss function value calculating step is as follows:

calculating KLD loss function values between the prediction probability density distribution and the label distribution of the uncertain suppression network model in a first iteration;

recording the prediction result of the uncertain suppression network model in each iteration process;

and calculating a first KLD loss function value between the uncertain suppression network model prediction probability density distribution and the label distribution and a second KLD loss function value between the uncertain suppression network model prediction probability density distribution and the uncertain suppression network model prediction probability density distribution in the previous iteration from the second iteration, and balancing the first KLD loss function value and the second KLD loss function value through a hyper-parameter. Based on the KLD loss function, the label distribution after label coding is taken as a learning target in the training process, and meanwhile, the learned knowledge of the model is considered. The method has the advantages that the difference between the variance set a priori and the true variance of the sample can be corrected, and meanwhile, the interference of the age label error on the learning direction of the model is reduced.

According to some embodiments of the invention, the age prediction step is specifically:

setting the uncertainty suppression network model as a prediction mode, wherein the L-Net branch network and the image uncertainty evaluation module do not participate in the prediction mode;

loading a weight file obtained when the uncertainty suppression network model is trained;

inputting the face image to be recognized into the loaded uncertainty suppression network model to obtain age prediction probability density distribution;

and obtaining a face age prediction result output by the uncertainty suppression network model by calculating an expected value of the age prediction probability density distribution.

In a second aspect of the present invention, a face age recognition system based on an uncertainty suppression network model is provided, including:

the training set acquisition module is used for preprocessing the face image containing the age label to obtain a training image with label distribution;

the model training module is used for inputting a training image into the uncertainty suppression network model for iterative training until the prediction accuracy of the uncertainty suppression network model on the verification set does not rise any more in the iterative process, and a weight file of the trained uncertainty suppression network model is obtained;

the age prediction module is used for identifying the face image to be identified by using the trained uncertainty suppression network model weight file to obtain a face age prediction result;

the uncertainty suppression network model comprises a lightweight Resnet network for image feature extraction, an L-Net branch network for eliminating facial contour differences based on face key point information, and an image uncertainty evaluation module based on an annotation mechanism, and the model training module specifically comprises:

Prediction results obtained by using uncertainty suppression network modelFirst-order regular loss function value L smoothed by calculation with real age_a；

In a third aspect of the present invention, there is provided a computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, cause the processor to execute the method for face age recognition based on an uncertainty suppression network model as described above.

The invention provides a face age identification method, a face age identification system and a storage medium based on an uncertain suppression network model. Based on the uncertainty suppression network model, adopting label distribution as a basic mode of age estimation, and converting the age estimation into prediction of probability density distribution of the age estimation; based on an uncertainty suppression network model, a lightweight ResnNet network is used as a main network for image feature extraction, then an L-Net branch network is used for eliminating the influence of a face contour on age estimation, and meanwhile, an image uncertainty evaluation module based on an attention machine mechanism calculates the uncertainty of an image, and the interference of uncertainty data on age estimation is reduced. The invention provides an image uncertainty evaluation algorithm based on an attention machine mechanism, and provides an improved KLD loss function for supervised learning in order to learn the corresponding probability density distribution for each image. The beneficial effects that finally reach:

1. the method has the advantages that the method can help the model to be rapidly converged and relieve the problem of insufficient training data by using the weight file trained in MS-Celeb-1M, loading the weight file into a lightweight Resnet backbone network and initializing the L-Net and image uncertainty evaluation module by using Kaiming normal;

2. the age labels are converted into the corresponding label distribution from a specific numerical value, and the method has the advantages that the change process of the human face age is smoothed, and meanwhile, the correlation among the age labels is effectively learned;

3. the key point detection tool is used for acquiring key point information and pupil coordinates of the face, and the face alignment is carried out according to the pupil coordinates, so that the method has the advantages that data expansion can be carried out through data enhancement, and the robustness of the model is improved;

4. the L-Net branch network has the advantages that the face contour can be constructed through the auxiliary information of the key points, so that the difference of the face contour among individuals can be learned, and the effect of eliminating differential information is achieved;

5. based on the KLD loss function, the label distribution after label coding is taken as a learning target in the training process, and meanwhile, the learned knowledge of the model is considered. The method has the advantages that the difference between the variance set a priori and the true variance of the sample can be corrected, and meanwhile, the interference of the age label error on the learning direction of the model is reduced.

Drawings

FIG. 1 is a schematic flow chart of a face age identification method based on an uncertainty suppression network model according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an L-Net branch network structure for eliminating facial contour differences based on face key point information in the embodiment of the present invention;

FIG. 3 is a flowchart illustrating an uncertain suppression network model training process according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating age prediction according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a face age recognition system based on an uncertainty suppression network model in the embodiment of the present invention.

Detailed Description

In order to further describe the technical scheme of the present invention in detail, the present embodiment is implemented on the premise of the technical scheme of the present invention, and detailed implementation modes and specific steps are given.

As shown in fig. 1, a face age identification method based on an uncertainty suppression network model is provided, which includes the following steps:

s01, a training set acquisition step, namely preprocessing the face image containing the age label to obtain a training image with label distribution;

in the specific implementation process, the preprocessed face image with key points and age labels is obtained, and the process comprises the following steps:

label coding, converting an age label into label distribution, specifically, assuming that the label distribution of the face age conforms to normal distribution in advance, setting the mean value of the normal distribution as a real age label, setting the variance as a priori, setting an initial value of the variance as 3 in specific implementation, converting the age label from a specific numerical value into the corresponding label distribution, and correcting the difference between the variance set by the priori and the real value in the training process by using a subsequent uncertainty suppression network model; the method has the advantages that the change process of the age of the face is smoothed, and meanwhile, the correlation among the age labels is effectively learned;

data enhancement: the method has the advantages that data expansion can be carried out through data enhancement, and the robustness of a model is improved. Last facial keypoint p_normRegularization is performed using the following formula, where P represents the keypoint coordinates, P_cDenotes an interpupillary center point, d denotes an interpupillary distance,

in addition, the preprocessing also comprises the steps of face image color adjustment, brightness adjustment, image normalization, random center cutting and equal scaling.

S02, a model training step, namely inputting a training image into the uncertainty suppression network model for iterative training until the prediction accuracy of the uncertainty suppression network model on the verification set does not rise any more in the iterative process, and obtaining a weight file of the trained uncertainty suppression network model;

in the specific implementation process, before the model training step, a weight file pre-trained on MS-Celeb-1M data is loaded into a lightweight Resnet network, and a Kaiming Normal is used for weight initialization of an L-Net branch network and an image uncertainty evaluation module. Wherein MS-Celeb-1M is a human face data set disclosed by Microsoft, and Kaiming Normal is a model weight initialization method.

Specifically, the uncertain suppression network model structure comprises three parts, namely a lightweight Resnet network for image feature extraction, an L-Net branch network for eliminating facial contour differences based on face key point information, and an image uncertainty evaluation module based on an annotation mechanism. The lightweight Resnet network is used as a backbone network for extracting image features, is based on the Resnet18 network and is improved, and comprises the steps of adjusting the size of an input image to 112 × 112; reducing the repetition times of all the residual blocks in the Resnet18 network to 1; the number of channels of all convolutional layers in the Resnet18 network is reduced to 1/2. The lightweight Resnet network is connected through layer jump among convolutional layers, so that the model learning identity mapping capability is given, the feature extraction capability of the convolutional network is greatly improved, and the weight quantity of the final model is only 1/10 of the Resnet18 network. The L-Net branch network for removing the facial contour difference based on the face key point information is composed of a full-connection network and an orthogonal separation plane, as shown in FIG. 2, wherein the full-connection network is divided into four layers, the dimensions are 136, 512, 256 and 512 respectively, 136 is a one-dimensional vector expanded by 68 face key points, and the middle layer is 256 dimensions, so that noise information can be effectively removed. The L-Net branch network uses the coordinates of 68 key points of the human face as input features, and normalizes the key point information according to the interpupillary distance. The orthogonal separation plane is composed of feature vectors extracted by facial key point coordinates and feature vectors extracted by a lightweight Resnet network, is a dot product operation between the two feature vectors, and achieves the effect of removing facial contour information from global features by reducing the correlation between the two feature vectors. The L-Net branch network has the advantages that the face contour can be constructed through the auxiliary information of the key points, so that the difference of the face contour among individuals can be learned, and the effect of differential information elimination is achieved. The image uncertain evaluation module comprises a full connection layer, a Batch pooling layer and a normalization layer, wherein the full connection layer is used for carrying out global feature conversion on each training image of Mini-Batch, the image uncertain evaluation module is similar to an Attention mechanism, a Query-Key matching mechanism is adopted to calculate uncertainty scores of training image samples, uncertainty of each sample in Batch processing data is calculated through dot product operation of Query and Key, the Query is obtained by summing and averaging global pooling features of all training images in the Batch samples, each sample corresponds to a Key value vector, the Key value is obtained through the full connection layer with input and output dimensions of 512, specifically, the full connection layer is used for converting the global pooling features of each image in Batch processing to obtain the Key value, and the input and output dimensions of the full connection layer are 512 dimensions. The image uncertainty evaluation module is mainly used for reducing interference of uncertain images to the network in the training process and accelerating the convergence speed of the network. The image uncertainty evaluation module adopts a Query and Key matching mechanism and is used for screening out samples with low image quality and high uncertainty from a batch of samples and reducing the guiding effect of the samples in the weight updating process. The uncertainty score of the image sample is obtained by performing dot product operation on Query and Key through a Sigmoid activation function, and is used for measuring the deviation degree of the input sample to the overall distribution of the batch processing sample, wherein the closer the score is to 1, the more the sample conforms to the overall distribution, and the closer the score is to 0, the more the sample does not conform to the overall distribution, so that the uncertainty is larger.

The specific training steps of the uncertain suppression network model are shown in fig. 3, wherein the specific process of obtaining the prediction result of the training image after network forward propagation comprises the following steps: the process is a network forward propagation calculation stage, according to the characteristics of a network structure, an image firstly passes through a light-weight Resnet network through feature extraction, then is respectively sent to an L-Net branch network and an image uncertain evaluation module, and is orthogonal to the facial contour features of a human face learned in the L-Net branch networkCalculating; calculating the deviation degree of the sample from the overall distribution in an image uncertainty evaluation module through an annotation force mechanism: specifically, let X be { X ═ X₁,x₂,x₃,...,x_bThe method comprises the steps of (1) inputting training image data of a model in each iteration process; f ═ F₁,f₂,f₃,...,f_bRepresenting the facial image feature vector acquired through a feature extraction lightweight Resnet network;

represents the mean of all feature vectors; then for each input image data x_iThe uncertainty is calculated by the following formula: alpha is alpha_i＝sigmoid(f_a·W^Tf_i) Where W is the feature vector f_iWeight matrix converted into Key value vector, f_aRepresenting a Query vector. In addition, global pooling characteristics of the image in the lightweight Resnet network can be converted into the number of categories corresponding to the age tags through a full connection layer, and the number of categories is converted into predicted probability density distribution through a softmax function; to calculate the predicted age of the model, an expectation of the predicted probability density distribution needs to be calculated.

As shown in fig. 3, the concrete steps of training the uncertainty suppression network model, wherein the specific process of calculating the loss value of the prediction result according to the real label is as follows: each image corresponds to a true age and a prior distribution. Wherein the real age is directed at the code of the age label, assuming that the labeled distribution of the face age conforms to normal distribution, taking the age label of each image as the mean value of the normal distribution, and calculating the label distribution by setting the variance a priori, in particular 3 as the variance of the normal distribution, specifically in the case of i ∈ [1,100 ]]Each age corresponds to a probability value rho_iThe specific coding formula is as follows:

wherein y is the real age corresponding to the training image, and σ is the artificially set normal distribution variance.

The specific process of calculating the loss function is as follows:

calculating a smooth first-order regular loss function value by using a global pooling feature vector obtained by a lightweight Resnet network and a facial contour feature vector obtained by an L-Net branch network; and (3) constructing orthogonal loss by using the output characteristics of the last full connection layer of the L-Net branch network and the global pooling characteristics in the lightweight Resnet network, specifically calculating the dot product of two 521 characteristics, and performing loss calculation by using a smooth first-order rule. The orthogonal loss is used for removing face contour information from the global pooling features of the face image and restricting the correlation between two feature vectors. In particular, assume f_globalRepresenting the global pooled feature vector, f, of the face images obtained by the lightweight Resnet network_shapeRepresenting the facial contour feature vector output by the last full-connection layer of the L-Net branch network, s ═ f_global·f_shapeFor a click operation between two vectors, the loss function is as follows:

calculating a KLD loss function value by utilizing the prediction probability density distribution and the label distribution of the uncertainty suppression network model; the prediction probability density distribution and the label distribution obtained by the age label coding are calculated by adopting a KLD loss function, meanwhile, in the t-th iteration process of the model, the prediction output at the t-1 moment is taken as a learning target, and the KLD loss is also adopted to smooth the learning process of the sample, and the method specifically comprises the following steps: calculating KLD loss function values between the prediction probability density distribution and the label distribution of the uncertain suppression network model in a first iteration; recording the prediction result of the uncertain suppression network model in each iteration process; and calculating a first KLD loss function value between the uncertain suppression network model prediction probability density distribution and the label distribution and a second KLD loss function value between the uncertain suppression network model prediction probability density distribution and the uncertain suppression network model prediction probability density distribution in the previous iteration from the second iteration, and balancing the first KLD loss function value and the second KLD loss function value through a hyper-parameter. Based on the KLD loss function, the label distribution after label coding is taken as a learning target in the training process, and meanwhile, the learned knowledge of the model is considered. The method has the advantages that the difference between the variance set a priori and the true variance of the sample can be corrected, and meanwhile, the interference of the age label error on the learning direction of the model is reduced. The KLD loss function is defined as follows:

where η is a hyperparameter used to balance the loss function, preferably set to 0.1, and the label distribution still obtained from the age label encoding is used as a leading factor to guide the learning process of the network. p is a radical of^t-1Representing the predicted probability density distribution, p, of a sample at time t-1^tRepresenting the predicted probability density distribution of the sample at time t, p representing the label distribution of the sample obtained by label coding, k being a specific age label, p_kRepresenting the probability that label k can be taken as the sample estimated age.

In the forward propagation process, the predicted age y' of the model is calculated from the expectation of the predicted probability density distribution, and the formula is as follows:

where K represents the number of categories of the age label.

And finally, calculating the difference between the predicted age y' and the real age y of the model by adopting a smooth first-order regular loss function. The formula is as follows:

the total loss value is formed by the two smoothed first-order regular loss function values L_s、L_aAnd a KLD loss function value L_rThe sum is obtained by summing up the results,considering the size of the loss function and the importance balance relation of tasks, calculating a smooth first-order regular loss function value L by using a global pooling feature vector obtained by a lightweight Resnet network and a facial contour feature vector obtained by an L-Net branch network_sWeight set to 0.5, two other terms L_a、L_rIs set to 1. Multiplying the uncertainty score alpha calculated by the image uncertainty evaluation module on the basis of the obtained summation loss value_iAnd obtaining the final loss function value.

Where N is the number of training image samples.

As shown in fig. 3, the concrete procedure of training the uncertainty suppression network model, wherein updating the weights using the back propagation algorithm involves the back propagation algorithm, updating the weight file of the uncertainty suppression network model by back propagating two smoothed first-order canonical loss function values and one KLD loss function value. The specific process is as follows: and transmitting the total loss function value obtained by calculation layer by layer according to a chain rule until the total loss function value is input into a layer, and updating the weight parameter of the model in the transmission process. The global optimal solution is approximated by a number of iterations, each involving forward propagation and gradient update by dividing the entire training sample into a number of batches.

As shown in fig. 3, a specific step of training the uncertainty suppression network model is shown, where the accuracy calculation in the validation set is performed after each iteration is finished, that is, after one forward propagation and gradient update of all training samples are completed, the validation set data is equally divided into a plurality of batches, forward propagation is performed without performing gradient update, the prediction result of the model in the validation set is calculated, and the average absolute error and the accumulated accuracy are used as the metrics in the validation set. The formula is defined as follows:

wherein y is a real age label, N is the number of training samples, N_{E≤3}Number of training samples representing an age prediction error within 3 years of age, MAE mean absolute error, CA_ETo accumulate the accuracy.

Mean absolute error MAE and accumulated accuracy CA of model on verification set_EAnd when the model weight is not increased any more, finishing the training to obtain a final model weight file. At this point, the training process is complete.

S03, an age prediction step, namely identifying the face image to be identified by using the trained uncertainty suppression network model weight file to obtain a face age prediction result;

specifically, the age prediction step is shown in fig. 4, and specifically includes:

inputting the face image to be recognized into the loaded uncertainty suppression network model, carrying out forward propagation on the image and obtaining age prediction probability density distribution, wherein the size of the image is scaled to 112 × 112, other transformation operations are not required, and an age key point is not required to be used as auxiliary information in the prediction process.

In the following, a system corresponding to the method shown in fig. 1 according to an embodiment of the present disclosure is described with reference to fig. 5, and a face age recognition system based on an uncertainty suppression network model, the system 100 includes: the training set acquisition module 101 is configured to perform preprocessing on a face image containing an age label to obtain a training image with label distribution; the model training module 102 is configured to input a training image into the uncertainty suppression network model for iterative training until the prediction accuracy of the uncertainty suppression network model on the verification set does not rise any more in the iterative process, and obtain a weight file of the trained uncertainty suppression network model; the age prediction module 103 is configured to identify a face image to be identified by using the trained uncertainty suppression network model weight file to obtain a face age prediction result; the uncertainty suppression network model comprises a lightweight Resnet network for image feature extraction, an L-Net branch network for eliminating facial contour differences based on face key point information, and an image uncertainty evaluation module based on an annotation mechanism, and the model training module 102 specifically comprises: calculating a smooth first-order regular loss function value by using a global pooling feature vector obtained by a lightweight Resnet network and a facial contour feature vector obtained by an L-Net branch network; calculating a KLD loss function value by utilizing the predicted probability density distribution and label distribution obtained by the uncertainty suppression network model; calculating a smooth first-order regular loss function value by using a prediction result obtained by the uncertainty suppression network model and the real age; updating the weight file of the uncertainty suppression network model by back-propagating the two smoothed first-order regularized loss function values and one KLD loss function value. In addition to the above modules, the system 100 may include other components, however, since these components are not relevant to the content of the embodiments of the present disclosure, illustration and description thereof are omitted here.

Embodiments of the invention may also be implemented as a computer-readable storage medium. A computer-readable storage medium according to an embodiment has computer-readable instructions stored thereon. When the computer readable instructions are executed by a processor, the face age identification method based on the uncertainty suppression network model according to the embodiment of the invention described with reference to the above drawings can be executed.

In summary, according to the face age identification method, system and storage medium based on the uncertain suppression network model provided by the invention, firstly, the face image sample data with the age label is used for training the uncertain suppression network model, and then the trained network model is used for carrying out age estimation on the face image to be identified. Based on the uncertainty suppression network model, adopting label distribution as a basic mode of age estimation, and converting the age estimation into prediction of probability density distribution of the age estimation; based on an uncertainty suppression network model, a lightweight ResnNet network is used as a main network for image feature extraction, then an L-Net branch network is used for eliminating the influence of a face contour on age estimation, meanwhile, the uncertainty of an image is calculated based on an image uncertainty evaluation module, and the interference of uncertainty data on age estimation is reduced. The invention provides an image uncertainty evaluation algorithm based on an attention machine mechanism, and provides an improved KLD loss function for supervised learning in order to learn the corresponding probability density distribution for each image. The beneficial effects that finally reach: the method has the advantages that the method can help the model to be rapidly converged and relieve the problem of insufficient training data by using the weight file trained in MS-Celeb-1M, loading the weight file into a lightweight Resnet backbone network and initializing the L-Net and image uncertainty evaluation module by using the Kaiming Normal; the age labels are converted into the corresponding label distribution from a specific numerical value, and the method has the advantages that the change process of the human face age is smoothed, and meanwhile, the correlation among the age labels is effectively learned; the key point detection tool is used for acquiring key point information and pupil coordinates of the face, and the face alignment is carried out according to the pupil coordinates, so that the method has the advantages that data expansion can be carried out through data enhancement, and the robustness of the model is improved; the L-Net branch network has the advantages that the face contour can be constructed through the auxiliary information of the key points, so that the difference of the face contour among individuals can be learned, and the effect of eliminating differential information is achieved; based on the KLD loss function, the label distribution after label coding is taken as a learning target in the training process, and meanwhile, the learned knowledge of the model is considered. The method has the advantages that the difference between the variance set a priori and the true variance of the sample can be corrected, and meanwhile, the interference of the age label error on the learning direction of the model is reduced.

In this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process or method.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. A face age identification method based on an uncertain suppression network model is characterized by comprising the following steps:

First order smoothing of prediction result and real age calculation obtained by using uncertainty suppression network modelRegular loss function value L_a；

2. The method for identifying the age of the face based on the uncertainty suppression network model according to claim 1, wherein the uncertainty suppression network model further comprises an image uncertainty evaluation module based on a batch attention mechanism, the image uncertainty evaluation module comprises a full connection layer, a batch pooling layer and a normalization layer, the full connection layer is used for performing global feature conversion on each training image, the image uncertainty evaluation module calculates uncertainty scores of training image samples by using a Query-Key matching mechanism, the Query is obtained by summing and averaging global pooling features of all training images in the batch samples, each sample corresponds to a Key value vector and is obtained by the full connection layer with input and output dimensions of 512.

3. The method for identifying the age of the face based on the uncertain suppression network model according to claim 1, wherein the step of preprocessing the face image containing the age label comprises the following specific steps:

label coding, converting the age label into label distribution, specifically, assuming that the label distribution of the face age conforms to normal distribution in advance, setting the mean value of the normal distribution as a real age label, setting the variance as prior, and converting the age label from a specific numerical value into the corresponding label distribution;

data enhancement: and acquiring key point information and pupil coordinates of the face by using a key point detection tool, and aligning the face according to the pupil coordinates.

4. The face age identification method based on the uncertainty suppression network model according to claim 2, characterized in that before the model training step, a weight file pre-trained on MS-Celeb-1M data is used and loaded into a lightweight Resnet network, and Kaiming Normal is used to initialize the weights of the L-Net branch network and the image uncertainty evaluation module.

5. The face age identification method based on the uncertainty suppression network model according to claim 1, wherein the lightweight Resnet network is based on a Resnet18 network and is improved, and the method comprises the following steps:

resizing the input image to 112 x 112;

6. The method for identifying the age of the face based on the uncertainty suppression network model according to claim 1, wherein the L-Net branch network is composed of a fully connected network and an orthogonal separation plane, the L-Net branch network uses the coordinates of the key points of the face as input features, the orthogonal separation plane is composed of feature vectors extracted from the coordinates of the key points of the face and feature vectors extracted from a lightweight Resnet network, and is a dot product operation between the two feature vectors.

7. The face age identification method based on the uncertain suppression network model according to claim 1, wherein the KLD loss function value calculating step is as follows:

and calculating a first KLD loss function value between the uncertain suppression network model prediction probability density distribution and the label distribution and a second KLD loss function value between the uncertain suppression network model prediction probability density distribution and the uncertain suppression network model prediction probability density distribution in the previous iteration from the second iteration, and balancing the first KLD loss function value and the second KLD loss function value through a hyper-parameter.

8. The face age identification method based on the uncertain suppression network model according to claim 2, wherein the age prediction step specifically comprises:

9. A face age recognition system based on an uncertainty suppression network model, comprising:

the uncertainty suppression network model comprises a lightweight Resnet network for image feature extraction and an L-Net branch network for eliminating facial contour differences based on face key point information, and the model training module specifically comprises:

10. A computer-readable storage medium having instructions stored thereon, which when executed by a processor, cause the processor to perform the method of any one of claims 1-8.