CN109815864B

CN109815864B - Facial image age identification method based on transfer learning

Info

Publication number: CN109815864B
Application number: CN201910027211.7A
Authority: CN
Inventors: 钱丽萍; 俞宁宁; 黄玉蘋; 吴远; 黄亮
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2019-01-11
Filing date: 2019-01-11
Publication date: 2021-01-01
Anticipated expiration: 2039-01-11
Also published as: CN109815864A

Abstract

A facial image age identification method based on transfer learning comprises the following steps: 1) a preprocessing technology for improving the brightness balance of the picture is adopted; 2) the method comprises the steps of using a Deep Convolutional Neural Network (DCNN) to realize picture feature extraction, and adopting a transfer learning method to train the DCNN; 3) the method comprises the steps of using a softmax classifier, wherein the softmax classifier maps a plurality of scalar parameter values output by DCNN into a probability distribution array, each probability is the possibility of corresponding to a classification label, selecting an Adam optimizer to solve a parameter theta in a deep convolutional neural network model, building the parameter theta through training a face picture dataset and an Adam optimization objective function based on the classifier of the DCNN, and taking the classification label corresponding to the maximum component in the probability distribution array output by the softmax classifier as the prediction result of the classifier in the prediction stage. The method obviously improves the accuracy of identifying the age of the face image.

Description

Facial image age identification method based on transfer learning

Technical Field

The invention relates to a face image age identification method, in particular to a face image age identification method based on transfer learning.

Background

With the rapid development of computer vision, pattern recognition and biometric identification technologies, computer-based face age estimation has become more and more important in recent years. The method has wide computer vision application prospect, including security detection, forensic medicine, human-computer interaction (HCI), electronic customer information management (ECRM) and the like. In real life, the monitoring camera and the age identification system are cooperated, so that cigarettes and prohibited drugs can be effectively prevented from being sold by the automatic vending machine to minors. In social security, fraud and illegal activities occurring in cash dispensers usually occur in people of a specific age, and thus early prevention can be confirmed by introducing age information. In the field of biometrics, facial age estimation is an important supplement to individual information, and can be combined with individual identification information such as irises, fingerprints, DNA, fingerprints and the like, so that the overall performance of a biometric identification system is improved. In a word, the human face age estimation technology based on the computer is widely applied to many fields, and has the characteristic of strong integration with other intelligent technologies.

Although there are related human face age estimation studies at home and abroad currently, the estimation accuracy is not high due to individual age generation difference, complexity of texture information, lack of data, interference factors and the like. Fundamentally, the age estimation problem can be divided into two main branches: 1) identifying an age range (e.g., 29-38 years); 2) an exact age (e.g., 18 years) is obtained. In practice, many age identification tasks typically require only one age range to be determined, and determining an age range is easier than obtaining the exact age.

Disclosure of Invention

In order to obviously improve the accuracy of a face image age identification system, the invention provides a face image age identification method based on transfer learning.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a facial image age identification method based on transfer learning comprises the following steps:

1) a preprocessing technology for improving the brightness balance of the picture is adopted;

2) the method for extracting the image features by using the DCNN realizes image feature extraction, adopts a transfer learning method to train the DCNN, and comprises the following steps:

step 2.1: dividing the face picture data into three parts: training set (60%), verification set (20%) and test set (20%), and ensuring that pictures in the training set do not appear in the verification set and the test set;

step 2.2: calling DCNN with parameters pre-trained on ImageNet, applying migration learning to train face picture data, and ensuring that the parameters of the DCNN except a full connection layer (FC) are unchanged, so that the parameters of the DCNN full connection layer are only finely adjusted by the migration learning;

step 2.3: in the training process, when the precision and the loss of the training set are continuously improved and the precision and the loss of the verification set are not obviously changed any more, the DCNN training is considered to be finished, and meanwhile, parameters finely adjusted in the FC are saved;

3) using a softmax classifier that maps a plurality of scalar parameter values output by the DCNN into an array of probability distributions, each probability being the likelihood of a corresponding classification label, for a training dataset

And s_iAs picture data, y_iE.g. {1,2, ·, C }, wherein C is the number of category labels, and N is the number of picture data; softmax can extract and map the characteristics of the data set into

Wherein x_ie.R, the model is as follows:

wherein, each parameter is defined as follows:

a probability distribution array;

x: a data mapping set;

c: the number of category labels;

parameter θ ═ θ₁,θ₂,···,θ_C) An optimization objective function and an optimization algorithm (SGD, PMSprop, Adam, etc.) are established through cross-entropy (cross-entropy) and solved, and the optimization objective function is as follows:

wherein, each parameter is defined as follows:

n: the number of picture data;

c: the number of category labels;

: a dirichlet function;

r (·): regularization constraint terms;

in a deep convolutional neural network model, an Adam optimizer is selected to solve a parameter theta, a classifier based on DCNN is established by training a human face image dataset and obtaining the parameter theta through an Adam optimization objective function, a classification label corresponding to the maximum component in a probability distribution array output by a softmax classifier is taken as a prediction result of the classifier in a prediction stage, and the process is represented as follows:

where L is the predicted classification label for the classifier.

Further, in the step 1), the pretreatment process is as follows:

step 1.1: respectively extracting values of three components from an original RGB image;

step 1.2: the mean of the three components is calculated and expressed as R_aver，G_aver，B_averThe calculation process is expressed as follows:

wherein, each parameter is defined as follows:

R_aver: the mean of the R components;

G_aver: the mean of the G components;

B_aver: the mean of the B components;

m: the number of picture pixels;

R_i: the value of the R component of the ith pixel;

G_i: the value of the G component of the ith pixel;

B_i: of B component of i-th pixelA value;

step 1.3: calculating the global gamma Q_averThe process is as follows:

wherein Q is_averIs a global gamma;

step 1.4: the gain factor for each component is calculated as follows:

wherein the parameters are defined as follows:

N_r: an R component gain coefficient;

N_g: a G component gain coefficient;

N_b: a B component gain coefficient;

step 1.5: the new components of image RGB are reconstructed as follows:

wherein the parameters are defined as follows:

R^*: a new R component;

G^*: a new G component;

B^*: a new B component;

step 1.6: the new component R obtained^*，G^*，B^*The correction is in the range of [ 0-255 ]]And (4) the following steps. For component values greater than 255, set to 255; for component values less than 0, set to 0; for component values within the range remain unchanged;

step 1.7: and constructing a picture according to the corrected new component.

In the step 1), the provided picture brightness equalization method is applied as a preprocessing mode which is not necessary for picture preprocessing, but the method is usually adopted as a preprocessing technology to obviously improve the system performance; in the step 2), the DCNN is trained by adopting transfer learning, and the whole connection layer is finely adjusted. Parameters in the partial convolutional layer may also be fine-tuned when the system requires higher accuracy and performance.

The technical conception of the invention is as follows: usually, the face image in the data set is too bright or too dark, which may adversely affect the application of the DCNN feature extraction. Therefore, a common method for equalizing the brightness of a picture is used as a picture preprocessing technique to correct the brightness of the picture. Next, we apply DCNN to implement feature extraction of the face picture. However, training a DCNN from end-to-end may take a lot of time and effort, and the deficiency of insufficient data may make it difficult to obtain good performance of the DCNN. We therefore apply a method of transfer learning to solve these problems, the core operation being to fine-tune only the parameters in the DCNN fully-connected layer while keeping the other parameters unchanged during training. After the model is trained, the prediction of the age of the face picture is realized by using a softmax classifier, and a classification label L corresponding to the maximum component in a probability distribution array output by the softmax classifier is taken as a final prediction result.

The beneficial effects of the invention are mainly as follows: 1. a commonly used brightness equalization method is applied as an image preprocessing technology, so that the adverse effect of excessive or excessively dark pictures on DCNN training and prediction is eliminated. 2. The method of transfer learning is applied to solve the problems of great time and energy consumption and data shortage in end-to-end DCNN training.

Drawings

FIG. 1 is a schematic diagram of an age recognition model of a face image;

fig. 2 is a schematic diagram of migration learning.

Detailed Description

The present invention is described in further detail below with reference to the attached drawing figures.

Referring to fig. 1 and 2, a method for identifying the age of a face image based on transfer learning uses a transfer learning method to estimate the age of the face image, so that image preprocessing (such as fig. 1) is first required. Then, a method of transfer learning (as fig. 2) is applied to train picture data, and finally, a label corresponding to the maximum component of the probability distribution array output by the softmax classifier is taken as a final prediction result. The method comprises the following steps:

1) for a face image age recognition system, it is very important to improve the image quality through the image preprocessing technology, it is not only the premise that the learning model extracts good features, but also directly influences the final prediction result, a common preprocessing technology for improving the brightness of the picture is adopted, comprising the following steps:

wherein, each parameter is defined as follows:

R_aver: the mean of the R components;

G_aver: the mean of the G components;

B_aver: the mean of the B components;

m: the number of picture pixels;

R_i: the value of the R component of the ith pixel;

G_i: the value of the G component of the ith pixel;

B_i: the value of the B component of the ith pixel;

step 1.3: calculating the global gamma Q_averThe process is as follows:

wherein Q is_averIs a global gamma;

step 1.4: the gain factor for each component is calculated as follows:

wherein the parameters are defined as follows:

N_r: an R component gain coefficient;

N_g: a G component gain coefficient;

N_b: a B component gain coefficient;

step 1.5: the new components of image RGB are reconstructed as follows:

wherein the parameters are defined as follows:

R^*: a new R component;

G^*: a new G component;

B^*: a new B component;

step 1.7: constructing a picture according to the corrected new component, and finishing preprocessing;

2) we use the Deep Convolutional Neural Network (DCNN) to achieve the picture feature extraction. However, a great deal of time and energy are spent on training a DCNN from end to end, and the network cannot obtain good performance due to the limitation of insufficient picture data. To solve these problems, we adopt a method of transfer learning to train DCNN, including the following steps:

step 2.3: in the training process, the accuracy and the loss of the training set are continuously improved, and when the accuracy and the loss of the verification set do not obviously change any more, the DCNN training is considered to be finished, and meanwhile, the FC fine-tuning parameters are saved;

3) among the multi-classification problems, the softmax classifier is most commonly used. The softmax classifier maps a plurality of scalar parameter values output by the DCNN into a probability distribution array, each probability being the probability of a corresponding classification label, for a training data set

And s_iAs picture data, y_iE.g. {1,2, ·, C }, wherein C is the number of category labels, N is the number of picture data, and softmax can extract and map the characteristics of the data set into

Wherein x_ie.R, the model is as follows:

wherein, each parameter is defined as follows:

a probability distribution array;

x: a data mapping set;

c: the number of category labels;

parameter θ ═ θ₁,θ₂,···,θ_C) An optimization objective function and an optimization algorithm (SGD, PMSprop, Adam, etc.) are established through cross-entropy (cross-entropy) to solve. The optimization objective function is as follows:

wherein, each parameter is defined as follows:

n: the number of picture data;

c: the number of category labels;

: a dirichlet function;

r (·): regularization constraint terms;

adam is a commonly used optimizer in deep convolutional neural network models. Compared with optimizers such as SGD (generalized maximum) and RMSprop (rmSprop), the method has better performance in the image classification problem, so that an Adam optimizer is selected to solve a parameter theta, a classifier based on DCNN (distributed computing network) is established by training a face image data set and obtaining the parameter theta through an Adam optimization objective function, a classification label corresponding to the maximum component in a probability distribution array output by a softmax classifier is taken as a prediction result of the classifier in a prediction stage, and the process is represented as follows:

where L is the predicted classification label for the classifier.

Claims

1. A facial image age identification method based on transfer learning is characterized by comprising the following steps:

step 2.1: dividing the face picture data into three parts: the system comprises a training set, a verification set and a test set, and ensures that pictures in the training set do not appear in the verification set and the test set;

step 2.2: calling DCNN with parameters pre-trained on ImageNet, and training face picture data by applying transfer learning to ensure that the parameters of the DCNN except a full connection layer are unchanged, so that only the parameters of the DCNN full connection layer are finely adjusted by the transfer learning;

And s_iAs picture data, y_iE is e {1,2, …, C }, wherein C is the number of the category labels, and N is the number of the picture data; softmax can extract and map the characteristics of the data set into

Wherein x_ie.R, the model is as follows:

wherein, each parameter is defined as follows:

a probability distribution array;

x: a data mapping set;

c: the number of category labels;

parameter θ ═ θ₁,θ₂,…,θ_C) Establishing an optimization objective function and solving an optimization algorithm through cross entropy, wherein the optimization objective function is as follows:

wherein, each parameter is defined as follows:

n: the number of picture data;

c: the number of category labels;

: a dirichlet function;

r (·): regularization constraint terms;

where L is the predicted classification label for the classifier.

2. The method for identifying the age of the human face image based on the transfer learning as claimed in claim 1, wherein in the step 1), the preprocessing process is as follows:

wherein, each parameter is defined as follows:

R_aver: the mean of the R components;

G_aver: the mean of the G components;

B_aver: the mean of the B components;

m: the number of picture pixels;

R_i: the value of the R component of the ith pixel;

G_i: the value of the G component of the ith pixel;

B_i: the value of the B component of the ith pixel;

step 1.3: calculating the global gamma Q_averThe process is as follows:

wherein Q is_averIs a global gamma;

step 1.4: the gain factor for each component is calculated as follows:

wherein the parameters are defined as follows:

N_r: an R component gain coefficient;

N_g: a G component gain coefficient;

N_b: a B component gain coefficient;

step 1.5: the new components of image RGB are reconstructed as follows:

wherein the parameters are defined as follows:

R^*: a new R component;

G^*: a new G component;

B^*: a new B component;

step 1.6: the new component R obtained^*，G^*，B^*The correction is in the range of [ 0-255 ]]For component values greater than 255, set to 255; for theComponent values less than 0, set to 0; for component values within the range remain unchanged;

step 1.7: and constructing a picture according to the corrected new component.