CN109002755B - Age estimation model construction method and estimation method based on face image - Google Patents

Age estimation model construction method and estimation method based on face image Download PDF

Info

Publication number
CN109002755B
CN109002755B CN201810563826.7A CN201810563826A CN109002755B CN 109002755 B CN109002755 B CN 109002755B CN 201810563826 A CN201810563826 A CN 201810563826A CN 109002755 B CN109002755 B CN 109002755B
Authority
CN
China
Prior art keywords
image
face
face image
age
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810563826.7A
Other languages
Chinese (zh)
Other versions
CN109002755A (en
Inventor
彭进业
李帆
李展
王珺
章勇勤
祝轩
唐文华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern University
Original Assignee
Northwestern University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern University filed Critical Northwestern University
Priority to CN201810563826.7A priority Critical patent/CN109002755B/en
Publication of CN109002755A publication Critical patent/CN109002755A/en
Application granted granted Critical
Publication of CN109002755B publication Critical patent/CN109002755B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/285Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/162Detection; Localisation; Normalisation using pixel segmentation or colour matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/467Encoded features or binary features, e.g. local binary patterns [LBP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/178Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition

Abstract

The invention discloses a method for constructing an age estimation model based on a face image and an estimation method, which adopts a method based on skin color classification and deep label distribution learning, age estimation is carried out on the face image in the MORPH database, the influence of individual skin color difference is considered in the age estimation method, compared with the existing method, the method can effectively reduce the influence caused by skin color difference, changes the last global average pooling layer of the inclusion-V3 deep convolutional neural network into the global maximum pooling layer, can reduce the problem of mean shift of estimated values caused by parameter errors of the convolutional layer, more retains texture information, and a deep label distribution learning algorithm and an inclusion-V3 deep convolution neural network are adopted, network fine tuning is carried out on data by using transfer learning, and the feasibility and the effectiveness of the method are verified through theoretical analysis and experiments.

Description

Age estimation model construction method and estimation method based on face image
Technical Field
The invention relates to the field of face image recognition, in particular to a face image-based age estimation model construction method and an age estimation method.
Background
Age information, an important biometric feature of human beings, has numerous application requirements in the field of human-computer interaction, and has an important influence on the performance of a face recognition system. The age estimation of the face image refers to modeling the rule of the face image changing along with the age by applying a computer technology, so that a machine can estimate the approximate age or the age range of a person according to the face image.
In the prior art, when age estimation is performed on the basis of a face image, a Gabor feature and an LBP feature are used, and an SVM is used for classifying ages; face age estimation is also performed by KPLS (Kernel Partial Least squares) regression and BIF (BiologeallyInsected Features) methods.
In the prior art, when the age of a face image is estimated, the influence of skin color difference and aging process on the age estimation accuracy is not considered, so that the age estimation accuracy is not high.
Disclosure of Invention
The invention aims to provide a construction method and an estimation method of an age estimation model based on a face image, which are used for solving the problems that the age estimation accuracy is low and the like because the skin color influence is not considered when the age estimation is carried out on the face image by the method in the prior art.
In order to realize the task, the invention adopts the following technical scheme:
an age estimation model construction method based on face images comprises the following steps:
step 1, carrying out face detection on a plurality of images with faces, intercepting only the images with the faces as face images, and storing the plurality of face images as an original face image set;
step 2, dividing all the face images in the original face image set into two groups, wherein all the face images with black skin form a first image set, and all the face images with non-black skin form a second image set;
respectively corresponding each human face image in the first image set to a respective age label to obtain a first age label set, and respectively corresponding each human face image in the second image set to a respective age label to obtain a second age label set, wherein the age labels are real age values of human faces in the human face images;
step 3, extracting LBP characteristics from each face image in the first image set to obtain first LBP characteristics of each face image; extracting DCNN (direct current neural network) features from each face image in the first image set by adopting a depth convolution neural network to obtain first DCNN features of each face image; fusing the first LBP characteristic and the first DCNN characteristic of each facial image to obtain a first image characteristic of each facial image, and collecting the first image characteristics of all facial images in a first image set to obtain a first image characteristic set;
extracting LBP characteristics from each face image in the second image set to obtain second LBP characteristics of each face image; extracting DCNN (direct current neural network) features from each face image in the second image set by adopting a depth convolution neural network to obtain second DCNN features of each face image; fusing the second LBP characteristic and the second DCNN characteristic of each facial image to obtain a second image characteristic of each facial image, and collecting the second image characteristics of all facial images in a second image set to obtain a second image characteristic set;
step 4, taking the first image feature set as input and the first age label set as output, training an XGboost regression model, and obtaining a first age estimation model;
and taking the second image feature set as input and the second age label set as output, training the XGboost regression model, and obtaining a second age estimation model.
Further, dividing all the face images in the original face image set into two groups, specifically including:
step 21, taking a plurality of face images from the original face image set as a grouped image set, wherein each face image in the grouped image set corresponds to a respective skin color label, the skin color labels comprise black skin [0] and non-black skin [1], and collecting the skin color labels of all the face images in the grouped image set to obtain a skin color label set;
step 22, extracting the color feature and the LBP feature of each facial image in the grouped image set, fusing the color feature and the LBP feature of each facial image to obtain the skin color feature of each facial image, and collecting the skin color features of all the facial images in the grouped image set to obtain a skin color feature set;
step 23, taking the skin color feature set as input and the skin color label set as output, training an XGboost classification model, and obtaining a face classification model;
and step 24, grouping the original face image set processed in the step 22 by using the face classification model obtained in the step 23 to obtain a black skin face image or a non-black skin face image.
Further, the loss function in the XGBoost classification model is logistic loss.
Further, the DCNN features are extracted by adopting an inclusion-v 3 network.
Further, when the inclusion-v 3 network is adopted to extract the DCNN feature, a top-layer global average pooling layer in the inclusion-v 3 network is replaced by a global maximum pooling layer, a plurality of layers of gaussian distribution initialized full-connection layers are added behind the global maximum pooling layer, tag distribution of a KL loss layer is replaced by deep tag distribution, and the output of the last full-connection layer in the plurality of layers of gaussian distribution initialized full-connection layers is used as the DCNN feature.
Further, two Gaussian distribution initialized full-connection layers are added behind the global maximum pooling layer, the weight of the global maximum pooling layer in the inclusion-v 3 network and the added two Gaussian distribution initialized full-connection layers is updated by adopting a back propagation finetune method, the output of the second full-connection layer is obtained, and a learning rate is set by utilizing an exponential attenuation method, wherein the initial learning rate is 0.1, the minimum step length is 32, and the iteration times are 100.
Further, the mean value of the gaussian distribution of the fully connected layer initialized by the gaussian distribution is 0, and the variance is 0.01.
Further, the loss function in the XGBoost regression model is square loss.
An age estimation method based on a face image, which adopts the step 1 to process an image to be estimated to obtain the face image to be estimated, and carries out age estimation on the face image to be estimated, and the method comprises the following steps:
step A, adopting the face classification model to classify the face image to be estimated, which is processed in step 22, and if the face image is a black skin face image, executing step B; if the face image is a non-black skin face image, executing the step C;
b, adopting the first age estimation model to estimate the age of the face image of the black skin processed in the step 3;
and C, adopting the second age estimation model to estimate the age of the face image of the non-black skin processed in the step 3.
Compared with the prior art, the invention has the following technical characteristics:
1. the invention considers the influence of individual skin color difference into the age estimation method, classifies the face images in the database into black and non-black according to the skin color difference, establishes age estimation models for the classified images respectively, and can effectively reduce the influence caused by the skin color difference compared with the prior method.
2. The last global average pooling layer of the inclusion-V3 deep convolutional neural network is changed into a global maximum pooling layer, because the global average pooling can reduce the problem of the increase of variance of the estimated value caused by the limitation of the size of a convolutional kernel, more background information of the image is reserved, more texture information needs to be reserved for the age estimation of the face, the global maximum pooling can reduce the problem of mean shift of the estimated value caused by parameter errors of the convolutional layer, and more texture information is reserved.
3. The method adopts a depth label distribution learning algorithm and an inclusion-V3 depth convolution neural network, uses transfer learning to carry out network fine tuning on data, extracts depth features, and is fused with LBP features, thereby complementing the problem that DCNN neglects local structure features when directly extracting face features and the problem of relying on manual selection to intervene subjective factors when extracting LBP features.
Drawings
FIG. 1 is a flow chart of a model building method provided by the present invention;
FIG. 2 is a face image in a first image set provided by the present invention;
FIG. 3 is a face image in a second image set provided by the present invention;
FIG. 4 is an image to be estimated provided in an embodiment of the present invention;
FIG. 5 is a diagram of a face image to be estimated according to an embodiment of the present invention;
FIG. 6 is a diagram of a face image to be estimated according to another embodiment of the present invention;
FIG. 7 is a graph of MAE results after training samples are added for each age estimation algorithm provided in one embodiment of the present invention;
fig. 8 is a graph of experimental results of depth selection of trees in the XGBoost algorithm provided by the present invention.
Detailed Description
The following are specific examples provided by the inventors to further explain the technical solutions of the present invention.
For example, the skin color of african is mostly black, the skin color of european is mostly white, and the skin color of asian is mostly yellow, so that black skin refers to the skin color of african, and non-black skin refers to the skin color of europe, asian, and the like, such as white, yellow, and the like.
Example one
The invention discloses a method for constructing an age estimation model based on a face image, which is used for constructing a model for estimating the age of an image with a face, and as shown in figure 1, the method comprises the following steps:
step 1, carrying out face detection on a plurality of images with faces, intercepting only the images with the faces as face images, and storing the plurality of face images as an original face image set;
because the acquired image not only comprises the face but also comprises a plurality of other interference factors, the face part in the image is extracted as the face image in the step, and the processing speed of the subsequent step is improved.
In this example, the images with faces were all from the MORPH database, where there were 55134 images of over 13,000 persons. Each person has approximately 4 images, aged from 16 to 77 years, from different ethnicities, with approximately 77% african, 19% european, and the remaining 4% including asians, etc. 10000 black skin images and 10000 non-black skin images are randomly selected from the 55134 images, an MATLAB face detection tool box is adopted to carry out face detection on 20000 images, the detected face part is extracted and stored as a face image with 299 pixels, and the 20000 face images form an original face image set.
Step 2, dividing all original face images in the original face image set into two groups, wherein all face images with black skin form a first image set, and all face images with non-black skin form a second image set;
respectively corresponding each image in the first image set to a respective age tag to obtain a first age tag set, and respectively corresponding each image in the second image set to a respective age tag to obtain a second age tag set, wherein the age tags are real age values of human faces in the images;
the method for dividing all original face images in the original face image set into two groups of black skin and non-black skin can be that after the face image features are extracted, a neural network classifier and a support vector machine classifier are adopted to classify the images.
In a preferred embodiment, the color features and LBP features of the image are extracted as the face image features, and the XGBoost classification model is used for classification.
Specifically, the method for classifying face images according to skin colors provided by the invention comprises the following steps:
step 21, taking a plurality of face images from the original face image set as a grouped image set, wherein each image in the grouped image set corresponds to a respective skin color label, the skin color labels comprise black skin [0] and non-black skin [1], and collecting the skin color labels of all the images in the grouped image set to obtain a skin color label set;
in this embodiment, 1000 face images are extracted from the original face image set as a grouped image set, wherein 500 face images are black skins [0], and 500 face images are non-black skins [1], and a 1000 × 1-dimensional vector is obtained as a skin color tag set according to the sequence of the face images in the grouped image set corresponding to the skin color tags.
Step 22, extracting the color feature and LBP feature of each facial image in the grouped image set, fusing the color feature and LBP feature of each original facial image to obtain the skin color feature of each facial image, and collecting the skin color features of all the images in the grouped image set to obtain a skin color feature set;
in the step, the pixel value of the color face image is used as the color feature, before the LBP feature is extracted, in order to improve the operation speed of the algorithm, the color face image is grayed, the gray face image is obtained to extract the LBP feature, after each face image is segmented, the LBP feature is extracted from each region of the segmented image.
In this embodiment, the face image is divided into 10 × 10 square regions, an LBP feature operator is used for each region, histogram features are calculated by using an equivalence mode, equal weight is given to each region, and the features of each region are connected to obtain an LBP feature of 5900 dimensions.
In the application, in order to improve the efficiency of the algorithm, the LBP characteristics and the color characteristics are spliced end to obtain the skin color characteristics of each face image.
Step 23, taking the skin color feature set as input and the skin color label set as output, training an XGboost classification model, and obtaining a face classification model;
in this step, the XGBoost classification model is trained, the loss function may be Square loss, logistic loss, etc., and the regularization term may be L1 regularization, L2 regularization, etc.
In this embodiment, the loss function in the XGBoost classification model is logistic loss, and the regularization term is L2 regularization.
And step 24, grouping the original face image set face images processed in the step 22 by using the face classification model obtained in the step 23 to obtain black skin face images or non-black skin face images.
In the step, the original face images in the original face image set are classified by using a face classification model, and the skin color property of each image is obtained. Through the classification of the step, each face image in the original face image set obtains the corresponding skin color label, so that the existing original face image set is divided into two groups, one group is a first image set formed by face images of all black skins, and the other group is a second image set formed by face images of all non-black skins.
Finally, each face image in the first image set corresponds to a respective age label, and a first age label set is obtained; and respectively corresponding each face image in the second image set to a respective age label to obtain a second age label set, wherein the age label is a real age value of the face in the image.
In this embodiment, the true age of the black skin face image shown in fig. 2 is 67, which corresponds to the age label [67], and the true age of the non-black skin face image shown in fig. 3 is 50, which corresponds to the age label [50 ].
In the invention, in order to reduce the influence on age estimation caused by skin color difference, a set of age estimation models are respectively established for a black skin face image and a non-black skin face image, so that different age estimation models are selected according to different colors of the face skin when the face image is estimated.
Step 3, extracting LBP characteristics from each face image in the first image set to obtain first LBP characteristics of each face image; extracting DCNN (direct current neural network) features from each face image in the first image set by adopting a deep convolutional neural network to obtain first DCNN features of each face image; fusing the first LBP characteristic and the first DCNN characteristic of each facial image to obtain a first image characteristic of each facial image, and collecting the first image characteristics of all facial images in a first image set to obtain a first image characteristic set;
in this step, the LBP features of all face images in the first image set adopt the LBP features extracted in step 22 to obtain a first LBP feature set.
When the convolutional neural network is adopted to extract the DCNN features, an AlexNet network, a ZF network, an inclusion-v 3 network and the like can be adopted.
In a preferred embodiment, the inclusion-v 3 network is adopted to extract DCNN features to obtain DCNN feature sets.
The method comprises the steps of extracting a first DCNN feature by adopting an inclusion-v 3 network to obtain a first DCNN feature set, improving the structure of the inclusion-v 3 network by adopting an inclusion-v 3 network to reserve more texture information of a human face, and modifying partial parameters of the inclusion-v 3 network.
Specifically, when the inclusion-v 3 network is used for extracting the DCNN feature, in order to make the method not easy to generate overfitting, a top-layer global average pooling layer in the inclusion-v 3 network is replaced by a global maximum pooling layer, a plurality of layers of gaussian distribution initialized full-connection layers are added behind the global maximum pooling layer, tag distribution of KL loss layers is replaced by deep tag distribution, and the output of the last full-connection layer in the plurality of layers of gaussian distribution initialized full-connection layers is used as the DCNN feature.
The last global average pooling layer of the inclusion-V3 deep convolutional neural network is changed into a global maximum pooling layer, because the global average pooling can reduce the problem of the increase of variance of the estimated value caused by the limitation of the size of a convolutional kernel, more background information of the image is reserved, more texture information needs to be reserved for the age estimation of the face, the global maximum pooling can reduce the problem of mean shift of the estimated value caused by parameter errors of the convolutional layer, and more texture information is reserved.
In order to ensure the effectiveness of the algorithm and the processing speed, the scheme provided by the invention adds a 2-layer Gaussian distribution initialized full-connection layer behind the global maximum pooling layer, see table 1, and adopts a back propagation finetune method to update the weights of the global maximum pooling layer and the added 2-layer Gaussian distribution initialized full-connection layer in the inclusion-v 3 network, so as to obtain the output of the second full-connection layer, and sets the learning rate by using an exponential decay method, wherein the initial learning rate is 0.1, the minimum step length is 32, and the iteration number is 100.
In a preferred embodiment, the mean value of the gaussian distribution of the fully connected layer initialized by the gaussian distribution is 0 and the variance is 0.01.
Table 1 improved inclusion V3 network structure
Figure BDA0001683981460000111
Figure BDA0001683981460000121
In addition, in order to improve the accuracy of the age estimation algorithm, the invention replaces the label distribution of a KL loss layer with the depth label distribution, the depth label distribution is a label distribution learning algorithm which converts an age label corresponding to each face age image into a discrete label distribution and minimizes the Kullback-Leibler difference between a real age label and a predicted age label by using a depth convolution neural network, the Kullback-Leibler digenerc is used as a standard for measuring the similarity between the real label distribution and the predicted label distribution, and the loss function is optimized by using a random gradient descent method.
Through the steps, the LBP characteristic and the DCNN characteristic of each face image in the first image set are extracted, and in order to ensure the accuracy of characteristic extraction and further ensure the accuracy of age estimation, the LBP characteristic and the DCNN characteristic are fused, so that the problems that the local structural characteristic of the DCNN characteristic is ignored when the DCNN characteristic is directly extracted and the problem that the LBP characteristic has more subjective factors when being extracted are solved.
In the scheme, in order to improve the efficiency of the algorithm, the LBP feature and the DCNN feature are directly spliced end to obtain the first image feature of each image in the first image set, so that the first image feature set of the first image set is obtained.
Extracting LBP characteristics from each face image in the second image set to obtain second LBP characteristics of each face image; extracting DCNN (direct current neural network) features from each face image in the second image set by adopting a depth convolution neural network to obtain second DCNN features of each face image; and fusing the second LBP characteristic and the second DCNN characteristic of each facial image to obtain a second image characteristic of each facial image, and collecting the second image characteristics of all facial images in the second image set to obtain a second image characteristic set.
And extracting LBP (local binary pattern) features and DCNN (direct sequence neural network) features from all the face images in the second image set by adopting the same method for processing all the face images in the first image set, and then fusing to obtain second image features of all the face images in the second image set, thereby obtaining a second image feature set.
Step 4, taking the first image feature set as input and the first age label set as output, training an XGboost regression model, and obtaining a first age estimation model;
and taking the second image feature set as input and the second age label set as output, training the XGboost regression model, and obtaining a second age estimation model.
In the scheme, two age estimation models are respectively established for the first image feature set and the second image feature set, namely, the image feature set of black skin corresponds to the first age estimation model, and the image feature set of non-black skin corresponds to the second age estimation model.
When the age estimation model is established, an XGboost regression model is adopted, and training is carried out by utilizing input and output, and as a preferred implementation mode, a loss function in the XGboost regression model is square loss.
Example two
Processing the image to be estimated by adopting the step 1 of the first embodiment to obtain the face image to be estimated, and estimating the age of the face image to be estimated.
In this embodiment, the age of the image to be estimated shown in fig. 4 is estimated, and after the processing in step 1, the face image to be estimated shown in fig. 5 is obtained, and the age of the face image to be estimated is estimated.
Specifically, the method comprises the following steps:
step A, adopting the face classification model in the first embodiment to classify the face image to be estimated, which is processed in the step 22, and if the face image is a black skin face image, executing the step B; and C, if the face image is a non-black face image, executing the step C.
In this embodiment, the face image shown in fig. 6 is classified by using a face classification model, and as a result, the face image is a black [0] skin face image, and step B is performed.
Step B, adopting the first age estimation model of the first embodiment to estimate the age of the black skin face image processed in the step 3;
and step C, adopting the second age estimation model in the first embodiment to estimate the age of the non-black skin face image processed in the step 3.
In step B, C, age estimation is performed on the face images of different skin colors.
In this embodiment, a first age estimation model is selected for a face image to be estimated as shown in fig. 6 to perform age estimation, and an obtained age label is: [70] i.e. the estimated age of the face in the image is 70 years.
EXAMPLE III
Validity test of face estimation model
In order to prove the effectiveness of the face age estimation model provided by the invention, two groups of experiments are carried out, wherein one experiment adopts a method for extracting features of the text, and the experiments are respectively carried out by using the existing age estimation algorithm, including BP neural network, SVM, KNN and label distribution learning algorithms IIS-LLD and CPNN. For CPNN, the number of hidden layer units is 400, k in KNN algorithm is set to 61, the activation function of BP neural network is sigmoid, and the hidden layer has 100 neurons. Experiment two through comparison with the results of the existing age estimation method on the basis of using the same database, the comparison method comprises AGES, MTWGB, OHRanker, Rkcca and the like.
When the original image set is classified, the black skin mark is 0, the non-black skin mark is 1, the threshold value is set to be 0.6, the classification result is larger than the threshold value, the classification result is classified into the non-black skin, a 10-fold cross verification method is adopted for carrying out experiments, finally, the average classification accuracy is 0.94, and the deviation is 0.03.
Table 2 shows the comparison results of age estimation, MAE, for different input characteristics1Means that the model only adopts LBP characteristics of the classified images as the average absolute error, MAE2Mean absolute error, MAE, obtained by using the depth features of the classified images as input features3The mean absolute error of the model obtained by using the LBP characteristic and the depth characteristic of the classified image as input characteristics is referred to.
Table 2: inputting age estimation comparison results with different characteristics
Figure BDA0001683981460000151
From the results of table 2, it can be seen that the age estimation model proposed herein exhibits excellent performance. The results of the first five comparison algorithms are the average of the results of the two modes of the black skin face image and the non-black skin face image. After the feature fusion is adopted, most algorithms have better effect than that of only using a single feature, except for the BP neural network and the SVM, the possible reason is the influence brought by the increase of the number of samples and the feature dimension. The learning speed of the BP neural network algorithm is very low, the BP neural network algorithm is optimized by adopting a gradient descent method, an objective function to be optimized is very complex, the BP algorithm is low in efficiency, from the mathematical point of view, the BP algorithm is an optimization method for local search, but the problem to be solved is to solve the global extremum of a complex nonlinear function, and therefore the algorithm is likely to fall into the local extremum. Because of the large number of database samples and the large number of categories for classifying ages, the storage and calculation of the matrix will consume a large amount of machine memory and operation time, which may affect the performance of the SVM algorithm. The time complexity and the storage space of the KNN algorithm can be rapidly increased along with the increase of the scale and the feature dimension of the training set, because the similarity must be calculated and compared with all the training sets every time when a new sample to be classified is adopted, when the face images of the database are classified by adopting the KNN algorithm, the classification of the sample is extremely unbalanced due to more classified age classes, everyone hardly ensures that each age group has at least one image, some age groups have a plurality of images, some age groups have almost no images, and the classification accuracy can be influenced. The IIS-LLD and CPNN algorithms are algorithms proposed by Geng et al that specifically address face age estimation, and the reason why CPNN performs better may be that, first, CPNN learns without a priori assumptions, while the IIS-LLD assumption is a maximum entropy model, which is not necessarily suitable for age estimation. Secondly, all class labels share the same set of model parameters in the CPNN, and the IIS-LLD learns the parameters of each class label separately, so the CPNN can better exploit the correlation between class labels, but the high bias brought by the CPNN suggests that it is more susceptible to overfitting.
As shown in table 2, the age estimation model proposed by the present invention is better for the age estimation of the face image with black skin than for the face image with non-black skin, and may be affected by the illumination during the database image acquisition process, especially the face image with white skin.
The convolutional neural network has excellent performance in the fields of target detection, image recognition and the like, wherein the important reason is a large amount of training data, for the problem of age estimation, whether increasing the number of training samples influences the result, in order to further verify the idea, 30000 pieces of black skin face images left in the MORPH database are gradually added into the established black skin face age estimation model, while the test data are kept the same, and as a result, as shown in fig. 7, by increasing the number of training samples, the age estimation model provided herein is utilized, so that a lower MAE of 3.39 can be obtained.
Table 3 lists the comparison results of the age estimation models provided by the present invention, and the method proposed by the present invention achieves better results, wherein the result of the age estimation on the face images of non-black skin is less than ideal, and the number of possible reason training samples is insufficient, wherein the age range of 0-16 years has few sample images, and each age range cannot guarantee sufficient data, and the influence of illumination interferes with the extraction of texture features, but achieves better results on the age estimation of the face images of black skin, and the MORPH database has about 40000 face images of black skin, and the difference between black skin color and illumination brightness is obvious, so the illumination influence on the results is less likely.
Table 3: comparison results of different age estimation models
Figure BDA0001683981460000171
Figure BDA0001683981460000181
For the XGBoost algorithm, the depth of the tree is also an important parameter, the depth of the tree is insufficient, which may cause an under-fitting phenomenon, and the estimated age result is inaccurate, for example, if the depth of the fruit tree is too deep, an over-fitting phenomenon may occur, so that the selection of the depth of the tree needs to be experimentally verified, as shown in fig. 8, when the depth of the tree is 3, the obtained tree model may generate a better result.

Claims (9)

1. A method for constructing an age estimation model based on a face image is characterized by comprising the following steps:
step 1, carrying out face detection on a plurality of images with faces, intercepting only the images with the faces as face images, and storing the plurality of face images as an original face image set;
step 2, dividing all the face images in the original face image set into two groups, wherein all the face images with black skin form a first image set, and all the face images with non-black skin form a second image set;
taking the real age value of each facial image in the first image set as a respective age label to obtain a first age label set, and taking the real age value of each facial image in the second image set as a respective age label to obtain a second age label set;
step 3, extracting LBP characteristics from each face image in the first image set to obtain first LBP characteristics of each face image; extracting DCNN (direct current neural network) features from each face image in the first image set by adopting a depth convolution neural network to obtain first DCNN features of each face image; fusing the first LBP characteristic and the first DCNN characteristic of each facial image to obtain a first image characteristic of each facial image, and collecting the first image characteristics of all facial images in a first image set to obtain a first image characteristic set;
extracting LBP characteristics from each face image in the second image set to obtain second LBP characteristics of each face image; extracting DCNN (direct current neural network) features from each face image in the second image set by adopting a depth convolution neural network to obtain second DCNN features of each face image; fusing the second LBP characteristic and the second DCNN characteristic of each facial image to obtain a second image characteristic of each facial image, and collecting the second image characteristics of all facial images in a second image set to obtain a second image characteristic set;
step 4, taking the first image feature set as input and the first age label set as output, training an XGboost regression model, and obtaining a first age estimation model;
and taking the second image feature set as input and the second age label set as output, training the XGboost regression model, and obtaining a second age estimation model.
2. The method for constructing an age estimation model based on facial images according to claim 1, wherein the dividing of all facial images in the original facial image set into two groups specifically comprises:
step 21, taking a plurality of face images from the original face image set as a grouped image set, wherein each face image in the grouped image set corresponds to a respective skin color label, the skin color labels comprise black skin [0] and non-black skin [1], and collecting the skin color labels of all the face images in the grouped image set to obtain a skin color label set;
step 22, extracting the color feature and the LBP feature of each facial image in the grouped image set, fusing the color feature and the LBP feature of each facial image to obtain the skin color feature of each facial image, and collecting the skin color features of all the facial images in the grouped image set to obtain a skin color feature set;
step 23, taking the skin color feature set as input and the skin color label set as output, training an XGboost classification model, and obtaining a face classification model;
and step 24, grouping the original face image set processed in the step 22 by using the face classification model obtained in the step 23 to obtain a black skin face image or a non-black skin face image.
3. The method for constructing an age estimation model based on facial images as claimed in claim 2, wherein the loss function in the XGBoost classification model is logistic.
4. The method of claim 1, wherein the inclusion-v 3 network is used to extract DCNN features.
5. The age estimation model construction method based on facial images as claimed in claim 4, wherein when the inclusion-v 3 network is adopted to extract DCNN features, a top layer global average pooling layer in the inclusion-v 3 network is replaced by a global maximum pooling layer, a multi-layer Gaussian distribution initialized full-connected layer is added behind the global maximum pooling layer, tag distribution of KL loss layers is replaced by depth tag distribution, and the output of the last full-connected layer in the multi-layer Gaussian distribution initialized full-connected layer is taken as the DCNN features.
6. The method for constructing an age estimation model based on facial images as claimed in claim 5, wherein two layers of Gaussian distribution initialized full-connected layers are added behind the global maximum pooling layer, the weight of the global maximum pooling layer and the added two layers of Gaussian distribution initialized full-connected layers in the inclusion-v 3 network is updated by using a back propagation finetune method, the output of the second full-connected layer is obtained, and the learning rate is set by using an exponential decay method, wherein the initial learning rate is 0.1, the minimum step size is 32, and the iteration number is 100.
7. The method of claim 6, wherein the mean value of the Gaussian distribution of the fully connected layer initialized by the Gaussian distribution is 0, and the variance is 0.01.
8. The method for constructing an age estimation model based on facial images as claimed in claim 1, wherein the loss function in the XGBoost regression model is square loss.
9. An age estimation method based on a face image, characterized in that, the step 1 of claim 1 is adopted to process the image to be estimated to obtain the face image to be estimated, and the age estimation is carried out on the face image to be estimated, the method comprises:
step A, adopting the face classification model of any claim of claims 2-3 to classify the face image to be estimated processed in step 22, and if the face image is a face image with black skin, executing step B; if the face image is a non-black skin face image, executing the step C;
step B, adopting the first age estimation model of any one of claims 1-8 to perform age estimation on the face image of the black skin processed in the step 3;
and C, using the second age estimation model of any one of claims 1 to 8 to estimate the age of the face image of the non-black skin processed in the step 3.
CN201810563826.7A 2018-06-04 2018-06-04 Age estimation model construction method and estimation method based on face image Active CN109002755B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810563826.7A CN109002755B (en) 2018-06-04 2018-06-04 Age estimation model construction method and estimation method based on face image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810563826.7A CN109002755B (en) 2018-06-04 2018-06-04 Age estimation model construction method and estimation method based on face image

Publications (2)

Publication Number Publication Date
CN109002755A CN109002755A (en) 2018-12-14
CN109002755B true CN109002755B (en) 2020-09-01

Family

ID=64574206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810563826.7A Active CN109002755B (en) 2018-06-04 2018-06-04 Age estimation model construction method and estimation method based on face image

Country Status (1)

Country Link
CN (1) CN109002755B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766810B (en) * 2018-12-31 2023-02-28 陕西师范大学 Face recognition classification method based on collaborative representation, pooling and fusion
CN109815864B (en) * 2019-01-11 2021-01-01 浙江工业大学 Facial image age identification method based on transfer learning
CN110069994B (en) * 2019-03-18 2021-03-23 中国科学院自动化研究所 Face attribute recognition system and method based on face multiple regions
CN110084134A (en) * 2019-04-03 2019-08-02 东华大学 A kind of face attendance checking system based on cascade neural network and Fusion Features
CN110222215B (en) * 2019-05-31 2021-05-04 浙江大学 Crop pest detection method based on F-SSD-IV3
CN111539911B (en) * 2020-03-23 2021-09-28 中国科学院自动化研究所 Mouth breathing face recognition method, device and storage medium
CN112102314B (en) * 2020-11-02 2021-03-09 成都考拉悠然科技有限公司 Computing method for judging quality of face image based on uncertainty
CN113095300A (en) * 2021-05-13 2021-07-09 华南理工大学 Age prediction method and system fusing race information

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504376A (en) * 2014-12-22 2015-04-08 厦门美图之家科技有限公司 Age classification method and system for face images
CN105095833A (en) * 2014-05-08 2015-11-25 中国科学院声学研究所 Network constructing method for human face identification, identification method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5239126B2 (en) * 2006-04-11 2013-07-17 株式会社ニコン Electronic camera

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095833A (en) * 2014-05-08 2015-11-25 中国科学院声学研究所 Network constructing method for human face identification, identification method and system
CN104504376A (en) * 2014-12-22 2015-04-08 厦门美图之家科技有限公司 Age classification method and system for face images

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A study of convolutional sparse feature learning for human age estimate;Xiaolong Wang 等;《2017 12th IEEE International Conference on Automatic Face & Gesture Recognition》;20171231;第566-572页 *
人脸图像的年龄估计技术研究;王先梅 等;《中国图象图形学报》;20120616;第603-618页 *

Also Published As

Publication number Publication date
CN109002755A (en) 2018-12-14

Similar Documents

Publication Publication Date Title
CN109002755B (en) Age estimation model construction method and estimation method based on face image
CN109685115B (en) Fine-grained conceptual model with bilinear feature fusion and learning method
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
Hridayami et al. Fish species recognition using VGG16 deep convolutional neural network
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN107564025B (en) Electric power equipment infrared image semantic segmentation method based on deep neural network
CN111079639B (en) Method, device, equipment and storage medium for constructing garbage image classification model
Kae et al. Augmenting CRFs with Boltzmann machine shape priors for image labeling
US20190228268A1 (en) Method and system for cell image segmentation using multi-stage convolutional neural networks
CN110321967B (en) Image classification improvement method based on convolutional neural network
CN112966691B (en) Multi-scale text detection method and device based on semantic segmentation and electronic equipment
US8379994B2 (en) Digital image analysis utilizing multiple human labels
US20140270489A1 (en) Learned mid-level representation for contour and object detection
CN110276248B (en) Facial expression recognition method based on sample weight distribution and deep learning
Taheri et al. Animal classification using facial images with score‐level fusion
WO2014205231A1 (en) Deep learning framework for generic object detection
CN104915972A (en) Image processing apparatus, image processing method and program
CN112364791B (en) Pedestrian re-identification method and system based on generation of confrontation network
CN106408037A (en) Image recognition method and apparatus
CN109063626A (en) Dynamic human face recognition methods and device
CN110135435B (en) Saliency detection method and device based on breadth learning system
CN111310820A (en) Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration
CN104598898A (en) Aerially photographed image quick recognizing system and aerially photographed image quick recognizing method based on multi-task topology learning
CN112836755B (en) Sample image generation method and system based on deep learning
Yeh et al. Intelligent mango fruit grade classification using alexnet-spp with mask r-cnn-based segmentation algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant