CN109034078B

CN109034078B - Training method of age identification model, age identification method and related equipment

Info

Publication number: CN109034078B
Application number: CN201810864567.1A
Authority: CN
Inventors: 葛彦昊; 李季檩; 汪铖杰; 黄飞跃; 吴永坚; 黄小明; 梁小龙; 晏轶超; 贺珂珂
Original assignee: Tencent Technology Shenzhen Co Ltd; Tencent Cloud Computing Beijing Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd; Tencent Cloud Computing Beijing Co Ltd
Priority date: 2018-08-01
Filing date: 2018-08-01
Publication date: 2023-07-14
Anticipated expiration: 2038-08-01
Also published as: CN109034078A

Abstract

The application relates to a training method of an age identification model, which comprises the following steps: obtaining a training image set containing human faces, taking training images in the training image set as input of an age identification model, obtaining predicted age values corresponding to the human faces in all training images output by the age identification model, calculating according to the predicted age values corresponding to the human faces in all target training images corresponding to the same marked age value, obtaining statistical predicted ages corresponding to all marked age values, calculating according to the statistical predicted ages and the corresponding marked age values, obtaining an age statistical error value, and adjusting parameters in the age identification model according to the age statistical error value until convergence conditions are met, thereby obtaining the target age identification model. The training method of the age identification model improves the accuracy of age identification. In addition, a training device, an age identification method, a training device, computer equipment and a storage medium of the age identification model are also provided.

Description

Training method of age identification model, age identification method and related equipment

Technical Field

The present disclosure relates to the field of computer processing technologies, and in particular, to a training method for an age identification model, an age identification method, and related devices.

Background

Age identification refers to identifying the age of a person through a face image. The conventional age identification model can only identify the ages of scenes with smaller age spans, and has poor identification effect on certain age groups under the scenes with larger age spans (such as monitoring scenes).

Disclosure of Invention

In view of the above, it is necessary to provide a training method, apparatus, computer device, storage medium, age identification method, apparatus, computer device, and storage medium for an age identification model having high accuracy in identifying each age group.

A method of training an age identification model, the method comprising:

acquiring a training image set containing human faces, wherein the human faces in each training image in the training image set have corresponding labeling age values;

taking training images in the training image set as input of an age identification model, and acquiring predicted age values corresponding to faces in all training images output by the age identification model;

according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value, calculating to obtain statistical predicted ages corresponding to the marked age values;

Calculating according to the statistical predicted age and the corresponding marked age value to obtain an age statistical error value;

and adjusting parameters in the age identification model according to the age statistics error value until convergence conditions are met, so as to obtain a target age identification model.

A training device for an age identification model, the device comprising:

the image set acquisition module is used for acquiring a training image set containing human faces, wherein the human faces in each training image in the training image set have corresponding labeling age values;

the training input/output module is used for taking the training images in the training image set as the input of the age identification model and obtaining the predicted age value corresponding to the face in each training image output by the age identification model;

the first calculation module is used for calculating and obtaining statistical predicted ages corresponding to the marked age values according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value;

the second calculation module is used for calculating and obtaining an age statistics error value according to the statistics prediction age and the corresponding marked age value;

and the adjustment module is used for adjusting parameters in the age identification model according to the age statistics error value until convergence conditions are met, so as to obtain a target age identification model.

A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:

A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:

According to the training method, the training device, the computer equipment and the storage medium of the age identification model, the statistical predicted ages corresponding to the marked age values are calculated according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value, then the statistical predicted ages and the corresponding marked age values are adopted to calculate the age statistical error values, and then parameters in the age identification model are adjusted according to the age statistical error values until convergence conditions are met, so that the target age identification model is obtained. By training the age identification model by adopting the innovative age statistics error value, the obtained target age identification model can obtain identification effects with higher accuracy on faces in all age groups even in an uncontrollable scene of the age groups.

A method of age identification, the method comprising:

acquiring an image to be identified containing a human face;

taking the image to be identified as input of an age identification model, wherein the age identification model adopts an age statistics error value as an error measurement standard, and the age statistics error value is obtained by carrying out statistics calculation according to a marked age value and predicted age values corresponding to faces in a plurality of training images corresponding to the same marked age value;

and obtaining an age value corresponding to the face in the image to be identified, which is output by the age identification model.

An age identification device, the device comprising:

the image to be identified acquisition module is used for acquiring an image to be identified containing a human face;

the input module is used for taking the image to be identified as the input of an age identification model, the age identification model adopts an age statistics error value as an error measurement standard, and the age statistics error value is obtained by carrying out statistics calculation according to a marked age value and predicted age values corresponding to faces in a plurality of training images corresponding to the same marked age value;

and the output module is used for acquiring the age value corresponding to the face in the image to be recognized, which is output by the trained age recognition model.

acquiring an image to be identified containing a human face;

According to the age identification method, the age identification device, the computer equipment and the storage medium, the image to be identified containing the face is obtained and is used as the input of the age identification model, and then the age corresponding to the face in the image to be identified, which is output by the age identification model, is obtained. The age identification model is trained by using an age statistics error value as an error measurement standard, the age statistics error value is obtained by carrying out statistics calculation according to a marked age value and predicted age values corresponding to faces in a plurality of training images corresponding to the same marked age value, the age identification model is trained by using an innovative age statistics error value as an error measurement standard, the prediction accuracy of the age identification model on each age group can be improved, and a recognition effect with higher accuracy can be obtained on faces in each age group under an uncontrollable scene of the age group.

Drawings

FIG. 1 is a diagram of an application environment for an age identification method in one embodiment;

FIG. 2 is a flowchart of a training method for an age identification model in one embodiment;

FIG. 3 is a schematic diagram of random erase before and after random erase in one embodiment;

FIG. 4 is a schematic diagram of random ambiguity before and after in one embodiment;

FIG. 5 is a schematic diagram of the super resolution processing before and after the super resolution processing in one embodiment;

FIG. 6 is a schematic diagram of face images at different angles in one embodiment;

FIG. 7 is a flow diagram of a method of age identification in one embodiment;

FIG. 8 is a diagram of identifying a face age in one embodiment;

FIG. 9A is a schematic diagram of face recognition in a monitoring scenario in one embodiment;

FIG. 9B is a schematic diagram of multiple face recognition effects in a monitoring scenario in one embodiment;

FIG. 10 is a schematic diagram of a structure of an age identification model in one embodiment;

FIG. 11 is a diagram of an architecture for an age identification model application in one embodiment;

FIG. 12 is a diagram of a statistically derived age distribution in one embodiment;

FIG. 13 is a flow chart of a method of age identification in another embodiment;

FIG. 14 is a block diagram of a training apparatus for an age identification model in one embodiment;

FIG. 15 is a block diagram showing a training apparatus of an age identification model in another embodiment;

FIG. 16 is a block diagram showing a structure of an age identifying device in one embodiment;

FIG. 17 is a block diagram showing the structure of an age identifying device according to another embodiment;

FIG. 18 is a block diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

FIG. 1 is a diagram of an application environment for a training method of an age identification model in one embodiment. Referring to fig. 1, the age identification method is applied to a training system of an age identification model. The training system of the age identification model includes a terminal 110 and a server 120. The terminal 110 and the server 120 are connected through a network, and the terminal 110 may be a desktop terminal or a mobile terminal, and the mobile terminal may be at least one of a mobile phone, a tablet computer, a notebook computer, and the like. The server 120 may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers. The terminal 110 is configured to upload a training image set including a face to the server 120, where the server 120 is configured to obtain a training image set including a face, where the face in each training image in the training image set has a corresponding labeling age value, use the training image in the training image set as an input of an age recognition model, obtain a predicted age value corresponding to the face in each training image output by the age recognition model, calculate a statistical predicted age corresponding to each labeling age value according to the predicted age value corresponding to the face in each target training image corresponding to the same labeling age value, calculate an age statistical error value according to the statistical predicted age and the corresponding labeling age value, and adjust parameters in the age recognition model according to the age statistical error value until convergence conditions are satisfied, thereby obtaining the target age recognition model.

In another embodiment, the above-mentioned age identification method may be directly applied to the terminal 110, where the terminal 110 is configured to obtain a training image set including faces, where the faces in each training image in the training image set have corresponding labeling age values, take the training images in the training image set as input of an age identification model, obtain predicted age values corresponding to the faces in each training image output by the age identification model, calculate, according to the predicted age values corresponding to the faces in each target training image corresponding to the same labeling age value, obtain statistical predicted ages corresponding to each labeling age value, calculate, according to the statistical predicted ages and the corresponding labeling age values, obtain an age statistical error value, and adjust parameters in the age identification model according to the age statistical error value until convergence conditions are met, thereby obtaining the target age identification model.

As shown in fig. 2, in one embodiment, a training method of an age identification model is provided, which can be applied to a terminal or a server, and this embodiment is exemplified by the application to the terminal. The training method of the age identification model specifically comprises the following steps:

Step S202, a training image set containing faces is obtained, and corresponding labeling age values exist in the faces in each training image in the training image set.

The training image set refers to a set of images required for training the model. In order to train the age-identified model, the age-identified model needs to be trained using a training image that includes a human face. The faces contained in the training images in the training image set are all corresponding to the corresponding age marks, namely training labels.

Step S204, taking the training images in the training image set as the input of the age identification model, and obtaining the predicted age value corresponding to the face in each training image output by the age identification model.

The predicted age value is an age value predicted by the age recognition model. The training image is used as the input of the age identification model, and then the predicted age value which is output by the age identification model and corresponds to the face in the training image is obtained, so that the parameters in the age identification model can be conveniently adjusted according to the difference between the predicted age value and the corresponding marked age value, and the age identification model is optimized towards a more accurate direction.

Step S206, according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value, calculating to obtain the statistical predicted ages corresponding to the marked age values.

The marked age value refers to an age value output by the expected age identification model. The faces in each training image are corresponding to corresponding marked age values. Training images corresponding to the same labeling age value are taken as a set. And then obtaining a predicted age value corresponding to each training image in the set. For the purpose of distinction, training images corresponding to the same labeling age value are referred to as "target training images".

And after obtaining the predicted age value corresponding to the face in each target training image corresponding to the same marked age, calculating according to the predicted age corresponding to each target training image to obtain the statistical predicted age corresponding to the marked age. In one embodiment, the statistical predicted age is calculated from an average of predicted age values corresponding to each of the target training images. In another embodiment, the statistical predicted age is calculated based on a median of predicted age values corresponding to each of the target training images. For example, assume that there are 5 target training images corresponding to the same labeling age value (e.g., 25 years old), and the predicted age values corresponding to the 5 target training images are respectively: 19 years, 24 years, 26 years, 28 years, 32 years. If an average calculation is used, the corresponding statistically predicted age is: (19+24+26+28+32)/5=25.8 years old. If median calculations are used, i.e. take the age in the middle, then the corresponding statistical age is 26 years, and if there are two in the middle, then take the average of the two values.

Step S208, calculating to obtain an age statistic error value according to the statistic predicted age and the corresponding marked age value.

The age statistical error value is calculated according to the statistical predicted age and the corresponding marked age value, and is a standard for measuring the error between the statistical predicted age and the corresponding marked age value.

And S210, adjusting parameters in the age identification model according to the age statistics error value until convergence conditions are met, and obtaining the target age identification model.

And after the age statistical error value is calculated, adjusting parameters in the age identification model according to the age statistical error value. And then, continuing training the adjusted age identification model by repeating the steps until convergence conditions are met, so as to obtain the target age identification model. The target age identification model is a trained age identification model. In one embodiment, the convergence condition is a preset threshold for measuring the age statistic error value, and the age recognition model training is completed when the calculated age statistic error value is smaller than the set threshold.

In one embodiment, the age statistical error function may be expressed as:

Wherein Gt represents a marked age, val represents a predicted age predicted, mean (Val) represents a statistically predicted age corresponding to the marked age, and the statistically predicted age is calculated from a plurality of predicted ages corresponding to the same marked age. N represents the number of labeling ages for which the current iteration round enters the model, e.g., n=60 assuming there are 60 different labeling ages in the current iteration round. By adopting the innovative age statistics error value as the error measurement standard, the problem of low prediction accuracy caused by unbalanced age distribution of the training sample can be solved, so that the accuracy of age identification is improved.

According to the training method of the age identification model, the statistical predicted ages corresponding to the marked age values are obtained through calculation according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value, then the statistical predicted ages and the corresponding marked age values are adopted to obtain the age statistical error values through calculation, and then parameters in the age identification model are adjusted according to the age statistical error values until convergence conditions are met, so that the target age identification model is obtained. By training the age identification model by adopting the innovative age statistics error value, the obtained target age identification model can obtain identification effects with higher accuracy on faces in all age groups even in an uncontrollable scene of the age groups.

In one embodiment, after acquiring the training image set including the face, the method further includes: performing enhancement processing on the training images in the training image set to obtain the training images after the enhancement processing, wherein the enhancement processing comprises: at least one of random erasure, random blurring, super resolution processing; taking the training images in the training image set as the input of the age identification model, comprising: and taking the training image after the enhancement processing as the input of the age identification model.

In order to improve generalization of the age identification model, a data enhancement mode is introduced to enhance the training images in the training image set. The adding process includes: at least one of random erasure, random blurring, super resolution processing. Random erasing means that a certain position in a training image is erased randomly, so that accurate identification of the age of the contained face can be realized under the condition of shielding. Random blurring refers to blurring processing of a training image, so that the included face can be accurately identified under the condition of blurring of the image. The super-resolution processing refers to performing super-resolution processing on a training image to obtain a relatively clear image, and then training the training image as a training image, namely, improving generalization of an age identification model as an aid.

The diversity of the training images can be increased by the enhancement process. In the training process, the training images before the enhancement processing and the training images after the enhancement processing are used as training samples of the training model, so that the accuracy of the age identification model obtained through training is improved. By introducing the enhancement processing mode, the accuracy of face age identification can be improved under the conditions of uncontrollable face illumination conditions, uncontrollable face far-near blurring degree and uncontrollable face shielding conditions.

In one embodiment, the enhancing the training images in the training image set to obtain the enhanced training images includes: when the enhancement processing comprises random erasure, randomly selecting an erasure area from the training image, and carrying out random assignment on pixels of the erasure area to obtain the training image after the random erasure processing; and/or when the enhancement processing comprises random blurring, randomly selecting a direction to carry out convolution operation on the training image, so as to obtain a training image after the random blurring processing; and/or when the enhancement processing comprises super-resolution processing, performing super-resolution processing on the training image through the image super-resolution model to obtain a super-resolution processed training image.

Wherein the enhancement process includes at least one of random erasure, random blurring, super resolution processing. The random erasing is to randomly select an erasing area from the training image, and then to carry out random assignment on pixels of the erasing area, so that the training image after random erasing treatment can be obtained. FIG. 3 is a schematic diagram of random erasing before and after random erasing, in one embodiment. The random blurring is to perform convolution operation on the training image by randomly selecting one direction to obtain the training image after random blurring processing. As shown in fig. 4, a schematic diagram of before and after random ambiguity is performed in one embodiment. The super-resolution processing is to perform super-resolution processing on the training image through a super-resolution model obtained through training, and a clearer image after super-resolution processing can be obtained. Fig. 5 is a schematic diagram of the super-resolution processing before and after the super-resolution processing in one embodiment.

In one embodiment, the training image set including faces includes a plurality of angled face images belonging to a plurality of different age groups.

In some situations, for example, a monitoring situation, the face angle is not controllable, so that the age identification can be performed on the face at any angle. The training image set includes face images with multiple angles, as shown in fig. 6, which is a schematic diagram of face images with different angles in the training image set in one embodiment. In addition, in order to accurately identify faces of all ages, training images of all ages need to be included in the training sample. Namely, in order to accurately identify the age of the face under the condition that the angle of the face is uncontrollable, the training image needs to contain faces with a plurality of angles. In order to realize accurate identification of different age groups, training images of all age groups are also needed to be contained in the training images.

In one embodiment, obtaining the statistical predicted age corresponding to the standard age value according to the predicted age value corresponding to the face in each target training image includes: and averaging the obtained predicted age values corresponding to the faces in the target training images to obtain average predicted ages, and taking the average predicted ages as statistical predicted ages corresponding to the standard age values.

The statistical predicted age is an average predicted age, which is obtained by averaging the predicted ages corresponding to the target training images corresponding to the same standard age value. The problem of low prediction accuracy caused by unbalanced age distribution in the training sample can be solved by adopting the average prediction age, and the prediction accuracy of the age identification model for each age group is improved.

In one embodiment, calculating an age statistical error value according to the statistical predicted age and the corresponding standard age value includes: obtaining an error adjustment coefficient; and calculating according to the error adjustment coefficient, the statistical prediction age and the corresponding standard age value to obtain an age statistical error value.

The appearance changes obviously along with the age when the person is young, and changes less obviously along with the age when the person is young, so that the prediction error is larger when the person is young. By acquiring the error adjustment coefficient, the statistical error values of the ages corresponding to different age groups are adjusted by utilizing the error adjustment coefficient, so that more accurate prediction effects can be obtained for each age group. I.e. the error adjustment coefficient is used to adjust the corresponding age statistic error value according to different ages. In one embodiment, the error adjustment coefficient is inversely related to the standard age value, i.e., the larger the standard age value, the smaller the corresponding error adjustment coefficient. And calculating a final age statistical error value according to the error adjustment coefficient, the statistical age prediction and the corresponding label.

In one embodiment, obtaining the error adjustment coefficient includes: calculating according to the statistical predicted age and the corresponding standard age value to obtain an error adjustment control parameter; and calculating an error adjustment coefficient according to the error adjustment control parameter, wherein the error adjustment coefficient is inversely related to the error adjustment control coefficient.

The error adjustment control parameter is used for controlling the magnitude of the error adjustment coefficient, namely controlling the adjustment amplitude of the age statistics error value. The error adjustment control parameter is related to the statistical predicted age and the corresponding standard age value. In one embodiment, the error adjustment control parameter is the maximum value of the statistical predicted age and the corresponding standard age value, i.e., the error adjustment control parameter may be expressed as max (Mean (Val), gt represents the standard age value, and Mean (Val) refers to the statistical predicted age corresponding to the standard age value.

After the error adjustment control parameters are obtained through calculation, an error adjustment function is obtained, and error adjustment values are obtained through substituting the error adjustment control parameters into the error adjustment function through calculation. The error adjustment coefficient is inversely related to the error adjustment control coefficient. In one embodiment, the relationship of the error adjustment coefficient to the error adjustment control parameter may be expressed by the following formula: f (x) =2k/(x+k), x is an error adjustment control parameter, and k is a constant larger than x.

In one embodiment, a predicted age interval corresponding to the age prediction model is obtained, a maximum value in the predicted age interval is obtained, and an error adjustment coefficient is calculated according to the maximum value of the predicted age interval and the error adjustment control parameter.

The predicted age range refers to a predicted age range, for example, the predicted age range that the predicted age model can predict is 0-70 years old, and then the maximum value in the predicted age range is 70 years old. And calculating an error adjustment coefficient according to the maximum value of the predicted age interval and the error adjustment control parameter. The following formula can be used: f (x) =2k/(x+k), where K is the maximum value of the predicted age interval and x is the error adjustment control parameter.

In one embodiment, the age statistic error value is calculated using the following formula:

where f (x) =2k/(max (Mean (Val), gt) +k). Wherein f (x) represents an error adjustment coefficient, K represents a maximum value of a preset age interval,max (Mean (Val, gt)) is an error adjustment control parameter, and Gt represents a standard age value.

As shown in fig. 7, in one embodiment, an age identification method is presented. The method can be applied to a terminal or a server, and the embodiment is applied to terminal illustration. The age identification method specifically comprises the following steps:

Step S702, an image to be recognized including a face is obtained.

The image to be identified is an image of the age of the face to be identified. The image to be identified can be an image containing a human face, which is obtained by calling a camera to shoot in real time by the terminal, or can be an obtained stored image containing the human face. The face in the image to be recognized can be at any angle, for example, can be a side face, a front face, and the like. The image to be identified can comprise one human face or a plurality of human faces. If the images contain a plurality of faces, the corresponding age of each face in the images needs to be identified later.

Step S704, the image to be identified is used as input of an age identification model, the age identification model adopts an age statistics error value as an error measurement standard, and the age statistics error value is obtained by carrying out statistics calculation according to the marked age value and the predicted age value corresponding to the face in the plurality of training images corresponding to the same marked age value.

The age recognition model is used for recognizing an age value corresponding to a face contained in the image to be recognized. The age statistical error value is calculated according to a set age statistical error function, and the age statistical error function takes the marked age value and the corresponding statistical predicted age as variables. And substituting the obtained marked age value and the corresponding statistical predicted age into an age statistical error function to calculate and obtain the corresponding age statistical error value.

The labeling age value refers to an age value for labeling the face in the training image, and can be understood as the actual age corresponding to the face, namely the age value expected to be output. In one embodiment, the labeling of age values uses visual age based on visual subjective judgment. In the case that the actual age of the face in the training image cannot be obtained (for example, in a monitoring scene), the corresponding labeling age value is obtained by visual inspection by a professional, and then labeling is performed. The statistical predicted age is obtained by counting predicted age values corresponding to a plurality of training images corresponding to the same marked age, and the predicted age values are obtained by inputting the training images into an age identification model.

In one embodiment, the statistical predicted age may be obtained by calculating a median, for example, by sorting the predicted ages corresponding to the same labeled age value according to size, and then selecting the middle predicted age value as the statistical predicted age. In another embodiment, the statistical predicted age is calculated by means of a mean value, that is, a plurality of predicted ages predicted are averaged, and the obtained mean value is taken as the statistical predicted age.

For example, assume that the labeling age values corresponding to faces in a plurality of training images are all 25 years old, the plurality of training images are respectively used as inputs of the training images to obtain the predicted age values corresponding to the faces in each training image, and then statistical predicted ages are obtained through calculation according to the predicted age values corresponding to the plurality of training images.

The age statistical error value is obtained by calculating the statistical prediction age and the labeling age value, and then the parameters in the model are adjusted and trained according to the age statistical error value, so that the age identification model with high identification accuracy for each age range can be still obtained under the condition that the sample age distribution is unbalanced. Because the age distribution of the training samples is uncontrollable in the special scene, if the age distribution is unbalanced, the predicted age of the model tends to be distributed in a relatively large age range, and in the embodiment, the age statistical error value is calculated by adopting the statistical predicted age and the marked age value, so that the problem of low prediction accuracy caused by the unbalanced age distribution can be effectively solved.

Step S706, obtaining an age value corresponding to the face in the image to be identified, which is output by the age identification model.

The image to be recognized containing the face is used as input of a trained age recognition model, and the output predicted age value corresponding to the face in the image to be recognized can be obtained. Fig. 8 is a schematic diagram of an age corresponding to the identified face in one embodiment.

According to the age identification method, the image to be identified containing the face is obtained and is used as the input of the trained age identification model, and then the age value corresponding to the face in the image to be identified, which is output by the trained age identification model, is obtained. The age identification model is obtained by training an age statistical error value serving as an error measurement standard, the age statistical error value is obtained by calculating according to a marked age value and corresponding statistical predicted ages, and the statistical predicted ages are obtained by carrying out statistical calculation according to the predicted ages output by the age identification model corresponding to a plurality of training images corresponding to the same marked age value. By training the age identification model by adopting the innovative age statistics error value as an error measurement standard, the prediction accuracy of the age identification model on each age group can be improved, and the identification effect with higher accuracy can be obtained on the face of each age group under the special scene of uncontrollable age groups.

Fig. 9A is a schematic diagram of an embodiment applied in a monitoring scenario. As shown in fig. 9A, a video image in a monitored scene is first acquired, then faces in the video image are detected, a target picture containing the faces is extracted, and for convenience in recognition, only one face is included in the extracted target picture containing the faces. If a plurality of faces exist in the video image, a plurality of target pictures are correspondingly extracted from the video image. Then taking the target picture as the input of the age identification model, and then acquiring the output age of the age identification model. Fig. 9B is a schematic diagram of a recognition result obtained by recognizing ages of a plurality of faces when the plurality of faces exist in a monitored scene in one embodiment.

In one embodiment, the age identification model is trained using a convolutional neural network model, the age identification model comprising: the plurality of convolution layers comprises a preset number of activation layers and pooling layers between adjacent convolution layers.

The age identification model is obtained by training a convolutional neural network model (Convolutional Neural Network, CNN). The age identification model includes a plurality of convolution layers including a predetermined number of activation layers and pooling layers between adjacent convolution layers. The convolution layer is used for carrying out convolution operation on the image to extract image characteristics. The activation layer is used for carrying out nonlinear operation on the image and describing nonlinear characteristics of the image. The activation layer is implemented using an activation function. Such as the ReLu function. The pooling layer is used for projecting each weight in the image to obtain the image after dimension reduction.

As shown in FIG. 10, in one embodiment, the age identification model includes 17 convolution layers and an output layer, where the first convolution layer is a 7X7 convolution layer and the 2 nd-17 th convolution layers are all 3X3 convolution layers. Where 7X7,3X3 refers to the size of the convolution kernel of the convolution layer. Also included between adjacent convolution layers is: a ReLu layer (i.e., an activation layer) and a Pooling layer. Wherein, the pooling core of the last layer is 5X5, and the pooling cores of other layers are 7X7. In this embodiment, the modification of the internal structure of the pooling layer of the last layer is innovatively 5×5, so that the expression capability of the model can be improved to a certain extent, and the parameter weight of the pre-training model is not changed basically, so that the repeated pre-training process can be avoided. The pre-training process refers to a process of initializing various parameters in the model. In order to avoid the repeated pre-training process, the pooling core of the pooling layer of the last layer in the original convolutional neural network model is modified to be 5X5, so that the expression capacity of the model can be improved, and the repeated pre-training process can be avoided.

In one embodiment, before taking the image to be identified as input to the trained age identification model, it comprises: recognizing the face in the image to be recognized to obtain a target face area; taking the image to be identified as input of a trained age identification model, comprising: and taking the target face area corresponding to the image to be identified as the input of the trained age identification model.

The target face area refers to an area where a face is located. The method comprises the steps of carrying out face detection and recognition on an image to be recognized, determining a cutting frame corresponding to a face, cutting according to the cutting frame to obtain a target face area, and taking the target face area as input of a trained age recognition model. Because the acquired image to be recognized possibly further comprises a lot of useless information, before the model is input, the face in the image to be recognized is firstly detected, then the target face area is extracted, and the target face area is used as the input of the trained age recognition model, so that the accuracy of the face age prediction is improved.

As shown in fig. 11, in an embodiment of a monitoring scenario, a front-end module (existing in a terminal) and a back-end module (existing in a server) are included, where the front-end module includes: and the video acquisition module and the face detection module. The background module comprises: the system comprises a face analysis module, a statistics analysis module and a database module. Specifically, firstly, a video acquisition module acquires images in a real scene in real time through a camera, then the video acquisition module transmits the acquired images to a human face detection module, the human face detection module is used for carrying out human face detection on the acquired images, when the images are detected to contain human faces, the area where the human faces are located is cut to obtain target human face images, the obtained target human face images are uploaded to a human face analysis module of a server, the human face analysis module contains an age recognition model, and age recognition is carried out on human faces in the target human face images according to the age recognition model to obtain age values corresponding to the human faces. And then the analyzed ages are transferred to a statistical analysis module, and the statistical analysis module is used for carrying out statistics according to the acquired ages of the faces to obtain corresponding age distribution, as shown in fig. 12, which is a schematic diagram of the age distribution obtained by statistics in one embodiment. The database module is used for storing the face information and the corresponding ages, and is convenient for subsequent searching.

As shown in fig. 13, in one embodiment, an age identifying method is provided, which includes:

step S1301, a training image set including faces is obtained, where the faces in each training image in the training image set have corresponding labeling age values.

Step S1302, taking the training images in the training image set as input of the age recognition model, and obtaining the predicted age value corresponding to the face in each training image output by the age recognition model.

Step S1303, according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value, calculating to obtain the statistical predicted ages corresponding to the marked age values.

In step S1304, the predicted ages corresponding to the obtained target training images are averaged to obtain an average predicted age, and the average predicted age is used as the statistical predicted age corresponding to the labeled age value.

Step S1305, calculating to obtain an age statistic error value according to the statistic predicted age and the corresponding marked age value.

Step S1306, adjusting parameters in the age identification model according to the age statistics error value until convergence conditions are met, and obtaining a target age identification model.

Step S1307, acquire an image to be recognized including a face.

Step S1308, taking an image to be identified as input of an age identification model;

step S1309, obtaining an age value corresponding to the face in the image to be identified, which is output by the age identification model.

As shown in fig. 14, in one embodiment, a training apparatus for an age identification model is provided, the apparatus including:

an image set obtaining module 1402, configured to obtain a training image set including faces, where the faces in each training image in the training image set have corresponding labeling age values;

the training input/output module 1404 is configured to take training images in the training image set as input of an age recognition model, and obtain predicted age values corresponding to faces in each training image output by the age recognition model;

a first calculation module 1406, configured to calculate, according to the predicted age values corresponding to the faces in the target training images corresponding to the same labeled age value, a statistical predicted age corresponding to each labeled age value;

a second calculation module 1408, configured to calculate an age statistics error value according to the statistics predicted age and the corresponding labeled age value;

And the adjusting module 1410 is configured to adjust parameters in the age identification model according to the age statistics error value until convergence conditions are satisfied, thereby obtaining a target age identification model.

As shown in fig. 15, in one embodiment, after the training image set including the face is obtained, the method further includes:

an enhancement processing module 1403, configured to perform enhancement processing on the training images in the training image set, obtaining a training image after enhancement processing, wherein the enhancement processing comprises: at least one of random erasure, random blurring, super resolution processing;

the training input and output module is also used for taking the training image after the enhancement processing as the input of the age identification model.

In one embodiment, when the enhancement processing module is further configured to randomly select an erasure area from the training image when the enhancement processing includes random erasure, and perform random assignment on pixels of the erasure area to obtain a training image after the random erasure processing; and/or when the enhancement processing comprises random blurring, randomly selecting a direction to carry out convolution operation on the training image, so as to obtain a training image after the random blurring processing; and/or when the enhancement processing comprises super-resolution processing, performing super-resolution processing on the training image through an image super-resolution model to obtain a super-resolution processed training image.

In one embodiment, the training image set including faces includes face images of a plurality of angles, and the face images of the plurality of angles belong to a plurality of different age groups.

In one embodiment, the first calculation module 1406 is further configured to average the predicted age values corresponding to the faces in the obtained target training images to obtain an average predicted age, and use the average predicted age as the statistical predicted age corresponding to the labeled age value.

In one embodiment, the second calculation module 1408 is further configured to obtain an error adjustment coefficient; and calculating according to the error adjustment coefficient, the statistical prediction age and the corresponding marked age value to obtain an age statistical error value.

In one embodiment, the second calculating module 1408 is further configured to calculate an error adjustment control parameter according to the statistical predicted age and the corresponding labeled age value; and calculating the error adjustment coefficient according to the error adjustment control parameter, wherein the error adjustment coefficient and the error adjustment control coefficient are inversely related.

As shown in fig. 16, in one embodiment, an age identifying device is provided, the device including:

The image to be identified acquiring module 1602 is configured to acquire an image to be identified including a face;

the input module 1604 is configured to take the image to be identified as an input of an age identification model, where the age identification model uses an age statistics error value as an error metric, and the age statistics error value is obtained by performing statistics calculation according to a labeled age value and predicted age values corresponding to faces in a plurality of training images corresponding to the same labeled age value;

and an output module 1606, configured to obtain an age value corresponding to the face in the image to be identified, which is output by the age identification model.

In one embodiment, the age identification model is trained using a convolutional neural network model, the age identification model comprising: a plurality of the convolution layers are formed, and a preset number of activation layers and pooling layers are arranged between the adjacent convolution layers.

As shown in fig. 17, in one embodiment, the age identifying device further includes:

the recognition module 1603 is configured to recognize the face in the image to be recognized to obtain a target face area;

the input module is also used for taking the target face area corresponding to the image to be recognized as the input of the trained age recognition model.

FIG. 18 illustrates an internal block diagram of a computer device in one embodiment. The computer device may specifically be a terminal or a server. As shown in fig. 18, the computer device includes a processor, a memory, and a network interface connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program that, when executed by the processor, causes the processor to implement a training method for an age identification model. The internal memory may also have stored therein a computer program which, when executed by the processor, causes the processor to perform a training method of the age identification model. It will be appreciated by those skilled in the art that the structure shown in fig. 18 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application is applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, the training method of the age identification model provided in the present application may be implemented in the form of a computer program, which may be executed on a computer device as shown in fig. 18. The memory of the computer device may store various program modules of the training apparatus that make up the age identification model, such as the image set acquisition module 1402, training input output module 1404, first computing module 1406, second computing module 1408, and adjustment module 1410 of fig. 14. The computer program constituted by the respective program modules causes the processor to execute the steps in the age identifying device of the respective embodiments of the present application described in the present specification. For example, the computer device shown in fig. 18 may obtain a training image set including a face through the image set obtaining module 1402 of the age identifying device shown in fig. 14, where the face in each training image in the training image set has a corresponding labeled age value; the training images in the training image set are used as the input of an age identification model through a training input/output module 1404, and the predicted age value corresponding to the face in each training image output by the age identification model is obtained; calculating, by the first calculation module 1406, a statistical predicted age corresponding to each labeled age value according to the predicted age value corresponding to the face in each target training image corresponding to the same labeled age value; calculating an age statistic error value according to the statistic predicted age and the corresponding labeled age value by a second calculation module 1408; and adjusting parameters in the age identification model by an adjusting module 1410 according to the age statistics error value until convergence conditions are met, so as to obtain a target age identification model.

In one embodiment, a computer device is presented comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of: acquiring a training image set containing human faces, wherein the human faces in each training image in the training image set have corresponding labeling age values; taking training images in the training image set as input of an age identification model, and acquiring predicted age values corresponding to faces in all training images output by the age identification model; according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value, calculating to obtain statistical predicted ages corresponding to the marked age values; calculating according to the statistical predicted age and the corresponding marked age value to obtain an age statistical error value; and adjusting parameters in the age identification model according to the age statistics error value until convergence conditions are met, so as to obtain a target age identification model.

In one embodiment, after the acquiring the training image set including the face, the computer program further causes the processor to perform the steps of: performing enhancement processing on the training images in the training image set to obtain the training images after the enhancement processing, wherein the enhancement processing comprises: at least one of random erasure, random blurring, super resolution processing; the step of using the training images in the training image set as the input of the age identification model comprises the following steps: and taking the training image after the enhancement processing as the input of an age identification model.

In one embodiment, the enhancing the training images in the training image set to obtain the enhanced training images includes: when the enhancement processing comprises random erasure, randomly selecting an erasure area from the training image, and carrying out random assignment on pixels of the erasure area to obtain a training image after the random erasure processing; and/or when the enhancement processing comprises random blurring, randomly selecting a direction to carry out convolution operation on the training image, so as to obtain a training image after the random blurring processing; and/or when the enhancement processing comprises super-resolution processing, performing super-resolution processing on the training image through an image super-resolution model to obtain a super-resolution processed training image.

In one embodiment, the obtaining the statistical predicted age corresponding to the labeling age value according to the predicted age value corresponding to the face in each target training image includes: and averaging the obtained predicted age values corresponding to the faces in the target training images to obtain average predicted ages, and taking the average predicted ages as statistical predicted ages corresponding to the marked age values.

In one embodiment, the calculating the age statistical error value according to the statistical prediction age and the corresponding marked age value includes: obtaining an error adjustment coefficient; and calculating according to the error adjustment coefficient, the statistical prediction age and the corresponding marked age value to obtain an age statistical error value.

In one embodiment, the obtaining the error adjustment coefficient includes: calculating according to the statistical predicted age and the corresponding marked age value to obtain an error adjustment control parameter; and calculating the error adjustment coefficient according to the error adjustment control parameter, wherein the error adjustment coefficient and the error adjustment control coefficient are inversely related.

In one embodiment, a computer device is presented comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of: acquiring an image to be identified containing a human face; taking the image to be identified as input of an age identification model, wherein the age identification model adopts an age statistics error value as an error measurement standard, and the age statistics error value is obtained by carrying out statistics calculation according to a marked age value and predicted age values corresponding to faces in a plurality of training images corresponding to the same marked age value; and obtaining an age value corresponding to the face in the image to be identified, which is output by the age identification model.

In one embodiment, the age identification model is trained using a convolutional neural network model, the age identification model comprising: and the plurality of convolution layers comprises a preset number of activation layers and pooling layers between the adjacent convolution layers.

In one embodiment, before said inputting of said image to be identified as an age identification model, said computer program further causes said processor to perform the steps of: identifying the face in the image to be identified to obtain a target face area; the input of the image to be identified as the trained age identification model comprises the following steps: and taking the target face area corresponding to the image to be recognized as the input of the trained age recognition model.

In one embodiment, a computer-readable storage medium is provided, storing a computer program which, when executed by a processor, causes the processor to perform the steps of: acquiring a training image set containing human faces, wherein the human faces in each training image in the training image set have corresponding labeling age values; taking training images in the training image set as input of an age identification model, and acquiring predicted age values corresponding to faces in all training images output by the age identification model; according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value, calculating to obtain statistical predicted ages corresponding to the marked age values; calculating according to the statistical predicted age and the corresponding marked age value to obtain an age statistical error value; and adjusting parameters in the age identification model according to the age statistics error value until convergence conditions are met, so as to obtain a target age identification model.

In one embodiment, a computer-readable storage medium is provided, storing a computer program which, when executed by a processor, causes the processor to perform the steps of: acquiring an image to be identified containing a human face; taking the image to be identified as input of an age identification model, wherein the age identification model adopts an age statistics error value as an error measurement standard, and the age statistics error value is obtained by carrying out statistics calculation according to a marked age value and predicted age values corresponding to faces in a plurality of training images corresponding to the same marked age value; and obtaining an age value corresponding to the face in the image to be identified, which is output by the age identification model.

Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. A method of training an age identification model, the method comprising:

According to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value, calculating to obtain statistical predicted ages corresponding to the marked age values; the target training image is a training image corresponding to the same labeling age value;

adjusting parameters in the age identification model according to the age statistics error value until convergence conditions are met, so as to obtain a target age identification model; the convergence condition is a preset threshold value for measuring the age statistics error value.

2. The method of claim 1, wherein after the acquiring the training image set including the face, further comprising:

performing enhancement processing on the training images in the training image set to obtain the training images after the enhancement processing, wherein the enhancement processing comprises: at least one of random erasure, random blurring, super resolution processing;

the step of using the training images in the training image set as the input of the age identification model comprises the following steps:

and taking the training image after the enhancement processing as the input of an age identification model.

3. The method according to claim 2, wherein the enhancing the training images in the training image set to obtain the enhanced training images comprises:

when the enhancement processing comprises random erasure, randomly selecting an erasure area from the training image, and carrying out random assignment on pixels of the erasure area to obtain a training image after the random erasure processing; and/or

When the enhancement processing comprises random blurring, randomly selecting a direction to carry out convolution operation on the training image, so as to obtain a training image after the random blurring processing; and/or

When the enhancement processing comprises super-resolution processing, super-resolution processing is carried out on the training image through an image super-resolution model, so that the training image after super-resolution processing is obtained.

4. The method of claim 1, wherein the training images including faces collectively include a plurality of angled face images, the plurality of angled face images belonging to a plurality of different age groups.

5. The method according to claim 1, wherein the obtaining the statistical predicted age corresponding to the labeled age value according to the predicted age value corresponding to the face in the target training image includes:

And averaging the obtained predicted age values corresponding to the faces in the target training images to obtain average predicted ages, and taking the average predicted ages as statistical predicted ages corresponding to the marked age values.

6. The method of claim 1, wherein calculating an age statistical error value from the statistically predicted age and the corresponding labeled age value comprises:

obtaining an error adjustment coefficient;

and calculating according to the error adjustment coefficient, the statistical prediction age and the corresponding marked age value to obtain an age statistical error value.

7. The method of claim 6, wherein the obtaining the error adjustment coefficient comprises:

calculating according to the statistical predicted age and the corresponding marked age value to obtain an error adjustment control parameter;

and calculating the error adjustment coefficient according to the error adjustment control parameter, wherein the error adjustment coefficient and the error adjustment control coefficient are inversely related.

8. A method of age identification, the method comprising:

acquiring an image to be identified containing a human face;

the image to be identified is used as input of an age identification model, the age identification model is obtained by training by using an age statistics error value as an error measurement standard, and the age statistics error value is obtained by carrying out statistics calculation according to a marked age value and predicted age values corresponding to faces in a plurality of target training images corresponding to the same marked age value; the target training image is a training image corresponding to the same labeling age value;

9. The method of claim 8, wherein the age-recognition model is trained using a convolutional neural network model, the age-recognition model comprising: the plurality of convolution layers comprises a preset number of activation layers and pooling layers between adjacent convolution layers.

10. The method of claim 8, wherein prior to the inputting of the image to be identified as an age identification model, comprising:

identifying the face in the image to be identified to obtain a target face area;

the input of the image to be identified as the trained age identification model comprises the following steps:

and taking the target face area corresponding to the image to be recognized as the input of the trained age recognition model.

11. A training device for an age identification model, the device comprising:

The first calculation module is used for calculating and obtaining statistical predicted ages corresponding to the marked age values according to the predicted age values corresponding to the faces in the target training images corresponding to the same marked age value; the target training image is a training image corresponding to the same labeling age value;

the adjustment module is used for adjusting parameters in the age identification model according to the age statistics error value until convergence conditions are met, so as to obtain a target age identification model; the convergence condition is a preset threshold value for measuring the age statistics error value.

12. The apparatus of claim 11, wherein the apparatus further comprises:

the enhancement processing module is used for carrying out enhancement processing on the training images in the training image set to obtain the training images after enhancement processing, and the enhancement processing comprises: at least one of random erasure, random blurring, super resolution processing;

the input/output module is also used for taking the training image after the enhancement processing as the input of the age identification model.

13. The apparatus of claim 12, wherein the enhancement processing module is further configured to:

14. The apparatus of claim 11, wherein the training images including faces collectively include a plurality of angled face images, the plurality of angled face images belonging to a plurality of different age groups.

15. The apparatus of claim 11, wherein the first computing module is further configured to:

16. The apparatus of claim 11, wherein the second computing module is further configured to:

obtaining an error adjustment coefficient;

17. The apparatus of claim 16, wherein the second computing module is further configured to:

18. An age identification device, the device comprising:

the input module is used for taking the image to be identified as the input of an age identification model, the age identification model is obtained by training by taking an age statistics error value as an error measurement standard, and the age statistics error value is obtained by carrying out statistics calculation according to a marked age value and predicted age values corresponding to faces in a plurality of target training images corresponding to the same marked age value; the target training image is a training image corresponding to the same labeling age value;

19. The apparatus of claim 18, wherein the age-recognition model is trained using a convolutional neural network model, the age-recognition model comprising: the plurality of convolution layers comprises a preset number of activation layers and pooling layers between adjacent convolution layers.

20. The apparatus of claim 18, wherein the apparatus further comprises:

the identification module is used for identifying the face in the image to be identified to obtain a target face area;

the input module is also used for:

21. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of any one of claims 1 to 10.

22. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 10.