CN112183326A

CN112183326A - Face age recognition model training method and related device

Info

Publication number: CN112183326A
Application number: CN202011033875.3A
Authority: CN
Inventors: 陈仿雄
Original assignee: Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Current assignee: Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Priority date: 2020-09-27
Filing date: 2020-09-27
Publication date: 2021-01-05

Abstract

The application discloses a face age recognition model training method and a related device. The method comprises the following steps: acquiring a sample data set, wherein the sample data set comprises a plurality of sample face images, each sample face image is marked with a corresponding age value label and an age difference value label, the age value label indicates an actual age value of a face in the sample face images, and the age difference value label is used for indicating a deviation range of the actual age value; based on the sample face image, the age value label of the sample face image and the age difference value label, a network model is trained to obtain an age identification model, and the age identification model can improve the accuracy of identifying the age of the face image.

Description

Face age recognition model training method and related device

Technical Field

The invention relates to the technical field of computer vision, in particular to a face age recognition model training method and a related device.

Background

The face image often contains a lot of face feature information, wherein age is taken as important feature information and is widely applied to the field of face recognition.

Generally, in the problem of age identification of a face image, the age is often used as label information, and a corresponding relationship between the face image and the age is established in a model training process. Different people have some similar characteristics in the same age stage, and the identification accuracy of the common face age identification algorithm on the face age in practical application is low.

Disclosure of Invention

The application provides a face age recognition model training method and a related device.

In a first aspect, a face age recognition model training method is provided, including:

acquiring a sample data set, wherein the sample data set comprises a plurality of sample face images, each sample face image is marked with a corresponding age value label and an age difference value label, the age value label indicates an actual age value of a face in the sample face images, and the age difference value label is used for indicating a deviation range of the actual age value;

training a network model based on the sample face image, the age value label of the sample face image and the age difference value label to obtain an age identification model.

In a second aspect, a training device for a face age recognition model is provided, which includes:

the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a sample data set, the sample data set comprises a plurality of sample face images, each sample face image is marked with a corresponding age value label and an age difference value label, the age value label indicates the actual age value of a face in the sample face image, and the age difference value label is used for indicating the deviation range of the actual age value;

and the training module is used for training a network model based on the sample face image, the age value label of the sample face image and the age difference value label to obtain an age identification model.

In a third aspect, a computer storage medium is provided, which stores one or more instructions adapted to be loaded by a processor and to perform the steps of the first aspect and any possible implementation thereof.

The embodiment of the application obtains a sample data set, the sample data set comprises a plurality of sample face images, each sample face image is marked with a corresponding age value label and an age difference label, the age value label indicates an actual age value of a face in the sample face image, the age difference label is used for indicating a deviation range of the actual age value, then based on the sample face image, the age value label of the sample face image and the age difference label, a network model is trained to obtain an age identification model, the age difference label can be weakened, mutual constraint between the age value label and the age difference label is utilized, the network model can better learn face characteristic information of similar age values, a multi-level loss function is constructed, and model training is carried out. And the age identification result can be comprehensively judged according to the predicted age value and the acceptable age difference value in application, so that the range of the age identification result is closer to the real age value, and the accuracy of identifying the age of the face image is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments or the background art of the present application, the drawings required to be used in the embodiments or the background art of the present application will be described below.

Fig. 1 is a schematic flow chart of a face age recognition model training method according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a face age recognition model training method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a network model according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a face age recognition model training device according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Neural Networks (NN) referred to in the embodiments of the present application are complex network systems formed by widely interconnecting a large number of simple processing units (called neurons), reflect many basic features of human brain functions, and are highly complex nonlinear dynamical learning systems. The neural network has the capabilities of large-scale parallel, distributed storage and processing, self-organization, self-adaptation and self-learning, and is particularly suitable for processing inaccurate and fuzzy information processing problems which need to consider many factors and conditions simultaneously.

Convolutional Neural Networks (CNN) are a class of feed forward Neural Networks (fed forward Neural Networks) that contain convolution computations and have a deep structure, and are one of the representative algorithms for deep learning (deep learning).

The embodiments of the present application will be described below with reference to the drawings.

Referring to fig. 1, fig. 1 is a schematic flowchart of a face age recognition model training method according to an embodiment of the present application. The method can comprise the following steps:

101. the method comprises the steps of obtaining a sample data set, wherein the sample data set comprises a plurality of sample face images, each sample face image is marked with a corresponding age value label and an age difference value label, the age value label indicates an actual age value of a face in the sample face images, and the age difference value label is used for indicating a deviation range of the actual age value.

The execution subject of the embodiment of the present application may be a face age recognition model training apparatus, and may be an electronic device, and in a specific implementation, the electronic device may be a terminal, which may also be referred to as a terminal device, including but not limited to other portable devices such as a mobile phone, a laptop computer, or a tablet computer having a touch-sensitive surface (e.g., a touch screen display and/or a touch pad). It should also be understood that in some embodiments, the devices described above are not portable communication devices, but rather are desktop computers having touch-sensitive surfaces (e.g., touch screen displays and/or touch pads).

Specifically, the sample data set may be constructed first. The sample data set may include sample face images of different ages, and specifically, the preset age range of the selected sample face images may be set, that is, the ages of the sample face images are distributed within the preset age range, for example, the sample face images may be set to be 1 to 100 years old, so that the sample data set may include sample face images of all ages within the preset age range as much as possible. In the embodiment of the application, the actual age of the face is used as a label for model training, and an age difference value of each sample face image, that is, an age deviation range value acceptable for age prediction, is also labeled, for example, when the age value label of one sample face image is 36 and the age difference value label is within plus or minus 2 years, the age label range of the sample face image is [34, 38 ]. The age difference values corresponding to different age values can be different, and the range of the age difference value corresponding to each age value can be adjusted according to actual conditions. The setting rule of the age difference values in the embodiment of the present application may set different age difference values in different age stages, for example, for age stages 10 to 19, where the label of the age difference value corresponding to each age is 1, and for age stages 30 to 39, where the label of the age difference value corresponding to each age is 2, which are the age difference values set based on the criterion of the degree of change of the face in different age stages, because the degree of change of the face features is different in different periods of the person, the face features may change more in a middle age period due to the faster development in the adolescent period, and thus a relatively smaller age difference value may be set. Setting the same age difference value may result in a larger or smaller prediction result, setting different age difference values for different age stages, considering the face change size of the person in different age stages, so that the face age can be predicted more accurately after training by using the labels, and the age prediction of the face in different age stages has pertinence. The maximum age difference value can be set to be not more than a preset threshold value, such as plus or minus 4 years, according to needs, so that the condition that the age difference value is too large and too many characteristics of other ages are covered for one age, and inaccuracy of model learning is caused can be prevented.

102. And training a network model based on the sample face image, the age value label of the sample face image and the age difference value label to obtain an age identification model.

In order to enable the model to better learn similar characteristic information between similar age values, in the model training stage, the label of a single age value is weakened by using the label of the age value and the label of the age difference value, and a multi-level loss function is constructed to train the network model.

In an alternative embodiment, the network model may be a pre-constructed deep network model.

In the embodiment of the application, the sample face image can be preprocessed, so that the sample face image is in a preset image size. Further alternatively, a depth network model may be constructed, e.g. representing the image size in RGB color mode for the size 224 x 3 of the input image, where 224 x 224 represents the pixel size (length x width) of the image.

The network model mainly adopts the size of a convolution kernel 3 x 3, the step length is set to be 2, an activation function is set to be Relu, a characteristic image with the size of 7 x 7 is obtained through multilayer convolution operation, and then the depth network model for age identification is formed through a multilayer output structure formed by full connection layers.

The network structure of feature extraction in the network model of the embodiment of the application is not limited, and can be replaced according to actual needs, for example, a mobilene structure, an inclusion structure and the like are adopted.

In an alternative embodiment, the step 103 may include:

21. generating an age label vector corresponding to each sample face image according to the age value label and the age difference value label of each sample face image;

22. and training a network model according to the age value label, the age difference value label and the age label vector of each sample face image to obtain the age identification model.

In the embodiment of the application, the age value label of each sample face image and the age difference value label can be integrated to express the label information into a corresponding vector form, namely the age label vector, and the label information is used in model training.

Specifically, the generating an age label vector corresponding to each sample face image according to the age value label and the age difference value label of each sample face image includes:

determining a plurality of target age values corresponding to each sample face image according to the age value label and the age difference value label of each sample face image;

and generating an age tag vector corresponding to each sample face image according to the target age values, wherein the age tag vector comprises vector values of each age value in a preset age range, and the vector values corresponding to the target age values in the preset age range are larger than vector values corresponding to other age values.

Alternatively, the age tag vector may be constructed using a one-hot encoding. One-Hot coding, or One-Hot coding, also known as One-bit-efficient coding, uses an N-bit state register to encode N states, each state having its own independent register bit and only One of which is active at any time.

Based on the age value label and the age difference label of each sample face image, a plurality of target age values corresponding to each sample face image can be determined, for example, when the age value label of a sample face image is 36 and the age difference label is within plus or minus 2 years, the age label range of the sample face image is [34, 38], that is, the corresponding target age values include 34, 35, 36, 37, 38. The target age value can be represented in a vector form, and when the age label vector is generated, the corresponding position of the target age value is distinguished and labeled from the vector values of other positions so as to embody an effective age value. For example, the target age values include 34, 35, 36, 37, and 38, and the corresponding age tag vector may be represented as [0, …,0,1,1,1,1,1,0, …,0], where if the preset age range is set to 1-100, the 34 th, 35 th, 36 th, 37 th, and 38 th bit vector values in the age tag vector are represented as 1, and the other bit vector values are represented as 0.

Other ways may also be set to generate the corresponding age tag vector representation according to the age value tag and the age difference value tag, which is not limited in this embodiment of the application.

Further, the step 22 may specifically include:

inputting the sample face image into the network model, and outputting a prediction classification result of the sample face image;

calculating the total loss value between the prediction classification result of each sample face image and the label data by adopting a loss function; and adjusting the network parameters of the network model according to the total loss value until the network model converges to obtain the age identification model, wherein the label data includes the age value label, the age difference label and the age label vector of the sample face image.

After generating the age label vector of the sample face image, the network model may be trained using the sample face image, a loss function may be used to calculate a total loss value between the prediction classification result of each sample face image and the label data in the training process, and a network parameter of the network model may be adjusted according to the total loss value until the network model converges, where the loss function may calculate loss values of three levels including an age value label, an age difference label, and an age label vector.

Further optionally, the method further comprises:

23. acquiring a face image to be recognized;

24. and processing the face image to be recognized by adopting the age recognition model to obtain an age prediction result of the face image to be recognized.

The above steps may be executed after step 102, and the trained age identification model may be obtained through step 101 and step 102, and then applied to the above processing steps. I.e. the model obtained after training, may also be used in other devices to perform the above processing steps, which is not limited herein.

The face image to be recognized may be a shot image containing a face. The acquired image can be preprocessed by cutting and the like to obtain a standardized face image to be recognized so as to recognize the age of the face in the face image. In practical application, the age identification result of the face image to be identified can be comprehensively judged according to the predicted age value and the acceptable age difference value.

Referring to fig. 2, fig. 2 is a schematic flow chart of another training method for a face age recognition model according to an embodiment of the present application. As shown in fig. 2, the method may specifically include:

201. the method comprises the steps of obtaining a sample data set, wherein the sample data set comprises a plurality of sample face images, each sample face image is marked with a corresponding age value label and an age difference value label, the age value label indicates an actual age value of a face in the sample face images, and the age difference value label is used for indicating a deviation range of the actual age value.

202. And generating an age label vector corresponding to each sample face image according to the age value label and the age difference value label of each sample face image.

Step 201 and step 202 may refer to specific descriptions related to step 21 in step 101 and step 202, respectively, in the embodiment shown in fig. 1, and are not described herein again. The above step 202 may be performed by the network model in the training phase, or may be performed before the model training phase.

203. And inputting the sample face image into the network model, and outputting a prediction classification result of the sample face image, wherein the prediction classification result of the sample face image comprises a prediction age value, a prediction age vector and a prediction age difference value of the sample face image.

In the training stage, the network model can process the sample face images and predict the age value, the age vector and the age difference value corresponding to each sample face image. The predicted age vector includes a probability value corresponding to each age value predicted by the model, and the predicted age value is understood as an age value in which the probability value is the largest.

204. Calculating a first loss value of the predicted age vector and the age label vector of each sample image, calculating a second loss value of the predicted age value and the age label of each sample image, calculating a third loss value of the predicted age difference value and the age label of each sample image, and adding the first loss value, the second loss value, and the third loss value to obtain the total loss value.

Specifically, in order to enable the model to better learn similar characteristic information between similar age values, in the model training stage, the label of a single age value can be weakened by using the age label vector, and a multi-level loss function is constructed. In the embodiment of the present application, the loss function of the network model may be composed of three parts, where the first loss value of the first part is an age label vector loss value L1, the first part uses a K-L distance function, the second loss value of the second part is an age regression loss value L2, and the third loss value of the third part is an age difference loss value L3.

In an alternative embodiment, the loss function is calculated as follows:

wherein n represents a maximum value within the predetermined age range, and P represents_iA vector value, Q, representing the ith age in the age label vector_iRepresenting a predicted age vector, i.e. a probability value corresponding to each age value predicted by the model; y represents the actual age value indicated by the age value tag, f represents the predicted age value output by the model, y_chaRepresenting the age difference indicated by the age difference label, f_chaRepresenting the predicted age difference of the model output.

205. And adjusting the network parameters of the network model according to the total loss value until the network model converges to obtain an age identification model, wherein the label data comprises the age value label, the age difference label and the age label vector of the sample face image.

The

steps

203 and 204 may be repeatedly performed until the network model converges, and the training is finished, so that a trained age identification model may be obtained.

In the embodiment of the present application, different algorithms may be selected as needed to optimize the model parameters, including setting the training parameters such as the number of iterations and the learning rate, which is not limited in the embodiment of the present application.

In an alternative embodiment, in order to minimize the loss value of the whole model training, Adam algorithm may be used to optimize the model parameters, for example, the number of iterations may be set to 500, the initial learning rate is set to 0.001, the weight attenuation is set to 0.0005, and the learning rate is attenuated to 1/10 every 100 iterations to perform the training of the network model. The above is merely an example, and the actual training parameters may have other settings, which is not limited in the embodiments of the present application.

In an alternative implementation, please refer to fig. 3, fig. 3 is a schematic structural diagram of a network model provided in an embodiment of the present application, and an age-predicted deep network structure is constructed as shown in fig. 3, in the structure, a convolution kernel, an activation function, and a fully-connected layer are mainly used, where a circle portion represents a node of the fully-connected layer, which is only an illustration.

In the embodiment of the application, a sample face image with a preset image size can be obtained; and inputting the sample face image into a depth network structure, and extracting features in a convolution layer of the depth network structure to obtain a feature map corresponding to the sample face image. The convolution kernel of the convolutional layer 3 x 3 is sized with a step size of 2 and the activation function is set to Relu.

For example, firstly, the size of the input image is set to 224 × 3, the size of the convolution kernel 3 × 3 is adopted, the step size is set to 2, the activation function is set to Relu, feature maps of different scales are continuously extracted, so that semantic features from shallow features to deep features are extracted, and finally, a feature map of 7 × 7 size is obtained. The feature extraction part is not limited, and an existing structure can be extracted for replacement.

After obtaining the characteristic graph corresponding to the sample face image, inputting the characteristic graph corresponding to the sample face image into a full-connection layer for processing to obtain a characteristic structure corresponding to the characteristic graph; and calculating and outputting a prediction classification result of the sample face image according to the characteristic structure corresponding to the characteristic graph. The prediction classification result may include a prediction age value, a prediction age difference value, and a prediction age vector. Wherein the predicted age vector comprises probability values corresponding to the predicted age values.

Specifically, after the final 7 × 7 feature map is obtained, the full-connection layer is adopted to convert the 7 × 7 feature map into 1 × 1024 feature structure; then through the full connection layer, the structure is converted into a 1X n characteristic structure. In order to construct a multi-level loss function, the model is divided into three parts to be output, the upper part outputs a predicted age value, and the part only has a single value, such as 33 years old; the middle layer portion outputs a predicted age difference value, which is also a single value, such as-1 year old or 1 year old, indicating that the age of the face in the image is between 32-34, which is acceptable, and that the prediction is accurate in this range. The lower layer part thus outputs a predicted age vector comprising probability values corresponding to the predicted age values, the vector having a size n, i.e. in particular

[0,0,0,…,0.3,0.4,0.3,…,0]. In FIG. 3Q_iRepresenting a predicted age vector, i.e. a probability value corresponding to each age value predicted by the model; f represents the predicted age value of the model output, f_chaRepresenting the predicted age difference of the model output.

The above structure setting is only an example, and different network model structures may be actually constructed according to needs, and different network parameters may be adjusted, which is not limited in the embodiment of the present application.

Optionally, after obtaining the age identification model, the age identification model may be used to identify the age of the face image, as described in step 23 and step 24 in the embodiment shown in fig. 1, and will not be described herein again.

Further optionally, the step 24 includes:

processing the face image to be recognized by adopting the age recognition model to obtain a predicted age value and a predicted age difference value of the face image to be recognized;

judging whether the absolute value of the predicted age difference is larger than a difference threshold value;

if not, the age prediction result is the predicted age value; if so, the age prediction result is the sum of the predicted age value and the predicted age difference value.

In an age identification application scenario, the difference threshold may be preset, for example, 4 (years), and the age identification model may be used to process the facial image to be identified, so as to obtain an age prediction result of the facial image to be identified. Specifically, the face image to be recognized is input into a trained age recognition model, feature extraction is performed on a convolutional layer to obtain a feature map, and then full-link layer processing is performed to obtain a predicted age vector of the face image to be recognized, wherein the predicted age vector comprises probabilities of the face image to be recognized corresponding to all age values, one of the age values with the highest probability can be determined as the predicted age value to be output, and finally the predicted age value and the predicted age difference value can be output. And when the predicted age difference value is larger than the difference threshold value, adding the result of the predicted age difference value to the predicted age value to serve as the finally output age value, namely the age prediction result.

In the embodiment of the application, when the age of a sample face image is labeled, the acceptable age difference value label is added, and the single age value label is weakened, so that the influence of clear demarcation between similar ages is reduced, the model training process is limited by the constraint relation between the predicted age value and the predicted age difference value, the range of the identification result of the age is closer to the true value, the face age identification result is larger in fluctuation and lower in accuracy compared with a general scheme, and the accuracy of face age identification can be improved.

Based on the description of the embodiment of the face age recognition model training method, the embodiment of the application also discloses a face age recognition model training device. Referring to fig. 4, the face age recognition model training apparatus 400 includes:

an obtaining module 410, configured to obtain a sample data set, where the sample data set includes a plurality of sample face images, each sample face image is labeled with a corresponding age value label and an age difference label, the age value label indicates an actual age value of a face in the sample face image, and the age difference label indicates a deviation range from the actual age value;

the training module 420 is configured to train a network model based on the sample face image, the age value label of the sample face image, and the age difference label, so as to obtain an age identification model.

Optionally, the obtaining module 410 is further configured to obtain a face image to be recognized;

the training module 420 is further configured to process the facial image to be recognized by using the age recognition model, and obtain an age prediction result of the facial image to be recognized.

According to an embodiment of the present application, the steps involved in the methods shown in fig. 1 and fig. 2 may be performed by the modules in the face age recognition model training apparatus 400 shown in fig. 4, and are not described herein again.

The face age recognition model training apparatus 400 in the embodiment of the present application may be configured to obtain a sample data set, where the sample data set includes a plurality of sample face images, each of the sample face images is labeled with a corresponding age value label and an age difference label, the age value label indicates an actual age value of a face in the sample face image, the age difference label is used to indicate a deviation range from the actual age value, and then train a network model based on the sample face image, the age value label of the sample face image, and the age difference label to obtain an age recognition model, weaken the age label by using an age difference, enable the network model to better learn face feature information of similar age values through mutual constraint between the age value label and the age difference label, and construct a loss function, and (5) carrying out model training. And the age identification result can be comprehensively judged according to the predicted age value and the acceptable age difference value in the application, so that the range of the age identification result is closer to the real age value, and the accuracy of age identification is improved.

An embodiment of the present application further provides a computer storage medium (Memory), which is a Memory device in an electronic device and is used to store programs and data. It is understood that the computer storage medium herein may include both a built-in storage medium in the electronic device and, of course, an extended storage medium supported by the electronic device. Computer storage media provide storage space that stores an operating system for an electronic device. Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), suitable for loading and execution by the processor. The computer storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory; and optionally at least one computer storage medium located remotely from the processor.

In one embodiment, one or more instructions stored in a computer storage medium may be loaded and executed by a processor to perform the corresponding steps in the above embodiments; in particular implementations, one or more instructions in the computer storage medium may be loaded by the processor and executed to perform any step of the method in fig. 1 and/or fig. 2, which is not described herein again.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the division of the module is only one logical division, and other divisions may be possible in actual implementation, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not performed. The shown or discussed mutual coupling, direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some interfaces, and may be in an electrical, mechanical or other form.

Modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)), or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a read-only memory (ROM), or a Random Access Memory (RAM), or a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape, a magnetic disk, or an optical medium, such as a Digital Versatile Disk (DVD), or a semiconductor medium, such as a Solid State Disk (SSD).

Claims

1. A training method for a face age recognition model is characterized by comprising the following steps:

2. The training method of the human face age recognition model according to claim 1, wherein the training of the network model based on the sample human face image, the age value label of the sample human face image and the age difference value label to obtain the age recognition model comprises:

generating an age label vector corresponding to each sample face image according to the age value label and the age difference value label of each sample face image;

and training a network model according to the age value label, the age difference value label and the age label vector of each sample face image to obtain the age identification model.

3. The training method of the human face age recognition model according to claim 2, wherein the generating an age label vector corresponding to each sample human face image according to the age value label and the age difference value label of each sample human face image comprises:

generating an age tag vector corresponding to each sample face image according to the target age values, wherein the age tag vector comprises vector values of each age value in a preset age range, and vector values corresponding to the target age values in the preset age range are larger than vector values corresponding to other age values.

4. The training method of the human face age recognition model according to claim 2 or 3, wherein the training of the network model according to the age value label, the age difference value label and the age label vector of each sample human face image to obtain the age recognition model comprises:

calculating the total loss value between the prediction classification result of each sample face image and the label data by adopting a loss function; adjusting network parameters of the network model according to the total loss value until the network model converges to obtain the age identification model, wherein the label data comprises the age value label, the age difference value label and the age label vector of the sample face image.

5. The training method of the face age recognition model according to claim 4, wherein the prediction classification result of the sample face image comprises a prediction age value, a prediction age vector and a prediction age difference value of the sample face image;

the calculating the total loss value between the prediction classification result of each sample face image and the label data by adopting the loss function comprises the following steps:

calculating a first loss value of the predicted age vector and the age label vector of each sample image; calculating a predicted age value of the respective sample image and a second loss value of the age value label; calculating a predicted age difference value of each sample image and a third loss value of the age difference value label;

adding the first loss value, the second loss value, and the third loss value to obtain the total loss value.

6. The training method of the face age recognition model according to any one of claims 1 to 5, wherein the method further comprises:

acquiring a face image to be recognized;

and processing the face image to be recognized by adopting the age recognition model to obtain an age prediction result of the face image to be recognized.

7. The training method of the human face age recognition model according to claim 6, wherein the processing the human face image to be recognized by using the age recognition model to obtain the age prediction result of the human face image to be recognized comprises:

if not, the age prediction result is the predicted age value; if the difference value is larger than the preset value, the age prediction result is the sum of the predicted age value and the predicted age difference value.

8. A training device for a face age recognition model is characterized by comprising:

9. The apparatus of claim 8,

the acquisition module is also used for acquiring a face image to be recognized;

the training module is further used for processing the face image to be recognized by adopting the age recognition model to obtain an age prediction result of the face image to be recognized.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, causes the processor to carry out the steps of the face age recognition model training method according to any one of claims 1 to 7.