CN112766238A

CN112766238A - Age prediction method and device

Info

Publication number: CN112766238A
Application number: CN202110278664.4A
Authority: CN
Inventors: 陈晨; 冯子钜; 叶润源; 毛永雄; 董帅; 邹昆; 李悦乔
Original assignee: Zhongshan Xidao Technology Co ltd; University of Electronic Science and Technology of China Zhongshan Institute
Current assignee: Zhongshan Xidao Technology Co ltd; University of Electronic Science and Technology of China Zhongshan Institute
Priority date: 2021-03-15
Filing date: 2021-03-15
Publication date: 2021-05-07
Anticipated expiration: 2041-03-15
Also published as: CN112766238B

Abstract

The application provides an age prediction method and device, and relates to the field of image recognition. According to the age prediction method, the age prediction result corresponding to the face image is output according to the face features, the face dynamic change features and the pre-trained age prediction model. Therefore, the predicted age information is more accurate.

Description

Age prediction method and device

Technical Field

The present application relates to the field of image recognition, and in particular, to an age prediction method and apparatus.

Background

Generally, in the field of image recognition, the age of a user corresponding to a face image can be predicted by extracting image features of the face image and then analyzing the extracted image features.

At present, the specific way of determining the age of a user corresponding to a face image through image recognition is as follows: the method comprises the steps of obtaining partial images of four parts, namely a left eye, a right eye, a nose and a mouth of a face image, respectively extracting multi-scale partial features of the partial images of the four parts, and connecting the extracted multi-scale partial features of the partial images of the four parts in a series connection mode to obtain face fusion features. And finally, according to the face fusion characteristics, predicting the age of the user corresponding to the face image. However, the accuracy of the age of the user determined in the above manner is low.

Disclosure of Invention

An object of the embodiments of the present application is to provide an age prediction method and an age prediction device, so as to solve the problem that the accuracy of determining the age of a user corresponding to a face image is low according to the face image.

In a first aspect, an embodiment of the present application provides an age prediction method, where the method includes:

acquiring face images of the same user acquired at multiple moments;

extracting the face features of each face image;

extracting dynamic change features of the human face according to the human face features of the human face images acquired at the previous time and the human face features of the human face images acquired at the later time in every two adjacent times, wherein the previous time is later than the later time;

and outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and a pre-trained age prediction model, wherein the age prediction model is formed by taking the face features, the face dynamic change features and the real age information of historical face image samples acquired at a plurality of historical moments as the input training of the network to be trained.

In a second aspect, an embodiment of the present application further provides an age prediction model training method, where the method includes:

determining an age prediction result based on a network to be trained according to the face features and the face dynamic change features of historical face image samples acquired at a plurality of historical moments;

determining a loss function according to the age prediction result and the real age information corresponding to the historical face image sample;

determining whether the loss function is less than a preset threshold;

if the current time is greater than the preset threshold value, updating historical face image samples, updating network parameters of the network to be trained based on the loss function, and returning to the step of executing the face features and the face dynamic change features of the historical face image samples collected at a plurality of historical moments based on the network to be trained to obtain an age prediction result;

and if the network parameter is smaller than the preset threshold value, establishing an age prediction model based on the current network parameter of the network to be trained.

In a third aspect, an embodiment of the present application further provides an age prediction apparatus, where the apparatus includes:

the information acquisition unit is used for acquiring face images of the same user acquired at a plurality of moments;

the characteristic extraction unit is used for extracting the face characteristic of each face image;

the feature extraction unit is further configured to extract a dynamic face change feature according to a face feature of a face image acquired at a previous time and a face feature of a face image acquired at a subsequent time in each two adjacent times, where the previous time is later than the subsequent time;

and the age prediction unit is used for outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and a pre-trained age prediction model, wherein the age prediction model is formed by taking the face features, the face dynamic change features and the real age information of historical face image samples acquired at a plurality of historical moments as the input training of the network to be trained.

In a fourth aspect, an embodiment of the present application further provides an age prediction model training apparatus, where the apparatus includes:

the information determining unit is used for determining an age prediction result based on a network to be trained according to the face characteristics and the face dynamic change characteristics of historical face image samples acquired at a plurality of historical moments;

the information determining unit is further configured to determine a loss function according to the age prediction result and the real age information corresponding to the historical face image sample;

the information determining unit is further configured to determine whether the loss function is smaller than a preset threshold;

the information updating unit is used for updating historical face image samples and network parameters of the network to be trained based on the loss function if the historical face image samples are larger than a preset threshold value, and returning to execute the steps of acquiring the age prediction result by using the face features and the face dynamic change features of the historical face image samples acquired at a plurality of historical moments based on the network to be trained;

and the model establishing unit is used for establishing an age prediction model based on the current network parameters of the network to be trained if the current network parameters are smaller than a preset threshold value.

In a fifth aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the steps in the method as provided in the first aspect are executed.

In a sixth aspect, embodiments of the present application provide a readable storage medium, on which a computer program is stored, where the computer program runs the steps in the method provided in the first aspect when executed by a processor.

Compared with the prior art, the method has the following beneficial effects: according to the age prediction method, the age prediction result corresponding to the face image is output according to the face features, the face dynamic change features and the pre-trained age prediction model. Therefore, the predicted age information is more accurate.

Additional features and advantages of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

Fig. 1 is a first flowchart of an age prediction method according to an embodiment of the present disclosure;

fig. 2 is a flowchart of a method for predicting age according to an embodiment of the present disclosure;

fig. 3 is a schematic interaction diagram of a server and a terminal device according to an embodiment of the present application;

fig. 4 is a flowchart of an age prediction model training method provided in an embodiment of the present application;

FIG. 5 is a schematic structural diagram of a cascaded multi-layer convolutional neural network provided in an embodiment of the present application;

fig. 6 is a schematic structural diagram of a convolutional neural network provided in an embodiment of the present application;

fig. 7 is a schematic diagram of functional units of an age prediction apparatus according to an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of a functional unit of an age prediction model training apparatus according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Technical term interpretation:

the Long Short-Term Memory (LSTM) network is a time-cycle neural network and is specially designed for solving the Long-Term dependence problem of the common RNN (cycle neural network). Due to the unique design structure, LSTM is suitable for handling and predicting significant events of very long intervals and delays in a time series. The LSTM network has the advantages that the input gate, the forgetting gate and the output gate are added, and the weight coefficient among the connections is designed, so that the LSTM network can accumulate long-term contact among nodes with longer distances, and long-term memory of data is realized.

The bidirectional Long and Short Term Memory (Bi-LSTM) network is formed by combining a forward LSTM and a backward LSTM. For the output at time t, the forward LSTM layer has information of time t and the previous time in the input sequence, and the backward LSTM layer has information of time t and the next time in the input sequence, so that the correlation between the information of the contexts can be determined.

As shown in fig. 1, the age prediction method provided by the present application may include: the method comprises the steps of preprocessing face images of the same user collected at multiple moments, and then obtaining the face characteristics of each face image from the preprocessed face images through a graph rolling layer. And then, the long-term and short-term memory network acquires dynamic change characteristics of the human face according to the extracted human face characteristics. And outputting a second target vector by using the fusion characteristics according to the human face characteristics and the human face dynamic change characteristics through a bidirectional long-short term memory network. Wherein the second target vector is used to indicate the relation of the fused feature to the age. Further, the full-connection layer maps the second target vector to a preset age interval; finally, the softmax function determines the probability of the second target vector at each age in the preset age interval, and the age with the highest probability may be selected as the output age prediction result. The age prediction method provided by the present application is described in detail below with particular reference to fig. 2-5.

Referring to fig. 2, the present application provides an age prediction method, which can be applied to a server 100. As shown in fig. 3, the server 100 may be communicatively connected with the terminal device 200 for data exchange.

The method comprises the following steps:

s21: the method comprises the steps of acquiring face images of the same user acquired at multiple moments.

The user may trigger the terminal device 200 to acquire the face images acquired at multiple times on a display interface of the application program with the photographing function of the terminal device 200. For example, the user uses a three-shot, five-shot function in the camera application. Or, the user may trigger the terminal device 200 to acquire the face images acquired at multiple times on a display interface of an application program of the terminal device 200 having a video shooting function. For example, a video capture function is used in a camera application to capture images of a human face captured at multiple times. The plurality of time instants may be continuous, for example, the face image is acquired every 100ms, and the plurality of time instants may be 100ms, 200ms, 300ms, 400ms, 500ms, and so on, which is not limited herein. The plurality of time instants may also be discontinuous, and the plurality of time instants may be continuous, for example, the face image is acquired every 100ms, and the plurality of time instants may be 100ms, 300ms, 500ms, and the like, which is not limited herein. The server 100 may receive face images from the terminal device 200 at a plurality of times.

Optionally, the face images acquired at multiple moments are preprocessed. Wherein, the pretreatment mode at least comprises the following steps: graying, normalizing, extracting a face region in the face image, and the like. The normalization processing mode can be as follows: the parameters such as the size, the brightness and the like of the face images acquired at a plurality of moments are normalized so as to facilitate the subsequent face recognition and the determination of the face change characteristics.

S22: and extracting the face features of each face image.

In the embodiment of the application, the face features may be undirected graph features. The undirected graph feature is used for representing the position relation of each pixel point in the face image relative to other pixel points. Wherein, S22 can be performed as follows:

step A: and converting each face image into a phase-free image matrix.

The undirected graph matrix is used for representing the position relation of each pixel point in the face image relative to other pixel points. Specifically, each pixel point of the face image can be used as a vertex pixel point of the undirected graph, and the euclidean distance between each vertex pixel point and the vertex pixel point is smaller than a preset distance (such as the euclidean distance)

) The generated adjacency matrix of each pixel point is generated by connecting the pixel points; then, the adjacency matrixes of all the pixel points are spliced to the edge set E. The face image is assumed to include w × h pixel points, the undirected graph matrix G obtained by conversion is { V, E }, V is a set of pixel points including w × h vertices, and E is an edge set.

And B: and inputting the undirected graph matrix into the cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics.

Fig. 4 shows the structure of a cascaded multi-layer graph convolutional neural network (a 5-layer graph convolutional neural network is included in fig. 4). The parameters of each layer graph convolutional neural network are shown in table 1:

number of layers	Parameter(s)
		Graph convolution neural network 1	R＝9,Q＝32,ReLU，C＝2
Graph convolutional neural network 2	R＝9,Q＝32,ReLU，C＝2
		Graph convolutional neural network 3	R＝6,Q＝64,ReLU，C＝1
Graph convolutional neural network 4	R＝6,Q＝64,ReLU，C＝1
		Graph convolution neural network 5	R＝4,Q＝128,ReLU，C＝1

TABLE 1

In table 1, R is the size of the filter, Q is the number of undirected graphs output to the next layer, ReLU represents the activation function, and C represents the number of coarsening times.

As shown in fig. 5, each layer of the graph convolutional neural network includes a filtering layer, an activation layer, and a coarsening layer. Specifically, the processing procedure of each layer of graph convolution neural network on the undirected graph matrix comprises the following steps: filtering the undirected graph matrix in a filtering layer to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix; (the filtering process may be that the undirected graph matrix in the spatial domain is first transformed into the frequency domain to obtain an undirected graph matrix in the frequency domain, then the undirected graph matrix in the frequency domain is filtered to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix in the frequency domain, and finally the vertex vector matrix formed by all vertex pixel points in the undirected graph matrix in the frequency domain is transformed into the spatial domain to activate the function processing, the vertex vector matrix is nonlinearly processed according to an activation function (Relu function) in an activation layer, and the vertex vector matrix after the nonlinear processing is coarsened in a coarsened layer. understandably, in the cascaded multi-layer graph convolutional neural network, a first target vector of the input of the first layer graph convolutional neural network is the undirected graph matrix, and the output of the last layer of the graph convolutional neural network is the undirected graph characteristic, and the output of the graph convolution neural network of the upper layer is the input of the graph convolution network of the next adjacent layer.

In addition, in the embodiment of the application, the facial features in the facial image can be extracted through a facial feature extraction algorithm based on geometric features, a facial feature extraction algorithm based on a neural network, a facial feature extraction algorithm based on elastic image matching, and a facial feature extraction algorithm based on a support vector machine, which are not limited herein. The face features may include left eye features, right eye features, nose features, mouth features, or a combination of at least two of the above features.

S23: and extracting dynamic change features of the human face according to the human face features of the human face images acquired at the previous moment and the human face features of the human face images acquired at the later moment in every two adjacent moments.

Wherein the preceding time is later than the following time. For example, the facial images acquired at multiple time instants may include a 100ms facial image, a 200ms facial image, and a 300ms facial image, and then the facial image at a previous time instant in any two adjacent time instants has a dynamically changing facial feature with respect to the facial image at a later time instant, including: the dynamic change characteristics of the face image of 200ms relative to the face image of 100ms and the dynamic change characteristics of the face image of 300ms relative to the face image of 200 ms. The dynamic human face change features may include position change features of pixel points, brightness change features of pixel points, color change features of pixel points, and the like, which are not limited herein.

Specifically, the face features of the face images acquired at the previous time and the face features of the face images acquired at the later time in every two adjacent times are used as the input of a pre-trained dynamic change feature generation model to determine the dynamic change features of the faces. The dynamic change feature generation model is based on the face features of face images acquired at the previous moment and the face features of face images acquired at the later moment in every two historical adjacent moments as training samples, and the dynamic change features of the faces corresponding to the training samples are used as target results and input into the long-term and short-term memory network for training.

S24: and outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and the pre-trained age prediction model.

The age prediction model is formed by taking the face features, the face dynamic change features and the real age information of historical face image samples acquired at a plurality of historical moments as input training of a network to be trained.

After the age prediction result corresponding to the face image is obtained, the age prediction result may be transmitted back to the terminal device 200, and the terminal device 200 may display the age prediction result on a display interface of an application program having a photographing function. Further, the user can know the age prediction result based on the display interface.

According to the age prediction method, the age prediction result corresponding to the face image is output according to the face features, the face dynamic change features and the pre-trained age prediction model. Therefore, the accuracy of the predicted age information is higher.

The following describes the training process of the age prediction model in S24, and as shown in fig. 4, the training process may include:

s41: and determining an age prediction result based on the network to be trained according to the face features and the face dynamic change features of the historical face image samples acquired at a plurality of historical moments.

The age prediction model may include a two-way long-short term memory network, a fully connected layer, and a softmax function. The bidirectional long-short term memory network is used for outputting a second target vector according to the human face features and the fusion features of the human face dynamic change features (a feature matrix spliced according to the human face features and the human face dynamic change features), and the second target vector is used for indicating the relationship between the fusion features and the age; the full-connection layer is used for mapping the second target vector to a preset age interval; the softmax function is used for determining the probability of the second target vector at each age in the preset age interval, and the age with the maximum probability is selected as the output age prediction result.

S42: and determining a loss function according to the age prediction result and the real age information corresponding to the historical face image sample.

S43: determining whether the loss function is smaller than a preset threshold value; if so, S44 is performed.

S44: and establishing an age prediction model based on the current network parameters of the network to be trained.

S45: and updating the historical face image samples and updating the network parameters of the network to be trained based on the loss function, and returning to the step S41.

In some alternative embodiments, S23 may include:

step 1: and extracting the undirected graph characteristics of each face image.

It can be understood that the manner of extracting the undirected graph features of the face image in step 1 is the same as the manner of extracting the undirected graph features of the face image in S23, and is not described herein again.

Step 2: undirected graph features corresponding to the face images acquired at multiple moments are used as input of a long-term and short-term memory network to obtain the face dynamic change features of the face image at the previous moment relative to the face image at the later moment.

Referring to fig. 7, an age prediction apparatus is further provided in the present embodiment, which can be applied to the server 100. As also shown in fig. 2, the server 100 may be communicatively coupled to the terminal device 200 for data exchange. It should be noted that, the specific implementation method and the beneficial effects of the age estimation device are the same as those of the above implementation, and the above description may be specifically referred to. The device includes: an information acquisition unit 801, a feature extraction unit 802, and an age prediction unit 803, wherein,

an information acquisition unit 801 is configured to acquire face images of the same user acquired at multiple times.

The apparatus may further include: the preprocessing unit is used for preprocessing the face images acquired at a plurality of moments, wherein the preprocessing mode at least comprises the following steps: graying processing and normalization processing.

And a feature extraction unit 802, configured to extract a face feature of each face image.

The feature extraction unit 802 is further configured to extract, based on the face images acquired at multiple times, a face dynamic change feature of a face image at a previous time relative to a face image at a later time in any two adjacent times, where the previous time is later than the later time.

Specifically, the feature extraction unit 802 may include: the first feature extraction module is used for extracting undirected graph features of each face image, wherein the undirected graph features are used for representing the position relation of each pixel point in the face image relative to other pixel points.

The first feature extraction module is specifically used for processing each face image into an undirected graph matrix; inputting the undirected graph matrix into a cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics;

wherein, the processing process of each layer of graph convolution neural network to the undirected graph matrix is as follows: filtering the undirected graph matrix to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix; carrying out nonlinear processing on the vertex vector matrix according to the activation function; and carrying out coarsening operation on the vertex vector matrix after the nonlinear processing to obtain the characteristics of the undirected graph.

And the second feature extraction module is used for taking the undirected graph features corresponding to the face images acquired at multiple moments as the input of the long-term and short-term memory network so as to acquire the face dynamic change features of the face image at the previous moment relative to the face image at the later moment.

And the age prediction unit 803 is configured to output an age prediction result corresponding to the face image according to the face features, the face dynamic change features, and a pre-trained age prediction model, where the age prediction model is formed by training, as an input of a network to be trained, the face features, the face dynamic change features, and real age information of history face image samples acquired at multiple history moments.

Referring to fig. 8, an age prediction model training apparatus according to an embodiment of the present application further includes an information determining unit 801, an information updating unit 802, and a model establishing unit 803.

The information determining unit 801 is configured to determine an age prediction result based on the face features and the face dynamic change features of the historical face image samples acquired by the network to be trained at multiple historical times.

The information determining unit 801 is further configured to determine a loss function according to the age prediction result and the real age information corresponding to the historical face image sample.

The information determining unit 801 is further configured to determine whether the loss function is smaller than a preset threshold.

And an information updating unit 802, configured to update historical face image samples and network parameters of the network to be trained based on the loss function if the historical face image samples are greater than a preset threshold, and return to the step of executing the face features and the face dynamic change features of the historical face image samples acquired at multiple historical moments based on the network to be trained to obtain an age prediction result.

The model establishing unit 803 is configured to, if the current network parameter of the network to be trained is smaller than a preset threshold, establish an age prediction model based on the current network parameter of the network to be trained.

The above prior art solutions have shortcomings which are the results of practical and careful study of the inventor, and therefore, the discovery process of the above problems and the solutions proposed by the following embodiments of the present invention to the above problems should be the contribution of the inventor to the present invention in the course of the present invention.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device for executing an age prediction model training method and an age prediction method according to an embodiment of the present disclosure, where the electronic device may include: at least one processor 110, such as a CPU, at least one communication interface 120, at least one memory 130, and at least one communication bus 140. Wherein the communication bus 140 is used for realizing direct connection communication of these components. The communication interface 120 of the device in the embodiment of the present application is used for performing signaling or data communication with other node devices. The memory 130 may be a high-speed RAM memory or a non-volatile memory (e.g., at least one disk memory). Memory 130 may optionally be at least one memory device located remotely from the aforementioned processor. The memory 130 stores computer readable instructions, which when executed by the processor 110, cause the electronic device to perform the method processes described above with reference to fig. 1.

It will be appreciated that the configuration shown in fig. 9 is merely illustrative and that the electronic device may include more or fewer components than shown in fig. 9 or have a different configuration than shown in fig. 9. The components shown in fig. 9 may be implemented in hardware, software, or a combination thereof.

The apparatus may be a module, a program segment, or code on an electronic device. It should be understood that the apparatus corresponds to the above-mentioned embodiment of the method of fig. 1, and can perform various steps related to the embodiment of the method of fig. 1, and the specific functions of the apparatus can be referred to the description above, and the detailed description is appropriately omitted here to avoid redundancy.

It should be noted that, for the convenience and conciseness of description, the specific working processes of the system and the device described above may refer to the corresponding processes in the foregoing method embodiments, and the description is not repeated here.

Embodiments of the present application provide a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the method processes performed by an electronic device in the method embodiment shown in fig. 1.

The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, which when executed by a computer, the computer is capable of executing the methods provided by the above-mentioned method embodiments, for example, acquiring facial images acquired at a plurality of moments; extracting the face features of each face image; on the basis of the face images acquired at multiple moments, extracting the face dynamic change characteristics of the face image at the previous moment relative to the face image at the later moment in any two adjacent moments, wherein the previous moment is later than the later moment; and outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and a pre-trained age prediction model, wherein the age prediction model is formed by training the face features, the face dynamic change features and the real age information of a plurality of historical face image samples acquired at historical moments as input of a network to be trained.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method of age prediction, the method comprising:

acquiring face images of the same user acquired at multiple moments;

extracting the face features of each face image;

extracting dynamic change features of the human face according to the human face features of the human face images acquired at the previous time and the human face features of the human face images acquired at the later time in every two adjacent times;

and outputting an age prediction result corresponding to the face image according to the face features respectively corresponding to the face images acquired at a plurality of moments, the extracted dynamic face change features and a pre-trained age prediction model, wherein the age prediction model is formed by taking the face features, the dynamic face change features and the real age information of historical face image samples acquired at a plurality of historical moments as input training of a network to be trained.

2. The method according to claim 1, wherein the extracting the facial features of each facial image comprises:

converting each face image into a phase-free image matrix, wherein the phase-free image matrix is used for representing the position relationship of each pixel point in the face image relative to other pixel points;

and extracting the face features according to the undirected graph matrix.

3. The method according to claim 2, wherein the extracting the face features according to the undirected graph matrix comprises:

inputting the undirected graph matrix into a cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics;

wherein, the processing process of each layer of the convolutional neural network on the undirected graph matrix is as follows:

filtering an input first target vector to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix;

carrying out nonlinear processing on the vertex vector matrix according to an activation function;

coarsening the vertex vector matrix after the nonlinear processing;

in the cascaded multilayer graph convolution neural network, a first target vector input by a first layer of graph convolution neural network is the undirected graph matrix, the output of a last layer of graph convolution neural network is an undirected graph characteristic, and the output of a previous layer of graph convolution neural network is the input of an adjacent next layer of graph convolution network.

4. The method according to claim 1, wherein the extracting the face features of the face image acquired at the previous time and the face features of the face image acquired at the later time in each two adjacent times in the plurality of times to determine the dynamic change features of the face comprises:

and determining dynamic change characteristics of the human face by taking the human face characteristics of the human face image acquired at the previous moment and the human face characteristics of the human face image acquired at the later moment in every two adjacent moments as the input of a dynamic change characteristic generation model trained in advance, wherein the dynamic change characteristic generation model is formed by inputting the human face characteristics of the human face image acquired at the previous moment and the human face characteristics of the human face image acquired at the later moment in every two adjacent moments in history as training samples and the dynamic change characteristics of the human face corresponding to the training samples as target results into the long-short term memory network for training.

5. The method of claim 1, wherein the age prediction model comprises a bidirectional long-short term memory network, a full connection layer and a softmax function, wherein the bidirectional long-short term memory network is configured to output a second target vector according to the fused features of the face features and the face dynamic variation features, and wherein the second target vector is configured to indicate a relationship between the fused features and the age; the full-connection layer is used for mapping the second target vector to a preset age interval; the softmax function is used for determining the probability of the second target vector at each age in a preset age interval, and selecting the age with the highest probability as an output age prediction result.

6. A method for training an age prediction model, the method comprising:

determining whether the loss function is less than a preset threshold;

if the current time is greater than the preset threshold value, updating the network parameters of the network to be trained based on the loss function, and returning to the step of executing the human face characteristics and the human face dynamic change characteristics of the historical human face image samples collected at a plurality of historical moments based on the network to be trained to obtain an age prediction result;

7. An age prediction apparatus, characterized in that the apparatus comprises:

8. An age prediction model training apparatus, characterized in that the apparatus comprises:

9. An electronic device comprising a processor and a memory, the memory storing computer readable instructions that, when executed by the processor, perform the method of any of claims 1-6.

10. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.