CN112766238A - Age prediction method and device - Google Patents

Age prediction method and device Download PDF

Info

Publication number
CN112766238A
CN112766238A CN202110278664.4A CN202110278664A CN112766238A CN 112766238 A CN112766238 A CN 112766238A CN 202110278664 A CN202110278664 A CN 202110278664A CN 112766238 A CN112766238 A CN 112766238A
Authority
CN
China
Prior art keywords
face
features
age
network
face image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110278664.4A
Other languages
Chinese (zh)
Other versions
CN112766238B (en
Inventor
陈晨
冯子钜
叶润源
毛永雄
董帅
邹昆
李悦乔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongshan Xidao Technology Co ltd
University of Electronic Science and Technology of China Zhongshan Institute
Original Assignee
Zhongshan Xidao Technology Co ltd
University of Electronic Science and Technology of China Zhongshan Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongshan Xidao Technology Co ltd, University of Electronic Science and Technology of China Zhongshan Institute filed Critical Zhongshan Xidao Technology Co ltd
Priority to CN202110278664.4A priority Critical patent/CN112766238B/en
Publication of CN112766238A publication Critical patent/CN112766238A/en
Application granted granted Critical
Publication of CN112766238B publication Critical patent/CN112766238B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/178Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an age prediction method and device, and relates to the field of image recognition. According to the age prediction method, the age prediction result corresponding to the face image is output according to the face features, the face dynamic change features and the pre-trained age prediction model. Therefore, the predicted age information is more accurate.

Description

Age prediction method and device
Technical Field
The present application relates to the field of image recognition, and in particular, to an age prediction method and apparatus.
Background
Generally, in the field of image recognition, the age of a user corresponding to a face image can be predicted by extracting image features of the face image and then analyzing the extracted image features.
At present, the specific way of determining the age of a user corresponding to a face image through image recognition is as follows: the method comprises the steps of obtaining partial images of four parts, namely a left eye, a right eye, a nose and a mouth of a face image, respectively extracting multi-scale partial features of the partial images of the four parts, and connecting the extracted multi-scale partial features of the partial images of the four parts in a series connection mode to obtain face fusion features. And finally, according to the face fusion characteristics, predicting the age of the user corresponding to the face image. However, the accuracy of the age of the user determined in the above manner is low.
Disclosure of Invention
An object of the embodiments of the present application is to provide an age prediction method and an age prediction device, so as to solve the problem that the accuracy of determining the age of a user corresponding to a face image is low according to the face image.
In a first aspect, an embodiment of the present application provides an age prediction method, where the method includes:
acquiring face images of the same user acquired at multiple moments;
extracting the face features of each face image;
extracting dynamic change features of the human face according to the human face features of the human face images acquired at the previous time and the human face features of the human face images acquired at the later time in every two adjacent times, wherein the previous time is later than the later time;
and outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and a pre-trained age prediction model, wherein the age prediction model is formed by taking the face features, the face dynamic change features and the real age information of historical face image samples acquired at a plurality of historical moments as the input training of the network to be trained.
In a second aspect, an embodiment of the present application further provides an age prediction model training method, where the method includes:
determining an age prediction result based on a network to be trained according to the face features and the face dynamic change features of historical face image samples acquired at a plurality of historical moments;
determining a loss function according to the age prediction result and the real age information corresponding to the historical face image sample;
determining whether the loss function is less than a preset threshold;
if the current time is greater than the preset threshold value, updating historical face image samples, updating network parameters of the network to be trained based on the loss function, and returning to the step of executing the face features and the face dynamic change features of the historical face image samples collected at a plurality of historical moments based on the network to be trained to obtain an age prediction result;
and if the network parameter is smaller than the preset threshold value, establishing an age prediction model based on the current network parameter of the network to be trained.
In a third aspect, an embodiment of the present application further provides an age prediction apparatus, where the apparatus includes:
the information acquisition unit is used for acquiring face images of the same user acquired at a plurality of moments;
the characteristic extraction unit is used for extracting the face characteristic of each face image;
the feature extraction unit is further configured to extract a dynamic face change feature according to a face feature of a face image acquired at a previous time and a face feature of a face image acquired at a subsequent time in each two adjacent times, where the previous time is later than the subsequent time;
and the age prediction unit is used for outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and a pre-trained age prediction model, wherein the age prediction model is formed by taking the face features, the face dynamic change features and the real age information of historical face image samples acquired at a plurality of historical moments as the input training of the network to be trained.
In a fourth aspect, an embodiment of the present application further provides an age prediction model training apparatus, where the apparatus includes:
the information determining unit is used for determining an age prediction result based on a network to be trained according to the face characteristics and the face dynamic change characteristics of historical face image samples acquired at a plurality of historical moments;
the information determining unit is further configured to determine a loss function according to the age prediction result and the real age information corresponding to the historical face image sample;
the information determining unit is further configured to determine whether the loss function is smaller than a preset threshold;
the information updating unit is used for updating historical face image samples and network parameters of the network to be trained based on the loss function if the historical face image samples are larger than a preset threshold value, and returning to execute the steps of acquiring the age prediction result by using the face features and the face dynamic change features of the historical face image samples acquired at a plurality of historical moments based on the network to be trained;
and the model establishing unit is used for establishing an age prediction model based on the current network parameters of the network to be trained if the current network parameters are smaller than a preset threshold value.
In a fifth aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the steps in the method as provided in the first aspect are executed.
In a sixth aspect, embodiments of the present application provide a readable storage medium, on which a computer program is stored, where the computer program runs the steps in the method provided in the first aspect when executed by a processor.
Compared with the prior art, the method has the following beneficial effects: according to the age prediction method, the age prediction result corresponding to the face image is output according to the face features, the face dynamic change features and the pre-trained age prediction model. Therefore, the predicted age information is more accurate.
Additional features and advantages of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a first flowchart of an age prediction method according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a method for predicting age according to an embodiment of the present disclosure;
fig. 3 is a schematic interaction diagram of a server and a terminal device according to an embodiment of the present application;
fig. 4 is a flowchart of an age prediction model training method provided in an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a cascaded multi-layer convolutional neural network provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a convolutional neural network provided in an embodiment of the present application;
fig. 7 is a schematic diagram of functional units of an age prediction apparatus according to an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a functional unit of an age prediction model training apparatus according to an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
Technical term interpretation:
the Long Short-Term Memory (LSTM) network is a time-cycle neural network and is specially designed for solving the Long-Term dependence problem of the common RNN (cycle neural network). Due to the unique design structure, LSTM is suitable for handling and predicting significant events of very long intervals and delays in a time series. The LSTM network has the advantages that the input gate, the forgetting gate and the output gate are added, and the weight coefficient among the connections is designed, so that the LSTM network can accumulate long-term contact among nodes with longer distances, and long-term memory of data is realized.
The bidirectional Long and Short Term Memory (Bi-LSTM) network is formed by combining a forward LSTM and a backward LSTM. For the output at time t, the forward LSTM layer has information of time t and the previous time in the input sequence, and the backward LSTM layer has information of time t and the next time in the input sequence, so that the correlation between the information of the contexts can be determined.
As shown in fig. 1, the age prediction method provided by the present application may include: the method comprises the steps of preprocessing face images of the same user collected at multiple moments, and then obtaining the face characteristics of each face image from the preprocessed face images through a graph rolling layer. And then, the long-term and short-term memory network acquires dynamic change characteristics of the human face according to the extracted human face characteristics. And outputting a second target vector by using the fusion characteristics according to the human face characteristics and the human face dynamic change characteristics through a bidirectional long-short term memory network. Wherein the second target vector is used to indicate the relation of the fused feature to the age. Further, the full-connection layer maps the second target vector to a preset age interval; finally, the softmax function determines the probability of the second target vector at each age in the preset age interval, and the age with the highest probability may be selected as the output age prediction result. The age prediction method provided by the present application is described in detail below with particular reference to fig. 2-5.
Referring to fig. 2, the present application provides an age prediction method, which can be applied to a server 100. As shown in fig. 3, the server 100 may be communicatively connected with the terminal device 200 for data exchange.
The method comprises the following steps:
s21: the method comprises the steps of acquiring face images of the same user acquired at multiple moments.
The user may trigger the terminal device 200 to acquire the face images acquired at multiple times on a display interface of the application program with the photographing function of the terminal device 200. For example, the user uses a three-shot, five-shot function in the camera application. Or, the user may trigger the terminal device 200 to acquire the face images acquired at multiple times on a display interface of an application program of the terminal device 200 having a video shooting function. For example, a video capture function is used in a camera application to capture images of a human face captured at multiple times. The plurality of time instants may be continuous, for example, the face image is acquired every 100ms, and the plurality of time instants may be 100ms, 200ms, 300ms, 400ms, 500ms, and so on, which is not limited herein. The plurality of time instants may also be discontinuous, and the plurality of time instants may be continuous, for example, the face image is acquired every 100ms, and the plurality of time instants may be 100ms, 300ms, 500ms, and the like, which is not limited herein. The server 100 may receive face images from the terminal device 200 at a plurality of times.
Optionally, the face images acquired at multiple moments are preprocessed. Wherein, the pretreatment mode at least comprises the following steps: graying, normalizing, extracting a face region in the face image, and the like. The normalization processing mode can be as follows: the parameters such as the size, the brightness and the like of the face images acquired at a plurality of moments are normalized so as to facilitate the subsequent face recognition and the determination of the face change characteristics.
S22: and extracting the face features of each face image.
In the embodiment of the application, the face features may be undirected graph features. The undirected graph feature is used for representing the position relation of each pixel point in the face image relative to other pixel points. Wherein, S22 can be performed as follows:
step A: and converting each face image into a phase-free image matrix.
The undirected graph matrix is used for representing the position relation of each pixel point in the face image relative to other pixel points. Specifically, each pixel point of the face image can be used as a vertex pixel point of the undirected graph, and the euclidean distance between each vertex pixel point and the vertex pixel point is smaller than a preset distance (such as the euclidean distance)
Figure BDA0002977312630000061
) The generated adjacency matrix of each pixel point is generated by connecting the pixel points; then, the adjacency matrixes of all the pixel points are spliced to the edge set E. The face image is assumed to include w × h pixel points, the undirected graph matrix G obtained by conversion is { V, E }, V is a set of pixel points including w × h vertices, and E is an edge set.
And B: and inputting the undirected graph matrix into the cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics.
Fig. 4 shows the structure of a cascaded multi-layer graph convolutional neural network (a 5-layer graph convolutional neural network is included in fig. 4). The parameters of each layer graph convolutional neural network are shown in table 1:
number of layers Parameter(s)
Graph convolution neural network 1 R=9,Q=32,ReLU,C=2
Graph convolutional neural network 2 R=9,Q=32,ReLU,C=2
Graph convolutional neural network 3 R=6,Q=64,ReLU,C=1
Graph convolutional neural network 4 R=6,Q=64,ReLU,C=1
Graph convolution neural network 5 R=4,Q=128,ReLU,C=1
TABLE 1
In table 1, R is the size of the filter, Q is the number of undirected graphs output to the next layer, ReLU represents the activation function, and C represents the number of coarsening times.
As shown in fig. 5, each layer of the graph convolutional neural network includes a filtering layer, an activation layer, and a coarsening layer. Specifically, the processing procedure of each layer of graph convolution neural network on the undirected graph matrix comprises the following steps: filtering the undirected graph matrix in a filtering layer to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix; (the filtering process may be that the undirected graph matrix in the spatial domain is first transformed into the frequency domain to obtain an undirected graph matrix in the frequency domain, then the undirected graph matrix in the frequency domain is filtered to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix in the frequency domain, and finally the vertex vector matrix formed by all vertex pixel points in the undirected graph matrix in the frequency domain is transformed into the spatial domain to activate the function processing, the vertex vector matrix is nonlinearly processed according to an activation function (Relu function) in an activation layer, and the vertex vector matrix after the nonlinear processing is coarsened in a coarsened layer. understandably, in the cascaded multi-layer graph convolutional neural network, a first target vector of the input of the first layer graph convolutional neural network is the undirected graph matrix, and the output of the last layer of the graph convolutional neural network is the undirected graph characteristic, and the output of the graph convolution neural network of the upper layer is the input of the graph convolution network of the next adjacent layer.
In addition, in the embodiment of the application, the facial features in the facial image can be extracted through a facial feature extraction algorithm based on geometric features, a facial feature extraction algorithm based on a neural network, a facial feature extraction algorithm based on elastic image matching, and a facial feature extraction algorithm based on a support vector machine, which are not limited herein. The face features may include left eye features, right eye features, nose features, mouth features, or a combination of at least two of the above features.
S23: and extracting dynamic change features of the human face according to the human face features of the human face images acquired at the previous moment and the human face features of the human face images acquired at the later moment in every two adjacent moments.
Wherein the preceding time is later than the following time. For example, the facial images acquired at multiple time instants may include a 100ms facial image, a 200ms facial image, and a 300ms facial image, and then the facial image at a previous time instant in any two adjacent time instants has a dynamically changing facial feature with respect to the facial image at a later time instant, including: the dynamic change characteristics of the face image of 200ms relative to the face image of 100ms and the dynamic change characteristics of the face image of 300ms relative to the face image of 200 ms. The dynamic human face change features may include position change features of pixel points, brightness change features of pixel points, color change features of pixel points, and the like, which are not limited herein.
Specifically, the face features of the face images acquired at the previous time and the face features of the face images acquired at the later time in every two adjacent times are used as the input of a pre-trained dynamic change feature generation model to determine the dynamic change features of the faces. The dynamic change feature generation model is based on the face features of face images acquired at the previous moment and the face features of face images acquired at the later moment in every two historical adjacent moments as training samples, and the dynamic change features of the faces corresponding to the training samples are used as target results and input into the long-term and short-term memory network for training.
S24: and outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and the pre-trained age prediction model.
The age prediction model is formed by taking the face features, the face dynamic change features and the real age information of historical face image samples acquired at a plurality of historical moments as input training of a network to be trained.
After the age prediction result corresponding to the face image is obtained, the age prediction result may be transmitted back to the terminal device 200, and the terminal device 200 may display the age prediction result on a display interface of an application program having a photographing function. Further, the user can know the age prediction result based on the display interface.
According to the age prediction method, the age prediction result corresponding to the face image is output according to the face features, the face dynamic change features and the pre-trained age prediction model. Therefore, the accuracy of the predicted age information is higher.
The following describes the training process of the age prediction model in S24, and as shown in fig. 4, the training process may include:
s41: and determining an age prediction result based on the network to be trained according to the face features and the face dynamic change features of the historical face image samples acquired at a plurality of historical moments.
The age prediction model may include a two-way long-short term memory network, a fully connected layer, and a softmax function. The bidirectional long-short term memory network is used for outputting a second target vector according to the human face features and the fusion features of the human face dynamic change features (a feature matrix spliced according to the human face features and the human face dynamic change features), and the second target vector is used for indicating the relationship between the fusion features and the age; the full-connection layer is used for mapping the second target vector to a preset age interval; the softmax function is used for determining the probability of the second target vector at each age in the preset age interval, and the age with the maximum probability is selected as the output age prediction result.
S42: and determining a loss function according to the age prediction result and the real age information corresponding to the historical face image sample.
S43: determining whether the loss function is smaller than a preset threshold value; if so, S44 is performed.
S44: and establishing an age prediction model based on the current network parameters of the network to be trained.
S45: and updating the historical face image samples and updating the network parameters of the network to be trained based on the loss function, and returning to the step S41.
In some alternative embodiments, S23 may include:
step 1: and extracting the undirected graph characteristics of each face image.
It can be understood that the manner of extracting the undirected graph features of the face image in step 1 is the same as the manner of extracting the undirected graph features of the face image in S23, and is not described herein again.
Step 2: undirected graph features corresponding to the face images acquired at multiple moments are used as input of a long-term and short-term memory network to obtain the face dynamic change features of the face image at the previous moment relative to the face image at the later moment.
Referring to fig. 7, an age prediction apparatus is further provided in the present embodiment, which can be applied to the server 100. As also shown in fig. 2, the server 100 may be communicatively coupled to the terminal device 200 for data exchange. It should be noted that, the specific implementation method and the beneficial effects of the age estimation device are the same as those of the above implementation, and the above description may be specifically referred to. The device includes: an information acquisition unit 801, a feature extraction unit 802, and an age prediction unit 803, wherein,
an information acquisition unit 801 is configured to acquire face images of the same user acquired at multiple times.
The apparatus may further include: the preprocessing unit is used for preprocessing the face images acquired at a plurality of moments, wherein the preprocessing mode at least comprises the following steps: graying processing and normalization processing.
And a feature extraction unit 802, configured to extract a face feature of each face image.
The feature extraction unit 802 is further configured to extract, based on the face images acquired at multiple times, a face dynamic change feature of a face image at a previous time relative to a face image at a later time in any two adjacent times, where the previous time is later than the later time.
Specifically, the feature extraction unit 802 may include: the first feature extraction module is used for extracting undirected graph features of each face image, wherein the undirected graph features are used for representing the position relation of each pixel point in the face image relative to other pixel points.
The first feature extraction module is specifically used for processing each face image into an undirected graph matrix; inputting the undirected graph matrix into a cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics;
wherein, the processing process of each layer of graph convolution neural network to the undirected graph matrix is as follows: filtering the undirected graph matrix to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix; carrying out nonlinear processing on the vertex vector matrix according to the activation function; and carrying out coarsening operation on the vertex vector matrix after the nonlinear processing to obtain the characteristics of the undirected graph.
And the second feature extraction module is used for taking the undirected graph features corresponding to the face images acquired at multiple moments as the input of the long-term and short-term memory network so as to acquire the face dynamic change features of the face image at the previous moment relative to the face image at the later moment.
And the age prediction unit 803 is configured to output an age prediction result corresponding to the face image according to the face features, the face dynamic change features, and a pre-trained age prediction model, where the age prediction model is formed by training, as an input of a network to be trained, the face features, the face dynamic change features, and real age information of history face image samples acquired at multiple history moments.
Referring to fig. 8, an age prediction model training apparatus according to an embodiment of the present application further includes an information determining unit 801, an information updating unit 802, and a model establishing unit 803.
The information determining unit 801 is configured to determine an age prediction result based on the face features and the face dynamic change features of the historical face image samples acquired by the network to be trained at multiple historical times.
The information determining unit 801 is further configured to determine a loss function according to the age prediction result and the real age information corresponding to the historical face image sample.
The information determining unit 801 is further configured to determine whether the loss function is smaller than a preset threshold.
And an information updating unit 802, configured to update historical face image samples and network parameters of the network to be trained based on the loss function if the historical face image samples are greater than a preset threshold, and return to the step of executing the face features and the face dynamic change features of the historical face image samples acquired at multiple historical moments based on the network to be trained to obtain an age prediction result.
The model establishing unit 803 is configured to, if the current network parameter of the network to be trained is smaller than a preset threshold, establish an age prediction model based on the current network parameter of the network to be trained.
The above prior art solutions have shortcomings which are the results of practical and careful study of the inventor, and therefore, the discovery process of the above problems and the solutions proposed by the following embodiments of the present invention to the above problems should be the contribution of the inventor to the present invention in the course of the present invention.
Referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device for executing an age prediction model training method and an age prediction method according to an embodiment of the present disclosure, where the electronic device may include: at least one processor 110, such as a CPU, at least one communication interface 120, at least one memory 130, and at least one communication bus 140. Wherein the communication bus 140 is used for realizing direct connection communication of these components. The communication interface 120 of the device in the embodiment of the present application is used for performing signaling or data communication with other node devices. The memory 130 may be a high-speed RAM memory or a non-volatile memory (e.g., at least one disk memory). Memory 130 may optionally be at least one memory device located remotely from the aforementioned processor. The memory 130 stores computer readable instructions, which when executed by the processor 110, cause the electronic device to perform the method processes described above with reference to fig. 1.
It will be appreciated that the configuration shown in fig. 9 is merely illustrative and that the electronic device may include more or fewer components than shown in fig. 9 or have a different configuration than shown in fig. 9. The components shown in fig. 9 may be implemented in hardware, software, or a combination thereof.
The apparatus may be a module, a program segment, or code on an electronic device. It should be understood that the apparatus corresponds to the above-mentioned embodiment of the method of fig. 1, and can perform various steps related to the embodiment of the method of fig. 1, and the specific functions of the apparatus can be referred to the description above, and the detailed description is appropriately omitted here to avoid redundancy.
It should be noted that, for the convenience and conciseness of description, the specific working processes of the system and the device described above may refer to the corresponding processes in the foregoing method embodiments, and the description is not repeated here.
Embodiments of the present application provide a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the method processes performed by an electronic device in the method embodiment shown in fig. 1.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, which when executed by a computer, the computer is capable of executing the methods provided by the above-mentioned method embodiments, for example, acquiring facial images acquired at a plurality of moments; extracting the face features of each face image; on the basis of the face images acquired at multiple moments, extracting the face dynamic change characteristics of the face image at the previous moment relative to the face image at the later moment in any two adjacent moments, wherein the previous moment is later than the later moment; and outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and a pre-trained age prediction model, wherein the age prediction model is formed by training the face features, the face dynamic change features and the real age information of a plurality of historical face image samples acquired at historical moments as input of a network to be trained.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A method of age prediction, the method comprising:
acquiring face images of the same user acquired at multiple moments;
extracting the face features of each face image;
extracting dynamic change features of the human face according to the human face features of the human face images acquired at the previous time and the human face features of the human face images acquired at the later time in every two adjacent times;
and outputting an age prediction result corresponding to the face image according to the face features respectively corresponding to the face images acquired at a plurality of moments, the extracted dynamic face change features and a pre-trained age prediction model, wherein the age prediction model is formed by taking the face features, the dynamic face change features and the real age information of historical face image samples acquired at a plurality of historical moments as input training of a network to be trained.
2. The method according to claim 1, wherein the extracting the facial features of each facial image comprises:
converting each face image into a phase-free image matrix, wherein the phase-free image matrix is used for representing the position relationship of each pixel point in the face image relative to other pixel points;
and extracting the face features according to the undirected graph matrix.
3. The method according to claim 2, wherein the extracting the face features according to the undirected graph matrix comprises:
inputting the undirected graph matrix into a cascaded multi-layer graph convolutional neural network to obtain undirected graph characteristics;
wherein, the processing process of each layer of the convolutional neural network on the undirected graph matrix is as follows:
filtering an input first target vector to obtain a vertex vector matrix formed by all vertex pixel points in the undirected graph matrix;
carrying out nonlinear processing on the vertex vector matrix according to an activation function;
coarsening the vertex vector matrix after the nonlinear processing;
in the cascaded multilayer graph convolution neural network, a first target vector input by a first layer of graph convolution neural network is the undirected graph matrix, the output of a last layer of graph convolution neural network is an undirected graph characteristic, and the output of a previous layer of graph convolution neural network is the input of an adjacent next layer of graph convolution network.
4. The method according to claim 1, wherein the extracting the face features of the face image acquired at the previous time and the face features of the face image acquired at the later time in each two adjacent times in the plurality of times to determine the dynamic change features of the face comprises:
and determining dynamic change characteristics of the human face by taking the human face characteristics of the human face image acquired at the previous moment and the human face characteristics of the human face image acquired at the later moment in every two adjacent moments as the input of a dynamic change characteristic generation model trained in advance, wherein the dynamic change characteristic generation model is formed by inputting the human face characteristics of the human face image acquired at the previous moment and the human face characteristics of the human face image acquired at the later moment in every two adjacent moments in history as training samples and the dynamic change characteristics of the human face corresponding to the training samples as target results into the long-short term memory network for training.
5. The method of claim 1, wherein the age prediction model comprises a bidirectional long-short term memory network, a full connection layer and a softmax function, wherein the bidirectional long-short term memory network is configured to output a second target vector according to the fused features of the face features and the face dynamic variation features, and wherein the second target vector is configured to indicate a relationship between the fused features and the age; the full-connection layer is used for mapping the second target vector to a preset age interval; the softmax function is used for determining the probability of the second target vector at each age in a preset age interval, and selecting the age with the highest probability as an output age prediction result.
6. A method for training an age prediction model, the method comprising:
determining an age prediction result based on a network to be trained according to the face features and the face dynamic change features of historical face image samples acquired at a plurality of historical moments;
determining a loss function according to the age prediction result and the real age information corresponding to the historical face image sample;
determining whether the loss function is less than a preset threshold;
if the current time is greater than the preset threshold value, updating the network parameters of the network to be trained based on the loss function, and returning to the step of executing the human face characteristics and the human face dynamic change characteristics of the historical human face image samples collected at a plurality of historical moments based on the network to be trained to obtain an age prediction result;
and if the network parameter is smaller than the preset threshold value, establishing an age prediction model based on the current network parameter of the network to be trained.
7. An age prediction apparatus, characterized in that the apparatus comprises:
the information acquisition unit is used for acquiring face images of the same user acquired at a plurality of moments;
the characteristic extraction unit is used for extracting the face characteristic of each face image;
the feature extraction unit is further configured to extract a dynamic face change feature according to a face feature of a face image acquired at a previous time and a face feature of a face image acquired at a subsequent time in each two adjacent times, where the previous time is later than the subsequent time;
and the age prediction unit is used for outputting an age prediction result corresponding to the face image according to the face features, the face dynamic change features and a pre-trained age prediction model, wherein the age prediction model is formed by taking the face features, the face dynamic change features and the real age information of historical face image samples acquired at a plurality of historical moments as the input training of the network to be trained.
8. An age prediction model training apparatus, characterized in that the apparatus comprises:
the information determining unit is used for determining an age prediction result based on a network to be trained according to the face characteristics and the face dynamic change characteristics of historical face image samples acquired at a plurality of historical moments;
the information determining unit is further configured to determine a loss function according to the age prediction result and the real age information corresponding to the historical face image sample;
the information determining unit is further configured to determine whether the loss function is smaller than a preset threshold;
the information updating unit is used for updating historical face image samples and network parameters of the network to be trained based on the loss function if the historical face image samples are larger than a preset threshold value, and returning to execute the steps of acquiring the age prediction result by using the face features and the face dynamic change features of the historical face image samples acquired at a plurality of historical moments based on the network to be trained;
and the model establishing unit is used for establishing an age prediction model based on the current network parameters of the network to be trained if the current network parameters are smaller than a preset threshold value.
9. An electronic device comprising a processor and a memory, the memory storing computer readable instructions that, when executed by the processor, perform the method of any of claims 1-6.
10. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.
CN202110278664.4A 2021-03-15 2021-03-15 Age prediction method and device Active CN112766238B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110278664.4A CN112766238B (en) 2021-03-15 2021-03-15 Age prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110278664.4A CN112766238B (en) 2021-03-15 2021-03-15 Age prediction method and device

Publications (2)

Publication Number Publication Date
CN112766238A true CN112766238A (en) 2021-05-07
CN112766238B CN112766238B (en) 2023-09-26

Family

ID=75691291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110278664.4A Active CN112766238B (en) 2021-03-15 2021-03-15 Age prediction method and device

Country Status (1)

Country Link
CN (1) CN112766238B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286753A1 (en) * 2016-04-01 2017-10-05 Hon Hai Precision Industry Co., Ltd. Method for identifying age based on facial feature
CN109190449A (en) * 2018-07-09 2019-01-11 北京达佳互联信息技术有限公司 Age recognition methods, device, electronic equipment and storage medium
CN109271958A (en) * 2018-09-30 2019-01-25 厦门市巨龙信息科技有限公司 The recognition methods of face age and device
CN110321863A (en) * 2019-07-09 2019-10-11 北京字节跳动网络技术有限公司 Age recognition methods and device, storage medium
CN111881737A (en) * 2020-06-18 2020-11-03 深圳数联天下智能科技有限公司 Training method and device of age prediction model, and age prediction method and device
CN111967382A (en) * 2020-08-14 2020-11-20 北京金山云网络技术有限公司 Age estimation method, and training method and device of age estimation model
WO2020232855A1 (en) * 2019-05-21 2020-11-26 平安科技(深圳)有限公司 Method and apparatus for adjusting screen display on the basis of subtle expression
WO2020238321A1 (en) * 2019-05-27 2020-12-03 北京字节跳动网络技术有限公司 Method and device for age identification
CN112418195A (en) * 2021-01-22 2021-02-26 电子科技大学中山学院 Face key point detection method and device, electronic equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286753A1 (en) * 2016-04-01 2017-10-05 Hon Hai Precision Industry Co., Ltd. Method for identifying age based on facial feature
CN109190449A (en) * 2018-07-09 2019-01-11 北京达佳互联信息技术有限公司 Age recognition methods, device, electronic equipment and storage medium
CN109271958A (en) * 2018-09-30 2019-01-25 厦门市巨龙信息科技有限公司 The recognition methods of face age and device
WO2020232855A1 (en) * 2019-05-21 2020-11-26 平安科技(深圳)有限公司 Method and apparatus for adjusting screen display on the basis of subtle expression
WO2020238321A1 (en) * 2019-05-27 2020-12-03 北京字节跳动网络技术有限公司 Method and device for age identification
CN110321863A (en) * 2019-07-09 2019-10-11 北京字节跳动网络技术有限公司 Age recognition methods and device, storage medium
CN111881737A (en) * 2020-06-18 2020-11-03 深圳数联天下智能科技有限公司 Training method and device of age prediction model, and age prediction method and device
CN111967382A (en) * 2020-08-14 2020-11-20 北京金山云网络技术有限公司 Age estimation method, and training method and device of age estimation model
CN112418195A (en) * 2021-01-22 2021-02-26 电子科技大学中山学院 Face key point detection method and device, electronic equipment and storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ALBERT ALI SALAH等: "Combining Facial Dynamics With Appearance for Age Estimation", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
ROHINI G.BHAISARE等: "Study of Age Estimation Using Fixed Rank Representation (FRR)", 《INTERNATIONAL JOURNAL OF COMPUTER SCIENCE TRENDS AND TECHNOLOGY (IJCST)》 *
ZICHANG TAN等: "Deeply-learned Hybrid Representations for Facial Age Estimation", 《PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-19)》 *
张亮亮等: "基于深度学习的脸部年龄预测", 《计算机工程》 *
纪志鹏: "基于视频的人脸年龄估计算法研究", 《中国优秀硕士学位论文全文数据库:信息科技辑》 *

Also Published As

Publication number Publication date
CN112766238B (en) 2023-09-26

Similar Documents

Publication Publication Date Title
CN110378264B (en) Target tracking method and device
JP7058373B2 (en) Lesion detection and positioning methods, devices, devices, and storage media for medical images
CN110135319B (en) Abnormal behavior detection method and system
CN113196289B (en) Human body action recognition method, human body action recognition system and equipment
CN111582141B (en) Face recognition model training method, face recognition method and device
CN112464807A (en) Video motion recognition method and device, electronic equipment and storage medium
CN109685037B (en) Real-time action recognition method and device and electronic equipment
CN113674421B (en) 3D target detection method, model training method, related device and electronic equipment
CN111582483A (en) Unsupervised learning optical flow estimation method based on space and channel combined attention mechanism
CN112561879B (en) Ambiguity evaluation model training method, image ambiguity evaluation method and image ambiguity evaluation device
CN112101262B (en) Multi-feature fusion sign language recognition method and network model
CN110738103A (en) Living body detection method, living body detection device, computer equipment and storage medium
CN111401322A (en) Station entering and exiting identification method and device, terminal and storage medium
CN112200056A (en) Face living body detection method and device, electronic equipment and storage medium
WO2021103474A1 (en) Image processing method and apparatus, storage medium and electronic apparatus
CN113221771A (en) Living body face recognition method, living body face recognition device, living body face recognition equipment, storage medium and program product
CN114445663A (en) Method, apparatus and computer program product for detecting challenge samples
CN107729885B (en) Face enhancement method based on multiple residual error learning
CN111652242B (en) Image processing method, device, electronic equipment and storage medium
US20230115765A1 (en) Method and apparatus of transferring image, and method and apparatus of training image transfer model
CN112766238B (en) Age prediction method and device
CN116129417A (en) Digital instrument reading detection method based on low-quality image
CN115861384A (en) Optical flow estimation method and system based on generation of countermeasure and attention mechanism
CN113887319A (en) Three-dimensional attitude determination method and device, electronic equipment and storage medium
CN113610856A (en) Method and device for training image segmentation model and image segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant