CN110532970B - Age and gender attribute analysis method, system, equipment and medium for 2D images of human faces - Google Patents

Age and gender attribute analysis method, system, equipment and medium for 2D images of human faces Download PDF

Info

Publication number
CN110532970B
CN110532970B CN201910823680.XA CN201910823680A CN110532970B CN 110532970 B CN110532970 B CN 110532970B CN 201910823680 A CN201910823680 A CN 201910823680A CN 110532970 B CN110532970 B CN 110532970B
Authority
CN
China
Prior art keywords
age
face
picture
predicted
gender
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910823680.XA
Other languages
Chinese (zh)
Other versions
CN110532970A (en
Inventor
张帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Ruiwei Information Technology Co ltd
Original Assignee
Xiamen Ruiwei Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Ruiwei Information Technology Co ltd filed Critical Xiamen Ruiwei Information Technology Co ltd
Priority to CN201910823680.XA priority Critical patent/CN110532970B/en
Publication of CN110532970A publication Critical patent/CN110532970A/en
Application granted granted Critical
Publication of CN110532970B publication Critical patent/CN110532970B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/178Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition

Abstract

The invention discloses a method, a system and computer equipment for analyzing age and gender attributes of a 2D (two-dimensional) image of a human face, wherein the method comprises the following steps: acquiring a 2D picture of a face to be detected; carrying out face detection on a single face 2D picture through a trained first neural network model to obtain a face frame position and a face feature point position; correcting and intercepting the picture according to the position of the face frame and the position of the facial feature point to obtain a 2D (two-dimensional) picture of the face after correction standardization; carrying out age and gender attribute prediction on the corrected and standardized 2D human face picture through a trained second neural network model to obtain an original prediction value; determining the attribute of the age and the gender of the face according to the original predicted value and the attribute selection strategy of the age and the gender, and outputting the predicted age and the gender; and outputting the result of the predicted age and sex to the background and recording the result to a database for subsequent data analysis. The method can quickly and accurately detect the age and sex attribute information of the face in the camera.

Description

Age and gender attribute analysis method, system, equipment and medium for face 2D image
Technical Field
The invention relates to the technical field of image processing based on a deep learning method, in particular to an age and gender attribute analysis method and system based on a 2D face picture and computer equipment.
Technical Field
When people walk on the street or in supermarkets of various shopping malls and shops, if attention is paid, various cameras can be found to be distributed throughout daily life of people, most of the cameras are used for data recording and have a storage function, monitoring data are called under certain conditions (case tracking, shop monitoring and the like) to perform historical backtracking analysis, a large amount of data are generated by the cameras every day, most of the data are used for backtracking, and the data are not fully utilized. In order to solve the practical problems of similar scenes, the invention provides an age and gender attribute analysis method based on a 2D face photo.
The Chinese invention with publication number CN109858388A, published as 20190607, discloses an intelligent tourism management system, which comprises: the system comprises an unmanned aerial vehicle aerial photo tourist distribution system, a scenic spot face recognition system, a scenic spot entrance people flow prediction system, a scenic spot basic information data system, a hotel data statistics system, a cloud data management platform and a mobile terminal; the scenic spot face recognition system is used for recognizing the age stage and the gender of tourists by using a face recognition technology, and comprises the following steps:
firstly, establishing a face database, wherein face images in the database comprise photos from different ages and different expressions, and the photo background is consistent with the photo background shot by a scenic spot entrance camera;
then, the database is manually sorted according to gender, the training samples are divided into a male image set and a female image set, the name of the database is according to English acronym, and the first layer is as follows: dividing the sex for the first time; a second layer: dividing young YM, middle MM and old OM in the sex layer of male or female; the third layer is used for dividing age ranges, the fourth layer is used for dividing databases with smaller age intervals, and the 'MM-i-13' is interpreted as a 3 rd sub-database which is attached to the 1 st database by the ith middle-aged man; fifth age estimation;
finally, an average age estimation method is adopted, wherein Li is the age of the database, Nij represents the total number of the sub-database training, pictures are divided into a plurality of pictures of each person per year, and then the pictures are independently trained;
the training model of the scenic spot face recognition system is as follows:
firstly, performing face recognition pre-training on a face database to obtain a deep learning face model, then performing fine tuning training on the characteristics of hair, eyes, nose, mouth and beard on a face attribute data set by using the model to obtain a face attribute model, connecting all full-connection layer characteristics of a network to be used as a face characteristic vector, and finally training and testing on the data set by using a random forest classifier;
then, the age stages are divided into four age stage categories of 5-15 years old, 15-25 years old, 25-50 years old and more than 50 years old; the cloud data management platform classifies the ages of the tourists obtained by the scenic spot face recognition system according to four age stages, the number of the tourists in each age stage is calculated, the ages and the sexes of the tourists are input when the tourists inquire the scenic spot information on the terminal APP, and the system pushes scenic spot data suitable for the ages and the sexes of the tourists. However, the invention can only predict age groups, but cannot predict specific age values, the application scene range is single, and the deep learning face model does not adopt a method of multi-person labeling, weighting and averaging, and the result is inaccurate.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a method, a system, equipment and a medium for analyzing the age and gender attributes of a 2D face image, which can quickly and accurately analyze the age and gender of the face image and statistically analyze the age and gender information in a camera in various scenes.
In a first aspect, the method of the present invention is implemented by: a method for analyzing age and gender attributes of a 2D image of a human face comprises the following steps:
step S1, acquiring a 2D picture of a face to be detected;
step S2, carrying out face detection on a single face 2D picture through the trained first neural network model to obtain the position of a face frame and the position of a face feature point; correcting and intercepting the picture according to the position of the face frame and the position of the facial feature point to obtain a corrected and standardized 2D picture of the face;
step S3, performing age and gender attribute prediction on the corrected and standardized human face 2D picture through the trained second neural network model to obtain an original prediction value;
step S4, determining the attribute of age and sex of the face according to the original predicted value and the attribute selection strategy of age and sex, and outputting the predicted age and sex;
and step S5, outputting the result of the predicted age and sex to the background and recording the result into a database for subsequent data analysis.
In a second aspect, the system of the present invention is implemented as: an age gender attribute analysis system of a 2D image of a human face, comprising:
the data acquisition module is used for acquiring a 2D picture of a face to be detected;
the first neural network model is used for carrying out face detection on a single face 2D picture to obtain a face frame position and a face feature point position; correcting and intercepting the picture according to the position of the face frame and the position of the facial feature point to obtain a corrected and standardized 2D picture of the face;
the second neural network model is used for carrying out age and gender attribute prediction on the corrected and standardized human face 2D picture to obtain an original prediction value;
the prediction module is used for determining the age and gender attribute of the face according to the original prediction value and the age and gender attribute selection strategy and outputting the predicted age and gender;
and the result output module is used for outputting the predicted age and sex results to a background and recording the results into a database for subsequent data analysis.
In a third aspect, the computer apparatus of the present invention is realized: a computer device comprising a memory storing a computer program and a processor implementing the method of the invention as described above when the processor executes the computer program.
In a fourth aspect, the medium of the present invention is realized by: a computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, carries out the method of the invention as described above.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention relates to an age and gender attribute analysis method, a system and computer equipment based on a 2D face photo, which can quickly detect a face frame and facial feature points in a picture through a face detection neural network model and output the position of the face frame and the position of the facial feature points; and after the face picture is corrected and amplified, a corrected and standardized face picture is intercepted.
(2) According to the age and gender attribute analysis method and system based on the 2D face photo and the computer device, the age and gender attribute information of the face in the camera can be detected quickly and accurately, so that a store owner can accurately master the age and gender distribution of customers in a store, and an effective strategy can be made by using the analyzed data to improve the turnover.
(3) The predicted age range of the invention is between 0 and 90 years, the data can be predicted by a model after being acquired to obtain very accurate surface age, and the final prediction result is the age value and the gender, but not the age range and the gender.
(4) In the invention, in a real use scene of data origin, a method of multi-person labeling, weighting and averaging is adopted to enable the result to be more accurate, unified correction and shearing are carried out for standardization after face detection, an age and gender prediction model is that a basic model is determined after a large number of related papers are read, a characteristic processing branch structure is designed, and accurate age values and gender are obtained through a post-processing flow according to the output result of the age and gender prediction model.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
FIG. 1 is a flow chart illustrating the use of the age and gender attribute analysis method based on 2D face photos in a real scene according to the present invention;
FIG. 2 is a diagram of a neural network model for face detection according to an embodiment of the present invention; wherein 2(a) is a P-Net network structure diagram of a face detection model; 2(b) is an R-Net network structure diagram of the face detection model; 2(c) is an O-Net network structure diagram of the face detection model;
fig. 3 is an architecture diagram of the system of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
On one hand, the invention provides an age and gender attribute analysis method of a face 2D image, which can meet the requirements of some scenes needing to judge the age and gender of the face by analyzing a video by using a deep learning face detection algorithm and a face age and gender analysis algorithm. The method can effectively, quickly and accurately detect the face position and the feature points in the video or/and the picture and predict the age and gender attribute of the face, so as to help some projects or scenes with requirements on the age and gender attribute of the face to carry out the age and gender attribute analysis of the face picture, and further better analyze and utilize related data.
As shown in fig. 1, the method of the present invention comprises:
step S1, obtaining a 2D picture of a human face to be detected;
step S2, carrying out face detection on a single face 2D picture through the trained first neural network model, and acquiring a face frame position and a face feature point position; correcting and intercepting the picture according to the position of the face frame and the position of the facial feature point to obtain a corrected and standardized 2D picture of the face;
step S3, performing age and gender attribute prediction on the corrected and standardized human face 2D picture through the trained second neural network model to obtain an original prediction value;
step S4, determining the age and gender attribute of the face according to the original predicted value and the age and gender attribute selection strategy, and outputting the predicted age and gender;
and step S5, outputting the result of the predicted age and sex to the background and recording the result into a database for subsequent data analysis.
The step S2 specifically includes:
step S21, carrying out face frame detection on a single picture through the trained first neural network model to obtain the position of a face frame and the position of a facial feature point; the position of the face frame comprises the coordinates of the upper left corner of the face frame and the coordinates of the lower right corner of the face frame; the facial feature points comprise left eye pupils, right eye pupils, nose tips, the leftmost point of the mouth and the rightmost point of the mouth; the position of the facial feature point comprises coordinates of five facial feature points;
step S22, calculating the included angle between the connection line of the two pupils and the horizontal line according to the positions of the left pupil and the right pupil; connecting the middle point of the double-pupil connecting line with the middle point of the connecting line of the leftmost point and the rightmost point of the mouth to be used as a longitudinal line, and taking a preset value of a line segment from top to bottom of the longitudinal line to be used as the central point of the image; reversely rotating the degree of the included angle by taking the central point as a center to obtain a horizontal picture with double pupils;
and step S23, amplifying the position of the face frame according to the preset proportion, intercepting the picture in the amplified face frame, and obtaining the corrected and standardized face picture.
On the other hand, as shown in fig. 3, the present invention further provides an age and gender attribute analysis system for a 2D image of a human face, comprising:
the data acquisition module is used for acquiring a 2D picture of a face to be detected;
the first neural network model is used for carrying out face detection on a single face 2D picture to obtain a face frame position and a face feature point position; correcting and intercepting the picture according to the position of the face frame and the position of the facial feature point to obtain a corrected and standardized 2D picture of the face;
the second neural network model is used for carrying out age and gender attribute prediction on the corrected and standardized human face 2D picture to obtain an original prediction value;
the prediction module is used for determining the age and gender attribute of the face according to the original prediction value and the age and gender attribute selection strategy and outputting the predicted age and gender;
and the result output module is used for outputting the predicted age and sex results to a background and recording the results into a database for subsequent data analysis.
The first neural network model is specifically configured to:
carrying out face frame detection on a single picture to obtain the position of a face frame and the position of a facial feature point; the position of the face frame comprises the coordinates of the upper left corner of the face frame and the coordinates of the lower right corner of the face frame; the facial feature points comprise left eye pupils, right eye pupils, nose tips, leftmost points of the mouth and rightmost points of the mouth; the position of the facial feature point comprises coordinates of the five facial feature points;
calculating the included angle between the connecting line of the two pupils and the horizontal line according to the positions of the left pupil and the right pupil; connecting the middle point of the connecting line of the two pupils with the middle point of the connecting line of the leftmost point and the rightmost point of the mouth to form a longitudinal line, and taking the preset value of the segment of the longitudinal line from top to bottom as the central point of the image; reversely rotating the degree of the included angle by taking the central point as a center to obtain a picture with double pupils horizontal;
and amplifying the preset proportion according to the position of the face frame, and intercepting the picture in the amplified face frame to obtain a corrected and standardized face picture.
In still another aspect, the present invention further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor executes the computer program to implement the method for analyzing the age and gender attribute of the 2D image of the human face according to the present invention.
In still another aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method for analyzing the age and gender attribute of a 2D image of a human face according to the present invention.
The specific steps for realizing the scheme of the invention are as follows:
training of neural network model
The training data mainly comprises two parts: firstly, an online public data set IMBD-WIKI is collected to be used as pre-training, meanwhile, face data of a real scene collected by a real scene camera is collected to be used as an optimization model to improve the prediction accuracy of the real scene, and because the neural network model is characterized in that the data representation effect under the same scene is good, but great effect loss is caused under different scenes.
1. Training of a first neural network
Firstly, pictures and videos of various figures in cameras under various scenes are collected, then five feature points (a left eye pupil, a right eye pupil, a nose tip, a left mouth corner and a right mouth corner respectively) of a face area and a face are manually calibrated by adopting an external rectangular frame, and calibrated data and corresponding labels are sent to a first neural network for training. In a specific embodiment, the first neural network model adopts an mtcn face detection (Multi-task Cascaded Convolutional network) model, the face detection model is composed of three network structures, namely P-net (pro-social network), R-net (refine network) and O-net (output network), and the obtaining of the face frame position and the face feature point position includes three stages:
(1) obtaining a candidate window of a face area and a regression vector of a boundary frame by the P-Net network, performing regression by using the regression vector of the boundary frame, calibrating the candidate window, combining highly overlapped candidate frames by non-maximum suppression, and outputting an initial face frame prediction result and five face feature points;
(2) removing the false-positive areas through bounding box regression and non-maximum suppression by the R-Net network, and outputting a more accurate face frame prediction result and five facial feature points;
(3) and further removing the false-positive areas through bounding box regression and non-maximum suppression by the 0-Net network, and outputting a more accurate face frame prediction result and five facial feature points.
These three networks are described in detail below:
P-Net network: the network structure is as shown in fig. 2(a), 12 pixels × 3 channels are used as network input, and 1 × 1 × 32 output results are obtained after passing through 3 × 3 convolutional network- > MaxPooling layer- > 3 × 3 convolutional network.
R-Net network: the network structure is shown in fig. 2(b), and the false-positive (the network predicts the face area but does not actually) areas are removed mainly by the bounding box regression and the NMS. Only because the network structure is different from the P-Net network structure, the input is changed into 24pixel multiplied by 3channel, and a full connection layer is added, so that the effect of inhibiting false-positive can be better achieved.
O-Net network: the network structure is as shown in fig. 2(c), the input is further enlarged to 48pixel × 48pixel × 3channel so that the input information is finer, and the layer has one more wrap base layer than the R-Net layer, and the function is the same as that of the R-Net layer. But the layer supervises more the face area, as the last stage of the whole model, the output five facial feature points (landmark, including the left eye pupil, the right eye pupil, the nose tip, the mouth leftmost point and the mouth rightmost point) are more accurate than those in the first two stages, the three small network structures all output the coordinates of the five facial feature points, but because the input of the R-Net and P-Net networks is too small and the information of the facial feature points is very little, the weight coefficient of the loss function generated by the regression of the facial feature points in the first two stages is set to be 0.5 which is relatively small, and the weight coefficient of the loss function generated by the O-Net network in the last stage is 1.0 which is relatively large, because the prediction of the facial feature points is the most accurate in the output of the O-Net stage, the prediction result of the facial feature points in the last stage is selected as the prediction result of the facial feature points in practice, the network input of the O-Net is the largest of the three small networks, so that the facial features can be extracted more accurately.
The loss function of the face detection feature description of the MCCN face detection model mainly comprises 3 parts: face classification loss functions (face/non-face classifier), face frame loss functions (bounding box regression), and face feature point loss functions (feature point localization).
(a) The face classification loss function is expressed as follows:
Figure GDA0003571828290000071
wherein i represents the ith sample, piRepresents the probability that the ith sample is a human face, ranging from 0 to 1, pi∈[0,1],
Figure GDA0003571828290000072
Real label data representing the ith sample, wherein the data range is 0 and 1, and y belongs to {0, 1 };
(b) the face box loss function is represented as follows:
Figure GDA0003571828290000073
wherein the content of the first and second substances,
Figure GDA0003571828290000074
in order to be predicted by the network,
Figure GDA0003571828290000075
y is a quadruple consisting of a horizontal and vertical coordinate at the upper left corner of the face frame, the length of the face frame and the width of the face frame;
(c) the facial feature point loss function is represented as follows:
Figure GDA0003571828290000076
wherein the content of the first and second substances,
Figure GDA0003571828290000077
in order to be predicted by the network,
Figure GDA0003571828290000078
and y is a ten-tuple consisting of 5 coordinates of the real facial feature points.
In summary, the overall loss function of the entire model training process can be expressed as follows:
Figure GDA0003571828290000081
Figure GDA0003571828290000082
P-Net R-Net(αdet=1,αbox=0.5,αlandmark=0.5)
0-Net(αdet=1,αbox=0.5,αlandmark=1)
wherein N is a preset face frameThe number of positive samples of (d); alpha is alphadet、αboxAnd alphalandmarkRepresenting weights respectively representing face classification loss, face frame loss and face feature point loss;
Figure GDA0003571828290000083
indicating whether a human face is input;
Figure GDA0003571828290000084
and
Figure GDA0003571828290000085
respectively representing a face classification loss function, a face frame loss function and a face feature point loss function.
From the above, it can be seen that the above-mentioned 3 loss functions are calculated during training but the losses are not meaningful for each input, so the above formula is defined to control the use of different losses for different inputs and to assign different weights. It can be seen that in both P-Net and R-Net networks, the loss weight α of facial feature point regressionlandmarkSmaller than the O-Net part because the first 2 stages focus on filtering out non-face bbox. The significance of β is that for example, non-face input, only meaningful face classification loss needs to be computed, and no meaningless bounding box and regression loss of facial feature points need to be computed, as for non-face regions.
After training, a deep learning neural network model capable of accurately detecting the face frame and the facial feature points is obtained and used for predicting the positions of the face frame and the facial feature points in the video or/and the picture, and then the face is extracted for next age and gender attribute analysis of the extracted face.
2. Training of a second neural network model
In a specific embodiment, the second neural network model uses LightCNN as a feature extraction layer, takes 128 pixels × 3 channels as network input, sets the output as a 512-dimensional vector as the extracted feature, and is followed by three parallel branches:
the first branch is used for predicting the gender, the prediction result is between 0 and 1, the closer to 1, the more the model determines that a male is in the picture, and the closer to 0, the more the model determines that a female is in the picture;
the second branch is used for classifying age groups, the predicted age group is set to be 0-90 years old and is averagely divided into 18 segments, so that 18 results are output in the second branch and respectively represent the confidence degrees of the segments, and the segment with the highest confidence degree is selected as the predicted result of the age group in the training and predicting process; for example, one segment every 5 years of age, for 18 segments, so there are 18 results output in the second branch;
the third branch also has 18 results output, each corresponding to a small range of adjustment values, combined with the results of the second branch to obtain a predicted age value.
For example, the confidence of the second branch prediction result is the fifth age group, the corresponding age range is [20, 25) this age group, the center age is 22.5 years, the fifth prediction result of the third branch is 1.2, and the final predicted age is 22.5+1.2 ≈ 23.7 ≈ 24 years by combining the results of the second branch and the third branch.
1) And the first branch (gender prediction branch) adopts mean square error MSELoss as a loss function, and the formula is as follows:
Figure GDA00035718282900000910
wherein the content of the first and second substances,
Figure GDA0003571828290000091
the predicted probability value of the male gender attribute is represented, y represents the true value of the gender attribute, y belongs to {0, 1}, 0 represents that the picture is female, and 1 represents that the picture is male; n represents the number of categories of all attributes;
2) the second branch (age group classification branch) adopts cross entropy CELoss as a loss function, and the formula is as follows:
Figure GDA0003571828290000092
wherein the content of the first and second substances,
Figure GDA0003571828290000093
a probability value representing all the predicted age groups,
Figure GDA0003571828290000094
y represents the true values of all age groups, y is equal to {0, 1}, 0 represents not in the age group, 1 represents in the age group, and for the same picture, the label of only one age group is 1, and the others are all 0;
Figure GDA0003571828290000095
representing a predicted probability value of the ith age group; y isiTrue values representing the ith age group; n represents the number of all age groups;
3) and the third branch (intra-segment age adjustment branch) adopts a Mean Square Error (MSE) as a Loss function of Loss, and the formula is as follows:
Figure GDA0003571828290000096
wherein, the first and the second end of the pipe are connected with each other,
Figure GDA0003571828290000097
a regression value representing the predicted adjustment value for the corresponding age group,
Figure GDA0003571828290000098
y denotes the true value for all age groups, y ∈ [ -2.5,2.5];
Figure GDA0003571828290000099
A predicted regression value representing the adjusted value of the ith age group; y isiTrue regression values representing the ith age group; n represents the number of all age groups.
And obtaining a model capable of accurately predicting the attribute of the age and the sex of the face through a large amount of training and parameter adjustment, and analyzing the attribute of the age and the sex of the face.
Use in real scenes
As shown in fig. 1, in a specific embodiment, using a trained first neural network model and a trained second neural network model to predict age and gender of data in a real scene, the embodiment specifically includes:
step S1, acquiring a 2D picture of a human face to be detected from a video screen;
step S2, carrying out face detection on a single face 2D picture through the trained first neural network model to obtain the position of a face frame and the position of a face feature point; correcting and intercepting the picture according to the position of the face frame and the position of the facial feature point to obtain a 2D (two-dimensional) picture of the face after correction standardization;
step S3, performing age and gender attribute prediction on the corrected and standardized human face 2D picture through the trained second neural network model to obtain an original prediction value;
step S4, determining the age and gender attribute of the face according to the original predicted value and the age and gender attribute selection strategy, and outputting the predicted age and gender;
and step S5, outputting the result of the predicted age and sex to the background and recording the result into a database for subsequent data analysis.
The predicted age and gender of the invention in an actual scene can be marked in the upper left corner of a picture by the predicted gender (M for male and F for female) and the corresponding predicted value (the range is 0-1, the closer to 0, the more like female, the closer to 1, the more like male) and the predicted age, and the position of a detected face frame and the positions of five coordinate points (left eye pupil, right eye pupil, nose tip, left mouth corner and right mouth corner) are drawn in the picture.

Claims (8)

1. A method for analyzing age and gender attributes of a face 2D image is characterized by comprising the following steps: the method comprises the following steps:
step S1, obtaining a 2D picture of a human face to be detected;
step S2, carrying out face detection on a single face 2D picture through the trained first neural network model to obtain the position of a face frame and the position of a face feature point; correcting and intercepting the picture according to the position of the face frame and the position of the facial feature point to obtain a corrected and standardized 2D picture of the face;
step S3, performing age and gender attribute prediction on the corrected and standardized human face 2D picture through the trained second neural network model to obtain an original prediction value; the second neural network model uses LightCNN as a feature extraction layer, takes 128pixel × 128pixel × 3channel as network input, sets output as a 512-dimensional vector as an extracted feature, and is followed by three parallel branches:
the first branch is used for predicting the gender, the prediction result is between 0 and 1, the closer to 1, the more the model determines that a male is in the picture, and the closer to 0, the more the model determines that a female is in the picture;
the second branch is used for classifying age groups, the predicted age group is set to be 0-90 years old and is averagely divided into 18 segments, so that 18 results are output in the second branch and respectively represent the confidence degrees of the segments, and the segment with the highest confidence degree is selected as the predicted result of the age group in the training and predicting process;
the third branch also has 18 result outputs, which are respectively corresponding to small-range adjustment values, and the predicted age value can be obtained by combining the result of the second branch; the third branch adopts mean square error MSE as Loss function of Loss, and the formula is as follows:
Figure FDA0003534075030000011
wherein the content of the first and second substances,
Figure FDA0003534075030000012
a regression value representing the predicted adjustment value of the corresponding age group,
Figure FDA0003534075030000013
y denotes the true value for all age groups, y ∈ [ -2.5,2.5];
Figure FDA0003534075030000014
A predicted regression value representing the adjusted value of the ith age group; y isiTrue regression values representing the ith age group; n represents the number of all age groups;
step S4, determining the age and gender attribute of the face according to the original predicted value and the age and gender attribute selection strategy, and outputting the predicted age and gender;
and step S5, outputting the result of the predicted age and sex to the background and recording the result in a database for subsequent data analysis.
2. The method for analyzing age-gender attribute of a human face 2D image according to claim 1, wherein: the step S2 specifically includes:
step S21, carrying out face frame detection on a single picture through the trained first neural network model to obtain the position of a face frame and the position of a facial feature point; the position of the face frame comprises the coordinates of the upper left corner of the face frame and the coordinates of the lower right corner of the face frame; the facial feature points comprise left eye pupils, right eye pupils, nose tips, leftmost points of the mouth and rightmost points of the mouth; the position of the facial feature point comprises coordinates of the five facial feature points;
step S22, calculating the included angle between the connection line of the two pupils and the horizontal line according to the positions of the left pupil and the right pupil; connecting the middle point of the double-pupil connecting line with the middle point of the connecting line of the leftmost point and the rightmost point of the mouth to be used as a longitudinal line, and using the preset value of the segment from top to bottom of the longitudinal line as the central point of the image; reversely rotating the degree of the included angle by taking the central point as a center to obtain a horizontal picture with double pupils;
and step S23, amplifying the position of the face frame according to the preset proportion, intercepting the picture in the amplified face frame, and obtaining the corrected and standardized face picture.
3. The method for analyzing age-gender attribute of a human face 2D image according to claim 1, wherein:
1) and the first branch adopts mean square error MSELoss as a loss function, and the formula is as follows:
Figure FDA0003534075030000021
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003534075030000022
the predicted probability value of the male gender attribute is represented, y represents the true value of the gender attribute, y belongs to {0, 1}, 0 represents that the picture is female, and 1 represents that the picture is male; n represents the number of categories of all attributes;
2) and the second branch adopts cross entropy CELoss as a loss function, and the formula is as follows:
Figure FDA0003534075030000023
wherein the content of the first and second substances,
Figure FDA0003534075030000024
a probability value representing all the predicted age groups,
Figure FDA0003534075030000025
y represents the true values of all age groups, y is equal to {0, 1}, 0 represents not in the age group, 1 represents in the age group, and for the same picture, the label of only one age group is 1, and the others are all 0;
Figure FDA0003534075030000026
representing a predicted probability value for the ith age group; y isiTrue values representing the ith age group; n represents the number of all age groups.
4. The system for analyzing the age and gender attributes of the 2D face image is characterized in that: the method comprises the following steps:
the data acquisition module is used for acquiring a 2D picture of a face to be detected;
the first neural network model is used for carrying out face detection on a single face 2D picture to obtain the position of a face frame and the position of a face characteristic point; correcting and intercepting the picture according to the position of the face frame and the position of the facial feature point to obtain a corrected and standardized 2D picture of the face;
the second neural network model is used for carrying out age and gender attribute prediction on the corrected and standardized human face 2D picture to obtain an original prediction value; the second neural network model uses LightCNN as a feature extraction layer, takes 128pixel × 128pixel × 3channel as network input, sets output as a 512-dimensional vector as an extracted feature, and is followed by three parallel branches:
the first branch is used for predicting the gender, the prediction result is between 0 and 1, the closer to 1, the more the model determines that a male is in the picture, and the closer to 0, the more the model determines that a female is in the picture;
the second branch is used for classifying age groups, the predicted age group is set to be 0-90 years old and is averagely divided into 18 segments, so that 18 results are output in the second branch and respectively represent the confidence degrees of the segments, and the segment with the highest confidence degree is selected as the predicted result of the age group in the training and predicting process;
the third branch also has 18 result outputs, which are respectively corresponding to small-range adjustment values, and the predicted age value can be obtained by combining the result of the second branch; the third branch adopts mean square error MSE as Loss function of Loss, and the formula is as follows:
Figure FDA0003534075030000031
wherein the content of the first and second substances,
Figure FDA0003534075030000032
a regression value representing the predicted adjustment value of the corresponding age group,
Figure FDA0003534075030000033
y denotes the true value for all age groups, y ∈ [ -2.5,2.5];
Figure FDA0003534075030000034
A predicted regression value representing the adjusted value of the ith age group; y isiTrue regression values representing the ith age group; n represents the number of all age groups;
the prediction module is used for determining the age and gender attribute of the face according to the original prediction value and the age and gender attribute selection strategy and outputting the predicted age and gender;
and the result output module is used for outputting the predicted age and sex results to the background and recording the results into the database for subsequent data analysis.
5. The age-gender attribute analysis system of a human face 2D image as claimed in claim 4, wherein: the first neural network model is specifically configured to:
carrying out face frame detection on a single picture to obtain the position of a face frame and the position of a facial feature point; the position of the face frame comprises the coordinates of the upper left corner of the face frame and the coordinates of the lower right corner of the face frame; the facial feature points comprise left eye pupils, right eye pupils, nose tips, leftmost points of the mouth and rightmost points of the mouth; the position of the facial feature point comprises coordinates of the five facial feature points;
calculating the included angle between the connecting line of the two pupils and the horizontal line according to the positions of the left pupil and the right pupil; connecting the middle point of the double-pupil connecting line with the middle point of the connecting line of the leftmost point and the rightmost point of the mouth to be used as a longitudinal line, and using the preset value of the segment from top to bottom of the longitudinal line as the central point of the image; reversely rotating the degree of the included angle by taking the central point as a center to obtain a horizontal picture with double pupils;
and amplifying the preset proportion according to the position of the face frame, and intercepting the picture in the amplified face frame to obtain a corrected and standardized face picture.
6. The age-gender attribute analysis system of a human face 2D image as claimed in claim 4, wherein:
1) and the first branch adopts mean square error MSELoss as a loss function, and the formula is as follows:
Figure FDA0003534075030000035
wherein the content of the first and second substances,
Figure FDA0003534075030000036
the predicted probability value of the male gender attribute is represented, y represents the true value of the gender attribute, y belongs to {0, 1}, 0 represents that the picture is female, and 1 represents that the picture is male; n represents the number of categories of all attributes;
2) and the second branch adopts cross entropy CELoss as a loss function, and the formula is as follows:
Figure FDA0003534075030000041
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003534075030000042
a probability value representing all the predicted age groups,
Figure FDA0003534075030000043
y represents the true values of all age groups, y is equal to {0, 1}, 0 represents not in the age group, 1 represents in the age group, and for the same picture, the label of only one age group is 1, and the others are all 0;
Figure FDA0003534075030000044
representing a predicted probability value for the ith age group; y isiA true value representing the ith age group; n represents the number of all age groups.
7. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that: a processor implementing the method of any one of claims 1 to 3 when executing the computer program.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 3.
CN201910823680.XA 2019-09-02 2019-09-02 Age and gender attribute analysis method, system, equipment and medium for 2D images of human faces Active CN110532970B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910823680.XA CN110532970B (en) 2019-09-02 2019-09-02 Age and gender attribute analysis method, system, equipment and medium for 2D images of human faces

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910823680.XA CN110532970B (en) 2019-09-02 2019-09-02 Age and gender attribute analysis method, system, equipment and medium for 2D images of human faces

Publications (2)

Publication Number Publication Date
CN110532970A CN110532970A (en) 2019-12-03
CN110532970B true CN110532970B (en) 2022-06-24

Family

ID=68666260

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910823680.XA Active CN110532970B (en) 2019-09-02 2019-09-02 Age and gender attribute analysis method, system, equipment and medium for 2D images of human faces

Country Status (1)

Country Link
CN (1) CN110532970B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111091109B (en) * 2019-12-24 2023-04-07 厦门瑞为信息技术有限公司 Method, system and equipment for predicting age and gender based on face image
CN113796826A (en) * 2020-06-11 2021-12-17 懿奈(上海)生物科技有限公司 Method for detecting skin age of human face of Chinese
CN111881747A (en) * 2020-06-23 2020-11-03 北京三快在线科技有限公司 Information estimation method and device and electronic equipment
CN112036249B (en) * 2020-08-04 2023-01-03 汇纳科技股份有限公司 Method, system, medium and terminal for end-to-end pedestrian detection and attribute identification
CN112329607B (en) * 2020-11-03 2022-10-21 齐鲁工业大学 Age prediction method, system and device based on facial features and texture features
CN112528897B (en) * 2020-12-17 2023-06-13 Oppo(重庆)智能科技有限公司 Portrait age estimation method, device, computer equipment and storage medium
CN112257693A (en) * 2020-12-22 2021-01-22 湖北亿咖通科技有限公司 Identity recognition method and equipment
CN113283368B (en) * 2021-06-08 2023-10-20 电子科技大学中山学院 Model training method, face attribute analysis method, device and medium
CN114360148A (en) * 2021-12-06 2022-04-15 深圳市亚略特科技股份有限公司 Automatic selling method and device, electronic equipment and storage medium
CN114463941A (en) * 2021-12-30 2022-05-10 中国电信股份有限公司 Drowning prevention alarm method, device and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105516585A (en) * 2015-11-30 2016-04-20 努比亚技术有限公司 Apparatus and method for automatically regulating skin colors
CN108399379A (en) * 2017-08-11 2018-08-14 北京市商汤科技开发有限公司 The method, apparatus and electronic equipment at facial age for identification
CN108596011A (en) * 2017-12-29 2018-09-28 中国电子科技集团公司信息科学研究院 A kind of face character recognition methods and device based on combined depth network
CN109447053A (en) * 2019-01-09 2019-03-08 江苏星云网格信息技术有限公司 A kind of face identification method based on dual limitation attention neural network model
CN110147728A (en) * 2019-04-15 2019-08-20 深圳壹账通智能科技有限公司 Customer information analysis method, system, equipment and readable storage medium storing program for executing
CN110163114A (en) * 2019-04-25 2019-08-23 厦门瑞为信息技术有限公司 A kind of facial angle and face method for analyzing ambiguity, system and computer equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503623B (en) * 2016-09-27 2019-10-08 中国科学院自动化研究所 Facial image age estimation method based on convolutional neural networks
CN108052862B (en) * 2017-11-09 2019-12-06 北京达佳互联信息技术有限公司 Age estimation method and device
CN110110663A (en) * 2019-05-07 2019-08-09 江苏新亿迪智能科技有限公司 A kind of age recognition methods and system based on face character

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105516585A (en) * 2015-11-30 2016-04-20 努比亚技术有限公司 Apparatus and method for automatically regulating skin colors
CN108399379A (en) * 2017-08-11 2018-08-14 北京市商汤科技开发有限公司 The method, apparatus and electronic equipment at facial age for identification
WO2019029459A1 (en) * 2017-08-11 2019-02-14 北京市商汤科技开发有限公司 Method and device for recognizing facial age, and electronic device
CN108596011A (en) * 2017-12-29 2018-09-28 中国电子科技集团公司信息科学研究院 A kind of face character recognition methods and device based on combined depth network
CN109447053A (en) * 2019-01-09 2019-03-08 江苏星云网格信息技术有限公司 A kind of face identification method based on dual limitation attention neural network model
CN110147728A (en) * 2019-04-15 2019-08-20 深圳壹账通智能科技有限公司 Customer information analysis method, system, equipment and readable storage medium storing program for executing
CN110163114A (en) * 2019-04-25 2019-08-23 厦门瑞为信息技术有限公司 A kind of facial angle and face method for analyzing ambiguity, system and computer equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Using artificial neural network for human age estimation based on facial images;Sarah N. Kohail;《International Conference on Innovations in Information Technology》;20120601;215-219 *
基于深度学习的多任务人脸属性识别研究与应用;程建峰;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20190715(第07期);I138-1002 *

Also Published As

Publication number Publication date
CN110532970A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
CN110532970B (en) Age and gender attribute analysis method, system, equipment and medium for 2D images of human faces
CN109284733B (en) Shopping guide negative behavior monitoring method based on yolo and multitask convolutional neural network
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
US20200005022A1 (en) Method, terminal, and storage medium for tracking facial critical area
CN111783576B (en) Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
CN108520226B (en) Pedestrian re-identification method based on body decomposition and significance detection
CN108304820B (en) Face detection method and device and terminal equipment
CN111091109B (en) Method, system and equipment for predicting age and gender based on face image
CN109446922B (en) Real-time robust face detection method
CN114241548A (en) Small target detection algorithm based on improved YOLOv5
US8094971B2 (en) Method and system for automatically determining the orientation of a digital image
CN111209818A (en) Video individual identification method, system, equipment and readable storage medium
CN109359577B (en) System for detecting number of people under complex background based on machine learning
CN110827432B (en) Class attendance checking method and system based on face recognition
CN112862849B (en) Image segmentation and full convolution neural network-based field rice ear counting method
CN108229289B (en) Target retrieval method and device and electronic equipment
CN113052039B (en) Method, system and server for detecting pedestrian density of traffic network
CN112085534B (en) Attention analysis method, system and storage medium
CN106407978B (en) Method for detecting salient object in unconstrained video by combining similarity degree
CN110555420A (en) fusion model network and method based on pedestrian regional feature extraction and re-identification
CN113706481A (en) Sperm quality detection method, sperm quality detection device, computer equipment and storage medium
CN115240119A (en) Pedestrian small target detection method in video monitoring based on deep learning
CN113008380B (en) Intelligent AI body temperature early warning method, system and storage medium
CN113780145A (en) Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium
CN111105436A (en) Target tracking method, computer device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant