CN108399373B - The model training and its detection method and device of face key point - Google Patents

The model training and its detection method and device of face key point Download PDF

Info

Publication number
CN108399373B
CN108399373B CN201810118211.3A CN201810118211A CN108399373B CN 108399373 B CN108399373 B CN 108399373B CN 201810118211 A CN201810118211 A CN 201810118211A CN 108399373 B CN108399373 B CN 108399373B
Authority
CN
China
Prior art keywords
data
network
coordinate
face
human face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810118211.3A
Other languages
Chinese (zh)
Other versions
CN108399373A (en
Inventor
李宣平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201810118211.3A priority Critical patent/CN108399373B/en
Publication of CN108399373A publication Critical patent/CN108399373A/en
Application granted granted Critical
Publication of CN108399373B publication Critical patent/CN108399373B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the invention provides the model training of face key point and its detection method and device, which includes: that human face data is extracted from training image;The human face data is input to the first order network to be trained, the first order network is used to export the prediction coordinate of face key point;When the first order network training is completed, in the human face data, it is based on the prediction Coordinate generation target data;The target data is input to the second level network to be trained, the second level network is used to export the coordinate shift value of the face key point;When the second level network training is completed, determine that the cascade network is face critical point detection model.Learn prediction coordinate and coordinate shift value, under complex scene, the coordinate of still available accurate face key point by two-level network.

Description

The model training and its detection method and device of face key point
Technical field
The present invention relates to the technical fields of computer disposal, model training and its detection more particularly to face key point Method and apparatus.
Background technique
Face critical point detection is one of the basic technology in facial image research, it is therefore an objective to automatically estimate face picture The coordinate of upper facial feature points, for example, face mask coordinate, face coordinate etc., are widely used in recognition of face, posture is estimated Meter, face filter, make up U.S. face, three-dimensional modeling etc..
In existing face critical point detection technology, traditional method includes being based on shape constraining method, is based on cascading back The method returned, classical model have active shape model (Active Shape Models, ASM)) and cascade regression model (Cascaded pose regression, CPR) etc..
But traditional method robustness is poor, under complex scene, the detection accuracy of face key point is lower.
Summary of the invention
The embodiment of the present invention proposes the model training and its detection method and device of face key point, to solve in complexity Under scene, the lower problem of the detection accuracy of face key point.
According to one aspect of the present invention, a kind of training of face critical point detection model based on cascade network is provided Method, the cascade network include first order network and second level network, which comprises
Human face data is extracted from training image;
The human face data is input to the first order network to be trained, the first order network is for exporting face The prediction coordinate of key point;
When the first order network training is completed, in the human face data, it is based on the prediction Coordinate generation target Data;
The target data is input to the second level network to be trained, the second level network is described for exporting The coordinate shift value of face key point;
When the second level network training is completed, determine that the cascade network is face critical point detection model.
It is optionally, described the human face data is input to the first order network to be trained, comprising:
The human face data is input to the first order network to handle, exports the prediction coordinate of face key point;
First-loss value is calculated using the prediction coordinate;
Judge whether the first order network restrains according to the first-loss value;
If so, determining that the first order network training is completed;
If it is not, then adjusting the first order network according to the first-loss value, it is described by the face number to return to execution It is handled according to the first order network is input to, exports the original coordinates of face key point.
It is optionally, described that first-loss value is calculated using the prediction coordinate, comprising:
Calculate the first distance between the prediction coordinate and true coordinate;
The average value for calculating the first distance, as first-loss value.
Optionally, it is described in the human face data, be based on the prediction Coordinate generation target data, comprising:
Partial image data is extracted in the human face data, based on the prediction coordinate;
The corresponding partial image data of multiple face key points is combined into data matrix according to color group, as target Data.
It is optionally, described the target data is input to the second level network to be trained, comprising:
The target data is input to the second level network to handle, the coordinate for exporting the face key point is inclined Shifting value;
Second penalty values are calculated using the coordinate shift value;
Judge whether the second level network restrains according to second penalty values;
If so, determining that the second level network training is completed;
If it is not, then adjusting the second level network according to second penalty values, it is described by the number of targets to return to execution It is handled according to the second level network is input to, exports the coordinate shift value of the face key point.
It is optionally, described that second penalty values are calculated using the coordinate shift value, comprising:
Calculate the second distance between the prediction coordinate and the offset coordinates;
The average value for calculating the second distance, as the second penalty values.
Optionally, further includes:
Data enhancing processing is carried out to the human face data;
Wherein, the data enhancing processing includes at least one following:
Increase noise data, cutting and restores, translation processing, increases contrast.
According to another aspect of the present invention, a kind of face critical point detection based on face critical point detection model is provided Method, the face critical point detection model include first order network and second level network, which comprises
Human face data is extracted from target image;
The human face data is input to the first order network to handle, exports the prediction coordinate of face key point;
In the human face data, it is based on the prediction Coordinate generation target data;
The target data is input to the second level network to handle, the coordinate for exporting the face key point is inclined Shifting value;
The coordinate shift value is added on the basis of the prediction coordinate, the target for obtaining the face key point is sat Mark.
According to another aspect of the present invention, a kind of training of face critical point detection model based on cascade network is provided Device, the cascade network include first order network and second level network, and described device includes:
Human face data extraction module, for extracting human face data from training image;
First order network training module is trained, institute for the human face data to be input to the first order network First order network is stated for exporting the prediction coordinate of face key point;
Target data generation module, in the human face data, being based on when the first order network training is completed The prediction Coordinate generation target data;
Second level network training module is trained, institute for the target data to be input to the second level network Second level network is stated for exporting the coordinate shift value of the face key point;
Model determining module, for when the second level network training is completed, determining the cascade network for face pass Key point detection model.
Optionally, the first order network training module includes:
Human face data input submodule is handled for the human face data to be input to the first order network, defeated The prediction coordinate of face key point out;
First-loss value computational submodule, for calculating first-loss value using the prediction coordinate;
First order network convergence judging submodule, for whether judging the first order network according to the first-loss value Convergence;If so, calling first order network to complete submodule, if it is not, then calling first order network adjusting submodule;
First order network completes submodule, for determining that the first order network training is completed;
First order network adjusting submodule is returned and is adjusted for adjusting the first order network according to the first-loss value With the human face data input submodule.
Optionally, the first-loss value computational submodule includes:
First distance computing unit, for calculating the first distance between the prediction coordinate and true coordinate;
First average calculation unit, for calculating the average value of the first distance, as first-loss value.
Optionally, the target data generation module includes:
Partial image data extracting sub-module, for extracting part in the human face data, based on the prediction coordinate Image data;
Matrix group zygote module, for combining the corresponding partial image data of multiple face key points according to color For data matrix, as target data.
Optionally, the second level network training module includes:
Target data input submodule is handled for the target data to be input to the second level network, defeated The coordinate shift value of the face key point out;
Second penalty values computational submodule, for calculating the second penalty values using the coordinate shift value;
Second level network convergence judging submodule, for whether judging the second level network according to second penalty values Convergence;If so, calling second level network to complete submodule, if it is not, then calling second level network adjusting submodule;
Second level network completes submodule, for determining that the second level network training is completed;
Second level network adjusting submodule is returned and is adjusted for adjusting the second level network according to second penalty values With the target data input submodule.
Optionally, the second penalty values computational submodule includes:
Second distance computing unit, for calculating the second distance between the prediction coordinate and the offset coordinates;
Second average calculation unit, for calculating the average value of the second distance, as the second penalty values.
Optionally, further includes:
Data enhance processing module, for carrying out data enhancing processing to the human face data;
Wherein, the data enhancing processing includes at least one following:
Increase noise data, cutting and restores, translation processing, increases contrast.
According to another aspect of the present invention, a kind of face critical point detection based on face critical point detection model is provided Device, the face critical point detection model includes first order network and second level network, and described device includes:
Human face data extraction module, for extracting human face data from target image;
First order network process module is handled for the human face data to be input to the first order network, defeated The prediction coordinate of face key point out;
Target data generation module is used in the human face data, is based on the prediction Coordinate generation target data;
Second level network process module is handled for the target data to be input to the second level network, defeated The coordinate shift value of the face key point out;
Coordinates of targets computing module obtains institute for adding the coordinate shift value on the basis of prediction coordinate State the coordinates of targets of face key point.
The embodiment of the present invention includes following advantages:
In embodiments of the present invention, cascade network includes first order network and second level network, and first order network is for defeated The prediction coordinate of face key point out, second level network is used to export the coordinate shift value of face key point, from training image It extracts human face data and is input to first order network and be trained, when first order network training is completed, in human face data, base It in prediction Coordinate generation target data and is input to second level network and is trained, when second level network training is completed, determine Cascade network is face critical point detection model, learns prediction coordinate and coordinate shift value by two-level network, in complex scene Under, the coordinate of still available accurate face key point, also, the input size of second level network is much smaller than first order net The input size of network, can reduce the time loss of second level network, be suitable for the lesser equipment of the resources such as mobile terminal into Pedestrian's face critical point detection, to improve practicability.
Detailed description of the invention
Fig. 1 is a kind of training method of face critical point detection model based on cascade network of one embodiment of the invention Step flow chart;
Fig. 2 is a kind of first order network of one embodiment of the invention and the topology example figure of second level network;
Fig. 3 is a kind of face critical point detection method based on face critical point detection model of one embodiment of the invention Step flow chart;
Fig. 4 is a kind of training device of face critical point detection model based on cascade network of one embodiment of the invention Structural block diagram;
Fig. 5 is a kind of face critical point detection device based on face critical point detection model of one embodiment of the invention Structural block diagram.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of face critical point detection model based on cascade network of one embodiment of the invention is shown Training method step flow chart, can specifically include following steps:
Step 101, human face data is extracted from training image.
It in embodiments of the present invention, can be using cascade network training face critical point detection model, to detect face Key point.
Cascade network includes first order network and second level network, and every grade of network can independently be run, first order net The output of network relies on the input of second level network.
First order network and second are used as with CNN (Convolutional Neural Network, convolutional neural networks) The example of grade network, CNN have multiple layers, upper one layer of the input exported as next layer.
Each layer of CNN is generally made of multiple map, and each map is made of multiple neural units, the institute of the same map There is neural unit to share a convolution kernel (i.e. weight), convolution kernel often represents a feature, for example some convolution sum represents one Section arc, then this convolution kernel is rolled on entire picture, the biggish region of convolution value is just likely to be one section of arc.
As shown in Fig. 2, in this example, first order network (CNN) has 5 preceding convolutional layers of sequence, 2 sequences exist Full articulamentum afterwards, second level network (CNN) have 3 preceding convolutional layers of sequence, 2 posterior full articulamentums of sequence.
In convolutional layer, convolutional layer is substantially a feature extraction layer, can set hyper parameter F to specify and set up how many Feature extractor (Filter) is equivalent to the moving window of a k*d size from input matrix for some Filter First character start constantly to move backward, wherein k and d is the specified window size of Filter.For window sometime, By the nonlinear transformation of neural network, the input value in this window is converted into some characteristic value, as window is constantly past After move, the corresponding characteristic value of this Filter constantly generates, and forms the feature vector of this Filter.Here it is convolutional layer pumpings Take the process of feature.Each Filter is so operated, and forms different feature extractors.
It is that the convolution kernel of n 1*1 carries out convolution to upper layer feature, then after to convolution in full articulamentum Feature is a mean value pooling.
Certainly, the structure of above-mentioned first order network and second level network is intended only as example, is implementing the embodiment of the present invention When, the structure of other first order networks and second level network can be set according to the actual situation, this is not added in the embodiment of the present invention With limitation.In addition, those skilled in the art can also be according to reality other than the structure of above-mentioned first order network and second level network Border needs the structure using other first order networks and second level network, and the embodiment of the present invention is also without restriction to this.
When using cascade network training face critical point detection model, pre-prepd training can be extracted from picture library Image is the image data comprising face in the training image.
Face datection is carried out to the training image, identifies the region where face, and the region is cut to specified size The block of pixels of (such as 10*10), as human face data.
In the concrete realization, Face datection can be carried out by following one or more modes:
1, reference template method
The template of one or several standard faces is designed first, between the sample and standard form for then calculating test acquisition Matching degree, and pass through threshold value to determine whether there are faces.
2, face rule method
Since face has certain structure distribution feature, the method for so-called face rule extracts these features and generates phase The rule answered is to judge whether test sample includes face
3, sample learning method
This method is the method for using artificial neural network in pattern-recognition, i.e., by opposite as sample sets and non-image surface The study of sample sets generates classifier
4, complexion model method
This method is to be distributed the rule of Relatively centralized in color space according to the looks colour of skin to be detected.
5, sub-face of feature method
This method be all image surface set are considered as to an image surface subspace, and based on test sample and its in subspace The distance between projection judge whether there is image surface.
In video, the scenes such as take pictures, the detection of face key point is often shaken larger, and the shake of human world key point may It is related with the detection shake of face frame, it is also related with illumination variation.
It in embodiments of the present invention,, can be right in the training process of face critical point detection model in order to reduce shake Human face data carries out data enhancing processing, and the enhancing of human face data helps to improve the robustness of face critical point detection model.
In one example, data enhancing processing includes at least one following:
1, increase noise data
Random pixel value is added in human face data, as noise (such as Gaussian noise).
2, it cuts and restores
By human face data random cropping, it is generally cut to the 80%~100% of full size, and be stretched to full size (resize), at this point, the true coordinate of face key point will also do corresponding transformation, it is ensured that the position of face key point will not be inclined It moves.
3, translation is handled
The pixel value of facial image is moved integrally, the pixel value of area of absence waits other pixel values to fill up with 0.
4, increase contrast
By in human face data, the biggish region of pixel value increases its pixel value, and the low region of pixel value reduces its pixel value.
Certainly, above-mentioned data enhancing processing is intended only as example, in implementing the embodiments of the present invention, can be according to practical feelings Other data enhancings processing is arranged in condition, and the embodiments of the present invention are not limited thereto.In addition, in addition to above-mentioned data enhancing is handled, Those skilled in the art can also be handled using other data enhancings according to actual needs, and the embodiment of the present invention is not also subject to this Limitation.
Step 102, the human face data first order network is input to be trained.
In embodiments of the present invention, first order network can be used for exporting the prediction coordinate of face key point.
As shown in Fig. 2, human face data is inputted in first order network, using human face data as training sample, to the first order Network is trained.
In one embodiment of the invention, step 102 may include following sub-step:
The human face data is input to the first order network and handled by sub-step S11, output face key point Predict coordinate.
Sub-step S12 calculates first-loss value using the prediction coordinate.
Sub-step S13 judges whether the first order network restrains according to the first-loss value;If so, executing son Step S14, if it is not, then executing sub-step S15.
Sub-step S14 determines that the first order network training is completed.
Sub-step S15 adjusts the first order network according to the first-loss value, returns and execute sub-step S11.
In embodiments of the present invention, human face data is previously provided with the true coordinate of face key point.
Human face data is inputted in first order network, is handled according to the logic of first order network, for example, such as Fig. 2 institute Show, after carrying out process of convolution in 5 convolutional layers in sequence, carries out process of convolution in 2 full articulamentums.
After first order network processes finish, the prediction coordinate of face key point is exported.
At this point, being inputted in preset loss function using the prediction coordinate of the face key point as the parameter calculated, calculate First-loss value.
In one example, the first distance between prediction coordinate and true coordinate can be calculated, first distance is calculated Average value, as first-loss value.
By taking Euclidean distance as an example, first-loss value is calculated by following formula:
Wherein, the total n of human face data (n is positive integer) a face key point, (x1i, y1i) it is that (i is i-th in first order network Positive integer, i≤n) a face key point prediction coordinate,For the true coordinate of i-th of face key point.
In each round iteration, judge whether first-loss value meets preset condition, the first threshold such as less than set Deng, if so, determining that first order network training is completed, otherwise, the parameter in first order network is adjusted, into next iteration, Continue to train, so that first order network under the constraint of loss function, is gradually restrained by modes such as backpropagations, until Stablize, stop iteration.
Step 103, when the first order network training is completed, in the human face data, it is based on the prediction coordinate Generate target data.
In training face critical point detection model, first order network is first trained, after first order network convergence is stablized, Gu Determine the parameter in first order network, is further continued for training second level network later.
As shown in Fig. 2, can extract partial data based on face key point in human face data, group is combined into target data, To reduce data volume.
In one embodiment of the invention, step 103 may include following sub-step:
Sub-step S21 extracts partial image data in the human face data, based on the prediction coordinate.
The partial image data is converted to color matrix, as target data by sub-step S22.
It in embodiments of the present invention, can be using the prediction coordinate of face key point as datum mark, by neighbouring a certain range Data in (such as 10*10) are cut out from human face data to be come, the partial image data as specified size (such as 10*10*3).
The corresponding partial image data of multiple face key points is combined into data matrix according to color group (to tie up based on color Spend the matrix of composition), as target data.
For example, the size of partial image data is 10*10*3, wherein 3 be GRB (RGB) three Color Channels, n The corresponding partial image data of a face key point combines, and is formed 10*10*3n data matrix.
Step 104, the target data second level network is input to be trained.
In embodiments of the present invention, second level network can be used for exporting the coordinate shift value of face key point, so-called seat Deviant is marked, can refer to that prediction coordinate deviates the degree of true coordinate.
As shown in Fig. 2, target data is inputted in the network of the second level, using target data as training sample, to the second level Network is trained.
In one embodiment of the invention, step 102 may include following sub-step:
The target data is input to the second level network and handled by sub-step S31, and it is crucial to export the face The coordinate shift value of point.
Sub-step S32 calculates the second penalty values using the coordinate shift value.
Sub-step S33 judges whether the second level network restrains according to second penalty values;If so, executing son Step S34, if it is not, then executing sub-step S35.
Sub-step S34 determines that the second level network training is completed.
Sub-step S35 adjusts the second level network according to second penalty values, returns and execute sub-step S31.
In embodiments of the present invention, human face data is previously provided with the true coordinate of face key point.
Target data is inputted in the network of the second level, is handled according to the logic of second level network, for example, such as Fig. 2 institute Show, after carrying out process of convolution in 3 convolutional layers in sequence, carries out process of convolution in 2 full articulamentums.
It, can be with residual between the prediction coordinate and true coordinate of face key point after second level network processes finish Difference, the coordinate shift value as face key point.
At this point, being inputted in preset loss function using the coordinate shift value of the face key point as the parameter calculated, count Calculate the second penalty values.
In one example, the second distance between prediction coordinate and offset coordinates can be calculated, second distance is calculated Average value, as the second penalty values.
By taking Euclidean distance as an example, the second penalty values are calculated by following formula:
Wherein, the total n of human face data (n is positive integer) a face key point, (x2i, y2i) it is that (i is i-th in the network of the second level Positive integer, i≤n) a face key point prediction coordinate,For the prediction coordinate of i-th of face key point and true Residual error (i.e. coordinate shift value) between real coordinate.
At this point,(x1i, y1i) it is that i-th of face closes in first order network The prediction coordinate of key point.
In each round iteration, judge whether the second penalty values meet preset condition, the second threshold such as less than set Deng, if so, determining that second level network training is completed, otherwise, the parameter in the network of the second level is adjusted, into next iteration, Continue to train, so that second level network under the constraint of loss function, is gradually restrained by modes such as backpropagations, until To stabilization, stop iteration.
Step 105, when the second level network training is completed, determine that the cascade network is face critical point detection mould Type.
After second level network convergence is stablized, the parameter in the network of the second level is fixed, at this point, the cascade network is face Critical point detection model.
In embodiments of the present invention, cascade network includes first order network and second level network, and first order network is for defeated The prediction coordinate of face key point out, second level network is used to export the coordinate shift value of face key point, from training image It extracts human face data and is input to first order network and be trained, when first order network training is completed, in human face data, base It in prediction Coordinate generation target data and is input to second level network and is trained, when second level network training is completed, determine Cascade network is face critical point detection model, learns prediction coordinate and coordinate shift value by two-level network, in complex scene Under, the coordinate of still available accurate face key point, also, the input size of second level network is much smaller than first order net The input size of network, can reduce the time loss of second level network, be suitable for the lesser equipment of the resources such as mobile terminal into Pedestrian's face critical point detection, to improve practicability.
Referring to Fig. 3, a kind of face based on face critical point detection model for showing one embodiment of the invention is crucial The step flow chart of point detecting method, can specifically include following steps:
Step 301, human face data is extracted from target image.
Step 302, the human face data is input to the first order network to handle, exports the pre- of face key point Survey coordinate.
Step 303, in the human face data, based on the prediction Coordinate generation target data.
Step 304, the target data is input to the second level network to handle, exports the face key point Coordinate shift value.
Step 305, the coordinate shift value is added on the basis of the prediction coordinate, obtains the face key point Coordinates of targets.
In practical applications, face critical point detection model can be deployed in access control system, monitoring system, payment system, The systems such as camera programm carry out Face datection to user according to the demand of business, identify face key point therein, as face takes turns Wide coordinate, face coordinate etc..
In embodiments of the present invention, face critical point detection model includes first order network and second level network, every grade of net Network can independently be run, and the output of first order network relies on the input of second level network.
If obtaining the target image of face key point to be detected, Face datection can be carried out to the target image, known Region where others' face, and the region is cut to specify the block of pixels of size (such as 10*10), as human face data.
Human face data is inputted in first order network, is handled according to the logic of first order network, output face is crucial The prediction coordinate of point.
For example, as shown in Fig. 2, in sequence in 5 convolutional layers carry out process of convolution after, 2 full articulamentums into Row process of convolution.
At this point, can extract partial data based on face key point in human face data, group is combined into target data, to reduce Data volume.
In one embodiment, partial image data can be extracted in human face data, based on prediction coordinate.It will be multiple The corresponding partial image data of face key point is combined into data matrix according to color group, as target data.
Hereafter, target data is inputted in the network of the second level, is handled according to the logic of second level network, export face The coordinate shift value of key point.
For example, as shown in Fig. 2, in sequence in 3 convolutional layers carry out process of convolution after, 2 full articulamentums into Row process of convolution.
At this point, adding coordinate shift value on the basis of predicting coordinate, it can the coordinates of targets as face key point:
(x+ Δ x, y+ Δ y)
Wherein, (x, y) is the prediction coordinate of face key point, and (Δ x, Δ y) are the coordinate shift value of face key point.
Output of the coordinates of targets as face critical point detection model, for other modules in system carry out using.
In embodiments of the present invention, cascade network includes first order network and second level network, is extracted from target image Human face data simultaneously inputs first order network and is handled, and the prediction coordinate of face key point is exported, in the human face data, base In prediction Coordinate generation target data, target data is input to the second level network and is handled, exports face key point Coordinate shift value, on the basis of predicting coordinate add coordinate shift value, obtain face key point coordinates of targets, pass through two Grade e-learning prediction coordinate and coordinate shift value, under complex scene, the seat of still available accurate face key point Mark, also, the input size of second level network much smaller than first order network input size, can reduce second level network when Between consume, be suitable for carrying out face critical point detection in the lesser equipment of the resources such as mobile terminal, to improve practicability.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented Necessary to example.
Referring to Fig. 4, a kind of face critical point detection model based on cascade network of one embodiment of the invention is shown Training device structural block diagram, the cascade network includes first order network and second level network, and described device specifically can be with Including following module:
Human face data extraction module 401, for extracting human face data from training image;
First order network training module 402 is trained for the human face data to be input to the first order network, The first order network is used to export the prediction coordinate of face key point;
Target data generation module 403 is used for when the first order network training is completed, in the human face data, Based on the prediction Coordinate generation target data;
Second level network training module 404 is trained for the target data to be input to the second level network, The second level network is used to export the coordinate shift value of the face key point;
Model determining module 405, for when the second level network training is completed, determining that the cascade network is face Critical point detection model.
In one embodiment of the invention, the first order network training module 402 includes:
Human face data input submodule is handled for the human face data to be input to the first order network, defeated The prediction coordinate of face key point out;
First-loss value computational submodule, for calculating first-loss value using the prediction coordinate;
First order network convergence judging submodule, for whether judging the first order network according to the first-loss value Convergence;If so, calling first order network to complete submodule, if it is not, then calling first order network adjusting submodule;
First order network completes submodule, for determining that the first order network training is completed;
First order network adjusting submodule is returned and is adjusted for adjusting the first order network according to the first-loss value With the human face data input submodule.
In one example of an embodiment of the present invention, the first-loss value computational submodule includes:
First distance computing unit, for calculating the first distance between the prediction coordinate and true coordinate;
First average calculation unit, for calculating the average value of the first distance, as first-loss value.
In one embodiment of the invention, the target data generation module 403 includes:
Partial image data extracting sub-module, for extracting part in the human face data, based on the prediction coordinate Image data;
Matrix group zygote module, for combining the corresponding partial image data of multiple face key points according to color For data matrix, as target data.
In one embodiment of the invention, the second level network training module 404 includes:
Target data input submodule is handled for the target data to be input to the second level network, defeated The coordinate shift value of the face key point out;
Second penalty values computational submodule, for calculating the second penalty values using the coordinate shift value;
Second level network convergence judging submodule, for whether judging the second level network according to second penalty values Convergence;If so, calling second level network to complete submodule, if it is not, then calling second level network adjusting submodule;
Second level network completes submodule, for determining that the second level network training is completed;
Second level network adjusting submodule is returned and is adjusted for adjusting the second level network according to second penalty values With the target data input submodule.
In one example of an embodiment of the present invention, the second penalty values computational submodule includes:
Second distance computing unit, for calculating the second distance between the prediction coordinate and the offset coordinates;
Second average calculation unit, for calculating the average value of the second distance, as the second penalty values.
In one embodiment of the invention, further includes:
Data enhance processing module, for carrying out data enhancing processing to the human face data;
Wherein, the data enhancing processing includes at least one following:
Increase noise data, cutting and restores, translation processing, increases contrast.
Referring to Fig. 5, a kind of face based on face critical point detection model for showing one embodiment of the invention is crucial The structural block diagram of point detection device, the face critical point detection model include first order network and second level network, the dress It sets and can specifically include following module:
Human face data extraction module 501, for extracting human face data from target image;
First order network process module 502 is handled for the human face data to be input to the first order network, Export the prediction coordinate of face key point;
Target data generation module 503 is used in the human face data, is based on the prediction Coordinate generation number of targets According to;
Second level network process module 504 is handled for the target data to be input to the second level network, Export the coordinate shift value of the face key point;
Coordinates of targets computing module 505 is obtained for adding the coordinate shift value on the basis of prediction coordinate The coordinates of targets of the face key point.
In one embodiment of the invention, the target data generation module 503 includes:
Partial image data extracting sub-module, for extracting part in the human face data, based on the prediction coordinate Image data;
Matrix group zygote module, for combining the corresponding partial image data of multiple face key points according to color For data matrix, as target data.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can provide as method, apparatus or calculate Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code The form of the computer program product of implementation.
The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these Computer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devices Capable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of specified function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devices In computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagram The function of being specified in frame or multiple boxes.
These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchart And/or in one or more blocks of the block diagram specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.
Above to a kind of training method of the face critical point detection model based on cascade network provided by the present invention, one Face critical point detection method and a kind of face key point inspection based on cascade network of the kind based on face critical point detection model Training device, a kind of face critical point detection device based on face critical point detection model for surveying model, have carried out detailed Jie It continues, used herein a specific example illustrates the principle and implementation of the invention, and the explanation of above embodiments is only It is to be used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, according to this hair Bright thought, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not manage Solution is limitation of the present invention.

Claims (14)

1. a kind of training method of the face critical point detection model based on cascade network, which is characterized in that the cascade network Including first order network and second level network, which comprises
Human face data is extracted from training image;
The human face data is input to the first order network to be trained, the first order network is for exporting face key The prediction coordinate of point;
When the first order network training is completed, in the human face data, it is based on the prediction Coordinate generation target data;
The target data is input to the second level network to be trained, the second level network is for exporting the face The coordinate shift value of key point;
When the second level network training is completed, determine that the cascade network is face critical point detection model;
Wherein, it is described in the human face data, be based on the prediction Coordinate generation target data, comprising: in the face number Partial image data is extracted in, based on the prediction coordinate;By the corresponding partial image data of multiple face key points It is combined into data matrix according to color group, as target data;
The human face data is previously provided with the true coordinate of the face key point.
2. the method according to claim 1, wherein described be input to the first order net for the human face data Network is trained, comprising:
The human face data is input to the first order network to handle, exports the prediction coordinate of face key point;
First-loss value is calculated using the prediction coordinate;
Judge whether the first order network restrains according to the first-loss value;
If so, determining that the first order network training is completed;
If it is not, then adjusting the first order network according to the first-loss value, it is described that the human face data is defeated to return to execution Enter to the first order network and handled, exports the original coordinates of face key point.
3. according to the method described in claim 2, it is characterized in that, it is described using the prediction coordinate calculating first-loss value, Include:
Calculate the first distance between the prediction coordinate and true coordinate;
The average value for calculating the first distance, as first-loss value.
4. the method according to claim 1, wherein described be input to the second level net for the target data Network is trained, comprising:
The target data is input to the second level network to handle, exports the coordinate shift of the face key point Value;
Second penalty values are calculated using the coordinate shift value;
Judge whether the second level network restrains according to second penalty values;
If so, determining that the second level network training is completed;
If it is not, then adjusting the second level network according to second penalty values, it is described that the target data is defeated to return to execution Enter to the second level network and handled, exports the coordinate shift value of the face key point.
5. according to the method described in claim 4, it is characterized in that, described calculate the second loss using the coordinate shift value Value, comprising:
Calculate the second distance between the prediction coordinate and the coordinate shift value;
The average value for calculating the second distance, as the second penalty values.
6. method according to claim 1-5, which is characterized in that further include:
Data enhancing processing is carried out to the human face data;
Wherein, the data enhancing processing includes at least one following:
Increase noise data, cutting and restores, translation processing, increases contrast.
7. a kind of face critical point detection method based on face critical point detection model, which is characterized in that the face is crucial Point detection model includes first order network and second level network, which comprises
Human face data is extracted from target image;
The human face data is input to the first order network to handle, exports the prediction coordinate of face key point;
In the human face data, it is based on the prediction Coordinate generation target data;
The target data is input to the second level network to handle, exports the coordinate shift of the face key point Value;
The coordinate shift value is added on the basis of the prediction coordinate, obtains the coordinates of targets of the face key point;
Wherein, it is described in the human face data, be based on the prediction Coordinate generation target data, comprising: based on the prediction Coordinate extracts partial image data;The corresponding partial image data of multiple face key points is combined into data square according to color group Battle array, as target data;
The human face data is previously provided with the true coordinate of the face key point.
8. a kind of training device of the face critical point detection model based on cascade network, which is characterized in that the cascade network Including first order network and second level network, described device includes:
Human face data extraction module, for extracting human face data from training image;
First order network training module is trained for the human face data to be input to the first order network, and described Primary network station is used to export the prediction coordinate of face key point;
Target data generation module is used for when the first order network training is completed, in the human face data, based on described Predict Coordinate generation target data;
Second level network training module is trained for the target data to be input to the second level network, and described Two grade network is used to export the coordinate shift value of the face key point;
Model determining module, for when the second level network training is completed, determining that the cascade network is face key point Detection model;
Wherein, the target data generation module includes:
Partial image data extracting sub-module, for extracting topography in the human face data, based on the prediction coordinate Data;
Matrix group zygote module, for the corresponding partial image data of multiple face key points to be combined into number according to color group According to matrix, as target data;
The human face data is previously provided with the true coordinate of the face key point.
9. device according to claim 8, which is characterized in that the first order network training module includes:
Human face data input submodule is handled for the human face data to be input to the first order network, exports people The prediction coordinate of face key point;
First-loss value computational submodule, for calculating first-loss value using the prediction coordinate;
First order network convergence judging submodule, for judging whether the first order network is received according to the first-loss value It holds back;If so, calling first order network to complete submodule, if it is not, then calling first order network adjusting submodule;
First order network completes submodule, for determining that the first order network training is completed;
First order network adjusting submodule returns for adjusting the first order network according to the first-loss value and calls institute State human face data input submodule.
10. device according to claim 9, which is characterized in that the first-loss value computational submodule includes:
First distance computing unit, for calculating the first distance between the prediction coordinate and true coordinate;
First average calculation unit, for calculating the average value of the first distance, as first-loss value.
11. device according to claim 8, which is characterized in that the second level network training module includes:
Target data input submodule is handled for the target data to be input to the second level network, exports institute State the coordinate shift value of face key point;
Second penalty values computational submodule, for calculating the second penalty values using the coordinate shift value;
Second level network convergence judging submodule, for judging whether the second level network is received according to second penalty values It holds back;If so, calling second level network to complete submodule, if it is not, then calling second level network adjusting submodule;
Second level network completes submodule, for determining that the second level network training is completed;
Second level network adjusting submodule returns for adjusting the second level network according to second penalty values and calls institute State target data input submodule.
12. device according to claim 11, which is characterized in that the second penalty values computational submodule includes:
Second distance computing unit, for calculating the second distance between the prediction coordinate and the coordinate shift value;
Second average calculation unit, for calculating the average value of the second distance, as the second penalty values.
13. according to the described in any item devices of claim 8-12, which is characterized in that further include:
Data enhance processing module, for carrying out data enhancing processing to the human face data;
Wherein, the data enhancing processing includes at least one following:
Increase noise data, cutting and restores, translation processing, increases contrast.
14. a kind of face critical point detection device based on face critical point detection model, which is characterized in that the face is crucial Point detection model includes first order network and second level network, and described device includes:
Human face data extraction module, for extracting human face data from target image;
First order network process module is handled for the human face data to be input to the first order network, exports people The prediction coordinate of face key point;
Target data generation module is used in the human face data, is based on the prediction Coordinate generation target data;
Second level network process module is handled for the target data to be input to the second level network, exports institute State the coordinate shift value of face key point;
Coordinates of targets computing module obtains the people for adding the coordinate shift value on the basis of prediction coordinate The coordinates of targets of face key point;
Wherein, the target data generation module includes:
Partial image data extracting sub-module, for extracting topography in the human face data, based on the prediction coordinate Data;
Matrix group zygote module, for the corresponding partial image data of multiple face key points to be combined into number according to color group According to matrix, as target data;
The human face data is previously provided with the true coordinate of the face key point.
CN201810118211.3A 2018-02-06 2018-02-06 The model training and its detection method and device of face key point Active CN108399373B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810118211.3A CN108399373B (en) 2018-02-06 2018-02-06 The model training and its detection method and device of face key point

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810118211.3A CN108399373B (en) 2018-02-06 2018-02-06 The model training and its detection method and device of face key point

Publications (2)

Publication Number Publication Date
CN108399373A CN108399373A (en) 2018-08-14
CN108399373B true CN108399373B (en) 2019-05-10

Family

ID=63095216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810118211.3A Active CN108399373B (en) 2018-02-06 2018-02-06 The model training and its detection method and device of face key point

Country Status (1)

Country Link
CN (1) CN108399373B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376593B (en) * 2018-09-10 2020-12-29 杭州格像科技有限公司 Face feature point positioning method and system
CN109597123B (en) * 2018-10-26 2021-02-19 长江大学 Effective signal detection method and system
CN109558837B (en) * 2018-11-28 2024-03-22 北京达佳互联信息技术有限公司 Face key point detection method, device and storage medium
CN109800648B (en) * 2018-12-18 2021-09-28 北京英索科技发展有限公司 Face detection and recognition method and device based on face key point correction
CN109685023A (en) * 2018-12-27 2019-04-26 深圳开立生物医疗科技股份有限公司 A kind of facial critical point detection method and relevant apparatus of ultrasound image
CN109858466A (en) * 2019-03-01 2019-06-07 北京视甄智能科技有限公司 A kind of face critical point detection method and device based on convolutional neural networks
CN110021150B (en) * 2019-03-27 2021-03-19 创新先进技术有限公司 Data processing method, device and equipment
CN109961103B (en) * 2019-04-02 2020-10-27 北京迈格威科技有限公司 Training method of feature extraction model, and image feature extraction method and device
CN110309706B (en) * 2019-05-06 2023-05-12 深圳华付技术股份有限公司 Face key point detection method and device, computer equipment and storage medium
CN110909664A (en) * 2019-11-20 2020-03-24 北京奇艺世纪科技有限公司 Human body key point identification method and device and electronic equipment
CN110969100B (en) * 2019-11-20 2022-10-25 北京奇艺世纪科技有限公司 Human body key point identification method and device and electronic equipment
CN111028212B (en) * 2019-12-02 2024-02-27 上海联影智能医疗科技有限公司 Key point detection method, device, computer equipment and storage medium
CN111783948A (en) * 2020-06-24 2020-10-16 北京百度网讯科技有限公司 Model training method and device, electronic equipment and storage medium
CN112115845B (en) * 2020-09-15 2023-12-29 中山大学 Active shape model parameterization method for face key point detection
CN112733700A (en) * 2021-01-05 2021-04-30 风变科技(深圳)有限公司 Face key point detection method and device, computer equipment and storage medium
CN112861689A (en) * 2021-02-01 2021-05-28 上海依图网络科技有限公司 Searching method and device of coordinate recognition model based on NAS technology
CN112949492A (en) * 2021-03-03 2021-06-11 南京视察者智能科技有限公司 Model series training method and device for face detection and key point detection and terminal equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824049A (en) * 2014-02-17 2014-05-28 北京旷视科技有限公司 Cascaded neural network-based face key point detection method
CN106575367A (en) * 2014-08-21 2017-04-19 北京市商汤科技开发有限公司 A method and a system for facial landmark detection based on multi-task
CN106874898A (en) * 2017-04-08 2017-06-20 复旦大学 Extensive face identification method based on depth convolutional neural networks model

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE602004029113D1 (en) * 2003-02-25 2010-10-28 Canon Kk Device and method for managing articles
US7118026B2 (en) * 2003-06-26 2006-10-10 International Business Machines Corporation Apparatus, method, and system for positively identifying an item

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824049A (en) * 2014-02-17 2014-05-28 北京旷视科技有限公司 Cascaded neural network-based face key point detection method
CN106575367A (en) * 2014-08-21 2017-04-19 北京市商汤科技开发有限公司 A method and a system for facial landmark detection based on multi-task
CN106874898A (en) * 2017-04-08 2017-06-20 复旦大学 Extensive face identification method based on depth convolutional neural networks model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于两阶段定位模型的人脸对齐算法研究;王峰;《中国优秀硕士学位论文全文数据库信息科技辑》;20180115(第01期);第11-53页
基于级联卷积神经网络的人脸关键点检测算法;靳一凡;《中国优秀硕士学位论文全文数据库信息科技辑》;20160215(第02期);I138-1621
结合人脸检测的人脸特征点定位方法研究;董瑞霞;《中国优秀硕士学位论文全文数据库信息科技辑》;20180115(第01期);I138-967

Also Published As

Publication number Publication date
CN108399373A (en) 2018-08-14

Similar Documents

Publication Publication Date Title
CN108399373B (en) The model training and its detection method and device of face key point
US10719940B2 (en) Target tracking method and device oriented to airborne-based monitoring scenarios
KR102150776B1 (en) Face location tracking method, apparatus and electronic device
Fischer et al. Flownet: Learning optical flow with convolutional networks
CN104408742B (en) A kind of moving target detecting method based on space time frequency spectrum Conjoint Analysis
CN107292247A (en) A kind of Human bodys' response method and device based on residual error network
CN109598268A (en) A kind of RGB-D well-marked target detection method based on single flow depth degree network
CN108198201A (en) A kind of multi-object tracking method, terminal device and storage medium
CN111881804B (en) Posture estimation model training method, system, medium and terminal based on joint training
CN109919059B (en) Salient object detection method based on deep network layering and multi-task training
CN108121931A (en) two-dimensional code data processing method, device and mobile terminal
CN107909026A (en) Age and gender assessment based on the small-scale convolutional neural networks of embedded system
CN113095254B (en) Method and system for positioning key points of human body part
CN110956646A (en) Target tracking method, device, equipment and storage medium
CN110598715A (en) Image recognition method and device, computer equipment and readable storage medium
CN106952304A (en) A kind of depth image computational methods of utilization video sequence interframe correlation
KR20220081261A (en) Method and apparatus for object pose estimation
CN111008631A (en) Image association method and device, storage medium and electronic device
CN110969110A (en) Face tracking method and system based on deep learning
CN113177470A (en) Pedestrian trajectory prediction method, device, equipment and storage medium
CN101587590A (en) Selective visual attention computation model based on pulse cosine transform
CN111260687B (en) Aerial video target tracking method based on semantic perception network and related filtering
CN114021704B (en) AI neural network model training method and related device
CN107948586A (en) Trans-regional moving target detecting method and device based on video-splicing
KR20190078890A (en) Method and apparatus for estimating plane based on grids

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant