CN106575367A - A method and a system for facial landmark detection based on multi-task - Google Patents

A method and a system for facial landmark detection based on multi-task Download PDF

Info

Publication number
CN106575367A
CN106575367A CN201480081241.1A CN201480081241A CN106575367A CN 106575367 A CN106575367 A CN 106575367A CN 201480081241 A CN201480081241 A CN 201480081241A CN 106575367 A CN106575367 A CN 106575367A
Authority
CN
China
Prior art keywords
face
training
convolutional neural
neural networks
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201480081241.1A
Other languages
Chinese (zh)
Other versions
CN106575367B (en
Inventor
汤晓鸥
张展鹏
罗平
吕健勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sensetime Technology Development Co Ltd
Original Assignee
Beijing Sensetime Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sensetime Technology Development Co Ltd filed Critical Beijing Sensetime Technology Development Co Ltd
Publication of CN106575367A publication Critical patent/CN106575367A/en
Application granted granted Critical
Publication of CN106575367B publication Critical patent/CN106575367B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present application disclosed a method and system for detecting facial landmarks of a face image. The method may comprise extracting multiple feature maps from at least one facial region of the face image and/or the whole face image; generating a shared facial feature vector from the extracted multiple feature maps; and predicting facial landmark locations of the face image from the generated shared facial feature vector. With the present method and system, the facial landmark detection can be optimized together with heterogeneous but subtly related task, so that the detection robustness can be improved through multi-task learning.

Description

For the method and system based on the face critical point detection of multitask
Technical field
The application be related to face alignment, exactly, be related to for face key point (landmark) detection method and System.
Background technology
Face critical point detection is that many human face analysis tasks (such as, infer, face verification and face are known by face character Substance not), but it is constantly subjected to the obstruction of light masking (occlusion) and postural change problem.
Accurately face critical point detection can be performed using cascade CNN (convolutional neural networks), and wherein face is advance Subregion is divided into different parts, and each part is by single depth CNN process.The output for obtaining subsequently carries out mean deviation And single cascading layers are transferred to, to process each face key point respectively.
Additionally, face critical point detection is not question of independence, it estimate can be subject to many isomeries but delicate association because The impact of element.For example, when child is when smiling, his/her face opens very big.Effectively find and using such internal association Face character will be helpful to more accurately detect the corners of the mouth.In addition, in the larger face of left rotation and right rotation, the distance between two is more It is little.This pose information can be used as additional information source, to constrain the solution space of crucial point estimation.Giving abundant reasonable phase In the case of the set of pass task, face critical point detection is processed in isolation can run counter to desire.
However, different task can be inherently different in terms of study difficult point, and with different rates of convergence.Additionally, Some tasks may be than other tasks overfitting earlier, so as to the study of the whole model of harm be received when learning at the same time Hold back.
The content of the invention
In the one side of the application, the method for detecting the face key point of facial image is disclosed.Methods described can Including:Multiple characteristic patterns are extracted from least one human face region of facial image;Generate from the multiple characteristic patterns for being extracted Shared face feature vector;And from the shared face feature vector for being generated predict facial image face key point Position.
In the another aspect of the application, the system for detecting the face key point of facial image is disclosed.The system May include feature extractor and predictor.Feature extractor can extract multiple spies from least one human face region of facial image Figure is levied, and shared face feature vector is generated from the multiple characteristic patterns for being extracted.Predictor can be from by feature extractor The face key point position of facial image is predicted in the shared face feature vector for generating.
The application is also associated nonproductive task with training for performing face critical point detection and at least one simultaneously The method of convolution character network.Methods described may include:1) to training facial image, its demarcation true from predetermined training set (ground-truth) key point position and its demarcation real goal for being used for each nonproductive task are sampled;2) compare it is pre- The face key point position of survey is different to generate crucial point tolerance from what is demarcated between true key point position;3) it is respectively compared mesh Mark prediction is different between real goal from for demarcating for each nonproductive task, to generate at least one training mission error; 4) by the crucial point tolerance for being generated and all of training mission error back propagation by convolutional neural networks, to adjust convolution The weight of the connection between the neuron of neutral net;5) from predetermined authentication concentrate to verify facial image and its to be used for each auxiliary The demarcation real goal for helping task is sampled;6) it is different between comparison object prediction and demarcation real goal, tested with generating Card task error;And 7) determine that the checking whether generated training mission error is less than first predetermined value and is generated is appointed Whether business error is less than second predetermined value.If it is, terminate training convolutional neural networks, and otherwise, by repeat step 1) to 7).
Present invention also provides a kind of computer-readable medium, the computer-readable medium is for store instruction, the instruction Can be performed to implement above-mentioned method by one or more processors.
Compared with the conventional method, face critical point detection can with isomery but the task of delicate auxiliary is optimized together, so as to Detection reliability can be improved by multi-task learning, especially the face with the masking of notable light and postural change is being processed In the case of.
According to the application, a single CNN is only used, therefore, it is possible to decrease the complexity of required systems/devices.Neither The advance subregion of face is needed also without concatenated convolutional nervous layer, so as to greatly reduce model complexity, and while still realizing Quite or even preferably accuracy.
With the carrying out of training, some inter-related tasks are just no longer beneficial to main task when optimum performance is reached, therefore can Stop the training process to them.According to the application, the training of CNN is performed using " stopping (early stopping) in advance " Journey, endangers the inter-related task of main task, to promote to stop those because overfitting (over-fit) training set is started Study convergence.
Description of the drawings
Below with reference to the exemplary non-limiting embodiments of the Description of Drawings present invention.Accompanying drawing is illustrative, and typically not It is actual precise proportions.Same or like element on different figures quotes identical drawing reference numeral.
Fig. 1 is the schematic diagram for illustrating the system for face critical point detection according to some disclosed embodiments.
Fig. 2 is the schematic diagram for illustrating the training unit as shown in Figure 1 according to some disclosed embodiments.
Fig. 3 is the signal of the example for illustrating the system for face critical point detection according to some disclosed embodiments Figure, there is shown with the example of convolutional neural networks.
When Fig. 4 is to illustrate to be performed in software according to the system for face critical point detection of some disclosed embodiments Schematic diagram.
Fig. 5 is the schematic flow diagram for illustrating the method for face critical point detection according to some disclosed embodiments.
Fig. 6 is the exemplary flow of the training process for illustrating the multitask convolutional neural networks according to some disclosed embodiments Figure.
Specific embodiment
This part will be explained in illustrative embodiments, and the example of these embodiments will be illustrated in the drawings.Suitable When working as, identical drawing reference numeral represents all the time same or similar part in accompanying drawing.
Fig. 1 is to illustrate showing for the example system 1000 for face critical point detection according to some disclosed embodiments It is intended to.According to system 1000, face critical point detection (being hereinafter also referred to as main task) is related at least one/nonproductive task It is optimized jointly.Face critical point detection refers to detect 2D positions, i.e. the 2D coordinates of the human face region of facial image (x and y).The example of face key point may include but be not limited to, the left eye center and right eye center of facial image, nose, the left corners of the mouth and The right corners of the mouth.The example of nonproductive task may include but be not limited to, head pose estimation, demography (such as, Gender Classification), year Age is estimated, expression recognition (such as, smiling) or face character infer (such as, wearing glasses).It will be appreciated that nonproductive task Quantity or type are not limited to those mentioned herein.
Fig. 1 is referred again to, feature extractor 100, training unit 200 and pre- are may include when system 1000 is implemented by hardware Survey device 300.Feature extractor 100 can extract many from least one human face region of facial image and/or whole facial image Individual characteristic pattern.Subsequently, shared face feature vector can be generated from the multiple characteristic patterns for being extracted by feature extractor 100.
Predictor 300 can predict facial image from the shared face feature vector extracted by feature extractor 100 Face key point position.Meanwhile, predictor 300 can further prediction and the inspection of face key point from shared face feature vector Survey the corresponding target of at least one associated nonproductive task.According to system 1000, face critical point detection can be with nonproductive task It is optimized jointly.
According to embodiment, feature extractor 100 may include convolutional neural networks.The network may include multiple convolution-ponds Layer and full articulamentum.In a network, each the executable convolution in multiple convolution-pond layer and maximum pondization operation, and by The characteristic pattern that preceding layer in convolution-pond layer is extracted is input in next layer of convolution pond layer, to extract and previously extraction The different characteristic pattern of characteristic pattern.Full articulamentum can generate shared face characteristic from all of extracted multiple characteristic patterns Vector.
The example of network is shown, wherein convolutional neural networks include input layer, multiple (for example, three) convolution-pond in Fig. 3 Change layer, a convolutional layer and a full articulamentum, wherein convolution-pond layer includes one or more (for example, three) convolution Layer and one or more (for example, three) pond layers.It should be noted that shown network is for example, and in feature extractor Convolutional neural networks not limited to this.As shown in figure 3, the facial image of 40 × 40 (such as) gray scales is input in input layer.First Convolution-pond layer extracts characteristic pattern from the image of input.Subsequently, the second convolution-pond layer using the output of ground floor as defeated Enter, to generate different characteristic patterns.This process is continued by using all of three convolution-pond layer.Finally, characteristic pattern Multiple layers by full articulamentum be used for generate shared face feature vector.In other words, by performing multiple convolution and maximum Pondization operates to generate shared face feature vector.Multiple neuron of each layer containing band locally or globally receptive field, and And the weight of the connection between the neuron of convolutional neural networks can be adjusted with correspondingly training network.
According to embodiment, system 1000 can also include training unit 200.Training unit 200 can use predetermined training set Carry out training characteristics extractor, with the weight of the connection between the neuron for adjusting convolutional neural networks, so that trained Feature extractor can extract shared d face feature vectors.Embodiments herein according to Fig. 2, training unit 200 can Including sampler 201, comparator 202 and back propagation device 203.
As shown in Fig. 2 sampler 201 can be from predetermined training set to training facial image, its true key point of demarcation Put and for each nonproductive task its demarcate real goal be sampled.According to embodiment, five are demarcated true key point (that is, the center of eyes, nose, the corners of the mouth) can be annotated directly on each training facial image.According to another embodiment, it is used for The demarcation real goal of each nonproductive task can hand labeled.For example, for Gender Classification, demarcating real goal can be marked as Female (F) or male (M).Infer for face character, such as, wear glasses, demarcating real goal can be marked as wearing (Y) or not wear (N).Estimate for head pose, can labelling (0 °, ± 30 °, ± 60 °), and for Expression Recognition, such as, smile, can be corresponding Ground labelling Yes/No.
It is different between comparable the predicted face key point position of comparator 202 and the true key point position of demarcation, To generate crucial point tolerance.Crucial point tolerance can be obtained by using such as method of least square.Comparator 203 can also compare respectively It is different between real goal from for demarcating for each nonproductive task compared with target prediction, to generate at least one training mission mistake Difference.According to another embodiment, training mission error can be obtained by using such as Cross-Entropy Method.
Back propagation device 203 can be by the crucial point tolerance for being generated and all of training mission error back propagation by volume Product neutral net, with the weight of the connection between the neuron for adjusting convolutional neural networks.
According to embodiment, training unit 200 may also include determining that device 204.Determiner 204 can determine that face critical point detection Training process whether restrain.According to another embodiment, determiner 204 may further determine that whether the training process of each task is received Hold back, this will be discussed after.
Hereinafter, will be discussed in detail the part in training unit as mentioned above 200.For purposes of illustration, will The embodiment for being trained T task jointly by training unit 200 is described.For T task, face critical point detection (that is, director Business) be expressed as in r, and at least one correlation/nonproductive task one be expressed as a, wherein a ∈ A.
For each in task, training data is expressed as (xi,yi t), t={ 1 ..., T } and i={ 1 ..., N }, its Middle N represents the quantity of training data.Specifically, for face critical point detection r, training data is expressed as (xi,yi r), whereinIt is the 2D coordinates of five key points.For task a, training data is expressed as (xi,yi a).In embodiment, four are illustrated Individual task p, g, w and s, and they represent respectively deduction ' posture ', ' sex ', ' wearing glasses ' and ' smile '.Therefore,Five different postures ((0 °, ± 30 °, ± 60 °)) are represented, andIt is binary system category Property and represent female/man respectively, do not wear glasses/wear glasses and do not have smile/smile.Different weights be assigned to main task r and Each nonproductive task a, and it is expressed as Wr{ Wa}。
Subsequently, the object function of all tasks is formulated as follows to optimize main task r and nonproductive task a:
Wherein, f (xt;wt) it is xtWith weight vectors wtLinear function;
L () represents loss function;
λaRepresent the important coefficient of the error of a-th task;And
xiRepresent shared face feature vector.
According to embodiment, least square function is used separately as the loss letter of main task r and nonproductive task a with entropy function is intersected Number l (), to generate corresponding crucial point tolerance and training mission error.Therefore, above-mentioned object function is rewritable as follows:
In equation (2), in Section 1It is linear function.Section 2 is posterior probability functionWhereinThe jth row of the weight matrix of expression task a, w.Section 3 punishment is big Weight W={ (Wr,{Wa}})。
According to embodiment, the weight of all tasks can correspondingly update.Specifically, the weight of face critical point detection Matrix byUpdate, wherein η represent learning rate (such as, η=0.003) and In addition, the weight matrix of each task a can be calculated as in a similar fashion
Subsequently, the crucial point tolerance for being generated and training mission error can be by the back propagations layer by layer of back propagation device 203 By convolutional neural networks until lowermost layer, so as to the weight of the connection between the neuron for adjusting convolutional neural networks.
According to presently filed embodiment, error can counter-propagate through convolution god by following back propagation strategy Jing networks:
In equation (3), εlThe all errors in layer l are represented, wherein
For example, ε1Represent the error of lowermost layer, and ε2Represent the error of the second low layer.The error of lower level is according to equation (3) calculated.For example,WhereinIt is the gradient of activation primitive σ () of network.
The above-mentioned training process of repetition, until determiner 204 determines that the training process of face critical point detection is convergence.Change Yan Zhi, if error is less than predetermined value, then training process will be confirmed as convergence.By above-mentioned training process, feature extraction Device 100 can extract shared face feature vector from given facial image.According to embodiment, for any facial image x0, characteristic vector x that the extraction of feature extractor 100 trained is sharedl.Subsequently, by (Wr)TxlTo predict key point position, And by p (ya|xl;Wa) obtaining the prediction target of nonproductive task.
During above-mentioned training process, while training at least one nonproductive task.However, different tasks have it is different Loss function and study difficult point, therefore with different rates of convergence.According to another embodiment, determiner 204 may further determine that Whether the training process of nonproductive task restrains.
Specifically,WithThe value of the loss function of checking collection and task a in training set is represented respectively.If The measured value of one task exceeds threshold epsilon, as follows, then the task will stop:
In equation (4), t represents current iteration, and k represents training length, and λaRepresent the weight of the error of a-th task Want property coefficient.' med ' represents the function for calculating intermediate value.Section 1 in equation (4) represents the training mission of task a and misses Poor trend.If training error declines rapidly in segment length k, then the value of Section 1 is less, so as to show task Training process can continue, because the task is still valuable.Otherwise, Section 1 is larger, then the task more likely stops.Therefore, During training process, nonproductive task can be closed before its overfitting, so that the task can start overfitting instruction at it Practice collection thus endanger before main task " in advance stop ".
By above-mentioned training process, feature extractor 100 can extract shared face characteristic from any facial image Vector.For example, facial image x0It is input in the input layer of convolutional neural networks, for example, as shown in Figure 3.Each volume in CNN There is multigroup convolution filter in lamination and be applied to the activation primitive of facial image, and they are one after the other applied with by face Image projects higher level.In other words, by learn a series of nonlinear mapping (as follows) with obtain arch face characteristic to Amount xlAnd step by step facial image is projected into higher level:
Herein, σ () and WslRepresentative is applied to facial imageNonlinear activation function and The wave filter for learning is needed in the layer l of CNN.For example,.Fig. 3 is referred again to, in estimation stages, shared face feature vector can It is used for critical point detection and auxiliary/inter-related task simultaneously.
It will be appreciated that system 1000 can be implemented using a certain hardware, software or combinations thereof.Additionally, the reality of the present invention Apply example and may be adapted to computer program, the computer program is embodied in containing computer program code one or many On individual computer-readable recording medium (including but not limited to, disk memory, CD-ROM, optical memory etc.).
In the case where system 1000 is implemented with software, system 1000 may include general purpose computer, computer cluster, main flow Computer, the computing device for being exclusively used in offer online content, or computer network, the computer network includes one group to collect In or distribution mode operation computer.As shown in figure 4, system 1000 may include one or more processors (processor 102, 104th, 106 etc.) what, the information between the various parts of memorizer 112, storage device 116 and accelerating system 1000 was exchanged is total Line.Processor 102 to 106 may include CPU (" CPU "), Graphics Processing Unit (" GPU ") or other are suitable Information processor.According to the type of the hardware for being used, processor 102 to 106 may include one or more printed circuit board (PCB)s And/or one or more microprocessors chip.The executable computer program of processor 102 to 106 instruction sequence, with perform by The various methods for illustrating in further detail below.
Memorizer 112 can especially include random access memory (" RAM ") and read only memory (" ROM ").Computer journey Sequence instruction can be stored by memorizer 112, accessed and read from the memorizer, so as in by processor 102 to 106 or Multiple computing devices.For example, memorizer 112 can store one or more software applications.Additionally, memorizer 112 can store whole The one of the software application that individual software application or only storage can be performed by the one or more processors in processor 102 to 106 Part.Although it should be noted that only illustrating a frame in Fig. 1, memorizer 112 may include to be arranged on central processor or different meters Calculate the multiple physical units on device.
Described above is for the system of face critical point detection.Describe hereinafter with reference to Fig. 5 and Fig. 6 for face pass The method of key point detection.
Fig. 5 is illustrated for the schematic flow diagram of face critical point detection, and Fig. 6 illustrates many of the execution of training unit 200 The schematic flow diagram of the training process of task convolutional neural networks.
In fig. 5 and fig., method 500 and 600 includes to be held by the one or more processors in processor 102 to 106 The series of steps that capable or system 1000 each module/unit is performed, to implement data processing operation.For description Purpose, below by entering as a example by being formed by the combination of hardware or hardware and software by each module/unit of system 1000 Row description.It will be understood by one of ordinary skill in the art that other suitable devices or system are applicable to implement following process, and System 1000 is intended merely as implementing the illustration of the process.
As shown in figure 5, in step S501, feature extractor 100 is carried from least one human face region of facial image Take multiple characteristic patterns.In another embodiment, in step S501, multiple characteristic patterns can be extracted from whole facial image.With Afterwards, in step S502, from the multiple characteristic patterns extracted in step S501 shared face feature vector is generated.In step In S503, the face key point position of facial image is predicted from the shared face feature vector generated in step S502. According to another embodiment, shared face feature vector can be used to predict that at least one be associated with face critical point detection is auxiliary Help the corresponding target of task.Subsequently, while obtaining the target prediction of all nonproductive tasks.
According to presently filed embodiment, feature extractor includes convolutional neural networks, and the convolutional neural networks include many Individual convolution-pond layer and full articulamentum.Each in convolution-pond layer is configured to perform convolution and maximum pondization operation. In the embodiment, in step S501, multiple characteristic patterns can continuously be extracted by multiple convolution-pond layer, wherein by convolution- The characteristic pattern that preceding layer in the layer of pond is extracted is input to next layer of convolution-pond layer, to extract and the previous feature extracted The different characteristic pattern of figure.In step S502, can be by full articulamentum from all of multiple characteristic patterns extracted in step S501 It is middle to generate shared face feature vector.
In this embodiment, method 500 also includes training step (not shown in Fig. 5), and the training step will enter with reference to Fig. 6 Row is discussed.
As shown in fig. 6, in step s 601, to training facial image, its true key point of demarcation from predetermined training set Position and its demarcation real goal for being used for each nonproductive task are sampled.For training facial image, in step S602, The target prediction of its face key point prediction and all nonproductive tasks can be correspondingly obtained from predictor 300.Subsequently, in step In S603, compare predicted face key point position and demarcate different between true key point position, to generate key point Error.In step s 604, it is respectively compared target prediction different between real goal from for demarcating for each nonproductive task, To generate at least one training mission error.Subsequently, in step s 605, by the crucial point tolerance for being generated and all of training Task error back propagation passes through convolutional neural networks, with the weight of the connection between the neuron for adjusting convolutional neural networks. In step S606, determine whether in nonproductive task one restrained.If "No", process 600 returns to step S606.Such as Fruit "Yes", the training process of the task stops and proceeds to step S608 in step S607.In step S608, people is determined Whether the training process of face critical point detection restrains.If "Yes", process 600 terminates, and otherwise, process 600 returns to step S601。
Therefore, face critical point detection can optimize to isomery but together with delicate related task.
Although having been described for the preferred embodiment of the present invention, after basic conception of the present invention is understood, the technology of art Personnel can be changed or change to these examples.Appended claims are intended to preferred including what is fallen within the scope of the present invention Example and all changes or change.
Obviously, without departing from the spirit and scope of the present invention, those skilled in the art can be to the present invention It is changed or changes.Therefore, if these changes or change belong to the scope of claims and equivalence techniques, then they Also can fall within the scope of the present invention.

Claims (20)

1. a kind of method for detecting the face key point of facial image, including:
Multiple characteristic patterns are extracted from least one human face region of the facial image;
Shared face feature vector is generated from the multiple characteristic patterns for being extracted;And
The face key point position of the facial image is predicted from the shared face feature vector for being generated.
2. method according to claim 1, wherein the face key point include selected from group consisting of extremely It is few one:The eyes central point of facial image, nose and the corners of the mouth.
3. method according to claim 1, wherein in the step of prediction, the shared face feature vector is used for pre- The corresponding target of at least one nonproductive task being associated with the face critical point detection is surveyed, it is all of described to obtain simultaneously The target prediction of nonproductive task.
4. method according to claim 3, wherein the nonproductive task include selected from group consisting of at least One:Head pose estimation, Gender Classification, estimation of Age, expression recognition or face character are inferred.
5. method according to claim 4, wherein the extraction is performed with the step of the generation by convolutional neural networks, The convolutional neural networks include the multiple convolution-pond layer for being configured to perform convolution and maximum pondization operation, and
Wherein, the step of extraction also includes:
The plurality of characteristic pattern is continuously extracted by the plurality of convolution-pond layer, wherein by before in the convolution-pond layer The characteristic pattern of one layer of extraction is input to next layer of the convolution-pond layer, to extract with the previous characteristic pattern for extracting not Same characteristic pattern.
6. method according to claim 5, wherein the convolutional neural networks also include full articulamentum, and in the life Into the step of in, from the full articulamentum generate from all of extracted multiple characteristic patterns the shared face characteristic to Amount.
7. method according to claim 6, wherein each layer of the convolutional neural networks has multiple neurons, and Wherein methods described also includes:
The convolutional neural networks are trained using predetermined training set, with adjust the convolutional neural networks the neuron it Between connection each weight so that generating the shared people by with the convolutional neural networks of weight being adjusted Face characteristic vector.
8. method according to claim 7, wherein the step of training also includes:
From the predetermined training set to training facial image, it demarcates true key point position and its is used for each nonproductive task Demarcation real goal be sampled;
Compare different between predicted face key point position and the true key point position of the demarcation, to generate key point Error;
The target prediction is respectively compared different between real goal from for demarcating for each nonproductive task, to generate at least One training mission error;And
The crucial point tolerance for being generated and the training mission error back propagation for being generated are passed through into the convolutional neural networks, with Adjust the weight of the connection between the neuron of the convolutional neural networks;
The repetition sampling, it is described compare and the step of the back propagation, until the crucial point tolerance of the generation is less than the One predetermined value and the training mission error of the generation are less than second predetermined value.
9. method according to claim 8, wherein comparing to generate crucial point tolerance according to Least Square in Processing to hold OK, and compare to generate training mission error according to cross entropy process to perform.
10. method according to claim 8, wherein for also including the step of each nonproductive task, the training:
Concentrate to verifying that facial image and its demarcation real goal for being used for each nonproductive task are sampled from predetermined authentication;
It is relatively more different between the target detection and the demarcation real goal, to generate validation task error;
The repetition sampling and the comparison, until the training mission error of the generation is less than third predetermined value and is generated Validation task error be less than the 4th predetermined value.
11. methods according to claim 1, wherein in the step of prediction, the people of the prediction of the facial image Face key point position is according to (Wr)TxlTo determine,
Wherein WrRepresentative is assigned to the weight of the face critical point detection, and xlThe shared face characteristic vector is represented, And T represents transposition.
A kind of 12. systems for detecting the face key point of facial image, including:
Feature extractor, it is configured to:
Multiple characteristic patterns are extracted from least one human face region of the facial image;And
Shared face feature vector is generated from the multiple characteristic patterns for being extracted;And
Predictor, it is described that it is configured to the prediction from the described shared face feature vector generated by the feature extractor The face key point position of facial image.
13. systems according to claim 12, wherein the predictor is further configured to by using described sharing simultaneously Face feature vector is obtaining the target prediction of at least one nonproductive task being associated with the face critical point detection.
14. systems according to claim 12, wherein the feature extractor also includes convolutional neural networks, wherein described Convolutional neural networks include:
Multiple convolution ponds layer, it is configured to perform convolution and maximum pondization operation, and wherein by convolution pond layer In the characteristic pattern that extracts of preceding layer be input in next layer of convolution pond layer, to extract and the previous spy of extraction Levy the different characteristic pattern of figure;And
Full articulamentum, its be configured to from multiple characteristic patterns of all of extraction to generate the shared face characteristic to Amount.
15. systems according to claim 13, wherein each layer of the convolutional neural networks has multiple neurons, with And its described in system also include:
Training unit, it is configured to train the convolutional neural networks using predetermined training set, to adjust the convolution god The weight of the connection between the neuron of Jing networks, so that the convolutional neural networks trained can extract described common The face feature vector enjoyed.
16. systems according to claim 15, wherein the training unit also includes:
Sampler, its be configured to from the predetermined training set to train facial image, its demarcate true key point position and Its demarcation real goal for being used for each nonproductive task is sampled;
Comparator, it is configured to compare between predicted face key point position and the true key point position of the demarcation Difference, to generate crucial point tolerance, and it is true with the demarcation for each nonproductive task to be respectively compared the target prediction Difference between real target, to generate at least one training mission error;And
Back propagation device, it is configured to the crucial point tolerance that will be generated and the training mission error back propagation passes through institute Convolutional neural networks are stated, with the weight of the connection between the neuron for adjusting the convolutional neural networks.
17. systems according to claim 15, wherein the training unit also includes:
Determiner, its training process for being configured to determine the face critical point detection whether restrain and each task instruction Practice whether process restrains.
18. systems according to claim 12, wherein the face key point is included selected from group consisting of At least one:The center of the eyes of facial image, nose, the corners of the mouth.
19. systems according to claim 13, wherein the nonproductive task includes being selected from group consisting of extremely It is few one:Head pose estimation, Gender Classification, estimation of Age, expression recognition or face character are inferred.
A kind of 20. methods for training convolutional neural networks, the convolutional neural networks perform face critical point detection simultaneously Nonproductive task is associated with least one, methods described includes:
1) from the predetermined training set to training facial image, it demarcates true key point position and its is used for each auxiliary and appoints The demarcation real goal of business is sampled;
2) compare different between predicted face key point position and the true key point position of the demarcation, to generate key Point tolerance;
3) be respectively compared it is different between the target prediction and the demarcation real goal for each nonproductive task, with life Into at least one training mission error;
4) the crucial point tolerance for being generated and all of training mission error back propagation are passed through into the convolutional Neural net Network, with the weight of the connection between the neuron for adjusting the convolutional neural networks;
5) concentrate to verifying that facial image and its demarcation real goal for being used for each nonproductive task take from predetermined authentication Sample;
6) it is relatively more different between the target detection and the demarcation real goal, to generate validation task error;
7) whether the training mission error for determining the generation is less than the validation task error of first predetermined value and the generation Whether second predetermined value is less than;And
If it is, terminate for training the convolutional neural networks, otherwise, by repeating said steps 1) to 7).
CN201480081241.1A 2014-08-21 2014-08-21 Method and system for the face critical point detection based on multitask Active CN106575367B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/000769 WO2016026063A1 (en) 2014-08-21 2014-08-21 A method and a system for facial landmark detection based on multi-task

Publications (2)

Publication Number Publication Date
CN106575367A true CN106575367A (en) 2017-04-19
CN106575367B CN106575367B (en) 2018-11-06

Family

ID=55350056

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480081241.1A Active CN106575367B (en) 2014-08-21 2014-08-21 Method and system for the face critical point detection based on multitask

Country Status (2)

Country Link
CN (1) CN106575367B (en)
WO (1) WO2016026063A1 (en)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951888A (en) * 2017-05-09 2017-07-14 安徽大学 The relative coordinate constrained procedure and localization method of human face characteristic point
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
CN107145857A (en) * 2017-04-29 2017-09-08 深圳市深网视界科技有限公司 Face character recognition methods, device and method for establishing model
CN107358149A (en) * 2017-05-27 2017-11-17 深圳市深网视界科技有限公司 A kind of human body attitude detection method and device
CN107423727A (en) * 2017-08-14 2017-12-01 河南工程学院 Face complex expression recognition methods based on neutral net
CN107563279A (en) * 2017-07-22 2018-01-09 复旦大学 The model training method adjusted for the adaptive weighting of human body attributive classification
CN107578055A (en) * 2017-06-20 2018-01-12 北京陌上花科技有限公司 A kind of image prediction method and apparatus
CN107704848A (en) * 2017-10-27 2018-02-16 深圳市唯特视科技有限公司 A kind of intensive face alignment method based on multi-constraint condition convolutional neural networks
CN107992864A (en) * 2018-01-15 2018-05-04 武汉神目信息技术有限公司 A kind of vivo identification method and device based on image texture
CN108196535A (en) * 2017-12-12 2018-06-22 清华大学苏州汽车研究院(吴江) Automated driving system based on enhancing study and Multi-sensor Fusion
CN108229288A (en) * 2017-06-23 2018-06-29 北京市商汤科技开发有限公司 Neural metwork training and clothes method for detecting color, device, storage medium, electronic equipment
CN108399373A (en) * 2018-02-06 2018-08-14 北京达佳互联信息技术有限公司 The model training and its detection method and device of face key point
CN108416314A (en) * 2018-03-16 2018-08-17 中山大学 The important method for detecting human face of picture
CN108615016A (en) * 2018-04-28 2018-10-02 北京华捷艾米科技有限公司 Face critical point detection method and face critical point detection device
CN109147940A (en) * 2018-07-05 2019-01-04 北京昆仑医云科技有限公司 From the device and system of the medical image automatic Prediction physiological status of patient
CN109522910A (en) * 2018-12-25 2019-03-26 浙江商汤科技开发有限公司 Critical point detection method and device, electronic equipment and storage medium
CN109829431A (en) * 2019-01-31 2019-05-31 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN109902641A (en) * 2019-03-06 2019-06-18 中国科学院自动化研究所 Face critical point detection method, system, device based on semanteme alignment
CN110060296A (en) * 2018-01-18 2019-07-26 北京三星通信技术研究有限公司 Estimate method, electronic equipment and the method and apparatus for showing virtual objects of posture
CN110136828A (en) * 2019-05-16 2019-08-16 杭州健培科技有限公司 A method of medical image multitask auxiliary diagnosis is realized based on deep learning
CN110232304A (en) * 2018-03-06 2019-09-13 德韧营运有限责任公司 Isomery convolutional neural networks for more problem solvings
CN110705419A (en) * 2019-09-24 2020-01-17 新华三大数据技术有限公司 Emotion recognition method, early warning method, model training method and related device
CN111563397A (en) * 2019-02-13 2020-08-21 阿里巴巴集团控股有限公司 Detection method, detection device, intelligent equipment and computer storage medium
CN112488003A (en) * 2020-12-03 2021-03-12 深圳市捷顺科技实业股份有限公司 Face detection method, model creation method, device, equipment and medium
TWI753588B (en) * 2019-09-30 2022-01-21 大陸商深圳市商湯科技有限公司 Face attribute recognition method, electronic device and computer-readable storage medium
US11341631B2 (en) 2017-08-09 2022-05-24 Shenzhen Keya Medical Technology Corporation System and method for automatically detecting a physiological condition from a medical image of a patient
KR20220066661A (en) * 2020-11-16 2022-05-24 상명대학교산학협력단 Device and method for landmark detection using artificial intelligence

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6750854B2 (en) 2016-05-25 2020-09-02 キヤノン株式会社 Information processing apparatus and information processing method
CN105957095B (en) * 2016-06-15 2018-06-08 电子科技大学 A kind of Spiking angular-point detection methods based on gray level image
US10289825B2 (en) * 2016-07-22 2019-05-14 Nec Corporation Login access control for secure/private data
US10467459B2 (en) 2016-09-09 2019-11-05 Microsoft Technology Licensing, Llc Object detection based on joint feature extraction
CN107871106B (en) * 2016-09-26 2021-07-06 北京眼神科技有限公司 Face detection method and device
JP6692271B2 (en) * 2016-09-28 2020-05-13 日本電信電話株式会社 Multitask processing device, multitask model learning device, and program
US10198626B2 (en) 2016-10-19 2019-02-05 Snap Inc. Neural networks for facial modeling
US10460153B2 (en) * 2016-11-15 2019-10-29 Futurewei Technologies, Inc. Automatic identity detection
CN106951840A (en) * 2017-03-09 2017-07-14 北京工业大学 A kind of facial feature points detection method
CN108073910B (en) * 2017-12-29 2021-05-07 百度在线网络技术(北京)有限公司 Method and device for generating human face features
US20210056292A1 (en) * 2018-05-17 2021-02-25 Hewlett-Packard Development Company, L.P. Image location identification
CN109145798B (en) * 2018-08-13 2021-10-22 浙江零跑科技股份有限公司 Driving scene target identification and travelable region segmentation integration method
US11954881B2 (en) 2018-08-28 2024-04-09 Apple Inc. Semi-supervised learning using clustering as an additional constraint
CN109635750A (en) * 2018-12-14 2019-04-16 广西师范大学 A kind of compound convolutional neural networks images of gestures recognition methods under complex background
CN110163080A (en) 2019-04-02 2019-08-23 腾讯科技(深圳)有限公司 Face critical point detection method and device, storage medium and electronic equipment
CN110163098A (en) * 2019-04-17 2019-08-23 西北大学 Based on the facial expression recognition model construction of depth of seam division network and recognition methods
WO2021036726A1 (en) * 2019-08-29 2021-03-04 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method, system, and computer-readable medium for using face alignment model based on multi-task convolutional neural network-obtained data
CN111191675B (en) * 2019-12-03 2023-10-24 深圳市华尊科技股份有限公司 Pedestrian attribute identification model realization method and related device
WO2022003982A1 (en) * 2020-07-03 2022-01-06 日本電気株式会社 Detection device, learning device, detection method, and storage medium
CN112820382A (en) * 2021-02-04 2021-05-18 上海小芃科技有限公司 Breast cancer postoperative intelligent rehabilitation training method, device, equipment and storage medium
US11776323B2 (en) 2022-02-15 2023-10-03 Ford Global Technologies, Llc Biometric task network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080317354A1 (en) * 2001-05-31 2008-12-25 Fumiyuki Shiratani Image selection support system for supporting selection of well-photographed image from plural images
CN102831382A (en) * 2011-06-15 2012-12-19 北京三星通信技术研究有限公司 Face tracking apparatus and method
CN103824054A (en) * 2014-02-17 2014-05-28 北京旷视科技有限公司 Cascaded depth neural network-based face attribute recognition method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1352436A (en) * 2000-11-15 2002-06-05 星创科技股份有限公司 Real-time face identification system
CN101673340A (en) * 2009-08-13 2010-03-17 重庆大学 Method for identifying human ear by colligating multi-direction and multi-dimension and BP neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080317354A1 (en) * 2001-05-31 2008-12-25 Fumiyuki Shiratani Image selection support system for supporting selection of well-photographed image from plural images
CN102831382A (en) * 2011-06-15 2012-12-19 北京三星通信技术研究有限公司 Face tracking apparatus and method
CN103824054A (en) * 2014-02-17 2014-05-28 北京旷视科技有限公司 Cascaded depth neural network-based face attribute recognition method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
顾佳玲 等: "增长式卷积神经网络及其在人脸检测中的应用", 《系统仿真学报》 *

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145857A (en) * 2017-04-29 2017-09-08 深圳市深网视界科技有限公司 Face character recognition methods, device and method for establishing model
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
CN106951888A (en) * 2017-05-09 2017-07-14 安徽大学 The relative coordinate constrained procedure and localization method of human face characteristic point
CN106951888B (en) * 2017-05-09 2020-12-01 安徽大学 Relative coordinate constraint method and positioning method of human face characteristic point
CN107358149A (en) * 2017-05-27 2017-11-17 深圳市深网视界科技有限公司 A kind of human body attitude detection method and device
CN107358149B (en) * 2017-05-27 2020-09-22 深圳市深网视界科技有限公司 Human body posture detection method and device
CN107578055A (en) * 2017-06-20 2018-01-12 北京陌上花科技有限公司 A kind of image prediction method and apparatus
CN107578055B (en) * 2017-06-20 2020-04-14 北京陌上花科技有限公司 Image prediction method and device
CN108229288A (en) * 2017-06-23 2018-06-29 北京市商汤科技开发有限公司 Neural metwork training and clothes method for detecting color, device, storage medium, electronic equipment
CN108229288B (en) * 2017-06-23 2020-08-11 北京市商汤科技开发有限公司 Neural network training and clothes color detection method and device, storage medium and electronic equipment
CN107563279B (en) * 2017-07-22 2020-12-22 复旦大学 Model training method for adaptive weight adjustment aiming at human body attribute classification
CN107563279A (en) * 2017-07-22 2018-01-09 复旦大学 The model training method adjusted for the adaptive weighting of human body attributive classification
US11341631B2 (en) 2017-08-09 2022-05-24 Shenzhen Keya Medical Technology Corporation System and method for automatically detecting a physiological condition from a medical image of a patient
CN107423727A (en) * 2017-08-14 2017-12-01 河南工程学院 Face complex expression recognition methods based on neutral net
CN107704848A (en) * 2017-10-27 2018-02-16 深圳市唯特视科技有限公司 A kind of intensive face alignment method based on multi-constraint condition convolutional neural networks
CN108196535A (en) * 2017-12-12 2018-06-22 清华大学苏州汽车研究院(吴江) Automated driving system based on enhancing study and Multi-sensor Fusion
CN107992864A (en) * 2018-01-15 2018-05-04 武汉神目信息技术有限公司 A kind of vivo identification method and device based on image texture
CN110060296A (en) * 2018-01-18 2019-07-26 北京三星通信技术研究有限公司 Estimate method, electronic equipment and the method and apparatus for showing virtual objects of posture
CN108399373B (en) * 2018-02-06 2019-05-10 北京达佳互联信息技术有限公司 The model training and its detection method and device of face key point
CN108399373A (en) * 2018-02-06 2018-08-14 北京达佳互联信息技术有限公司 The model training and its detection method and device of face key point
CN110232304A (en) * 2018-03-06 2019-09-13 德韧营运有限责任公司 Isomery convolutional neural networks for more problem solvings
CN108416314B (en) * 2018-03-16 2022-03-08 中山大学 Picture important face detection method
CN108416314A (en) * 2018-03-16 2018-08-17 中山大学 The important method for detecting human face of picture
CN108615016B (en) * 2018-04-28 2020-06-19 北京华捷艾米科技有限公司 Face key point detection method and face key point detection device
CN108615016A (en) * 2018-04-28 2018-10-02 北京华捷艾米科技有限公司 Face critical point detection method and face critical point detection device
CN109147940A (en) * 2018-07-05 2019-01-04 北京昆仑医云科技有限公司 From the device and system of the medical image automatic Prediction physiological status of patient
CN109522910A (en) * 2018-12-25 2019-03-26 浙江商汤科技开发有限公司 Critical point detection method and device, electronic equipment and storage medium
CN109522910B (en) * 2018-12-25 2020-12-11 浙江商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN109829431A (en) * 2019-01-31 2019-05-31 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN109829431B (en) * 2019-01-31 2021-02-12 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN111563397B (en) * 2019-02-13 2023-04-18 阿里巴巴集团控股有限公司 Detection method, detection device, intelligent equipment and computer storage medium
CN111563397A (en) * 2019-02-13 2020-08-21 阿里巴巴集团控股有限公司 Detection method, detection device, intelligent equipment and computer storage medium
CN109902641A (en) * 2019-03-06 2019-06-18 中国科学院自动化研究所 Face critical point detection method, system, device based on semanteme alignment
CN109902641B (en) * 2019-03-06 2021-03-02 中国科学院自动化研究所 Semantic alignment-based face key point detection method, system and device
CN110136828A (en) * 2019-05-16 2019-08-16 杭州健培科技有限公司 A method of medical image multitask auxiliary diagnosis is realized based on deep learning
CN110705419A (en) * 2019-09-24 2020-01-17 新华三大数据技术有限公司 Emotion recognition method, early warning method, model training method and related device
TWI753588B (en) * 2019-09-30 2022-01-21 大陸商深圳市商湯科技有限公司 Face attribute recognition method, electronic device and computer-readable storage medium
KR20220066661A (en) * 2020-11-16 2022-05-24 상명대학교산학협력단 Device and method for landmark detection using artificial intelligence
KR102538804B1 (en) * 2020-11-16 2023-06-01 상명대학교 산학협력단 Device and method for landmark detection using artificial intelligence
CN112488003A (en) * 2020-12-03 2021-03-12 深圳市捷顺科技实业股份有限公司 Face detection method, model creation method, device, equipment and medium

Also Published As

Publication number Publication date
WO2016026063A1 (en) 2016-02-25
CN106575367B (en) 2018-11-06

Similar Documents

Publication Publication Date Title
CN106575367B (en) Method and system for the face critical point detection based on multitask
US9530047B1 (en) Method and system for face image recognition
US20220392234A1 (en) Training neural networks for vehicle re-identification
JP6832504B2 (en) Object tracking methods, object tracking devices and programs
CN106358444B (en) Method and system for face verification
Anand Thoutam et al. Yoga pose estimation and feedback generation using deep learning
CN109902546A (en) Face identification method, device and computer-readable medium
CN106415594A (en) A method and a system for face verification
Khan et al. Human Gait Analysis: A Sequential Framework of Lightweight Deep Learning and Improved Moth‐Flame Optimization Algorithm
Purnapatra et al. Face liveness detection competition (livdet-face)-2021
Zhai et al. Face verification across aging based on deep convolutional networks and local binary patterns
Tran et al. Baby learning with vision transformer for face recognition
Guntor et al. Convolutional neural network (CNN) based gait recognition system using microsoft kinect skeleton features
Farag et al. Inductive Conformal Prediction for Harvest-Readiness Classification of Cauliflower Plants: A Comparative Study of Uncertainty Quantification Methods
CN111382712A (en) Palm image recognition method, system and equipment
Ma et al. Is a picture worth 1000 votes? Analyzing the sentiment of election related social photos
CN117292421B (en) GRU-based continuous vision estimation deep learning method
Pabiasz et al. SOM vs FCM vs PCA in 3D face recognition
Shashidhar et al. An Efficient method for Recognition of Occluded Faces from Images
Mudasar Azeem et al. MASK DETECTION USING DEEP LEARNING METHODS
Pham Gabor filter initialization and parameterization strategies in convolutional neural networks
Purbanugraha et al. Improvement Accuracy Identification and Learning Speed of Offline Signatures Based on SqueezeNet with ADAM Backpropagation
Hosamani et al. Data science: prediction and analysis of data using multiple classifier system
Hossain Detecting and Mitigating Adversarial Attack
Liu et al. Face Detection with Structural Coordinates for the Estimation of Patterns Using Machine Learning Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant