CN100426317C - Multiple attitude human face detection and track system and method - Google Patents
Multiple attitude human face detection and track system and method Download PDFInfo
- Publication number
- CN100426317C CN100426317C CNB200610113423XA CN200610113423A CN100426317C CN 100426317 C CN100426317 C CN 100426317C CN B200610113423X A CNB200610113423X A CN B200610113423XA CN 200610113423 A CN200610113423 A CN 200610113423A CN 100426317 C CN100426317 C CN 100426317C
- Authority
- CN
- China
- Prior art keywords
- face
- mrow
- msub
- detection
- tracking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000001514 detection method Methods 0.000 title claims description 158
- 238000012549 training Methods 0.000 claims abstract description 32
- 238000012795 verification Methods 0.000 claims description 55
- 238000013519 translation Methods 0.000 claims description 49
- 230000036544 posture Effects 0.000 claims description 33
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 26
- 230000009466 transformation Effects 0.000 claims description 22
- 238000012360 testing method Methods 0.000 abstract description 7
- 230000000007 visual effect Effects 0.000 abstract 1
- 238000004422 calculation algorithm Methods 0.000 description 79
- 210000000887 face Anatomy 0.000 description 35
- 238000010586 diagram Methods 0.000 description 18
- 230000008569 process Effects 0.000 description 13
- 238000012545 processing Methods 0.000 description 13
- 239000011159 matrix material Substances 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 8
- 210000003739 neck Anatomy 0.000 description 7
- 230000008859 change Effects 0.000 description 5
- 230000007547 defect Effects 0.000 description 5
- 239000003086 colorant Substances 0.000 description 4
- 238000000513 principal component analysis Methods 0.000 description 4
- 238000005286 illumination Methods 0.000 description 3
- 210000000056 organ Anatomy 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000011897 real-time detection Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 241000405217 Viola <butterfly> Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
This invention discloses one method and system for different human faces to test multiple faces in the test sequence for continuous tracing in the images, which comprises the following steps: separately getting human face front and side test mode through sample training to determining AAM human face mode; using the above modes to test the input visual image to determine whether one frame of image is stored in the human face, if testing the face and then tracing and validating the human face in the back frame.
Description
Technical Field
The invention relates to a face detection and tracking system and a method thereof, in particular to a multi-pose face detection and tracking method.
Background
Human faces are one of the most convenient ways for human-computer interaction in computer vision systems. The face detection is to determine the position, size and other information of all faces in an image or an image sequence, and the face tracking is to continuously track one or more detected faces in a video sequence. The face detection and tracking technology is not only a necessary premise of technologies such as face recognition, expression recognition, face synthesis and the like, but also has wide application value in the fields of intelligent human-computer interaction, video conferences, intelligent monitoring, video retrieval and the like.
The images targeted by the system are video sequences input by a video camera. The applicant has previously proposed a method and system for real-time detection and continuous tracking of human faces in a video sequence, chinese patent application No. 200510135668.8, hereinafter referred to as document 1, which is incorporated herein by reference in its entirety. The method and the system provided by the application adopt a face detection method based on AdaBoost statistical hierarchical classifier to realize real-time detection of the face with the front upright, and combine a face tracking method based on Mean shift and histogram features to realize a real-time face tracking system. From experimental results, the system can detect the face with-20 to 20-degree depth rotation and-20 to 20-degree plane rotation, and can detect the faces with different skin colors, the faces under different illumination conditions, the faces wearing eyes and the like. The tracking of the human face is realized through skin color, the tracking algorithm is not influenced by the posture of the human face, and the side face and the rotating human face can be tracked as well.
However, the algorithm in the above patent application also has certain limitations. Firstly, the algorithm only trains a detection model of the front face, and the side face cannot be detected, which means that the detection and verification of the face can only aim at the front face, and the application range of the algorithm is greatly limited; secondly, the algorithm tracks the face only through the skin color histogram, and the skin color characteristics of the face are very easily interfered by other skin color areas such as neck, hands or similar skin color areas such as yellow clothes, and the tracking result is reflected that the tracking area sometimes jumps to hands, neck or yellow clothes; thirdly, the size and the position of the tracking area obtained by the original algorithm are changed violently, and even if the face of a person is kept still, the tracking result can also shake obviously; in addition, the algorithm cannot acquire further pose information of the human face, such as the rotation angle of the human face, the current approximate pose and the like.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a multi-pose face detection and tracking system and method, which can track and detect a multi-pose face, overcome the defect of interference of a non-face region close to the skin color of the face, ensure the stability of a continuous tracking and detection algorithm of the multi-pose face, acquire the rotation angle of the face and output the accurate size of the face.
In order to solve the above technical problem, the present invention provides a multi-pose face detection and tracking method, comprising:
(1) respectively obtaining a face front side detection model and a half side detection model through face sample training, and determining an active appearance AAM face model;
(2) carrying out face detection on an input video image by using the face front and half side detection model to determine whether a face exists in a frame of image;
(3) if a face is detected in an image of a frame, tracking and verifying the face in a subsequent frame, comprising the steps of:
(31) tracking the face position in the previous frame image to obtain the initial position of the face in the current frame;
(32) calculating the translation speed of the human face by taking the obtained initial position as an initial value and utilizing the chromaticity difference between the current frame image and the previous frame image;
(33) estimating the approximate position of the face in the current frame according to the translation speed, and detecting near the position by using the face front model and the half-side detection model to verify the face;
(34) if the human face is detected near the position, the verification is passed, and the AAM human face model is adopted to calculate the affine transformation coefficient of the current human face, so as to obtain the characteristic parameters of the current human face.
Wherein the step (3) further comprises:
(35) and matching the key points of the face of the current frame and the face of the previous frame, and further correcting the calculated face translation speed and the characteristic parameters of the face of the current frame according to the matching result.
Wherein the step (3) further comprises:
(36) and updating the characteristic parameters of the face of the current frame, and using the parameters for tracking and verifying the next frame of image.
Wherein the step (34) further comprises: if no face is detected near the location, the verification fails and the tracking verification is performed in the next frame.
Wherein the step (34) further comprises: if the face verification is still not passed in the following frames, the tracking is stopped.
Wherein, further comprising the steps of:
(4) and (3) after the previous tracking target stops tracking, detecting in the subsequent images from the step (2) again until a new face is found, and continuing to track.
Wherein, the step (1) of respectively obtaining the face front and half-side detection models through face sample training comprises the following steps: firstly, training a multilayer detection model by using face samples of all postures, and then respectively training the face samples of the front side, the left side and the right side postures to obtain detection models of three postures.
Wherein, the step (2) of detecting the human face comprises the following steps: firstly, searching the image by adopting the detection models of all the postures, eliminating most of search windows, then respectively inputting the rest windows into the detection models of the three postures, and determining the approximate posture of the human face according to the detection result.
In order to solve the above technical problem, the present invention further provides a multi-pose face detection and tracking system, comprising:
the training module is used for respectively obtaining a face front side detection model and a half side detection model through face sample training and determining an AAM face model;
the detection module is used for carrying out face detection on the input video image according to the face front and half side detection model and determining whether a face exists in one frame of image;
the tracking module is used for tracking and verifying the face in the subsequent frame after the face is detected in the image of a certain frame, and comprises:
a unit for tracking the face position in the previous frame image and obtaining the preliminary position of the face in the current frame;
a unit for calculating the translation speed of the human face by using the obtained preliminary position as an initial value and using the chromaticity difference between the current frame image and the previous frame image;
a unit for estimating the approximate position of the face in the current frame according to the translation speed, and detecting near the position by using the face front model and the half-side detection model to verify the face;
and the unit is used for adopting the AAM face model to calculate the affine transformation coefficient of the current face after the face is detected near the position, and acquiring the characteristic parameters of the current face.
Wherein the tracking module further comprises:
and the unit is used for matching the key points of the face of the current frame and the face of the previous frame of image and further correcting the calculated translation speed of the face and the characteristic parameters of the face of the current frame according to the matching result.
The training module trains a multi-layer detection model by using the face samples in all postures, and trains the face samples in the front, left and right postures respectively to obtain detection models in three postures.
The detection module searches the image by adopting the detection models of all the postures, eliminates most of search windows, respectively inputs the rest windows into the detection models of the three postures, and determines the approximate posture of the human face according to the detection result.
The multi-pose face detection and tracking system and the multi-pose face detection and tracking method can track and detect the multi-pose face, overcome the defect that a non-face area similar to the skin color of the face, such as a neck, a hand or a yellow clothes, is interfered, ensure the stability of a continuous tracking and detection algorithm of the multi-pose face, acquire the rotation angle of the face and output the accurate size of the face.
Drawings
FIG. 1 is a schematic structural diagram of a multi-pose face detection and tracking system according to an embodiment of the present invention;
FIG. 2 is a flow chart of a multi-pose face detection and tracking method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a face detection and tracking result in a multi-pose face detection and tracking method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of seven groups of micro-features selected by a face detection algorithm in a multi-pose face detection and tracking method according to an embodiment of the present invention;
FIG. 5 illustrates the calibration and collection of face samples in a multi-pose face detection and tracking method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of 4 sets of multi-pose face detection results in a multi-pose face detection and tracking method according to an embodiment of the present invention;
FIG. 7 is a schematic flowchart of a face verification module in the multi-pose face detection and tracking method according to the embodiment of the present invention;
FIG. 8 is a schematic diagram illustrating a face verification result obtained by a first level of verification in a multi-pose face detection and tracking method according to an embodiment of the present invention;
FIG. 9 is a schematic diagram illustrating a result of face verification by a second level of verification in a multi-pose face detection and tracking method according to an embodiment of the present invention;
FIG. 10 is a schematic diagram illustrating an example of the calculation result of the affine coefficient of the AAM algorithm in the multi-pose face detection and tracking method according to the embodiment of the present invention;
FIG. 11 is a schematic diagram of the AAM-based face tracking result in the multi-pose face detection and tracking method according to the embodiment of the present invention;
FIG. 12 is a diagram illustrating a selection and tracking result of key points in a multi-pose face detection and tracking method according to an embodiment of the present invention;
fig. 13 is a schematic diagram illustrating an exemplary face detection and tracking result in a multi-pose face detection and tracking method according to an embodiment of the present invention.
Detailed Description
Referring to fig. 1, the present invention first provides a multi-pose face detection and tracking system, which includes a training module 100, a detection module 200, and a tracking module (not shown). Wherein:
the training module 100 is configured to obtain a front face and a half-side detection model (including a right-side posture and a left-side posture) of a human face through human face sample training, and determine an aam (active application models) human face model;
the detection module 200 is configured to perform face detection on an input video image according to the face front and half side detection model, and determine whether a face exists in a frame of image;
the tracking module is used for tracking and verifying the face in the following frame after the face is detected in the image of a certain frame, and comprises:
a unit for tracking the face position in the previous frame image and obtaining the preliminary position of the face in the current frame;
a unit for calculating the translation speed of the human face by using the obtained preliminary position as an initial value and using the color difference between the current frame image and the previous frame image;
a unit for estimating the approximate position of the face in the current frame according to the translation speed, and detecting near the position by using the face front model and the half-side detection model to verify the face;
a unit for calculating the affine transformation coefficient of the current face by adopting the AAM face model after the face is detected near the position, and acquiring the characteristic parameters of the current frame face; and
and the unit is used for matching the key points of the face of the current frame and the face of the previous frame of image and further correcting the calculated translation speed of the face and the characteristic parameters of the face of the current frame according to the matching result.
According to the embodiment shown in fig. 1, referring to the training module 100, two sets of models, namely, a front face and a half-side face detection model and an AAM face model (not shown in the figure), need to be trained first. The training algorithm of the face detection model can adopt a multi-stage classifier based on an AdaBoost algorithm, the multi-stage classifier is trained by a plurality of face samples of the front face and the half side face, and the size of the extracted face is 12 multiplied by 12. In addition, in order to ensure that the algorithm can recognize three poses of the left side, the front side and the right side of the human face, in this embodiment, a left-side pose face detection model, a right-side pose face detection model and a front-side pose face detection model are trained, wherein the left-side pose face detection model and the right-side pose face detection model can be collectively referred to as a semi-lateral face detection model, and the right-side pose face detection model is obtained after mirror image processing of the left-side pose face detection model. In addition, in order to accelerate the detection speed, the present embodiment also trains 15 layers of all pose face detection models, referred to as first-stage detection models, using all pose face samples, and performs preliminary detection on the input image to roughly obtain the face position.
In the training module 100, the purpose of training the AAM face model is to calculate an affine transformation coefficient of an input face with respect to a standard face on the premise that the approximate position and approximate size of the face are known, and obtain a more accurate position, size, and rotation angle of the face.
Referring to the detection module 200, when performing face detection, in this embodiment, first, the detection models in all the poses are adopted to search the input image, most of the search windows are eliminated, then, the remaining windows are respectively input into the detection models in the three poses, the last detection candidate frame is returned, and a weight is calculated for each candidate frame according to the detection result. Generally, the detection model for each pose returns some candidate frames, merges neighboring candidate frames, and counts the weight of the candidate frames returned by each pose. If the weight of the face in a certain merging frame is larger, the detected face is the face in the front; if the weight of the left face is larger, the detected face can be judged to be approximately the left face, and therefore the approximate pose of the face can be determined.
Referring now to fig. 2, a flow chart of a multi-pose face detection and tracking method according to an embodiment of the present invention is shown.
Step 201: inputting a frame of image from a video camera, searching the image in each frame before a tracking target is not obtained, and detecting the existence of a human face;
the result of face detection is given at 301 in fig. 3, where the frame is the detected face frame.
Step 202: judging whether the face of the previous frame is tracked or not;
step 203: when the face is not tracked in the previous frame, performing multi-pose face detection on the current frame image, if one or more faces are found in the current frame image, performing step 204, otherwise, continuing face detection in the subsequent image;
step 204: and tracking the face detected in the previous frame in the next two frames of images, verifying the tracked face, and judging that the face really exists only after two continuous frames of a certain face pass the verification by an algorithm, and if a plurality of faces pass the verification, selecting the largest face to start tracking. The face verification is to detect the area where the tracked face is again through the reference detection module 200, and judge whether the tracked face is a real face;
step 205: starting tracking after the verification is passed;
after the face is determined to be tracked, continuously tracking the face in the subsequent frames, wherein the tracking process comprises the following steps:
step 206: tracking the previous frame of face by adopting a face tracking algorithm based on Mean Shift and a histogram to obtain the initial position of the current face;
step 207: the face position obtained by the tracking algorithm in the previous step is not accurate and is easily interfered by other areas which are closer to skin colors, such as necks, hands and the like, so that the translation speed of the face is obtained by using the chrominance information of the current frame image and the previous frame image;
step 208: estimating the approximate position of the face through the calculated translation speed, and performing face verification by using a face detection model, namely, searching near the position to judge whether the face exists in the region, wherein the face verification method is consistent with the face verification method in the step 205;
step 209: judging whether the face passes the verification;
if the face in the current area exists and the face verification is passed, the method comprises the following steps:
step 210: calculating an affine transformation coefficient of the current face by adopting an AAM algorithm, and acquiring characteristic parameters including an accurate position, a rotation angle and a size of the face;
step 211: and matching key points of the face of the current frame and the face of the previous frame to obtain more accurate translation speed, scale transformation, rotation coefficient and the like of the two faces in the two frames of images so as to obtain accurate characteristic parameters of the face of the current frame. Another purpose of this step is to keep the tracking result stable so that the tracking area does not appear to be noticeably jittery. The result of face tracking by verification is shown with reference to 302 in fig. 3;
step 212: updating the characteristic parameters of the face of the current frame, and continuously processing the next frame of image by using the characteristic parameters;
if in step 209, no face is searched in the tracking area, i.e. the face verification fails, which indicates that the current tracking area does not contain a face or the face pose changes too much, the face is continuously tracked in the subsequent frame, and the verification is continued, including the following steps:
step 213: judging whether the continuous frames are not verified;
step 214: if the verification is passed, updating the characteristic parameters and continuing to track;
step 215: if the face verification still fails in the subsequent frames, the current tracking target is considered to be not the face, or the face pose changes too much, the tracking value is not high, and the tracking is stopped. An example of a face tracking result that fails to pass the verification is shown with reference to 303 in fig. 3.
And after the previous tracking target stops tracking, carrying out face detection again in the subsequent images until a new face is found, and then carrying out tracking again.
The following description focuses on some key technical points in the processing procedure of the present invention.
First, the face detection algorithm in step 203 of the present invention will be described in further detail.
The human face detection algorithm described in the present invention is basically consistent with the principle of document 1, and a human face detection method based on an AdaBoost statistical hierarchical classifier is adopted, as described in the foregoing document 1, a human face detection algorithm based on AdaBoost (p.viola, and m.jones, Rapid object detection using a boost of simple features, proc.on Computer Vision pattern recognition, 2001, hereinafter referred to as document 2), first, a two-class classifier of "human face/non-human face" is trained by a large number of samples of "human face" and "non-human face", and the classifier can determine whether a rectangular window of a certain scale is a human face, and if the length of the rectangle is m and the width is n, the flow of human face detection is: the method comprises the steps of continuously scaling images according to a certain proportion, exhaustively searching and judging all size mxn pixel windows in the obtained series of images, inputting each window into a human face/non-human face classifier, leaving candidate windows identified as human faces, merging candidates of adjacent positions by adopting a post-processing algorithm, and outputting information such as the positions and sizes of all detected human faces.
Document 1 only considers the detection of a front face, referring to a standard face image shown in 501 and a clipped standard face result shown in 502 in fig. 5, but the present invention also needs to implement the detection of a side face, so as to ensure the continuous tracking of a multi-pose face and the stability of a detection algorithm. The present invention still extracts the face features by using the seven sets of micro-features shown in fig. 4, but the images of faces in different poses are very different, which results in very large difference of the micro-features in the same positions of the faces in different poses, which means that if the algorithm described in document 1 is still used to train an AdaBoost strong classifier for all positive samples, the result of convergence is difficult to obtain by the training algorithm, and even if very many micro-features are selected by weak classifiers at all levels, the false alarm rate of the anti-sample is still higher. Therefore, the detection of the multi-pose face is finished in two steps, firstly, 15 layers of detection models are trained by adopting face samples of all poses, then, samples of three poses are trained respectively, and one detection model is trained for each pose.
In the present invention, about 4500 face images are collected, wherein about 2500 front face images, about 1000 left face images and about 1000 right face images are collected. The face samples are subjected to affine transformation, clipping and segmentation in combination with the standard face and clipping method mentioned in document 1, reference is made to the face samples and calibration points shown at 503 in fig. 5 and the clipping result shown at 504, and all face regions are normalized to a size of 12 × 12. Let the distance between the two eyes be r and the central point of the line connecting the two eyes be (x)center,ycenter) If the length and width of the rectangular region are set to 2r, that is, the distance between two eyes is two times, the coordinate (x) of the rectangular cutting region is acquiredleft,ytop,xright,ybottom) Comprises the following steps:
in order to enhance the detection robustness of the classifier on the rotation and size change of a human face at a certain angle, each sample is subjected to mirror image transformation, rotation at an angle of +/-20 degrees and size amplification by 1.1 times, so that each sample is expanded into five samples, and about 22500 positive samples are obtained in total. The anti-sample image is a large number of images without human faces, including landscape images, animals, characters and the like, and the total number of the images is 5400. The method for acquiring the characteristics of the inverse samples in the training process of each layer of AdaBoost classifiers is completely consistent with that described in the document 1, firstly, an inverse sample image is randomly selected, the size and the position of the inverse sample in the image are randomly determined, then, a corresponding area is cut out from the image, and the cut image is normalized to the size of 12 x 12 to obtain the inverse sample.
After all models are trained, the first-stage detection model has 15 layers, the false alarm rate is 0.0022, the classification error rate of the training positive sample is 4.8%, the error rate of the positive sample is higher, and the false alarm rate still exceeds 0.1%, which indicates that the difference of the characteristic data of different posture samples is larger, the convergence of the model in the AdaBoost training process is slower, and the reason why the models need to be trained for different postures is the same. The detection model of the front posture has 18 layers, the total false alarm rate is 2.6e-6, and the classification error rate of the training sample passing the first-stage detection is 4.1%. The detection model of the left-side posture has 16 layers, the total false alarm rate is 3.8e-7, and the classification error rate of the training sample passing the first-stage detection is 0.42%. In order to save training time, the gray distribution of the left face and the gray distribution of the right face are considered to be completely symmetrical, so that the detection model of the right posture is not trained, and the detection model of the left posture is subjected to mirror image processing, so that the detection model of the right posture face can be obtained. In the training samples, the front samples are more, the interference of a plurality of samples is larger, so the classification error rate is higher, the side samples are less, the interference is very small, and the classification error rate is very low.
When detecting human faces, the invention firstly reduces images on a plurality of scales, for example, for 160 × 120 images, 9 scales are considered, the reduction times of the images are respectively 1.5, 1.88, 2.34, 2.93, 3.66, 4.56, 5.72, 7.15 and 8.94, the minimum of the human face frame in the corresponding original image is 18 × 18 and the maximum is 107 × 107, then a first-stage detection model is adopted to search each reduced image, most of search windows are eliminated, then the rest windows are respectively input into the human face detection models with three poses, the last detection candidate frame is returned, and a weight is calculated for each candidate frame according to the detection result. Generally, the face detection model of each pose returns some candidate frames, combines adjacent candidate frames, and counts the weight of the candidate frames returned by each pose. If the weight of the face in a certain merging frame is larger, the detected face is the face in the front; and if the weight of the left face is larger, the detected face can be considered as the left face, so that the approximate pose of the face can be determined. Referring to fig. 6, there is a schematic diagram of several groups of multi-pose face detection results, wherein the detection results of different poses have been labeled with different gray-scale boxes.
Secondly, the man Shift-based face tracking algorithm described in step 206 of the present invention is further described in detail:
in addition, the face detection algorithm is very time-consuming, and generally takes tens of milliseconds to complete the detection of all faces in a 320 × 240 image, so that the face detection cannot be performed on each frame of image of a real-time input video sequence, but the efficiency of the algorithm is greatly improved by tracking and verifying the detected faces, and the algorithm is ensured not to track other non-face targets.
The face tracking algorithm of the invention firstly adopts the Object tracking algorithm Based on Mean shift and histogram features, which is mentioned in document 3(d. meaniciu, v. mesh, and p. Mean. kernel-Based Object tracking. ieee trans. pattern Analysis and machine Analysis, May 2003, 25 (5): 564-577, abbreviated as document 3) by document 1, comanicu and the like, to track the detected face, and searches the face position in the current frame image by the face position size of the previous frame and the long and short term two groups of local histogram features of the face to obtain the coordinates of the center point of the face region. The algorithm has the advantages of high efficiency and no influence of face rotation and posture change, and can also roughly acquire the position of the face center when the face rapidly translates in the video. But the defects are obvious, the tracking precision of the algorithm is not high, although the position of the face can be quickly obtained, the coordinate of the obtained central point is not accurate enough, and the central point can still shake even if the face is fixed and influenced by noise interference and the like. In addition, the algorithm uses skin color as a tracking feature, which means that the algorithm may also track to skin color areas such as hands and neck.
Based on the advantages and disadvantages of the tracking algorithm, accurate estimation of human face translation, continuous verification of human face images and estimation of human face scale posture are added on the basis of a Mean Shift-based tracking result, the algorithm can be ensured to track the human face area, the tracking area precision is higher, and the accurate size, the rotation angle and the like of the human face can be obtained.
Third, the translation estimation described in step 207 of the present invention is described in detail as follows:
the human face tracking algorithm based on Mean Shift can quickly acquire the rough position of the center point of the current frame of the human face, and the purpose of the translation estimation is to accurately estimate the translation vector of the adjacent frame of the human face on the basis of the rough position by combining human face chroma distribution characteristics and a Lucas-Kanade back calculation algorithm (I.Matthews and S.Baker.ActiveAppearance Models revised International Journal of Computer Vision, Vol.60, No.2, November, 2004, pp.135-164, hereinafter referred to as document 4) and determine the accurate position of the center point of the human face.
The Lucas-Kanade algorithm can quickly calculate the translation speed of a point in a continuous image sequence. Given a certain point A, the coordinate is xA,I(xA,tk) If the translation speed of a in two adjacent frames is u ═ (u, v) for the luminance of the point in the k-th frame image, then:
I(x-uδt,tk)=I(x,tk-1),δt=tk-tk-1 (2)
in many cases, the initial value of the velocity of A is known and is set as u0If the translation speed of the point in the previous frame in the continuous image sequence can be set as the initial value of the speed, then u-u0+ Δ u, and Δ u is typically relatively small. Considering points in the neighborhood of point A, the average of these pointsThe moving speed can be considered to be very close to u, so that the mean square sum of the pixel difference values of all points in the neighborhood range N in the two adjacent frames can be calculated:
the minimum u of the above equation serves as an estimate of the translational velocity of a. If Δ u is small, then the above equation can be expanded to a Taylor series of δ t and derivative terms above one order removed, as follows:
then, the expansion is derived for Δ u to make the derivative equal to zero, and the equation is solved to obtain:
where H is the Hessian matrix:
the velocity estimation formula described above can only accommodate situations where Δ u is small because an approximate one-stage taylor series expansion is used. In order to ensure that the algorithm can estimate a relatively large translation speed, multiple iteration processing needs to be performed, the translation speed estimated in the previous iteration is used as an initial value of a new iteration step, a new translation speed is estimated in each iteration and is superposed with the original translation speed, that is:
un=un-1+Δun (7)
wherein u isnIs the total velocity, Δ u, after the nth iterationnThe speed obtained in the nth iteration. In addition, processing at multiple resolutions is required, where the translation velocity is estimated at a lower resolution, and this velocity is used as an initial value for a high resolution estimation algorithm, and then a more accurate velocity is calculated.
According to equation (7), the initial value of each iteration process is the calculated value of the previous frame, so that each iteration process needs to be recalculatedThe H matrix and the inverse matrix thereof are very time-consuming, so the invention adopts the Lucas-Kanade inverse algorithm to improve the efficiency of the algorithm.
Take the nth iteration as an example:
I(x-unδ,tk)=I(x,tk-1)=I(x-un-1δt-Δunδt,tk) (8)
converting Δ u in the above formulanChanging positions, changing to:
I(x-un-1δt,tk)=I(x+Δunδt,tk-1) (9)
from this, Δ u can be obtainednThe calculation formula of (A) is as follows:
where H is the Hessian matrix:
the H matrix in the above equation is fixed and invariant in the whole iteration process, and the inverse matrix can be calculated before the iteration starts, and then the calculation is not needed. Thus, only continuous calculation is needed in iteration
<math>
<mfrac>
<mrow>
<mi>I</mi>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>-</mo>
<msub>
<mi>u</mi>
<mrow>
<mi>n</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msub>
<mi>δt</mi>
<mo>,</mo>
<msub>
<mi>t</mi>
<mi>k</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mi>I</mi>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<msub>
<mi>t</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>)</mo>
</mrow>
</mrow>
<mi>δt</mi>
</mfrac>
</math>
And Δ unAnd the calculation amount is greatly reduced.
The size of the face in the video sequence changes very violently, and in order to ensure that the estimation algorithm can still calculate the translation speed quickly when the size of the face is very large, firstly, the faces with different scales are normalized, and the faces are all scaled to the same size. And scaling the current frame image according to the size of the face tracked by the previous frame to enable the size of the face area to be approximately 16 multiplied by 16. Then, the speed estimated based on the Mean shift algorithm is used as an initial value of a back calculation algorithm, the translation speed is calculated between two reduced frames of images, the images are subjected to multi-resolution processing firstly, the images are reduced by one time, the size of a human face is approximate to 8 multiplied by 8, the neighborhood N of a center point of the human face is the 8 multiplied by 8 neighborhood, and the translation speed is estimated by adopting the back calculation algorithm; the estimated speed is doubled and the translation speed is re-estimated on a 16 x 16 face region. And finally, the total speed is restored to the translation speed of the center point of the face on the original video.
When the translation estimation is realized, not only gray information but also skin color information of a human face need to be considered, three components of RGB of an input image are converted into YUV space, and the three components are respectively sent into a speed estimation formula. In addition, in order to reduce the influence of the illumination change of the face, all brightness values are divided by a larger number so as to reduce the weight of the brightness Y and emphasize the effects of two UV chrominance components, and the processing mode obviously improves the accuracy of the speed estimation when the face moves rapidly from the practical effect.
Fourthly, the face verification described in step 205 and step 208 of the present invention is described in detail:
in the aforementioned document 1, since the face detection algorithm can only detect a front upright face, and the tracking algorithm can only acquire a face region, and cannot know the rotation angle posture of the face, etc., when the face verification operation is performed, only hundreds of frames are continuously tracked to a target, but no front face is detected in the tracking region, the target is not necessarily the face, and the tracking is stopped. This has the disadvantage that if non-human face objects such as neck, hands, etc. are tracked, the system takes tens of seconds to react, which also greatly affects the performance of the system.
The face verification module of the invention solves the defects of the original system, because the new face detection can detect the upright faces on the front and the side, and the subsequent AAM-based face affine coefficient estimation algorithm can obtain the rotation angle of the face, etc., thereby realizing the continuous verification of the tracked face, i.e. judging whether the tracking area is the face or not in each frame, if not, outputting the non-face tracking result, and otherwise, stopping the tracking if the continuous frames fail to pass the verification. Thus, when the system tracks the non-face area, the system can respond in 1 second and stop tracking the target.
Referring to fig. 7, a detailed flowchart of the face verification module is shown. The specific process is as follows:
step 701: and combining the scale and the rotation angle of the face of the previous frame, the previously calculated translation parameters and the input image of the current frame.
Step 702: and roughly determining the position, size and rotation angle of the face of the current frame.
Step 703: and (5) cutting and normalizing the face area to obtain a 12 x 12 image.
Affine transformation is carried out on the current frame image by the parameters, and cutting and size normalization processing are carried out, so that a 12 x 12 image is obtained.
Step 704: inputting the image into a multi-pose face detection model, judging whether the image is a real face, if so, entering step 705, and if not, entering step 706. And if the weights of all the attitude detectors are zero, the input image is not a human face, and the neighborhood of the position of the human face of the current frame also needs to be searched.
Step 705: and returning the human face posture after the verification is passed.
Step 706: and searching the human face again in a smaller domain range and scale range. And searching in a smaller scale by combining the known size and the rotation angle of the face, merging the candidate face frames passing through all the attitude detectors, and taking the attitude corresponding to the maximum weight as the attitude of the face of the current frame. If any candidate face box is found, step 707 is entered, and if not, step 708 is entered.
Step 707: and merging the candidate faces, and returning the new position, scale and posture of the face in the original image.
Step 708: the verification failed. The current search area does not contain the face or the face pose changes too much, and the face verification fails.
Two examples of face verification are given below, with a particular image being used for illustration.
Fig. 8 is a schematic diagram of a face verification result by the first-level verification. In fig. 8, 801 indicates a previous frame image and a tracking result, 802 indicates a current frame image, and 803 indicates a clipped 12 × 12 image. The image, although not a full frontal face, passes all face detectors and the pose is identified as frontal, since such an algorithm can detect a range of angles of a face with plane rotation.
Fig. 9 is a schematic diagram of a face verification result by the second-level verification. In fig. 9, 901 indicates the previous frame image and the tracking result, 902 indicates the current frame image, 903 indicates the normalized face, and 904 indicates the result of the second-stage verification. This figure shows an example in which the first-stage verification fails and the second-stage verification passes, in this example, the translation speed estimation is biased, and therefore the normalized image is to the left compared with the real face, and the first-stage verification fails, while in the second-stage verification, affine transformation and cropping processing are also performed on the input image, but the region to be cropped is larger than that of the first-stage verification, the face in this region is searched, and candidate results are merged, and the detected face block diagram is shown as 904.
Fifth, the AAM-based human face affine coefficient estimation described in step 210 of the present invention is further described in detail.
The face frame output by the face verification algorithm can include all organs, but the scale and the rotation angle still use the previous frame result, so that the face with an excessively large rotation angle cannot pass the face verification, and the algorithm cannot process the plane rotation action of the face. In order to ensure that the algorithm can track the face rotating at any angle, the invention also provides an affine transformation coefficient estimation algorithm based on the simplified AAM to obtain the rotation, translation, scaling coefficient and the like of the face of the current frame.
The AAM is a parameter model based on Principal Component Analysis (PCA), target shape characteristics and color distribution characteristics, and aims to obtain the shape, affine transformation coefficients and the like of a target area through a model trained in advance. AAM is widely used in the fields of face modeling and face localization, for example, document 4 uses an AAM algorithm to obtain contour information of each organ of a face.
The purpose of the AAM-based face affine coefficient estimation in the present invention is to obtain the size and rotation angle of the tracked face, that is, four affine transformation coefficients a ═ a are calculatedi0, 1, 2, 3, and only three transformations of translation, scaling, and rotation are included:
according to the formula, the invention does not need to know the contour information of each organ of the human face. Therefore, the AAM model in document 4 can be simplified, and only the gray PCA model needs to be trained for the gray features of the face, and the input face is searched by using the AAM model including only the gray model, and the affine transformation coefficient of the face is calculated.
In addition, the pixel distribution of faces in different poses is different, for which the AAM is trained separately for the three poses. Firstly, human face samples in human face detection are cut, normalized in scale and normalized in gray scale, thousands of 16X 16 human face images are obtained, the cutting mode is consistent with that in human face detection, wherein two thousands of human faces are arranged on the front side, the left side is about 1000, and the right side is also 1000. The following describes the training and positioning process of the AAM by taking a frontal face as an example.
Let a face image be a (x), where x represents a point in a 16 × 16 image. PCA conversion is carried out on all training samples to obtain the mean value face A0M maximum eigenvalues and corresponding m eigenvectors Ai1, 2.. m, an arbitrary frontal face image is approximately represented as a0And Ai1, 2, linear summation of m:
wherein λ isiIs the linear weighting coefficient of A (x).
The face image input to AAM positioning algorithm is I (x), which is obtained by the face central point position, face size and the rotation angle of the previous frame face returned by face verification algorithm, and the appropriate lambda needs to be calculatediAnd affine transformation coefficient a ═ aiI ═ 0, 1, 2, 3, i (x) is matched to the trained AAM, minimizing the following:
wherein, I (x, a) is an image obtained by affine transformation of I (x), and a is obtained by adopting iterative processing and a Lucas-Kanade back calculation algorithm. Δ a is obtained for each iteration, which is:
as described in document 4, a technique of spatial projection is adopted to eliminate λ in the above formulaiThe computational load of the minimization iterative process is simplified. The space spanned by the vector Ai is denoted as sub (A)i),AiIs denoted as sub (A)i)⊥Then the above equation can be written as:
wherein the first term is in sub (A)i)⊥Calculated above, comprising AiCan be omitted since they are at sub (A)i)⊥The spatial projections are all zero, i.e.:
first term and λ in the above formulaiIndependently, the minimum value can be calculated for the first term to obtain a proper affine coefficient, and then the minimum value is calculated for the second term to calculate lambdai:
The minimization process of the first item can be realized by a Lucas-Kanade back calculation algorithm:
wherein
<math>
<mrow>
<mfrac>
<msub>
<mrow>
<mo>∂</mo>
<mi>A</mi>
</mrow>
<mn>0</mn>
</msub>
<mrow>
<mo>∂</mo>
<mi>a</mi>
</mrow>
</mfrac>
<mo>=</mo>
<mo>▿</mo>
<msub>
<mi>A</mi>
<mn>0</mn>
</msub>
<mfrac>
<mrow>
<mo>∂</mo>
<mi>x</mi>
</mrow>
<mrow>
<mo>∂</mo>
<mi>a</mi>
</mrow>
</mfrac>
<mo>,</mo>
</mrow>
</math>
Then there are:
wherein the Hessian matrix H is:
wherein,
<math>
<msub>
<mrow>
<mo>[</mo>
<mo>▿</mo>
<msub>
<mi>A</mi>
<mn>0</mn>
</msub>
<mfrac>
<mrow>
<mo>∂</mo>
<mi>x</mi>
</mrow>
<mrow>
<mo>∂</mo>
<msub>
<mi>a</mi>
<mi>j</mi>
</msub>
</mrow>
</mfrac>
<mo>]</mo>
</mrow>
<mrow>
<mi>sub</mi>
<msup>
<mrow>
<mo>(</mo>
<msub>
<mi>A</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>⊥</mo>
</msup>
</mrow>
</msub>
</math>
comprises the following steps:
the partial derivatives of the x-ray visiting change coefficient a are respectively:
in the above formulaThe inverse matrix of H can also be calculated in advance, since it is determined by the coordinates of each point in the mean image and the 16 × 16 image trained in AAM. Only I (x, a) and a need to be updated continuously in the iteration process, so that the efficiency of the algorithm can be greatly improved.
The steps of the whole AAM positioning algorithm are:
pre-calculating:
(2) For computing points in a 16 x 16 image
(4) Calculating a Hessian matrix and an inverse matrix thereof;
and (3) iterative processing:
(5) calculating I (x, a) according to a of the previous frame;
(6) calculating the image difference A0(x) -I (x, a) and Δ a;
(7) calculating a new affine transformation coefficient a + delta a;
and (3) subsequent calculation:
(8) the linear coefficient lambda is calculated by returning a calculated after the iteration is finishedi。
Simultaneously, an AAM is trained for each of the left and right faces. Referring to fig. 10, an exemplary schematic diagram of the calculation result of the affine coefficient of the AAM algorithm is shown. The figure shows the positioning result of the AAM algorithm, wherein the black box is the face box determined by the face detection algorithm, 1001 input shown in fig. 10 is the front face, 1002 is the left face, 1003 is the right face, and the white box is the box obtained after affine transformation, and for convenience of viewing, the positions of the two eyes are calculated by the formula (1) and the positions of the white line box, and are represented by + in a reverse way. Referring to fig. 11, a schematic diagram of the AAM-based face tracking result shows three images in a sequence, where the rotation angles of the faces in the images are large, but the tracking algorithm can still track the rotated faces and accurately reflect the angles.
Sixthly, the tracking of the key points of the face in step 211 of the present invention will be described in further detail.
The translation speed estimation, the face verification and the affine coefficient calculation are all realized on the lower face resolution, so that the efficiency of the algorithm can be improved, but the accuracy of the obtained face parameters can be reduced, because the resolution of the original image is much higher. Therefore, the data of face position, scale, angle and the like in the output result still have slight deviation from the real data. From the result, even if the face in a sequence is fixed, the face position, size and angle obtained by the module are obviously jittered. In order to solve the problem, a tracking module of a face key point is added at the end of the system, a translation estimation method which is consistent with the face tracking algorithm based on Mean Shift and is based on Lucas-Kanade back calculation algorithm is still adopted, the color information of the neighborhood pixel points of each key point is utilized, the initial translation speed is set according to the AAM positioning result, then the translation speed is respectively calculated for each key point between the input images of adjacent frames, and the parameters such as the final position of the face are determined.
Fig. 12 is a schematic diagram of the key point selection and tracking result. The determination method of the key points is 1201 as shown in fig. 12, where a frame of a previous frame in the drawing is a face frame, five points ABCDE are key points, a is a center point, and BCDE is a center point of a connection line between a and four vertices of the face frame. The reference numeral 1202 in fig. 12 shows a current frame image and a face frame determined by AAM, the corresponding five key points are a ' B ' C ' D ' E ', respectively, coordinates of these points are used as initial values of translation estimation, each point considers pixel points in its 5 × 5 neighborhood, and calculates translation speed of each key point, so as to obtain a new point a "B" C "D" E ", as 1203 shown in fig. 12. If the adjacent frame face has obvious rotation, the method for determining the face position by the translation speed of the key point alone may not reflect the rapid rotation change of the face, because the distribution of the neighborhood pixels of the corresponding key point no longer satisfies the translation relation (2), the translation estimation precision may be reduced, and the estimation speed of a ″ in fig. 1203 is not accurate enough. A compromise method is adopted for this purpose, and A ' B ' C ' D ' E ' and the coordinates of A ' B ' C ' D ' E ' are weighted and summed to obtain a new point A ' B ' C ' D ' E ', as shown by 1204 in FIG. 12, and the position, the outer frame, the rotation angle, the size and the like of the face are finally determined by the points. The square box shown in fig. 1204 is a face box, the four line segments shown are the final output results of the system, the intersection point of the extension lines of the line segments is the central point of the box, and the side length of the box is len, so that the distances from the two end points of each line segment to the central point are len/2 and len respectively.
The whole multi-pose face detection and tracking system is demonstrated in multiple scenes and multiple occasions and is combined with the face recognition, three-dimensional face synthesis and other programs to realize multiple demonstration programs. From the test results of multiple aspects, the face detection method provided by the invention can detect the face with-50 degree deep rotation and-20-degree plane rotation, can detect the head-up face with 0-30 degree, can detect the head-down face with 0-30 degree, can detect the faces with different skin colors, the faces under different illumination conditions, the faces with glasses, and the like, can track the faces on the front and the half sides, track the plane rotation face with any angle, has stable tracking algorithm, is not interfered by non-face areas similar to the skin colors of the faces, such as necks, hands, and the like, can obtain the rotation angle of the faces, and outputs the accurate size of the faces.
The algorithm of the invention has very high efficiency, according to the test result, the processing time of each frame is 8ms-15ms when the algorithm tracks the face of 320 x 240 image on a P42.8GHz computer, the CPU occupancy rate is not more than 12% when the algorithm processes 320 x 240 video image with the frame rate of 10fps, and the CPU occupancy rate is not more than 18% when the algorithm processes 640 x 480 video image with the frame rate of 10 fps. Fig. 13 is a schematic diagram of an example set of face detection and tracking results, where the first graph is the face detection result, and the last graph is an example of failed verification, and is represented by four black line segments.
Aiming at the limitations of the original algorithm, the invention provides a plurality of improved ideas, solves the defects of the original algorithm, realizes a more stable and reliable tracking result and keeps very high operation efficiency. The method can detect a plurality of front and half-side upright human faces in a shooting scene in real time, selects the largest human face, continuously tracks the human face by adopting a Mean Shift-based tracking algorithm and a Lucas-Kanade back calculation algorithm, calculates affine transformation coefficients of the tracked human face and a trained human face model by adopting an AAM-based human face model, and determines the size and the rotation angle of the tracked human face.
Claims (12)
1. A multi-pose face detection and tracking method is characterized by comprising the following steps:
(1) respectively obtaining a face front side detection model and a half side detection model through face sample training, and determining an active appearance face model;
(2) carrying out face detection on an input video image by using the face front and half side detection model to determine whether a face exists in a frame of image;
(3) if a face is detected in an image of a frame, tracking and verifying the face in a subsequent frame, comprising the steps of:
(31) tracking the face position in the previous frame image to obtain the initial position of the face in the current frame;
(32) calculating the translation speed of the human face by taking the obtained initial position as an initial value and utilizing the chromaticity difference between the current frame image and the previous frame image;
(33) estimating the approximate position of the face in the current frame according to the translation speed, and detecting near the position by using the face front model and the half-side detection model to verify the face;
(34) if the face is detected near the position, the verification is passed, and the affine transformation coefficient of the current face is calculated by adopting the active appearance face model, so as to obtain the characteristic parameters of the current frame face.
2. The method of claim 1, wherein step (3) further comprises:
(35) and matching the key points of the face of the current frame and the face of the previous frame, and further correcting the calculated face translation speed and the characteristic parameters of the face of the current frame according to the matching result.
3. The method of claim 2, wherein step (3) further comprises:
(36) and updating the characteristic parameters of the face of the current frame, and using the parameters for tracking and verifying the next frame of image.
4. The method of claim 1, wherein said step (34) further comprises: if no face is detected near the location, the verification fails and the tracking verification is performed in the next frame.
5. The method of claim 4, wherein said step (34) further comprises: if the face verification is still not passed in the following frames, the tracking is stopped.
6. The method of claim 5, further comprising the step of:
(4) and (3) after the previous tracking target stops tracking, detecting in the subsequent images from the step (2) again until a new face is found, and continuing to track.
7. The method of claim 1, wherein the step of obtaining the face front and half side detection models respectively through face sample training in step (1) comprises: firstly, training a multilayer detection model by using face samples of all postures, and then respectively training the face samples of the front side, the left side and the right side postures to obtain detection models of three postures.
8. The method of claim 1, wherein the face detection step of step (2) comprises: firstly, searching the image by adopting the detection models of all the postures, eliminating most of search windows, then respectively inputting the rest windows into the detection models of the three postures, and determining the approximate posture of the human face according to the detection result.
9. A multi-pose face detection and tracking system, comprising:
the training module is used for respectively obtaining a face front side detection model and a half side detection model through face sample training and determining an active appearance face model;
the detection module is used for carrying out face detection on the input video image according to the face front and half side detection model and determining whether a face exists in one frame of image;
the tracking module is used for tracking and verifying the face in the subsequent frame after the face is detected in the image of a certain frame, and comprises:
a unit for tracking the face position in the previous frame image and obtaining the preliminary position of the face in the current frame;
a unit for calculating the translation speed of the human face by using the obtained preliminary position as an initial value and using the chromaticity difference between the current frame image and the previous frame image;
a unit for estimating the approximate position of the face in the current frame according to the translation speed, and detecting near the position by using the face front model and the half-side detection model to verify the face;
and the unit is used for calculating the affine transformation coefficient of the current face by adopting the active appearance face model after the face is detected near the position, and acquiring the characteristic parameters of the current frame face.
10. The system of claim 9, wherein the tracking module further comprises:
and the unit is used for matching the key points of the face of the current frame and the face of the previous frame of image and further correcting the calculated translation speed of the face and the characteristic parameters of the face of the current frame according to the matching result.
11. The system of claim 9, wherein the training module is configured to train the multi-layered detection model using all pose face samples, and train face samples in front, left, and right pose face samples to obtain three pose detection models.
12. The system of claim 11, wherein the detection module determines the approximate pose of the face based on the detection results by searching the image using the detection models for all poses, eliminating most of the search windows, and inputting the remaining windows into the detection models for all poses.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB200610113423XA CN100426317C (en) | 2006-09-27 | 2006-09-27 | Multiple attitude human face detection and track system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB200610113423XA CN100426317C (en) | 2006-09-27 | 2006-09-27 | Multiple attitude human face detection and track system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1924894A CN1924894A (en) | 2007-03-07 |
CN100426317C true CN100426317C (en) | 2008-10-15 |
Family
ID=37817523
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB200610113423XA Active CN100426317C (en) | 2006-09-27 | 2006-09-27 | Multiple attitude human face detection and track system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100426317C (en) |
Families Citing this family (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101325691B (en) | 2007-06-14 | 2010-08-18 | 清华大学 | Method and apparatus for tracing a plurality of observation model with fusion of differ durations |
CN101216973B (en) * | 2007-12-27 | 2011-08-17 | 北京中星微电子有限公司 | An ATM monitoring method, system, and monitoring device |
CN101499128B (en) * | 2008-01-30 | 2011-06-29 | 中国科学院自动化研究所 | Three-dimensional human face action detecting and tracing method based on video stream |
JP4577410B2 (en) * | 2008-06-18 | 2010-11-10 | ソニー株式会社 | Image processing apparatus, image processing method, and program |
CN101576953B (en) * | 2009-06-10 | 2014-04-23 | 北京中星微电子有限公司 | Classification method and device of human body posture |
CN101739676B (en) * | 2009-12-04 | 2012-02-22 | 清华大学 | Method for manufacturing face effigy with ultra-low resolution |
CN101794385B (en) * | 2010-03-23 | 2012-11-21 | 上海交通大学 | Multi-angle multi-target fast human face tracking method used in video sequence |
CN101968846B (en) * | 2010-07-27 | 2013-05-15 | 上海摩比源软件技术有限公司 | Face tracking method |
CN103544478A (en) * | 2013-10-09 | 2014-01-29 | 五邑大学 | All-dimensional face detection method and system |
EP3198558A4 (en) * | 2014-09-25 | 2018-04-18 | Intel Corporation | Facilitating efficient free in-plane rotation landmark tracking of images on computing devices |
CN104318211A (en) * | 2014-10-17 | 2015-01-28 | 中国传媒大学 | Anti-shielding face tracking method |
CN105138956B (en) * | 2015-07-22 | 2019-10-15 | 小米科技有限责任公司 | Method for detecting human face and device |
CN105405094A (en) * | 2015-11-26 | 2016-03-16 | 掌赢信息科技(上海)有限公司 | Method for processing face in instant video and electronic device |
CN106251294B (en) * | 2016-08-11 | 2019-03-26 | 西安理工大学 | A kind of single width faces the virtual multi-pose generation method of facial image |
CN106503682B (en) * | 2016-10-31 | 2020-02-04 | 北京小米移动软件有限公司 | Method and device for positioning key points in video data |
CN106650624A (en) * | 2016-11-15 | 2017-05-10 | 东软集团股份有限公司 | Face tracking method and device |
CN106650682B (en) * | 2016-12-29 | 2020-05-01 | Tcl集团股份有限公司 | Face tracking method and device |
CN106991688A (en) * | 2017-03-09 | 2017-07-28 | 广东欧珀移动通信有限公司 | Human body tracing method, human body tracking device and electronic installation |
CN108664850B (en) * | 2017-03-30 | 2021-07-13 | 展讯通信(上海)有限公司 | Human face posture classification method and device |
CN107993250A (en) * | 2017-09-12 | 2018-05-04 | 北京飞搜科技有限公司 | A kind of fast multi-target pedestrian tracking and analysis method and its intelligent apparatus |
CN108875333B (en) * | 2017-09-22 | 2023-05-16 | 北京旷视科技有限公司 | Terminal unlocking method, terminal and computer readable storage medium |
CN109754383A (en) * | 2017-11-08 | 2019-05-14 | 中移(杭州)信息技术有限公司 | A kind of generation method and equipment of special efficacy video |
CN108197613B (en) * | 2018-02-12 | 2022-02-08 | 天地伟业技术有限公司 | Face detection optimization method based on deep convolution cascade network |
CN108510061B (en) * | 2018-03-19 | 2022-03-29 | 华南理工大学 | Method for synthesizing face by multiple monitoring videos based on condition generation countermeasure network |
CN109064489A (en) * | 2018-07-17 | 2018-12-21 | 北京新唐思创教育科技有限公司 | Method, apparatus, equipment and medium for face tracking |
CN109325964B (en) * | 2018-08-17 | 2020-08-28 | 深圳市中电数通智慧安全科技股份有限公司 | Face tracking method and device and terminal |
CN110909568A (en) * | 2018-09-17 | 2020-03-24 | 北京京东尚科信息技术有限公司 | Image detection method, apparatus, electronic device, and medium for face recognition |
CN109670474B (en) * | 2018-12-28 | 2023-07-25 | 广东工业大学 | Human body posture estimation method, device and equipment based on video |
CN113228626B (en) * | 2018-12-29 | 2023-04-07 | 浙江大华技术股份有限公司 | Video monitoring system and method |
CN112101063A (en) * | 2019-06-17 | 2020-12-18 | 福建天晴数码有限公司 | Skew face detection method and computer-readable storage medium |
US11805225B2 (en) | 2020-06-10 | 2023-10-31 | Plantronics, Inc. | Tracker activation and deactivation in a videoconferencing system |
CN112084856A (en) * | 2020-08-05 | 2020-12-15 | 深圳市优必选科技股份有限公司 | Face posture detection method and device, terminal equipment and storage medium |
CN112188140A (en) * | 2020-09-29 | 2021-01-05 | 深圳康佳电子科技有限公司 | Face tracking video chat method, system and storage medium |
CN112364808A (en) * | 2020-11-24 | 2021-02-12 | 哈尔滨工业大学 | Living body identity authentication method based on FMCW radar and face tracking identification |
CN112614168B (en) * | 2020-12-21 | 2023-08-29 | 浙江大华技术股份有限公司 | Target face tracking method and device, electronic equipment and storage medium |
CN113705444A (en) * | 2021-08-27 | 2021-11-26 | 成都玻尔兹曼智贝科技有限公司 | Facial development analysis and evaluation method and system |
CN114187216B (en) * | 2021-11-17 | 2024-07-23 | 海南乾唐视联信息技术有限公司 | Image processing method, device, terminal equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5850469A (en) * | 1996-07-09 | 1998-12-15 | General Electric Company | Real time tracking of camera pose |
US6741756B1 (en) * | 1999-09-30 | 2004-05-25 | Microsoft Corp. | System and method for estimating the orientation of an object |
CN1794265A (en) * | 2005-12-31 | 2006-06-28 | 北京中星微电子有限公司 | Method and device for distinguishing face expression based on video frequency |
CN1794264A (en) * | 2005-12-31 | 2006-06-28 | 北京中星微电子有限公司 | Method and system of real time detecting and continuous tracing human face in video frequency sequence |
-
2006
- 2006-09-27 CN CNB200610113423XA patent/CN100426317C/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5850469A (en) * | 1996-07-09 | 1998-12-15 | General Electric Company | Real time tracking of camera pose |
US6741756B1 (en) * | 1999-09-30 | 2004-05-25 | Microsoft Corp. | System and method for estimating the orientation of an object |
CN1794265A (en) * | 2005-12-31 | 2006-06-28 | 北京中星微电子有限公司 | Method and device for distinguishing face expression based on video frequency |
CN1794264A (en) * | 2005-12-31 | 2006-06-28 | 北京中星微电子有限公司 | Method and system of real time detecting and continuous tracing human face in video frequency sequence |
Also Published As
Publication number | Publication date |
---|---|
CN1924894A (en) | 2007-03-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100426317C (en) | Multiple attitude human face detection and track system and method | |
Wang et al. | Automatic laser profile recognition and fast tracking for structured light measurement using deep learning and template matching | |
Lin et al. | Hierarchical part-template matching for human detection and segmentation | |
EP2192549B1 (en) | Target tracking device and target tracking method | |
CN111914664A (en) | Vehicle multi-target detection and track tracking method based on re-identification | |
US8406470B2 (en) | Object detection in depth images | |
JP4625074B2 (en) | Sign-based human-machine interaction | |
JP5848341B2 (en) | Tracking by monocular 3D pose estimation and detection | |
US10885667B2 (en) | Normalized metadata generation device, object occlusion detection device and method | |
US9639748B2 (en) | Method for detecting persons using 1D depths and 2D texture | |
US20060204035A1 (en) | Method and apparatus for tracking a movable object | |
Jammalamadaka et al. | Has my algorithm succeeded? an evaluator for human pose estimators | |
CN102214309B (en) | Special human body recognition method based on head and shoulder model | |
Mikolajczyk et al. | Face detection in a video sequence-a temporal approach | |
Zhang et al. | Fast moving pedestrian detection based on motion segmentation and new motion features | |
CN112419317A (en) | Visual loopback detection method based on self-coding network | |
Gürel et al. | Design of a face recognition system | |
CN100426318C (en) | AAM-based object location method | |
Stenger et al. | Estimating 3D hand pose using hierarchical multi-label classification | |
Nam et al. | Pedestrian detection system based on stereo vision for mobile robot | |
CN108985216B (en) | Pedestrian head detection method based on multivariate logistic regression feature fusion | |
Polat et al. | A nonparametric adaptive tracking algorithm based on multiple feature distributions | |
CN115345902A (en) | Infrared image dim target detection tracking method and system based on machine learning | |
Dong et al. | Ellipse regression with predicted uncertainties for accurate multi-view 3d object estimation | |
Vasu et al. | Vehicle tracking using a human-vision-based model of visual similarity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20180408 Address after: 100191 Xueyuan Road, Haidian District, Haidian District, Beijing, No. 607, No. six Patentee after: Beijing Vimicro AI Chip Technology Co Ltd Address before: 100083, Haidian District, Xueyuan Road, Beijing No. 35, Nanjing Ning building, 15 Floor Patentee before: Beijing Vimicro Corporation |