WO2015070764A1 - 一种人脸定位的方法与装置 - Google Patents
一种人脸定位的方法与装置 Download PDFInfo
- Publication number
- WO2015070764A1 WO2015070764A1 PCT/CN2014/090943 CN2014090943W WO2015070764A1 WO 2015070764 A1 WO2015070764 A1 WO 2015070764A1 CN 2014090943 W CN2014090943 W CN 2014090943W WO 2015070764 A1 WO2015070764 A1 WO 2015070764A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- face
- window
- sub
- module
- image
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/165—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/754—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries involving a deformation of the sample pattern or of the reference pattern; Elastic matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
Definitions
- the present invention relates to the field of human-computer interaction technologies, and in particular, to a method and apparatus for face location.
- the invention provides a method and device for face positioning, which improves the fitting precision.
- the invention provides a method for face positioning, comprising:
- the obtaining the face detection area information according to the rough positioning image of the face includes:
- the sub-window output by the online learning classifier is subjected to non-maximum suppression NMS processing to obtain face detection area information.
- the sub-window in which the image variance value is smaller than the preset variance threshold is input to the online learning classifier, and the sub-window outputted by the online learning classifier is obtained, which specifically includes:
- the local shape fitting method is specifically a supervised sequence fitting method
- the supervised sequence fitting method is specifically:
- Step a extracting shape information of each part of the face according to the face detection area information, and using the extracted shape information as an initial value of each part shape of the face;
- Step b extracting a current feature descriptor according to a calibration point of a shape of each part of the current face, and the plurality of current feature descriptors constitute a current feature description vector;
- Step c using the current feature description vector as an index number, searching for a corresponding update matrix in the update matrix library, updating shape information of each part of the current face according to the corresponding update matrix, and obtaining updated current face parts.
- Shape information replacing the shape of each part of the current face in step b with the shape information of each part of the current face after the update;
- Step d determining whether it is greater than the preset maximum number of iteration steps, or whether the vector norm error of the last two shape errors is less than a preset vector norm error threshold, otherwise returning to step b, then proceeding to step e;
- Step e Obtain accurate shape information of each part of the face.
- the better ones include:
- the better ones include:
- the online learning classifier is updated according to the moving position of each part of the face.
- the invention also provides a device for locating a face, comprising:
- Obtaining an image module configured to acquire a user original image by using a camera, and send the original image of the user to the coarse positioning module;
- the coarse positioning module is connected to the acquired image module, and is configured to roughly locate the original image of the user, obtain a rough positioning image of the face, and send the rough positioning image of the face to the detection area module;
- the detection area module is connected to the coarse positioning module, and configured to obtain face detection area information according to the rough positioning image of the face, where the face detection area information includes position information of each part of the face;
- the fitting module is connected to the detection area module, and is configured to obtain accurate shape information of each part of the face by the local shape fitting method according to the face detection area information.
- the detection area module specifically includes:
- a sliding window module configured to divide the rough positioning image of the face into a plurality of sub-windows
- a variance filtering module is connected to the sliding window module and configured to calculate an image variance value of each sub-window, and compare the image variance value of each sub-window with a preset variance threshold, if the preset is smaller than the preset The variance threshold determines that the child window contains the target area, receives the child window, and vice versa rejects the child window;
- An online learning module is connected to the variance filtering module, and configured to input a sub-window whose image variance value is smaller than the preset variance threshold into an online learning classifier, to obtain a sub-window output by the online learning classifier;
- the NMS module is connected to the online learning module, and is configured to perform NMS processing on the sub-window output by the online learning classifier to obtain face detection area information.
- the better ones include:
- An optimization module is connected to the fitting module, and is configured to obtain an objective function of each part of the face according to the accurate shape information of each part of the face, and optimize an objective function of each part of the face. , to get the position of each part of the face after optimization.
- the better ones include:
- An online update module is connected to the optimization module, and is configured to track a motion position of each part of the face in two consecutive frames according to the optimized position of each part of the face, and according to the motion position of each part of the face, Update the online learning classifier.
- the present invention implements the above embodiment, and obtains face detection area information by performing rough face positioning image on the user image collected by the camera, and then according to the face detection area information, obtains accurate shape of each part of the face through the local shape fitting method. Improve the accuracy of the fit.
- FIG. 1 is a schematic flow chart of a method for face positioning according to an embodiment of the present invention
- FIG. 2 is a schematic flow chart of another embodiment of a method for locating a face according to the present invention.
- FIG. 3 is a schematic flow chart of still another embodiment of a method for locating a face according to the present invention.
- FIG. 4 is a schematic structural diagram of a device for locating a face according to an embodiment of the present invention.
- FIG. 5 is a schematic structural view of another embodiment of a device for locating a face according to the present invention.
- FIG. 6 is a schematic structural diagram of an update matrix library submodule according to an embodiment of the present invention.
- a schematic flowchart of a method for face location includes:
- Step S101 Acquire an original image of the user through the camera.
- the original image of the user is preprocessed, and the preprocessing includes preprocessing such as noise removal and illumination equalization.
- Step S102 The user original image is roughly positioned to obtain a rough positioning image of the face.
- the raw image of the user is subjected to rough detection and positioning of the face by Haar and AdaBoos t algorithm, and then based on the skin color distribution feature of the face, the skin color filter is applied to eliminate the false detection area and the detected face area is cropped, and the face is obtained. Roughly position the image.
- Step S103 Obtain face detection area information according to the rough positioning image of the face, and the face detection area information includes position information of each part of the face.
- the position information of each part of the face includes left eye position information, right eye position information, nose position information, and mouth position information.
- step S103 includes:
- the first step dividing the rough positioning image of the face into a plurality of sub-windows (that is, at least two sub-windows), and calculating an image variance value of each sub-window;
- Step 2 For any sub-window of the plurality of sub-windows, compare the image variance value of the sub-window with a preset variance threshold, and if less than the preset variance threshold, determine that the sub-window includes the target area, Receiving the sub-window; otherwise, rejecting the sub-window;
- the third step passing the sub-window passed in the second step (that is, the sub-window whose image variance value is smaller than the preset variance threshold) through the online learning classifier (ie, input), and obtaining the sub-class through the online learning classifier a window (that is, a sub-window obtained by the online learning classifier);
- the fourth step performing NMS (Non-maximal Suppression) processing on the sub-window outputted in the third step to obtain face detection area information.
- NMS Non-maximal Suppression
- Step S104 According to the face detection area information, the shape information of each part of the face is obtained by the local shape fitting method.
- the local shape fitting method is specifically a Supervised Sequence Method (SSM). Fitting method) method.
- the shape information of each part of the face includes left eye shape information, right eye shape information, nose shape information, and mouth shape information.
- the present invention implements the above embodiment, obtains face detection area information by performing face rough positioning image on the user image collected by the camera, and then obtains accurate shape information of each part of the face by local shape fitting method according to the face detection area information. , improve the accuracy of the fit.
- a method for the face location of the embodiment of the present invention is further described in detail below with reference to the flowchart of another embodiment of the method for face location according to the present invention.
- Step S201 Acquire an original image of the user through the camera.
- the original image of the user is preprocessed, and the preprocessing includes preprocessing such as noise removal and illumination equalization.
- Step S202 The user original image is roughly positioned to obtain a rough positioning image of the face.
- the raw image of the user is subjected to rough detection and positioning of the face by Haar and AdaBoos t algorithm, and then based on the skin color distribution feature of the face, the skin color filter is applied to eliminate the false detection area and the detected face area is cropped, and the face is obtained. Roughly position the image.
- Step S203 Obtain face detection area information according to the rough location image of the face, where the face detection area information includes location information of each part of the face.
- the position information of each part of the face includes left eye position information, right eye position information, nose position information, and mouth position information.
- this step S203 includes:
- the first step dividing the rough positioning image of the face into a plurality of sub-windows (that is, at least two sub-windows), and calculating an image variance value of each sub-window;
- Step 2 For any sub-window of the plurality of sub-windows, compare the image variance value of the sub-window with a preset variance threshold, and if less than the preset variance threshold, determine that the sub-window includes the target area, Receiving the sub-window; otherwise, rejecting the sub-window;
- the third step passing the sub-window passed in the second step (that is, the sub-window whose image variance value is smaller than the preset variance threshold) through the online learning classifier (ie, input), and obtaining the sub-class through the online learning classifier window;
- the online learning classifier includes a random forest classifier and an NCC (Normalized Cross Correlation) classifier.
- the fourth step performing NMS processing on the sub-window outputted in the third step (that is, through the sub-window of the online learning classifier) to obtain face detection area information.
- Step S204 According to the face detection area information, the shape information of each part of the face is obtained by the local shape fitting method.
- the local shape fitting method is specifically an SSM method.
- Shape information of each part of the face includes left eye shape information, Right eye shape information, nose shape information, and mouth shape information.
- Step S205 According to the accurate shape information of each part of the face, the objective function of each part of the face is obtained by the structural learning method.
- the structure learning method is specifically a SSVM (Structured Support Vector Machine) method.
- Step S206 Optimizing the objective function of each part of the face to obtain the optimized position of each part of the face.
- the target function of each part of the face is optimized by the Stochastic Gradient Descent (SGD) algorithm, and the optimized position of each part of the face is obtained.
- SGD Stochastic Gradient Descent
- Step S207 Tracking the motion positions of each part of the face in two consecutive frames according to the optimized position of each part of the face, and updating the online learning classifier according to the motion position of each part of the face.
- the forward and backward optical flow tracking method is used to track the moving position of each part of the face in two consecutive frames; according to the current tracking position of each part of the face and the coverage of each sub-window Proportional and posterior probabilities obtain positive and negative samples of each part of the face; based on the positive and negative samples of each part of the obtained face, select several samples with higher confidence (for example, greater than the set confidence threshold) to calculate the positive and negative samples. And then updating the prior probability of the random forest classifier, adding the positive and negative samples of the obtained face parts to the sample library of the NCC classifier, and updating the sample library of the NCC classifier.
- the present invention implements the above embodiments, and the method for capturing user images by using a camera, using a sliding window method, implementing a face localization method by using an online learning classifier and using an NMS algorithm, and determining the face positioning method by using the characteristics of the sliding window itself
- Parallel programming technology implements acceleration functions, and the filters and classifiers used do not involve complex operations. Therefore, the robustness of the program is ensured while the computational complexity is reduced, and the features of the various parts of the face are fitted.
- the optimization of the position of each part of the face and the tracking of various parts of the face can make the face positioning more accurate and more robust.
- a method for face localization according to an embodiment of the present invention is further described in detail below with reference to a flowchart of still another embodiment of a method for locating a face according to the present invention.
- Step S301 Acquire an original image of the user through the camera.
- the original image of the user is preprocessed, and the preprocessing includes preprocessing such as noise removal and illumination equalization.
- Step S302 The user original image is roughly positioned to obtain a rough positioning image of the face.
- the raw image of the user is subjected to rough detection and positioning of the face by Haar and AdaBoos t algorithm, and then based on the skin color distribution feature of the face, the skin color filter is applied to eliminate the false detection area and the detected face area is cropped, and the face is obtained. Roughly position the image.
- Step S303 The face rough positioning image is divided into several sub-windows.
- Step S304 Calculate the image variance value of each sub-window, compare the image variance value of each sub-window with a preset variance threshold, and if less than the preset variance threshold, determine that the sub-window includes the target area, and receive The child window The port is reversed; otherwise, the child window is rejected.
- Step S305 Calculate the posterior probability of the random forest classifier passing through the sub-window of step S304. If the posterior probability is greater than the preset probability threshold, the sub-window is received; otherwise, the sub-window is rejected.
- the random forest classifier is composed of 13 decision trees, and the characteristics of each decision tree are obtained by comparing the brightness values of the random 10 image blocks of each sub-window with each other, and the random forest classifier
- the posterior probability is the mean of the posterior probabilities of the 13 decision trees.
- the prior probability distribution of the random forest classifier will be updated in real time after tracking the face to achieve adaptability to the target shape change and texture change; for any decision tree, according to the prior probability and the decision tree
- the feature gets the posterior probability of the decision tree.
- Step S306 Calculate the matching coefficient of the target template in the NSC classifier sample library through the sub-window of step S305. If the matching coefficient is greater than the preset coefficient threshold, the sub-window is received; otherwise, the sub-window is rejected.
- the NCC classifier sample library is updated in real time after tracking the face, and an accurate description of the tracking target is completed.
- Step S307 Perform NMS processing on the sub-window outputted in step S306 to obtain face detection area information.
- the face detection area information includes at least position information of the left eye, the right eye, the nose, and the mouth in the face.
- the left eye in the face detection area is taken as an example to describe the process of fitting various parts of the face.
- Step S308 The left eye shape is extracted by a PCA (Principal Component Analysis) algorithm according to the position information of the left eye of the face, and the extracted left eye shape is an initial value.
- PCA Principal Component Analysis
- Step S309 extracting feature descriptors according to the calibration points of the shape of the left eye, and the plurality of feature descriptors constitute a feature description vector.
- the feature descriptor can be extracted by using a SIFT (Scale Invariant Feature Transform) algorithm or a variant algorithm thereof.
- SIFT Scale Invariant Feature Transform
- Step S310 Calculate a difference vector of the left eye shape and the preset real shape.
- Step S311 Obtain an update matrix according to the feature description vector in step S309 and the difference vector in step S310.
- an error function about the 2 norm is formed, and the error function is optimized by a linear least squares method to obtain an update matrix.
- Step S312 The left eye shape in step 309 is obtained by the update matrix obtained in step S311 (that is, the left eye shape in step 309 and the update matrix obtained in step S311 are subjected to vector product operation) to obtain an updated left eye shape, and Extracting the updated feature description vector of the shape of the left eye, and storing the feature description vector of the updated left eye shape as the index number corresponding to the update matrix obtained in step S311; and replacing the left eye shape of step S309 with the updated left Eye shape.
- the update matrix obtained in step S311 that is, the left eye shape in step 309 and the update matrix obtained in step S311 are subjected to vector product operation
- Step S313 determining whether the number of iteration steps is greater than the preset maximum update matrix library, or whether the norm error of the last two update matrices is less than a preset matrix norm error threshold, otherwise returning to step S309, and then proceeding to step S314.
- Step S314 Obtain an update matrix library, which is composed of a one-to-one corresponding index number and an update matrix.
- Step S315 Extract the current feature descriptor according to the calibration point of the current left eye shape, and the several current feature descriptors constitute the current feature description vector.
- the current left eye shape initial value is the left eye shape in step S308.
- Step S316 The current feature description vector is used as an index number, and the corresponding update matrix is searched in the update matrix library, and the current left eye shape is updated according to the corresponding update matrix to obtain the updated current left eye shape, and the current left edge in step S315 is obtained.
- the eye shape is replaced by the updated current left eye shape.
- Step S317 determining whether it is greater than the preset maximum iteration step, or whether the vector norm error of the last two shape errors is less than a preset vector norm error threshold, otherwise returning to step S315, if yes, proceeding to step S318.
- Step S318 Obtain an accurate left eye shape.
- an accurate right eye shape, a precise nose shape, and an accurate mouth shape can be obtained as well.
- the above method only involves the search and matrix vector product operation in the fitting process, and the fitting of the various parts of the face and the feature description vector extraction process can be processed in parallel, thus satisfying the real-time requirement, in addition, due to the NCC classifier sample library
- the richness and ability of the feature description vector to resist scale changes and rotation changes greatly improve the accuracy and real-time of the fitting.
- Step S319 Extract the left eye feature information according to the precise left eye shape to form a left eye feature vector.
- the HOG Hologram of Oriented Gradient
- the HOG Histogram of Oriented Gradient
- the left eye feature vector is reduced in dimension by a linear dimensionality reduction method.
- Step S320 Select a certain part as an anchor point, and determine a distance feature vector between the left eye and the part.
- the pixel difference between the left eye and the nose is calculated, and the sum of the squares of the differences is used as the distance feature vector between the left eye and the part.
- Step S321 The left eye feature vector obtained in step 319 and the distance feature vector determined in step 320 are used as a feature mapping function, and the left eye objective function is obtained from the feature mapping function.
- the feature mapping function obtains the objective function through the SSVM structure algorithm.
- Step S322 Optimizing the left eye target function to obtain an optimized left eye position.
- the target function is optimized by the SGD algorithm to obtain an optimized position of the left eye.
- the optimized position of the right eye portion, the position of the optimized nose portion, and the position of the optimized mouth portion can also be obtained.
- the position of the four parts is globally adjusted in units of face parts, and the relative position constraint relationship (ie, shape constraint) of each part of the face is satisfied, and The location is a unit, and SGD is used as a numerical optimization method to provide guarantee for the effectiveness, robustness and real-time performance of the algorithm.
- Step S323 Apply the forward and backward optical flow tracking method to track the motion position of the left eye in two consecutive frames according to the optimized left eye position.
- Step S324 Obtain positive and negative samples of the left eye in the face according to the currently tracked left eye motion position, the coverage ratio of each sub-window, and the posterior probability.
- Step S325 Based on the obtained positive and negative samples of the left eye, select a number of samples with higher confidence (for example, greater than the set confidence threshold) to calculate the characteristics of the positive and negative samples, and then update the prior probability of the random forest classifier.
- a number of samples with higher confidence for example, greater than the set confidence threshold
- Step S326 Add the obtained positive and negative samples of the left eye to the sample library of the NCC classifier, and update the sample library of the NCC classifier.
- the embodiment of the present invention implements the above embodiments.
- the camera captures the user image and adopts the sliding window method.
- the variance filter Through the variance filter, the random forest classifier, the NCC classifier and the NMS algorithm, the characteristics of the sliding window itself determine that the method can be parallel.
- the programming technology implements the acceleration function, and the adopted filters and classifiers do not involve complex operations, ensuring the robustness of the program while reducing the computational complexity, and fitting the features of the faces of the face, each face Part location optimization and tracking of various parts of the face can make face positioning more accurate and more robust.
- the image obtaining module 401 is configured to acquire the original image of the user by using the camera, and send the original image of the user to the coarse positioning module 402.
- the original image of the user is preprocessed, and the preprocessing includes preprocessing such as noise removal and illumination equalization.
- the coarse positioning module 402 is connected to the acquisition image module 401, and is configured to roughly locate the original image of the user, obtain a rough positioning image of the face, and send the rough positioning image of the face to the detection area module 403.
- the raw image of the user is subjected to rough detection and positioning of the face by Haar and AdaBoos t algorithm, and then based on the skin color distribution feature of the face, the skin color filter is applied to eliminate the false detection area and the detected face area is cropped, and the face is obtained. Roughly position the image.
- the detection area module 403 is connected to the coarse positioning module 402, and is configured to obtain face detection area information according to the rough location image of the face, where the face detection area information includes location information of each part of the face.
- the detection area module 403 specifically includes:
- the sliding window module 4031 is configured to divide the rough face positioning image into a plurality of sub-windows.
- the variance filtering module 4032 is connected to the sliding window module 4031, and is configured to calculate an image variance value of each sub-window, and compare the image variance value of each sub-window with a preset variance threshold, if the preset is smaller than the preset The variance threshold determines that the child window contains the target area and receives the child window; otherwise, the child window is rejected.
- the online learning module 4033 is connected to the variance filtering module 4032, and is configured to input a sub-window whose image variance value is smaller than the preset variance threshold into the online learning classifier, to obtain a sub-window output by the online learning classifier.
- the NMS module 4034 is connected to the online learning module 4033, and is configured to perform NMS processing on the sub-window output by the online learning classifier to obtain face detection area information.
- the location information of each part of the face includes left eye position information, right eye position information, nose position information, and mouth position information.
- the fitting module 404 is connected to the detection area module 403 for obtaining accurate shape information of each part of the face by the local shape fitting method according to the face detection area information.
- the local shape fitting method is specifically an SSM method.
- the shape of each part of the face includes left eye shape information, right eye shape information, nose shape information, and mouth shape information.
- the device also includes:
- the optimization module 405 is connected to the fitting module 404, and is configured to obtain an objective function of each part of the face according to the precise shape of each part of the face, and optimize the objective function of each part of the face, and optimize the target function. The position of each part of the face.
- the objective function of each part of the face is optimized by the SGD algorithm, and the optimal position of each part of the face is obtained.
- the online update module 406 is connected to the optimization module 405, and is configured to track the motion positions of each part of the face in two consecutive frames according to the positions of the optimized face parts, and update the online learning according to the motion positions of the parts of the face. Classifier.
- the forward and backward optical flow tracking method is used to track the motion position of each part of the face in two consecutive frames, according to the current tracking position of each part of the face and the coverage of each sub-window.
- the proportional and posterior probabilities obtain positive and negative samples of each part of the face. Based on the positive and negative samples of each part of the obtained face, select several samples with higher confidence (for example, greater than the set confidence threshold) to calculate the positive and negative samples. And then updating the prior probability of the random forest classifier, adding the positive and negative samples of the obtained face parts to the sample library of the NCC classifier, and updating the sample library of the NCC classifier.
- the present invention implements the above embodiment, and obtains face detection area information by performing rough face positioning image on the user image collected by the camera, and then according to the face detection area information, obtains accurate shape of each part of the face through the local shape fitting method. Improve the accuracy of the fit.
- the acquiring image module 501 is configured to acquire a user original image by using a camera, and send the user original image to the coarse positioning module 502.
- the original image of the user is preprocessed, and the preprocessing includes preprocessing such as noise removal and illumination equalization.
- the coarse positioning module 502 is connected to the acquired image module 501 for coarsely positioning the original image of the user, obtaining a rough positioning image of the face, and transmitting the rough positioning image of the face to the sliding window module 503.
- the rough detection of the face is performed on the original image of the user through the Haar and AdaBoos t algorithms, and then Based on the skin color distribution feature of the face, the skin color filter is applied to eliminate the false detection area and the detected human face area is cropped, and the rough face positioning image is obtained.
- the sliding window module 503 is connected to the coarse positioning module 502, and is configured to divide the rough positioning image into a plurality of sub-windows.
- the variance filtering module 504 is connected to the sliding window module 503, and is configured to calculate an image variance value of each sub-window, and compare the image variance value of each sub-window with a preset variance threshold, if the preset variance threshold is less than the preset Then, the child window is determined to include the target area, and the child window is received; otherwise, the child window is rejected, and the passed child window is sent to the random forest classifier 505.
- the random forest classifier 505 is connected to the variance filtering module 504, and is configured to calculate a posterior probability of the random forest classifier passing through the sub-window of the variance filtering module 504. If the posterior probability is greater than a preset probability threshold, the child is received. The window; otherwise, the child window is rejected, and the passed child window is sent to the NCC classifier 506.
- the random forest classifier is composed of 13 decision trees, and the characteristics of each decision tree are obtained by comparing the brightness values of the random 10 image blocks of each sub-window with each other, and the random forest classifier
- the posterior probability is the mean of the posterior probabilities of the 13 decision trees.
- the prior probability distribution of the random forest classifier will be updated in real time after tracking the face to achieve adaptability to the target shape change and texture change; for any decision tree, according to the prior probability and the decision tree
- the feature gets the posterior probability of the decision tree.
- An NCC classifier 506, coupled to the random forest classifier 505, is configured to calculate a matching coefficient of the target window in the NCC classifier sample library by the random window of the random forest classifier 305, if the matching coefficient is greater than a preset coefficient threshold, The sub-window is received; otherwise, the sub-window is rejected.
- the NCC classifier sample library is updated in real time after tracking the face, and an accurate description of the tracking target is completed.
- the NMS module 507 is connected to the NCC classifier 506 for performing NMS processing through the sub-window of the NCC classifier 506 to obtain face detection area information.
- the face detection area information includes at least position information of the left eye, the right eye, the nose, and the mouth in the face.
- the device also includes:
- the face part feature fitting module 508 is connected to the NMS module 507 and further includes
- the first extraction sub-module 5081 is configured to extract the shape of each part of the face according to the face detection area information.
- the first feature description vector sub-module 5082 is configured to extract a current feature descriptor according to a calibration point of each part shape of the current face, and the plurality of current feature descriptors constitute a current feature description vector.
- the first update sub-module 5083 is configured to use the current feature description vector as an index number, search for a corresponding update matrix in the update matrix library, and update the shape of each part of the current face according to the corresponding update matrix to obtain an updated current left eye shape. .
- the first determining sub-module 5084 is configured to determine whether the maximum number of iteration steps is greater than a preset, or whether the norm error of the last two shape error vectors is less than a preset vector norm error threshold, otherwise the updated current face will be updated.
- the shape of each part is returned to the first feature description vector sub-module 5082 as the shape of each part of the current face, and the updated current state is The shape of each part of the face is sent to the first result sub-module 5085;
- the first result sub-module 5085 is used to obtain an accurate shape of each part of the face.
- the face part feature fitting module 508 further includes an update matrix library sub-module 5086, which specifically includes:
- the second extraction sub-module 50861 is configured to extract the shape of each part of the face according to the face detection area information.
- the second feature description vector sub-module 50862 is configured to extract a feature descriptor according to a calibration point of a shape of each part of the face, and the plurality of feature descriptors constitute a feature description vector.
- the calculation sub-module 50863 is configured to calculate a difference vector of the shape of each part of the face and the preset true shape.
- the update matrix sub-module 50864 is configured to obtain an update matrix according to the feature description vector in the vector sub-module 50862 and the difference vector in the calculation sub-module 50863 according to the second feature.
- the second update sub-module 50865 is configured to update the shape of each part of the face in the second feature description vector sub-module 50862 through the update matrix obtained by the update matrix sub-module 50864 to obtain the updated shape of each part of the face, and extract the updated
- the feature description vector of the shape of each part of the face is stored locally in the index of the updated feature matrix obtained by the update matrix sub-module 30864 with the feature description vector of the shape of each face of the updated face.
- the second determining sub-module 50866 is configured to determine whether the number of iteration steps is greater than a preset maximum update matrix library, or whether the norm error of the last two updated matrices is less than a preset matrix norm error threshold, otherwise the updated
- the shape of each part of the face is returned to the second feature description vector sub-module 50862 as the shape of each part of the face, and the update matrix and index number stored in the local are sent to the second result sub-module 50867.
- the second result sub-module 50867 is configured to obtain an update matrix library, where the update matrix library is composed of a one-to-one corresponding index number and an update matrix.
- the device also includes:
- the face part optimization module 509 is connected to the face part feature fitting module 508, and is configured to extract feature information of each part of the face according to the accurate mean shape of each part of the face, and form a feature vector of each part of the face, and select A certain part is an anchor point, and the distance feature vector between each part of the face and the part is obtained.
- the feature vector and the distance feature vector of the face are used as the feature mapping function to obtain the objective function of each part of the face, and the parts of the face are The objective function is optimized to obtain the optimized position of each part of the face.
- the face part tracking module 510 is connected to the face location optimization module 509 for tracking the position of each part of the face according to the optimized position of the face, and tracking the motion position of each part of the face in two consecutive frames according to the forward and backward optical flow tracking method; According to the current moving position of each part of the face, the coverage ratio of each sub-window and the posterior probability, positive and negative samples of each part of the face are obtained, and based on the obtained positive and negative samples of each part of the face, the confidence is selected to be high ( For example, several samples larger than the set confidence threshold calculate the characteristics of the positive and negative samples; then update the prior probability of the random forest classifier, and add the positive and negative samples of the obtained face parts to the sample library of the NCC classifier. Update the sample library of the NCC classifier.
- the embodiment of the present invention implements the above embodiment, and the user image is collected by the camera, and the method of sliding the window is adopted, which is sequentially passed.
- Variance filter, random forest classifier, NCC classifier and NMS algorithm the characteristics of the sliding window itself determine that the method can use the parallel programming technology to achieve the acceleration function, and the filter and classifier used do not involve complex operations. It ensures the robustness of the program while reducing the computational complexity, and fitting the features of each part of the face, optimizing the position of each part of the face and tracking the parts of the face can make the face positioning more accurate and robust. higher.
- embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
- computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
- the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
- the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
- These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
- the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Geometry (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (10)
- 一种人脸定位的方法,其特征在于,包括:通过摄像头获取用户原始图像;对所述用户原始图像经过粗略定位,得到人脸粗略定位图像;根据所述人脸粗略定位图像得到人脸检测区域信息,所述人脸检测区域信息包括人脸各部位位置信息;根据所述人脸检测区域信息,通过局部形状拟合方法得到精确的人脸各部位形状信息。
- 如权利要求1所述的方法,其特征在于,所述根据所述人脸粗略定位图像得到人脸检测区域信息,具体包括:将所述人脸粗略定位图像划分为若干个子窗口;计算每个子窗口的图像方差值,将所述每个子窗口的图像方差值与预设的方差阈值进行比较,如果小于所述预设的方差阈值,则认定该子窗口包含目标区域,接收该子窗口,反之则否决该子窗口;将图像方差值小于所述预设的方差阈值的子窗口输入在线学习分类器,得到所述在线学习分类器输出的子窗口;将在线学习分类器输出的子窗口进行非最大抑制NMS处理,得到人脸检测区域信息。
- 如权利要求2所述的方法,其特征在于,所述将图像方差值小于所述预设的方差阈值的子窗口输入在线学习分类器,得到所述在线学习分类器输出的子窗口,具体包括:计算图像方差值小于所述预设的方差阈值的子窗口的随机森林分类器的后验概率,如果所述后验概率大于预设的概率阈值,则接收该子窗口,反之则否决该子窗口;计算后验概率大于预设的概率阈值的子窗口与正则协相关NCC分类器样本库中的目标模板的匹配系数,如果所述匹配系数大于预设的系数阈值,则接收该子窗口,反之则否决该子窗口。
- 如权利要求1所述的方法,其特征在于,所述局部形状拟合方法具体为监督的序列拟合方法,所述监督的序列拟合方法具体为:步骤a:根据人脸检测区域信息提取人脸各部位形状信息,将该提取的形状信息作为人脸各部位形状初始值;步骤b:根据当前人脸各部位形状的标定点提取当前特征描述符,若干个当前特征描述符组成当前特征描述矢量;步骤c:以所述当前特征描述矢量为索引号,在更新矩阵库中查找对应的更新矩阵,根据所述对应的更新矩阵更新当前人脸各部位形状信息,得到更新后的当前人脸各部位形 状信息,将步骤b中的当前人脸各部位形状信息替代为该更新后的当前人脸各部位形状信息;步骤d:判断是否大于预设的最大迭代步数,或者最近两次形状误差的向量范数误差是否小于预设的向量范数误差阈值,否则返回步骤b,是则进入步骤e;步骤e:得到精确的人脸各部位形状信息。
- 如权利要求2或3所述的方法,其特征在于,还包括:根据所述精确的人脸各部位形状信息,通过结构学习方法得到人脸各部位的目标函数;对所述人脸各部位目标函数进行优化,得到优化后的人脸各部位位置。
- 如权利要求5所述的方法,其特征在于,还包括:根据优化后的人脸各部位位置,跟踪连续两帧中人脸各部位的运动位置;根据所述人脸各部位的运动位置,更新所述在线学习分类器。
- 一种人脸定位的装置,其特征在于,包括:获取图像模块,用于通过摄像头获取用户原始图像,将所述用户原始图像发送给粗略定位模块;粗略定位模块,与所述获取图像模块相连,用于对所述用户原始图像经过粗略定位,得到人脸粗略定位图像,并将所述人脸粗略定位图像发送给检测区域模块;检测区域模块,与所述粗略定位模块相连,用于根据所述人脸粗略定位图像得到人脸检测区域信息,所述人脸检测区域信息包括人脸各部位位置信息;拟合模块,与所述检测区域模块相连,用于根据所述人脸检测区域信息,通过局部形状拟合方法得到精确的人脸各部位形状信息。
- 如权利要求7所述的装置,其特征在于,所述检测区域模块具体包括:滑动窗口模块,用于将所述人脸粗略定位图像划分为若干个子窗口;方差滤波模块,与所述滑动窗口模块相连,用于计算每个子窗口的图像方差值,将所述每个子窗口的图像方差值与预设的方差阈值进行比较,如果小于所述预设的方差阈值,则认定该子窗口包含目标区域,接收该子窗口,反之则否决该子窗口;在线学习模块,与所述方差滤波模块相连,用于将图像方差值小于所述预设的方差阈值的子窗口输入在线学习分类器,得到所述在线学习分类器输出的子窗口;非最大抑制NMS模块,与所述在线学习模块相连,用于将在线学习分类器输出的子窗口进行NMS处理,得到人脸检测区域信息。
- 如权利要求8所述的装置,其特征在于,还包括:优化模块,与所述拟合模块相连,用于根据所述精确的人脸各部位形状信息,通过结构学习方法得到人脸各部位的目标函数,对所述人脸各部位目标函数进行优化,得到优化 后的人脸各部位位置。
- 如权利要求9所述的装置,其特征在于,还包括:在线更新模块,与所述优化模块相连,用于根据所述优化后的人脸各部位位置,跟踪连续两帧中人脸各部位的运动位置,并根据所述人脸各部位的运动位置,更新在线学习分类器。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2014350727A AU2014350727B2 (en) | 2013-11-13 | 2014-11-12 | Face positioning method and device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310560912.X | 2013-11-13 | ||
CN201310560912.XA CN103593654B (zh) | 2013-11-13 | 2013-11-13 | 一种人脸定位的方法与装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015070764A1 true WO2015070764A1 (zh) | 2015-05-21 |
Family
ID=50083786
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/090943 WO2015070764A1 (zh) | 2013-11-13 | 2014-11-12 | 一种人脸定位的方法与装置 |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN103593654B (zh) |
AU (1) | AU2014350727B2 (zh) |
WO (1) | WO2015070764A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112132067A (zh) * | 2020-09-27 | 2020-12-25 | 深圳市梦网视讯有限公司 | 一种基于压缩信息的人脸倾斜度分析方法、系统及设备 |
CN113051961A (zh) * | 2019-12-26 | 2021-06-29 | 深圳市光鉴科技有限公司 | 深度图人脸检测模型训练方法、系统、设备及存储介质 |
CN114708420A (zh) * | 2022-04-24 | 2022-07-05 | 广州大学 | 基于局部方差和后验概率分类器的视觉定位方法及装置 |
CN114863472A (zh) * | 2022-03-28 | 2022-08-05 | 深圳海翼智新科技有限公司 | 多级行人检测方法、装置和存储介质 |
CN114863472B (zh) * | 2022-03-28 | 2024-09-27 | 深圳海翼智新科技有限公司 | 多级行人检测方法、装置和存储介质 |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593654B (zh) * | 2013-11-13 | 2015-11-04 | 智慧城市系统服务(中国)有限公司 | 一种人脸定位的方法与装置 |
CN105303150B (zh) * | 2014-06-26 | 2019-06-25 | 腾讯科技(深圳)有限公司 | 实现图像处理的方法和系统 |
CN105868767B (zh) * | 2015-01-19 | 2020-02-18 | 阿里巴巴集团控股有限公司 | 人脸特征点定位方法和装置 |
CN105809123B (zh) * | 2016-03-04 | 2019-11-12 | 智慧眼科技股份有限公司 | 人脸检测方法及装置 |
CN107481190B (zh) * | 2017-07-04 | 2018-12-07 | 腾讯科技(深圳)有限公司 | 一种图像数据处理方法以及装置 |
CN107977640A (zh) * | 2017-12-12 | 2018-05-01 | 成都电科海立科技有限公司 | 一种基于车载人脸识别图像采集装置的采集方法 |
CN107862308A (zh) * | 2017-12-12 | 2018-03-30 | 成都电科海立科技有限公司 | 一种基于车载人脸识别装置的人脸识别方法 |
CN110008791B (zh) * | 2018-01-05 | 2021-04-27 | 武汉斗鱼网络科技有限公司 | 一种人脸区域确定方法、电子设备及可读存储介质 |
CN108764034A (zh) * | 2018-04-18 | 2018-11-06 | 浙江零跑科技有限公司 | 一种基于驾驶室近红外相机的分神驾驶行为预警方法 |
CN109086711B (zh) * | 2018-07-27 | 2021-11-16 | 华南理工大学 | 人脸特征分析方法、装置、计算机设备和存储介质 |
CN109613526A (zh) * | 2018-12-10 | 2019-04-12 | 航天南湖电子信息技术股份有限公司 | 一种基于支持向量机的点迹过滤方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1866271A (zh) * | 2006-06-13 | 2006-11-22 | 北京中星微电子有限公司 | 基于aam的头部姿态实时估算方法及系统 |
CN101561710A (zh) * | 2009-05-19 | 2009-10-21 | 重庆大学 | 一种基于人脸姿态估计的人机交互方法 |
CN101916370A (zh) * | 2010-08-31 | 2010-12-15 | 上海交通大学 | 人脸检测中非特征区域图像处理的方法 |
CN102622589A (zh) * | 2012-03-13 | 2012-08-01 | 辉路科技(北京)有限公司 | 一种基于gpu的多光谱人脸检测方法 |
CN103593654A (zh) * | 2013-11-13 | 2014-02-19 | 智慧城市系统服务(中国)有限公司 | 一种人脸定位的方法与装置 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101593022B (zh) * | 2009-06-30 | 2011-04-27 | 华南理工大学 | 一种基于指端跟踪的快速人机交互方法 |
-
2013
- 2013-11-13 CN CN201310560912.XA patent/CN103593654B/zh active Active
-
2014
- 2014-11-12 AU AU2014350727A patent/AU2014350727B2/en not_active Ceased
- 2014-11-12 WO PCT/CN2014/090943 patent/WO2015070764A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1866271A (zh) * | 2006-06-13 | 2006-11-22 | 北京中星微电子有限公司 | 基于aam的头部姿态实时估算方法及系统 |
CN101561710A (zh) * | 2009-05-19 | 2009-10-21 | 重庆大学 | 一种基于人脸姿态估计的人机交互方法 |
CN101916370A (zh) * | 2010-08-31 | 2010-12-15 | 上海交通大学 | 人脸检测中非特征区域图像处理的方法 |
CN102622589A (zh) * | 2012-03-13 | 2012-08-01 | 辉路科技(北京)有限公司 | 一种基于gpu的多光谱人脸检测方法 |
CN103593654A (zh) * | 2013-11-13 | 2014-02-19 | 智慧城市系统服务(中国)有限公司 | 一种人脸定位的方法与装置 |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113051961A (zh) * | 2019-12-26 | 2021-06-29 | 深圳市光鉴科技有限公司 | 深度图人脸检测模型训练方法、系统、设备及存储介质 |
CN112132067A (zh) * | 2020-09-27 | 2020-12-25 | 深圳市梦网视讯有限公司 | 一种基于压缩信息的人脸倾斜度分析方法、系统及设备 |
CN112132067B (zh) * | 2020-09-27 | 2024-04-09 | 深圳市梦网视讯有限公司 | 一种基于压缩信息的人脸倾斜度分析方法、系统及设备 |
CN114863472A (zh) * | 2022-03-28 | 2022-08-05 | 深圳海翼智新科技有限公司 | 多级行人检测方法、装置和存储介质 |
CN114863472B (zh) * | 2022-03-28 | 2024-09-27 | 深圳海翼智新科技有限公司 | 多级行人检测方法、装置和存储介质 |
CN114708420A (zh) * | 2022-04-24 | 2022-07-05 | 广州大学 | 基于局部方差和后验概率分类器的视觉定位方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN103593654A (zh) | 2014-02-19 |
AU2014350727B2 (en) | 2017-06-29 |
AU2014350727A1 (en) | 2016-06-09 |
CN103593654B (zh) | 2015-11-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015070764A1 (zh) | 一种人脸定位的方法与装置 | |
US11928800B2 (en) | Image coordinate system transformation method and apparatus, device, and storage medium | |
US11237637B2 (en) | Gesture recognition systems | |
JP6570786B2 (ja) | 動作学習装置、技能判別装置および技能判別システム | |
WO2021031817A1 (zh) | 情绪识别方法、装置、计算机装置及存储介质 | |
CN107610177B (zh) | 一种同步定位与地图构建中确定特征点的方法和设备 | |
TWI798815B (zh) | 目標重識別方法、裝置及電腦可讀存儲介質 | |
CN112200056B (zh) | 人脸活体检测方法、装置、电子设备及存储介质 | |
US11501462B2 (en) | Multi-view three-dimensional positioning | |
CN114937232B (zh) | 医废处理人员防护用具穿戴检测方法、系统和设备 | |
JP2009157767A (ja) | 顔画像認識装置、顔画像認識方法、顔画像認識プログラムおよびそのプログラムを記録した記録媒体 | |
WO2023151237A1 (zh) | 人脸位姿估计方法、装置、电子设备及存储介质 | |
CN109214324A (zh) | 基于多相机阵列的最正脸图像输出方法及输出系统 | |
KR20220004009A (ko) | 키 포인트 검출 방법, 장치, 전자 기기 및 저장 매체 | |
CN110717406B (zh) | 一种人脸检测的方法、装置及终端设备 | |
CN113989914A (zh) | 一种基于人脸识别的安防监控方法及系统 | |
CN110751034B (zh) | 行人行为识别方法及终端设备 | |
Hahmann et al. | Model interpolation for eye localization using the Discriminative Generalized Hough Transform | |
US20240127631A1 (en) | Liveness detection method and apparatus, and computer device | |
CN113627476B (zh) | 一种基于特征规范化的人脸聚类方法及系统 | |
CN114639116B (zh) | 行人重识别方法及装置、存储介质、终端 | |
JP6768913B1 (ja) | 画像処理装置、画像処理方法、及びプログラム | |
JP7157784B2 (ja) | 画像処理装置、画像処理方法、及びプログラム | |
JP6764012B1 (ja) | 画像処理装置、画像処理方法、及びプログラム | |
Świtoński et al. | Recognition of human gestures represented by depth camera motion sequences |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14862961 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2014350727 Country of ref document: AU Date of ref document: 20141112 Kind code of ref document: A |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 27/09/2016) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14862961 Country of ref document: EP Kind code of ref document: A1 |