CN110210306B - Face tracking method and camera - Google Patents
Face tracking method and camera Download PDFInfo
- Publication number
- CN110210306B CN110210306B CN201910361317.0A CN201910361317A CN110210306B CN 110210306 B CN110210306 B CN 110210306B CN 201910361317 A CN201910361317 A CN 201910361317A CN 110210306 B CN110210306 B CN 110210306B
- Authority
- CN
- China
- Prior art keywords
- facial feature
- face
- frame image
- feature point
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a face tracking method and a camera. The method of the invention comprises the following steps: acquiring an initial position of a facial feature point of a first frame image in an acquired image frame sequence; taking the initial positions of the facial feature points as the input of a convolutional neural network, outputting a position probability heat map of the facial feature points by the convolutional neural network, and performing iterative regression processing on the position probability heat map of the facial feature points by adopting a point distribution model to obtain the pixel positions of the facial feature points in the first frame image; and acquiring the initial position of the facial feature point of the second frame image in the image frame sequence by utilizing the pixel position of the facial feature point in the first frame image, thereby realizing face tracking. The method and the device can realize face alignment tracking between frames, avoid the problems of instability and jitter of face feature points between frames, and improve the tracking accuracy and the tracking speed.
Description
Technical Field
The invention relates to the technical field of machine learning, in particular to a face tracking method and a camera.
Background
The face alignment is to detect a face in an image and label each specific point. Face alignment techniques are commonly used in video processing, such as live, short video, and other applications.
The real-time tracking problem of the human face alignment greatly helps the processing of videos, however, the current research on the human face alignment is enthusiastic, the real-time tracking research on the human face alignment is less, and most of point positions determined by a tracking algorithm have jitter among frames, so that the tracking result has obvious distortion.
Disclosure of Invention
The present invention provides a face tracking method and camera to at least partially solve the above problems.
In a first aspect, the present invention provides a face tracking method, including: acquiring an initial position of a facial feature point of a first frame image in an acquired image frame sequence; taking the initial positions of the facial feature points as the input of a convolutional neural network, outputting a position probability heat map of the facial feature points by the convolutional neural network, and performing iterative regression processing on the position probability heat map of the facial feature points by adopting a point distribution model to obtain the pixel positions of the facial feature points in the first frame image; wherein the location probability heatmap represents a probability that the facial feature point is at a pixel location in the first frame image; and acquiring the initial position of the facial feature point of the second frame image in the image frame sequence by utilizing the pixel position of the facial feature point in the first frame image, thereby realizing face tracking.
In some embodiments, obtaining an initial position of a facial feature point of a second frame image in the sequence of image frames using a pixel position of the facial feature point in the first frame image comprises: performing face detection on the second frame image by using a multitask cascade convolution network or by using a machine learning tool Dlib; when a face is detected in the second frame image, the initial position of the facial feature point in the second frame image is obtained according to the position information of the facial feature point in the first frame image and a preset relaxation amount, wherein the preset relaxation amount represents the position change of the same feature point in the adjacent frame image.
In some embodiments, obtaining the initial position of the facial feature point in the second frame image according to the position information of the facial feature point in the first frame image and a preset slack amount comprises: according toAcquiring initial positions of facial feature points in the second frame image; wherein the content of the first and second substances,as the initial position of the facial feature point i in the second frame image,the position information of the face characteristic point i in the first frame image is represented by i which is a natural number greater than 1, alpha is greater than 0 and less than 1, alpha is a preset adjustment factor, dxmonIs the preset amount of relaxation.
In some embodiments, the performing face detection on the second frame image by using a multi-task cascaded convolutional network or by using a machine learning tool Dlib further includes: when the face in the second frame image is not detected, extracting face characteristic points from a pre-constructed average face image; determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the second frame image.
In some embodiments, obtaining an initial position of a facial feature point of a first frame image in the sequence of image frames comprises: extracting face characteristic points from a pre-constructed average face image; determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the first frame image.
In some embodiments, extracting facial feature points from the pre-constructed average face image comprises: acquiring facial feature points of each face training sample in a face training sample set, wherein the facial feature points of each face training sample in the face training sample set are calibrated; constructing a mapping matrix from the face training samples to an average face model according to the face feature points of each face training sample; and respectively superposing the facial feature points of all the face training samples to the average face model according to the mapping matrix to obtain the average face image, and determining the superposed facial feature points as the facial feature points of the average face image.
In a second aspect, the present invention provides a camera comprising: a camera and a processor; the camera collects an image frame sequence of the face of the user and sends the image frame sequence to the processor; the processor is used for acquiring the initial position of the facial feature point of the first frame image in the image frame sequence; taking the initial positions of the facial feature points as the input of a convolutional neural network, outputting a position probability heat map of the facial feature points by the convolutional neural network, and performing iterative regression processing on the position probability heat map of the facial feature points by adopting a point distribution model to obtain position information of the facial feature points in the first frame image; wherein the location probability heat map represents a probability that the facial feature point is at each pixel location in the first frame image; and acquiring the initial position of the facial feature point of the second frame image in the image frame sequence by utilizing the position information of the facial feature point in the first frame image, thereby realizing the face tracking.
In some embodiments, the processor further performs face detection on the second frame image using a multitask cascaded convolutional network or using a machine learning tool Dlib; when a face is detected in the second frame image, the initial position of the facial feature point in the second frame image is obtained according to the position information of the facial feature point in the first frame image and a preset relaxation amount, wherein the preset relaxation amount represents the position change of the same feature point in the adjacent frame image.
In some embodiments, the processor extracts facial feature points from a pre-constructed average face image when a face in the second frame image is not detected; determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the second frame image.
In some embodiments, the processor obtains facial feature points of each face training sample in a face training sample set, wherein the facial feature points of each face training sample in the face training sample set are calibrated; constructing a mapping matrix from the face training samples to an average face model according to the face feature points of each face training sample; and respectively superposing the facial feature points of all the face training samples to the average face model according to the mapping matrix to obtain the average face image, and determining the superposed facial feature points as the facial feature points of the average face image.
The method and the device predict the initial position of the facial feature point of the second frame image by using the facial feature point of the first frame image, and identify the specific pixel position of the facial feature point in each frame image by adopting a mode of combining a convolutional neural network and PDM when the initial position of the facial feature point in each frame image is obtained, thereby realizing face alignment tracking between frames, avoiding the problems of instability and jitter of the facial feature point between frames, and improving the tracking accuracy and tracking speed.
Drawings
FIG. 1 is a flow chart of a face tracking method according to an embodiment of the present invention;
fig. 2 is a block diagram of a camera according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings. It is to be understood that such description is merely illustrative and not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The words "a", "an" and "the" and the like as used herein are also intended to include the meanings of "a plurality" and "the" unless the context clearly dictates otherwise. Furthermore, the terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks.
Thus, the techniques of the present invention may be implemented in hardware and/or in software (including firmware, microcode, etc.). Furthermore, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon for use by or in connection with an instruction execution system. In the context of the present invention, a computer-readable storage medium may be any medium that can contain, store, communicate, propagate, or transport the instructions. For example, a computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the computer-readable storage medium include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and/or wired/wireless communication links.
The face tracking means that the position of the facial feature point of the frame is predicted according to the information of the previous frame, and the face tracking method has very important significance for the face alignment of the frame, can reduce the search range of the frame, and can improve the tracking accuracy. The embodiment provides a jitter-free interframe tracking method aiming at the unreal problem of face alignment caused by jitter of interframe tracking.
Fig. 1 is a flowchart of a face tracking method according to an embodiment of the present invention, and as shown in fig. 1, the method according to the embodiment includes:
s110, acquiring the initial position of the facial feature point of the first frame image in the acquired image frame sequence.
The facial feature points include feature points for identifying eyes, nose, mouth, eyebrows, face contours and the like.
S120, taking the initial positions of the facial feature points as the input of a convolutional neural network, outputting a position probability heat map of the facial feature points by the convolutional neural network, and performing iterative regression processing on the position probability heat map of the facial feature points by adopting a point distribution model to obtain the pixel positions of the facial feature points in the first frame image; wherein the location probability heat map represents a probability that the facial feature point is at each pixel location in the first frame image.
In the embodiment, the positions of the facial feature points are processed by using a convolutional neural network, then iterative regression processing is performed by using a Point Distribution Model (PDM) based on the position probability of the facial feature points output by the convolutional neural network, so as to obtain the specific pixel position information of the facial feature points, and the recognition accuracy of the facial feature points is improved by combining the convolutional neural network and the PDM.
S130, acquiring the initial position of the facial feature point of the second frame image in the image frame sequence by using the pixel position of the facial feature point in the first frame image, and realizing face tracking.
The initial positions of the facial feature points in the second frame image can be predicted based on the pixel positions of the facial feature points in the first frame image, then the pixel positions of the facial feature points in the second frame image can be obtained by combining a convolutional neural network and a PDM, the pixel positions of the facial feature points in each frame of subsequent images in the image frame sequence can be predicted in sequence, and tracking and identification of the human face are achieved.
In the embodiment, the initial position of the facial feature point of the second frame image is predicted by using the facial feature point of the first frame image, and when the initial position of the facial feature point in each frame image is obtained, the specific pixel position of the facial feature point in each frame image is identified by adopting a mode of combining a convolutional neural network and a PDM (product data model), so that the inter-frame face alignment tracking is realized, the problems of instability and jitter of the inter-frame facial feature point are avoided, and the tracking accuracy and tracking speed are improved.
The above steps S110 to S130 will be described in detail.
First, step S110 is performed, i.e., an initial position of a facial feature point of a first frame image in the captured image frame sequence is acquired.
In some embodiments, the initial position of the facial feature point in the first frame image is obtained as follows: extracting face characteristic points from a pre-constructed average face image; determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the first frame image.
The method comprises the steps of obtaining facial feature points of each face training sample in a face training sample set, wherein the facial feature points of each face training sample in the face training sample set are calibrated; constructing a mapping matrix from the face training samples to an average face model according to the face feature points of each face training sample; and respectively superposing the facial feature points of all the face training samples to the average face model according to the mapping matrix to obtain the average face image, and determining the superposed facial feature points as the facial feature points of the average face image.
After acquiring the initial positions of the facial feature points of the first frame image, continuing to execute step S120, that is, taking the initial positions of the facial feature points as the input of a convolutional neural network, wherein the convolutional neural network outputs a position probability heat map of the facial feature points, and performing iterative regression processing on the position probability heat map of the facial feature points by using a point distribution model to obtain position information of the facial feature points in the first frame image; wherein the location probability heat map represents a probability that the facial feature point is at each pixel location in the first frame image.
The embodiment firstly uses a convolutional neural network to calculate a position probability heat map of facial feature points relative to positions, the convolutional neural network can improve the calculation speed, the PDM is a model which can obtain specific positions of the points through iterative regression processing through the position probability heat map of the points, the PDM is used for performing iterative regression processing on the position probability heat map of the facial feature points relative to the positions after the position probability heat map of the facial feature points relative to the positions is calculated by the convolutional neural network, and specific pixel positions of the facial feature points are determined.
After the position information of the facial feature points in the first frame image is obtained, step S130 is continuously executed, that is, the initial positions of the facial feature points in the second frame image in the image frame sequence are obtained by using the position information of the facial feature points in the first frame image, so as to implement face tracking.
In some embodiments, the initial positions of the facial feature points in the second frame image are obtained by: performing face detection on the second frame image by using a multitask cascade convolution network or by using a machine learning tool Dlib; when a face is detected in the second frame image, the initial position of the facial feature point in the second frame image is obtained according to the position information of the facial feature point in the first frame image and a preset relaxation amount, wherein the preset relaxation amount represents the position change of the same feature point in the adjacent frame image.
In connection with an example of this embodiment, may be according toAcquiring initial positions of facial feature points in the second frame image; wherein the content of the first and second substances,as the initial position of the facial feature point i in the second frame image,the position information of the face characteristic point i in the first frame image is represented by i which is a natural number greater than 1, alpha is greater than 0 and less than 1, alpha is a preset adjustment factor, dxmonIs the preset amount of relaxation.
When the second frame image is subjected to face detection by utilizing a multitask cascade convolution network or a machine learning tool Dlib, if a face is not detected in the second frame image, extracting face characteristic points from a pre-constructed average face image; determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the second frame image.
The method comprises the steps of obtaining facial feature points of each face training sample in a face training sample set, wherein the facial feature points of each face training sample in the face training sample set are calibrated; constructing a mapping matrix from the face training samples to an average face model according to the face feature points of each face training sample; respectively superposing the facial feature points of all the face training samples to the average face model according to the mapping matrix to obtain the average face image, and determining the superposed facial feature points as the facial feature points of the average face image; determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the second frame image.
The invention also provides a camera.
Fig. 2 is a block diagram of a camera according to an embodiment of the present invention, and as shown in fig. 2, the camera according to the embodiment includes: a camera and a processor; wherein the content of the first and second substances,
the camera is used for collecting an image frame sequence of the face of the user and sending the image frame sequence to the processor;
the processor is used for acquiring the initial position of the facial feature point of the first frame image in the image frame sequence; taking the initial positions of the facial feature points as the input of a convolutional neural network, outputting a position probability heat map of the facial feature points by the convolutional neural network, and performing iterative regression processing on the position probability heat map of the facial feature points by adopting a point distribution model to obtain position information of the facial feature points in the first frame image; wherein the location probability heat map represents a probability that the facial feature point is at each pixel location in the first frame image; and acquiring the initial position of the facial feature point of the second frame image in the image frame sequence by utilizing the position information of the facial feature point in the first frame image, thereby realizing the face tracking.
In the embodiment, the initial position of the facial feature point of the second frame image is predicted by using the facial feature point of the first frame image, and when the initial position of the facial feature point in each frame image is obtained, the specific pixel position of the facial feature point in each frame image is identified by adopting a mode of combining a convolutional neural network and a PDM (product data model), so that the inter-frame face alignment tracking is realized, the problems of instability and jitter of the inter-frame facial feature point are avoided, and the tracking accuracy and tracking speed are improved.
In some embodiments, the processor further performs face detection on the second frame image by using a multitask cascade convolution network or by using a machine learning tool Dlib; when a face is detected in the second frame image, the initial position of the facial feature point in the second frame image is obtained according to the position information of the facial feature point in the first frame image and a preset relaxation amount, wherein the preset relaxation amount represents the position change of the same feature point in the adjacent frame image.
In connection with one example of the embodiment, a processorAcquiring initial positions of facial feature points in the second frame image; wherein the content of the first and second substances,as the initial position of the facial feature point i in the second frame image,the position information of the face characteristic point i in the first frame image is represented by i which is a natural number greater than 1, alpha is greater than 0 and less than 1, alpha is a preset adjustment factor, dxmonIs the preset amount of relaxation.
In some embodiments, the processor, when the human face in the second frame image is not detected, extracts facial feature points from a pre-constructed average face image, and determines point positions of the facial feature points on the average face as initial positions of the facial feature points in the second frame image.
In some embodiments, the processor extracts facial feature points from a pre-constructed average face image, and determines point locations of the facial feature points on the average face as initial positions of the facial feature points in the first frame image.
The processor is used for specifically acquiring facial feature points of each face training sample in a face training sample set, wherein the facial feature points of each face training sample in the face training sample set are calibrated; constructing a mapping matrix from the face training samples to an average face model according to the face feature points of each face training sample; and respectively superposing the facial feature points of all the face training samples to the average face model according to the mapping matrix to obtain the average face image, and determining the superposed facial feature points as the facial feature points of the average face image.
For the camera embodiment, since it basically corresponds to the method embodiment, the relevant points may be referred to the partial description of the method embodiment. The above-described camera embodiments are merely illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
For the convenience of clearly describing the technical solutions of the embodiments of the present invention, in the embodiments of the present invention, the words "first", "second", and the like are used to distinguish the same items or similar items with basically the same functions and actions, and those skilled in the art can understand that the words "first", "second", and the like do not limit the quantity and execution order.
While the foregoing is directed to embodiments of the present invention, other modifications and variations of the present invention may be devised by those skilled in the art in light of the above teachings. It should be understood by those skilled in the art that the foregoing detailed description is for the purpose of better explaining the present invention, and the scope of the present invention should be determined by the scope of the appended claims.
Claims (10)
1. A face tracking method, comprising:
acquiring an initial position of a facial feature point of a first frame image in an acquired image frame sequence;
taking the initial positions of the facial feature points as the input of a convolutional neural network, outputting a position probability heat map of the facial feature points by the convolutional neural network, and performing iterative regression processing on the position probability heat map of the facial feature points by adopting a point distribution model to obtain the pixel positions of the facial feature points in the first frame image; wherein the location probability heatmap represents a probability that the facial feature point is at a pixel location in the first frame image;
acquiring the initial position of the facial feature point of the second frame image in the image frame sequence by using the pixel position of the facial feature point in the first frame image to realize face tracking, specifically: and obtaining the initial position of the facial feature point in the second frame image according to the position information of the facial feature point in the first frame image and a preset relaxation amount, wherein the preset relaxation amount represents the position change of the same feature point in the adjacent frame images.
2. The method of claim 1, further comprising, before obtaining an initial position of a facial feature point of a second frame image in the sequence of image frames using a pixel position of the facial feature point in a first frame image:
performing face detection on the second frame image by using a multitask cascade convolution network or by using a machine learning tool Dlib;
and when a human face is detected in the second frame image, acquiring the initial position of the facial feature point in the second frame image according to the position information of the facial feature point in the first frame image and a preset relaxation amount.
3. The method according to claim 2, wherein the obtaining of the initial position of the facial feature point in the second frame image according to the position information of the facial feature point in the first frame image and a preset slack amount comprises:
wherein the content of the first and second substances,as the initial position of the facial feature point i in the second frame image,the position information of the face characteristic point i in the first frame image is represented by i which is a natural number greater than 1, alpha is greater than 0 and less than 1, alpha is a preset adjustment factor, dxmonIs the preset amount of relaxation.
4. The method according to claim 2, wherein the performing face detection on the second frame image by using a multitask cascade convolution network or by using a machine learning tool Dlib further comprises:
when the face in the second frame image is not detected, extracting face characteristic points from a pre-constructed average face image;
determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the second frame image.
5. The method of claim 1, wherein obtaining the initial position of the facial feature point of the first image in the sequence of image frames comprises:
extracting face characteristic points from a pre-constructed average face image;
determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the first frame image.
6. The method according to claim 4 or 5, wherein the extracting facial feature points from the pre-constructed average face image comprises:
acquiring facial feature points of each face training sample in a face training sample set, wherein the facial feature points of each face training sample in the face training sample set are calibrated;
constructing a mapping matrix from the face training samples to an average face model according to the face feature points of each face training sample;
and respectively superposing the facial feature points of all the face training samples to the average face model according to the mapping matrix to obtain the average face image, and determining the superposed facial feature points as the facial feature points of the average face image.
7. A camera, comprising: a camera and a processor;
the camera collects an image frame sequence of the face of the user and sends the image frame sequence to the processor;
the processor is used for acquiring the initial position of the facial feature point of the first frame image in the image frame sequence; taking the initial positions of the facial feature points as the input of a convolutional neural network, outputting a position probability heat map of the facial feature points by the convolutional neural network, and performing iterative regression processing on the position probability heat map of the facial feature points by adopting a point distribution model to obtain position information of the facial feature points in the first frame image; wherein the location probability heat map represents a probability that the facial feature point is at each pixel location in the first frame image; acquiring the initial position of the facial feature point of the second frame image in the image frame sequence by using the position information of the facial feature point in the first frame image, specifically: and obtaining the initial position of the facial feature point in the second frame image according to the position information of the facial feature point in the first frame image and a preset relaxation amount to realize face tracking, wherein the preset relaxation amount represents the position change of the same feature point in the adjacent frame images.
8. The camera according to claim 7, wherein the processor further performs face detection on the second frame image by using a multitask cascade convolution network or by using a machine learning tool Dlib; and when a human face is detected in the second frame image, acquiring the initial position of the facial feature point in the second frame image according to the position information of the facial feature point in the first frame image and a preset relaxation amount.
9. The camera according to claim 8, wherein the processor extracts facial feature points from a pre-constructed average face image when a human face is not detected in the second frame image; determining a point location of a facial feature point on the average face as an initial location of the facial feature point in the second frame image.
10. The camera according to claim 7, wherein the processor obtains facial feature points of each face training sample in a face training sample set, and the facial feature points of each face training sample in the face training sample set are calibrated; constructing a mapping matrix from the face training samples to an average face model according to the face feature points of each face training sample; and respectively superposing the facial feature points of all the face training samples to the average face model according to the mapping matrix to obtain the average face image, and determining the superposed facial feature points as the facial feature points of the average face image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910361317.0A CN110210306B (en) | 2019-04-30 | 2019-04-30 | Face tracking method and camera |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910361317.0A CN110210306B (en) | 2019-04-30 | 2019-04-30 | Face tracking method and camera |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110210306A CN110210306A (en) | 2019-09-06 |
CN110210306B true CN110210306B (en) | 2021-09-14 |
Family
ID=67786832
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910361317.0A Active CN110210306B (en) | 2019-04-30 | 2019-04-30 | Face tracking method and camera |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110210306B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2672425A1 (en) * | 2012-06-08 | 2013-12-11 | Realeyes OÜ | Method and apparatus with deformable model fitting using high-precision approximation |
CN103714331A (en) * | 2014-01-10 | 2014-04-09 | 南通大学 | Facial expression feature extraction method based on point distribution model |
CN105512627A (en) * | 2015-12-03 | 2016-04-20 | 腾讯科技(深圳)有限公司 | Key point positioning method and terminal |
CN109241910A (en) * | 2018-09-07 | 2019-01-18 | 高新兴科技集团股份有限公司 | A kind of face key independent positioning method returned based on the cascade of depth multiple features fusion |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10896318B2 (en) * | 2017-09-09 | 2021-01-19 | Apple Inc. | Occlusion detection for facial recognition processes |
-
2019
- 2019-04-30 CN CN201910361317.0A patent/CN110210306B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2672425A1 (en) * | 2012-06-08 | 2013-12-11 | Realeyes OÜ | Method and apparatus with deformable model fitting using high-precision approximation |
CN103714331A (en) * | 2014-01-10 | 2014-04-09 | 南通大学 | Facial expression feature extraction method based on point distribution model |
CN105512627A (en) * | 2015-12-03 | 2016-04-20 | 腾讯科技(深圳)有限公司 | Key point positioning method and terminal |
CN109241910A (en) * | 2018-09-07 | 2019-01-18 | 高新兴科技集团股份有限公司 | A kind of face key independent positioning method returned based on the cascade of depth multiple features fusion |
Non-Patent Citations (3)
Title |
---|
《Constrained Local Neural Fields for robust facial landmark detection in the wild》;Tadas Baltrusaitis,et al;《ICCV2013》;20131231;第354-361页 * |
《Deep Alignment Network:A convolutional neural network for robust face alignment》;Marek Kowalski,et al;《arXiv:1706.01789v2》;20170810;第1-10页 * |
《引入全局约束的精简人脸关键点检测网络》;张伟,等;《信号处理》;20190331;第35卷(第3期);第507-515页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110210306A (en) | 2019-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7236545B2 (en) | Video target tracking method and apparatus, computer apparatus, program | |
CN109214343B (en) | Method and device for generating face key point detection model | |
KR102150776B1 (en) | Face location tracking method, apparatus and electronic device | |
CN107274433B (en) | Target tracking method and device based on deep learning and storage medium | |
WO2018188453A1 (en) | Method for determining human face area, storage medium, and computer device | |
US11238272B2 (en) | Method and apparatus for detecting face image | |
JP6694829B2 (en) | Rule-based video importance analysis | |
US20230030267A1 (en) | Method and apparatus for selecting face image, device, and storage medium | |
WO2020024484A1 (en) | Method and device for outputting data | |
CN109308469B (en) | Method and apparatus for generating information | |
US10620826B2 (en) | Object selection based on region of interest fusion | |
US8903130B1 (en) | Virtual camera operator | |
US20150269739A1 (en) | Apparatus and method for foreground object segmentation | |
CN111104925B (en) | Image processing method, image processing apparatus, storage medium, and electronic device | |
CN109271929B (en) | Detection method and device | |
CN112132847A (en) | Model training method, image segmentation method, device, electronic device and medium | |
CN113887547B (en) | Key point detection method and device and electronic equipment | |
CN109767453A (en) | Information processing unit, background image update method and non-transient computer readable storage medium | |
US20200401811A1 (en) | Systems and methods for target identification in video | |
CN112149615A (en) | Face living body detection method, device, medium and electronic equipment | |
CN112101109B (en) | Training method and device for face key point detection model, electronic equipment and medium | |
CN110856014B (en) | Moving image generation method, moving image generation device, electronic device, and storage medium | |
CN110633630B (en) | Behavior identification method and device and terminal equipment | |
CN112732553A (en) | Image testing method and device, electronic equipment and storage medium | |
CN110210306B (en) | Face tracking method and camera |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |