CN108229432A - Face calibration method and device - Google Patents
Face calibration method and device Download PDFInfo
- Publication number
- CN108229432A CN108229432A CN201810096476.8A CN201810096476A CN108229432A CN 108229432 A CN108229432 A CN 108229432A CN 201810096476 A CN201810096476 A CN 201810096476A CN 108229432 A CN108229432 A CN 108229432A
- Authority
- CN
- China
- Prior art keywords
- face
- unique
- face region
- neural network
- batch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 77
- 238000003062 neural network model Methods 0.000 claims abstract description 104
- 238000012545 processing Methods 0.000 claims description 60
- 230000008569 process Effects 0.000 claims description 36
- 238000006243 chemical reaction Methods 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 8
- 238000011161 development Methods 0.000 claims description 5
- 230000008713 feedback mechanism Effects 0.000 abstract description 3
- 210000005036 nerve Anatomy 0.000 abstract 2
- 210000000887 face Anatomy 0.000 description 18
- 238000010586 diagram Methods 0.000 description 14
- 238000000605 extraction Methods 0.000 description 10
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 230000014509 gene expression Effects 0.000 description 6
- 238000003860 storage Methods 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 241000405217 Viola <butterfly> Species 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 210000004209 hair Anatomy 0.000 description 1
- 238000003704 image resize Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005316 response function Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/164—Detection; Localisation; Normalisation using holistic features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The embodiment of the present application provides a kind of face calibration method and device, and wherein method includes:Face picture is handled according to first nerves network model, to determine a collection of human face region;A collection of human face region is handled according to nervus opticus network model, to filter out non-face region from a collection of human face region;It is handled according to third nerve network model having filtered non-face a collection of human face region, to determine unique human face region at T moment;According to face tracking model to unique human face region at T moment into line trace, to determine unique human face region at T+1 moment.Due to have passed through multiple neural network models by multilayer convolution operation, human face characteristic point is efficiently accurately extracted, greatly improves the robustness of face calibration, further, when introducing feedback mechanism, improves calibration efficiency and stability.
Description
Technical Field
The embodiment of the application relates to the technical field of image processing, in particular to a face calibration method and device.
Background
The neural network makes a great breakthrough in the field of image recognition, promotes the rapid development of the application of the face calibration as an image, provides higher stability for the face calibration when coping with changes of postures, illumination and expressions, and promotes the wide application of the face calibration in more and more fields of entertainment, safety and the like.
The face calibration is mainly divided into two stages of face detection and face characterization. In the face detection stage, any one picture is given, whether one or more faces exist in the picture is judged, and the position area of each face is returned. The research of face detection mainly focuses on template matching, subspace methods and the like in the early stage, and mainly focuses on data-driven methods such as statistical model methods, neural network learning methods and the like in the later stage. Most typically, the Viola and Jones (VJ for short) obtain a face detector with very good real-time performance through a cascade classifier trained by Haar-Like features and AdaBoost. But for real complex environments such as: the face size is changeable, the gesture is various, illumination condition is abominable, resolution ratio is low etc. classic VJ face detector is not good more often. Recently, more and more face recognition algorithms based on the CNN convolutional neural network emerge, and show stronger robustness and higher detection accuracy. Such as: FacenessNet, DCNN, etc.
The face characterization is mainly to align the face and extract the features of the face, and to locate the positions of key areas such as eyebrows, glasses, mouth, nose, face contour, etc., which is also called face key point detection. The current common face alignment has 5-point alignment and 68-point alignment. The face alignment can be applied to facial feature positioning, expression recognition, face caricature generation, augmented reality, face changing and the like. The method for detecting the key points of the human face is divided into three types: 1. traditional methods based on asm (active Shape model) and aam (active appearance model); 2. a cascade shape regression-based method; 3. a method based on deep learning. Although the traditional method model is simple and easy to understand and apply, the traditional method model has strong dependence on the model and poor robustness. Therefore, most people use a deep learning-based method to detect key points of the human face.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a method and an apparatus for face calibration, which are used to overcome or alleviate the above-mentioned drawbacks in the prior art.
The embodiment of the application provides a face region calibration method, which comprises the following steps:
processing the face pictures according to the first neural network model to determine a batch of face regions;
processing the batch of face regions according to a second neural network model to filter out non-face regions from the batch of face regions;
processing the batch of face regions with the non-faces filtered according to a third neural network model to determine a unique face region at the time T;
and tracking the unique face region at the moment T according to the face tracking model to determine the unique face region at the moment T + 1.
Optionally, in any embodiment of the present application, the method further includes:
acquiring an acquired original face picture, and carrying out scaling processing on the original face picture to obtain image pyramids with different sizes;
and taking the image pyramids with different sizes as the input of the first neural network model, so that the first neural network model processes the face pictures to determine a batch of face regions.
Optionally, in any embodiment of the present application, acquiring an acquired original face picture includes: and acquiring an original face picture acquired by an image acquisition unit arranged on the electronic terminal through a development interface of the electronic terminal.
Optionally, in any embodiment of the present application, processing the face pictures according to the first neural network model to determine a batch of face regions includes: and processing the face pictures in sequence according to different convolution layers and convolution kernels configured in the first neural network model to determine a batch of face regions.
Optionally, in any embodiment of the present application, successively processing the face pictures according to different convolution layers and convolution kernels configured in the first neural network model to determine a batch of face regions includes: processing the face picture according to different convolution layers and convolution kernels configured in the first neural network model to respectively obtain a plurality of candidate face area frames; and determining a batch of face regions according to the overlapping of the candidate face region frames and the set overlapping threshold value.
Optionally, in any embodiment of the present application, processing the batch of face regions according to a second neural network model to filter out non-face regions from the batch of face regions includes: and processing the face regions in sequence according to different convolution layers and convolution kernels configured in the second neural network model so as to filter out non-face regions from the face regions.
Optionally, in any embodiment of the present application, successively processing the batch of face regions according to different convolution layers and convolution kernels configured in the second neural network model to filter out non-face regions from the batch of face regions includes: processing the face regions in the batch in sequence according to different convolution layers and convolution kernels configured in a second neural network model to respectively obtain a plurality of candidate face region frames; and filtering out non-face regions from the batch of face regions according to the overlapping of the candidate face region frames and a set overlapping threshold value.
Optionally, in any embodiment of the present application, processing the batch of face regions with non-faces filtered according to a third neural network model to determine a unique face region at time T includes: and processing the batch of face regions with the non-faces filtered according to different convolution layers and convolution kernels configured in the third neural network model to determine the unique face region and the position of a face key point at the time T.
Optionally, in any embodiment of the present application, tracking the unique face region at time T according to a face tracking model to determine the unique face region at time T +1 includes: and tracking the unique face region at the time T according to a position filter and a scale filter in the face tracking model to determine the unique face region at the time T + 1.
Optionally, in any embodiment of the present application, the method further includes: and judging whether the unique face area tracking is successful or not according to the unique face area at the moment T and the unique face area at the moment T + 1.
Optionally, in any embodiment of the present application, determining whether the unique face region tracking is successful according to the unique face region at the time T and the unique face region at the time T +1 includes: if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is equal to a set overlap threshold value, judging that the unique face region is successfully tracked; or if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is smaller than or larger than a set overlap threshold, determining that the unique face region tracking fails.
Optionally, in any embodiment of the present application, if it is determined that the unique face region tracking is successful, the unique face region at the time T +1 is used as an input of the third neural network model, so as to process the batch of face regions with non-faces filtered, so as to determine the unique face region at the time T + 2.
Optionally, in any embodiment of the present application, if it is determined that the unique face region tracking fails, the step of processing the face pictures according to the first neural network model is skipped to determine a batch of face regions again.
The embodiment of the present application further provides a face region calibration apparatus, which includes:
the first program unit is used for processing the face pictures according to the first neural network model so as to determine a batch of face regions;
the second program unit is used for processing the batch of face regions according to a second neural network model so as to filter out non-face regions from the batch of face regions;
a third program unit, configured to process the batch of face regions with non-faces filtered according to a third neural network model, so as to determine a unique face region at time T;
and the fourth program unit is used for tracking the unique face area at the time T according to the face tracking model so as to determine the unique face area at the time T + 1.
Optionally, in any embodiment of the present application, the method further includes:
the conversion unit is used for acquiring an acquired original face picture and carrying out zooming processing on the original face picture to obtain image pyramids with different sizes;
and the input unit is used for taking the image pyramids with different sizes as the input of the first neural network model, so that the first neural network model processes the face pictures to determine a batch of face regions.
Optionally, in any embodiment of the present application, the first program unit is further configured to process the face pictures sequentially according to different convolution layers and convolution kernels configured in the first neural network model, so as to determine a batch of face regions.
Optionally, in any embodiment of the present application, the first program unit is further configured to process the face picture successively according to different convolution layers and convolution kernels configured in the first neural network model, so as to obtain a plurality of candidate face region frames respectively; and determining a batch of face regions according to the overlapping of the candidate face region frames and the set overlapping threshold value.
Optionally, in any embodiment of the present application, the second program unit is further configured to process the batch of face regions in sequence according to different convolution layers and convolution kernels configured in the second neural network model, so as to filter out non-face regions from the batch of face regions.
Optionally, in any embodiment of the present application, the second program unit is further configured to process the batch of face regions successively according to different convolution layers and convolution kernels configured in the second neural network model, so as to obtain a plurality of candidate face region frames respectively; and filtering out non-face regions from the batch of face regions according to the overlapping of the candidate face region frames and a set overlapping threshold value.
Optionally, in any embodiment of the present application, the third program unit is further configured to sequentially process the batch of face regions with non-faces filtered according to different convolution layers and convolution kernels configured in the third neural network model, so as to determine a unique face region and a face key point position at time T.
Optionally, in any embodiment of the present application, the fourth program unit is further configured to track the unique face region at time T according to a position filter and a scale filter in the face tracking model, so as to determine the unique face region at time T + 1.
Optionally, in any embodiment of the present application, the method further includes: and the feedback unit is used for judging whether the unique face area tracking is successful according to the unique face area at the moment T and the unique face area at the moment T + 1.
Optionally, in any embodiment of the present application, the feedback unit is further configured to determine that the unique face area is successfully tracked if the overlap between the face area frames of the unique face area at the time T and the face area frames of the unique face area at the time T +1 is equal to a set overlap threshold; or if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is smaller than or larger than a set overlap threshold, determining that the unique face region tracking fails.
Optionally, in any embodiment of the application, if it is determined that the unique face region tracking is successful, the feedback unit is further configured to use the unique face region at the time T +1 as an input of the third neural network model, so as to process the batch of face regions with non-faces filtered, so as to determine the unique face region at the time T + 2.
The embodiment of the present application further provides an electronic device, which includes the face region calibration apparatus in any one of the above embodiments.
In the embodiment of the application, the face pictures are processed according to the first neural network model to determine a batch of face regions; processing the batch of face regions according to a second neural network model to filter out non-face regions from the batch of face regions; processing the batch of face regions with the non-faces filtered according to a third neural network model to determine a unique face region at the time T; the unique face region at the time T is tracked according to the face tracking model to determine the unique face region at the time T +1, and due to the fact that the plurality of neural network models are subjected to multilayer convolution operation, face characteristic points are efficiently and accurately extracted, and the robustness of face calibration is greatly improved. In addition, the device is not influenced by complex environment. Meanwhile, for the most advanced neural network algorithm at present, the prediction efficiency is improved by 300%, and the accuracy is not influenced. Further, when a feedback mechanism is introduced, calibration efficiency and stability are improved.
Drawings
Some specific embodiments of the present application will be described in detail hereinafter by way of illustration and not limitation with reference to the accompanying drawings. The same reference numbers in the drawings identify the same or similar elements or components. Those skilled in the art will appreciate that the drawings are not necessarily drawn to scale. In the drawings:
fig. 1 is a schematic flow chart of a human face region calibration method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a human face region calibration device in the second embodiment of the present application;
fig. 3 is a schematic structural diagram of a face area calibration device in the third embodiment of the present application;
fig. 4 is a schematic structural diagram of a human face area calibration device in the fourth embodiment of the present application.
Detailed Description
It is not necessary for any particular embodiment of the invention to achieve all of the above advantages at the same time.
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application shall fall within the scope of the protection of the embodiments in the present application.
The following further describes specific implementations of embodiments of the present application with reference to the drawings of the embodiments of the present application.
Fig. 1 is a schematic flow chart of a human face region calibration method according to an embodiment of the present application; as shown in fig. 1, it includes:
s101, processing the face pictures according to a first neural network model to determine a batch of face regions;
optionally, in an embodiment, step S101 further includes: acquiring an acquired original face picture, and carrying out scaling processing on the original face picture to obtain image pyramids with different sizes; and taking the image pyramids with different sizes as the input of the first neural network model, so that the first neural network model processes the face pictures to determine a batch of face regions.
Taking an image acquisition device such as a camera in a mobile terminal as an example, the permission to use the camera can be obtained through a corresponding development interface provided by the mobile terminal, and each frame of picture shot by the camera is extracted. The image captured by the camera is typically an RGB three-channel image that can be converted to a single-channel grayscale image using conventional conversion algorithms or conversion tools. By converting the image shot by the camera into a gray image, the irrelevant information of the image is greatly reduced. Then, the grayscale image is subjected to resize (scaling) processing to generate a plurality of sub-image frames with different resolutions, that is, the grayscale image resize is an image pyramid with different sizes and resolutions, for example, 480 × 480 image is sequentially subjected to resize of 144 × 144, 43 × 43, 13 × 13 by taking 0.3 as a multiple, so as to obtain an image pyramid with four sizes and resolutions. An image pyramid is a kind of multi-scale representation of an image, which is a structure that interprets an image in multiple resolutions. A pyramid of an image is a series of image sets of progressively lower resolution arranged in a pyramid shape and derived from the same original image. Which is obtained by down-sampling, the higher the level, the smaller the image and the lower the resolution.
But not limited to, in practical use, the image may be processed in other manners to generate a plurality of sub-image frames with different resolutions; it is also possible to generate a plurality of sub-image frames with different resolutions by directly performing image processing such as scaling processing without performing grayscale image frame conversion.
Specifically, in the embodiment, the acquiring of the acquired original face picture in step S101 may specifically include: and acquiring an original face picture acquired by an image acquisition unit arranged on the electronic terminal through a development interface of the electronic terminal.
Specifically, in the embodiment, when the face pictures are processed according to the first neural network model in step S101 to determine a batch of face regions, the face pictures may be processed sequentially according to different convolution layers and convolution kernels configured in the first neural network model to determine a batch of face regions. In specific implementation, the face image may be processed sequentially according to different convolution layers and convolution kernels configured in the first neural network model to obtain a plurality of candidate face region frames respectively; and determining a batch of face regions according to the overlapping of the candidate face region frames and the set overlapping threshold value.
Illustratively, the first neural network model is a four-layer convolutional neural network model. The first layer is an input layer, and the obtained image pyramid can be used as the input of the input layer; the second layer is a convolution layer of 3 × 3, the convolution kernel is 5 × 10, and image features are extracted; the third layer is a convolution layer of 3 × 3, the convolution kernel is 3 × 16, and image features are extracted again based on the extraction result of the second layer; the last layer is the output regression layer, the convolution of 1 × 12, and the last output of a batch of face regions, which can reflect the following results: 1. whether the face is a human face; 2. the face region frame position. It should be noted that, in practical applications, not limited to the above four-layer convolutional neural network model structure, those skilled in the art may also adopt more layers of model structures according to actual needs.
If the first neural network model is a four-layer convolutional neural network model, each layer of network outputs a candidate face region frame, and candidate face region frames with the overlapping degrees smaller than a set overlapping degree threshold value are screened out by calculating the overlapping degrees among a plurality of subsequent face region frames, namely a group of face regions.
Specifically, the overlapping degree of any two candidate face region frames can be calculated by the following formula:
RAdenotes the area of the A candidate face region box, RBRepresenting the area of the B candidate face region box, the overlapping degree IOU (interaction-Over-Union) of the two candidate bright region boxes is calculated as follows:
all region boxes (i.e., the batch of face regions) with IOU < threshold (overlap threshold) are filtered out as input to the first neural network model in step S102.
Before step S101, the following steps may be performed: acquiring an acquired original face picture and carrying out scaling processing on the original face picture to obtain image pyramids with different sizes; and taking the image pyramids with different sizes as the input of the first neural network model, so that the first neural network model processes the face pictures to determine a batch of face regions.
S102, processing the batch of face regions according to a second neural network model to filter out non-face regions from the batch of face regions;
specifically, in this embodiment, when the batch of face regions are processed according to the second neural network model in S102 to filter out non-face regions from the batch of face regions, the batch of face regions may be processed in sequence according to different convolution layers and convolution kernels configured in the second neural network model to filter out the non-face regions from the batch of face regions.
Further, successively processing the batch of face regions according to different convolution layers and convolution kernels configured in the second neural network model in S102 to filter out non-face regions from the batch of face regions, including: processing the face regions in the batch in sequence according to different convolution layers and convolution kernels configured in a second neural network model to respectively obtain a plurality of candidate face region frames; and filtering out non-face regions from the batch of face regions according to the overlapping of the candidate face region frames and a set overlapping threshold value.
Illustratively, the second neural network model is a five-layer convolutional neural network model. The first layer is an input layer, and a batch of generated human face areas are used as input of the input layer; the second layer is a convolution layer of 3 × 3, the convolution kernel is 11 × 28, and feature extraction of the face region image is performed; the third layer is a convolution layer of 3 × 3, the convolution kernel is 4 × 48, and feature extraction of the face region is performed again based on the extraction result of the second layer; the fourth layer is a 2 × 2 convolution layer, the convolution kernel is 3 × 64, and feature extraction is performed again on the face region based on the extraction result of the third layer; the fifth layer is a 128 full connected layer, and the batch of face regions with non-faces filtered are finally output, which may reflect the following results: 1. whether the face is a human face; 2. the face region frame position. It should be noted that, in practical applications, the five-layer convolutional neural network model structure is not limited to the above, and those skilled in the art may also adopt other layer-level lightweight model structures according to actual requirements.
And further, outputting a candidate face region frame at each layer of the network of the second neural network model, and filtering out non-face regions from the batch of face regions by calculating the overlapping degree between a plurality of subsequent face region frames and screening out candidate face region frames with the overlapping degree smaller than a set overlapping degree threshold value.
S103, processing the batch of face regions with the non-faces filtered according to a third neural network model to determine a unique face region at the moment T;
in this embodiment, when the batch of face regions with non-faces filtered are processed according to the third neural network model in step S103 to determine the unique face region at time T, the batch of face regions with non-faces filtered may be sequentially processed according to different convolution layers and convolution kernels configured in the third neural network model to determine the unique face region and the face key point position at time T.
Illustratively, the third neural network model is a six-layer convolutional neural network model: the first layer is an input layer, and the output of the second neural network model in step S102 is resized to 48 × 48 pictures as input. The second layer was a 3 × 3 convolutional layer with a convolution kernel of 23 × 32, and feature extraction was performed based on 48 × 48 pictures. The third layer is a convolution layer of 3 × 3, the convolution kernel is 10 × 64, and feature extraction is performed based on the output of the second layer. The fourth layer is a 2 × 2 convolutional layer, the convolution kernel is 4 × 64, and feature extraction is performed based on the output of the third layer. The fifth layer is a 2 × 2 convolutional layer, the convolution kernel is 3 × 128, and feature extraction is performed based on the output of the fourth layer. The sixth layer is a fully connected layer, and the only face area at the moment T is finally output, which can reflect the following results: 1. whether the face is a human face; 2. a face region frame position; 3. and obtaining the calibration area and the key point position of the final face frame.
S104, tracking the unique face area at the moment T according to the face tracking model to determine the unique face area at the moment T + 1;
in this embodiment, when the unique face region at the time T is tracked according to the face tracking model in step S104 to determine the unique face region at the time T +1, the unique face region at the time T is specifically tracked according to a position filter and a scale filter in the face tracking model to determine the unique face region at the time T + 1.
In this embodiment, the two filters are a position filter and a scale filter, respectively, the former performs positioning of a face in the current image frame, and the latter performs estimation of a face scale in the current image frame. The two filters are relatively independent so that different feature types and feature calculation modes can be selected for training and testing. When the target is tracked, in a new frame of image frame, a two-dimensional position filter is used for determining a new candidate position of the target, and then a one-dimensional scale filter is used for obtaining candidate frames with different scales by taking the current central position of the target as a central point, so that the best matching scale is found, the frame rate can reach 100+ fps, the accuracy is more than 0.8, and the requirement of face calibration on a mobile terminal can be completely met.
In step S104, according to the calibration result of the third neural network model to the face, the position and the scale of the face in the image frame to be processed are obtained; and determining the position and the scale of the human face in an image frame after the image frame to be processed according to the position and the scale and a preset position model and a scale model. Optionally, after determining the position and the scale of the human face in an image frame after the image frame to be processed, a preset position model and a preset scale model may be updated according to the determined position and scale.
The image frame after the image frame to be processed may be an image frame next to the current image frame, or may be an image frame after several frames apart.
Illustratively, the input (input) of the face tracking model includes: 1) an image i (t) at time t; 2) the face position P (t-1) and the scale S (t-1) of the previous frame; 3) the position models A _ trans (t-1), B _ trans (t-1) and the scale models A _ scale (t-1), B _ scale (t-1) of the previous frame. The output (output) comprises: 1) the face estimation position P (t) and the estimation scale S (t) of the current frame; 2) updated position models A _ trans (t), B _ trans (t) and scale models A _ scale (t), B _ scale (t).
Wherein the position model and the scale model can be determined by:
for a certain image g, the following formula (2) can be expressed by using an input image f and a filter h:
wherein,representing a cross product.
According to the convolution theorem, the fourier transform of the functional cross-correlation is equal to the product of the functional fourier transforms, and the following formula (3) is obtained by processing the formula (1):
wherein F () represents a Fourier transform,denotes the complex conjugate of F (h).
The formula (2) is simplified to the following formula (4):
wherein G is a simplification of F (G), F is a simplification of F (F),is composed ofThe simplification of (1).
Setting a linear least squares error function as the following equation (5):
wherein ε represents the error; 1, …, d, d represents the dimension of the feature vector of image F; hlRepresenting filtering the l-dimension feature; flA l-dimension feature vector representing the image F; "| | |" represents the euclidean distance; | | non-woven hair2Expressing the square sum;a regular expression representing an error to reduce an over-fitting problem in the optimization; λ represents the weight parameter of the regular expression.
By minimizing the error function for equation (4), the final solved filter is as follows equation (6):
wherein,represents the complex conjugate of G and represents the complex conjugate of G,is represented by FKK 1, …, d, d representing the dimension of the feature vector of image F; 1, …, d, d representing the direction of features of image FThe dimension of the quantity.
Then the filter calculation for a certain time instant i is as follows (7):
wherein,
where t is 1, …, N denotes the number of image frames, η is a training parameter, and may be expressed as a learning rate.
In performing position tracking, G, F, position models a _ trans () and B _ trans (), which are position-dependent, are acquired based on a position filter; in scale tracking, scale-dependent G, F, scale models A _ scale () and B _ scale () are obtained based on the scale filter.
The process of performing position estimation is as follows: a) sampling according to 2 times of target size on a current image frame l according to the position P (t-1) and the scale S (t-1) of a previous image frame of the human face to obtain a sample Ztrans; b) calculating the position response according to the position models A _ trans (t-1) and B _ trans (t-1) of the previous image frame, wherein the formula is as follows:c) the face position p (t) max (y _ trans) is obtained.
Where y trans represents the position filter response value,representing the inverse of the discrete fourier transform,l-channel vector representing the t-th picture, l ═ 1, …, d, d represent the dimension of the feature vector of the image, and λ represents the weight parameter.
The procedure for scale estimation is as follows: a) extracting face samples Ztrans _ scale with different scales; b) in the same manner as above, y _ scale is calculated, and the face scale s (t) is obtained as max (y _ scale).
The process of performing model update is as follows: a) extracting training samples f _ trans and f _ scale from a current image frame l (t); extracting corresponding Hog characteristics and gray characteristics, and constructing a Gaussian response function of a corresponding scale; b) and updating the position models A _ trans (t-1) and B _ trans (t-1) and the scale models A _ scale (t-1) and B _ scale (t-1).
S105, judging whether the tracking is successful;
in this embodiment, in step S105, it may be specifically determined whether the unique face region tracking is successful according to the unique face region at the time T and the unique face region at the time T + 1.
In step S105, specifically, determining whether the unique face region tracking is successful according to the unique face region at the time T and the unique face region at the time T +1 includes: if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is equal to a set overlap threshold value, judging that the unique face region is successfully tracked; or if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is smaller than or larger than a set overlap threshold, determining that the unique face region tracking fails.
S106, if the tracking is successful, judging whether to continue to calibrate;
optionally, if it is determined that the unique face region tracking is successful, taking the unique face region at the time T +1 as an input of the third neural network model, so as to process the batch of face regions with non-faces filtered, so as to determine the unique face region at the time T + 2.
If the tracking fails, the process goes to step S101. If the unique face region tracking is judged to fail, the step of processing the face pictures according to the first neural network model is skipped to, so that a batch of face regions are determined again.
The specific way of judging whether to continue calibration can be determined by a set calibration flag or a set condition for continuing calibration, such as the number of continuous calibration.
S107A, if the calibration is continued, acquiring the unique face area at the moment of determining T +1, and jumping to the step S103;
if the tracking is successful, the output tracking target area position is directly cut out from the unique face area at the moment of T +1, and the step S103 is skipped to as the input of a third neural network model, so that more accurate face area positions and key point calibration positions can be obtained. As the most time-consuming S101 is saved in the step, the prediction time can be improved by 3 times, and the prediction efficiency is greatly accelerated. And the input of the third neural network model is directly the only face region which is successfully tracked, namely the target frame is accurately tracked, so that the final output face calibration position and the key point position are more stable.
S107B, if the calibration is not continued, the process ends.
It should be noted that, in any other embodiment, only step S101 to step S104 are included, and step S105 to step S107 are further optimized or further applied technical solutions.
Fig. 2 is a schematic structural diagram of a human face region calibration device in the second embodiment of the present application; as shown in fig. 2, it includes:
the first program unit is used for processing the face pictures according to the first neural network model so as to determine a batch of face regions;
the second program unit is used for processing the batch of face regions according to a second neural network model so as to filter out non-face regions from the batch of face regions;
a third program unit, configured to process the batch of face regions with non-faces filtered according to a third neural network model, so as to determine a unique face region at time T;
and the fourth program unit is used for tracking the unique face area at the time T according to the face tracking model so as to determine the unique face area at the time T + 1.
Specifically, in this embodiment, the first program unit is further configured to process the face pictures in sequence according to different convolution layers and convolution kernels configured in the first neural network model, so as to determine a batch of face regions.
Specifically, in this embodiment, the first program unit is further configured to process the face picture successively according to different convolution layers and convolution kernels configured in the first neural network model, so as to obtain a plurality of candidate face region frames respectively; and determining a batch of face regions according to the overlapping of the candidate face region frames and the set overlapping threshold value.
Specifically, in this embodiment, the second program unit is further configured to process the batch of face regions in sequence according to different convolution layers and convolution kernels configured in the second neural network model, so as to filter out non-face regions from the batch of face regions.
Specifically, in this embodiment, the second program unit is further configured to sequentially process the batch of face regions according to different convolution layers and convolution kernels configured in the second neural network model, so as to obtain a plurality of candidate face region frames respectively; and filtering out non-face regions from the batch of face regions according to the overlapping of the candidate face region frames and a set overlapping threshold value.
Specifically, in this embodiment, the third program unit is further configured to sequentially process the batch of face regions with non-faces filtered according to different convolution layers and convolution kernels configured in the third neural network model, so as to determine a unique face region and a face key point position at time T.
Specifically, in this embodiment, the fourth program unit is further configured to track the unique face region at time T according to a position filter and a scale filter in the face tracking model, so as to determine the unique face region at time T + 1.
Fig. 3 is a schematic structural diagram of a face area calibration device in the third embodiment of the present application; as shown in fig. 3, it may include, in addition to the first program unit, the second program unit, the third program unit and the fourth program unit in fig. 2:
the conversion unit is used for acquiring an acquired original face picture and carrying out zooming processing on the original face picture to obtain image pyramids with different sizes;
and the input unit is used for taking the image pyramids with different sizes as the input of the first neural network model, so that the first neural network model processes the face pictures to determine a batch of face regions.
In a specific implementation, the conversion unit and the input unit may be used as a substructure of the first program unit, or may be a structure independent of the first program unit.
Fig. 4 is a schematic structural diagram of a face area calibration device in the fourth embodiment of the present application; as shown in fig. 4, it includes, in addition to the first program unit, the second program unit, the third program unit, the fourth program unit, the conversion unit, and the input unit in fig. 2: and the feedback unit is used for judging whether the unique face area tracking is successful according to the unique face area at the moment T and the unique face area at the moment T + 1.
In specific implementation, the feedback unit is further configured to determine that the unique face area is successfully tracked if the overlap between the face area frames of the unique face area at the time T and the face area frames of the unique face area at the time T +1 is equal to a set overlap threshold; or if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is smaller than or larger than a set overlap threshold, determining that the unique face region tracking fails.
In specific implementation, if it is determined that the unique face region tracking is successful, the feedback unit is further configured to use the unique face region at the time T +1 as an input of the third neural network model, so as to process the batch of face regions with non-faces filtered, so as to determine the unique face region at the time T + 2.
It should be noted that the expressions first, second, third and fourth are not limited to numbers, and for those skilled in the art, the program modules may be multiplexed or shared, and therefore, the number of the program modules may be less than four.
In addition, the program modules are not necessarily located at the same physical location, but may be based on a distributed architecture, such as being partially located on a front-end mobile terminal and partially located on a back-end server.
The embodiment of the present application further provides an electronic device, which includes the face region calibration apparatus in any one of the above embodiments. The electronic equipment can be a PC or a mobile terminal. The technical scheme of the embodiment of the application can be applied to scenes such as expression recognition, generation of human face cartoon, reality enhancement, face changing and the like.
In the embodiment of the application, the face pictures are processed according to the first neural network model to determine a batch of face regions; processing the batch of face regions according to a second neural network model to filter out non-face regions from the batch of face regions; processing the batch of face regions with the non-faces filtered according to a third neural network model to determine a unique face region at the time T; the unique face region at the time T is tracked according to the face tracking model to determine the unique face region at the time T +1, and due to the fact that the plurality of neural network models are subjected to multilayer convolution operation, face characteristic points are efficiently and accurately extracted, and the robustness of face calibration is greatly improved. In addition, the device is not influenced by complex environment. Meanwhile, for the most advanced neural network algorithm at present, the prediction efficiency is improved by 300%, and the accuracy is not influenced. Further, when a feedback mechanism is introduced, calibration efficiency and stability are improved.
The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions and/or portions thereof that contribute to the prior art may be embodied in the form of a software product that can be stored on a computer-readable storage medium including any mechanism for storing or transmitting information in a form readable by a computer (e.g., a computer). For example, a machine-readable medium includes Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media, optical storage media, flash memory storage media, electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.), and others, and the computer software product includes instructions for causing a computing device (which may be a personal computer, server, or network device, etc.) to perform the methods described in the various embodiments or portions of the embodiments.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.
As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus (device), or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Claims (25)
1. A face region calibration method is characterized by comprising the following steps:
processing the face pictures according to the first neural network model to determine a batch of face regions;
processing the batch of face regions according to a second neural network model to filter out non-face regions from the batch of face regions;
processing the batch of face regions with the non-faces filtered according to a third neural network model to determine a unique face region at the moment T, wherein T is greater than 0;
and tracking the unique face region at the moment T according to the face tracking model to determine the unique face region at the moment T + 1.
2. The method of claim 1, further comprising:
acquiring an acquired original face picture, and carrying out scaling processing on the original face picture to obtain image pyramids with different sizes;
and taking the image pyramids with different sizes as the input of the first neural network model, so that the first neural network model processes the face pictures to determine a batch of face regions.
3. The method of claim 2, wherein obtaining the captured original face picture comprises: and acquiring an original face picture acquired by an image acquisition unit arranged on the electronic terminal through a development interface of the electronic terminal.
4. The method of claim 1, wherein processing the face pictures according to the first neural network model to determine a set of face regions comprises: and processing the face pictures in sequence according to different convolution layers and convolution kernels configured in the first neural network model to determine a batch of face regions.
5. The method of claim 4, wherein successively processing the face pictures according to different convolutional layers and convolutional kernels configured in the first neural network model to determine a batch of face regions comprises: processing the face picture according to different convolution layers and convolution kernels configured in the first neural network model to respectively obtain a plurality of candidate face area frames; and determining a batch of face regions according to the overlapping of the candidate face region frames and the set overlapping threshold value.
6. The method of claim 1, wherein processing the collection of face regions according to a second neural network model to filter out non-face regions from the collection of face regions comprises: and processing the face regions in sequence according to different convolution layers and convolution kernels configured in the second neural network model so as to filter out non-face regions from the face regions.
7. The method of claim 6, wherein successively processing the plurality of face regions according to different convolutional layers and convolutional kernels configured in a second neural network model to filter out non-face regions from the plurality of face regions comprises: processing the face regions in the batch in sequence according to different convolution layers and convolution kernels configured in a second neural network model to respectively obtain a plurality of candidate face region frames; and filtering out non-face regions from the batch of face regions according to the overlapping of the candidate face region frames and a set overlapping threshold value.
8. The method of claim 1, wherein processing the batch of face regions with non-faces filtered according to a third neural network model to determine a unique face region at time T comprises: and processing the batch of face regions with the non-faces filtered according to different convolution layers and convolution kernels configured in the third neural network model to determine the unique face region and the position of a face key point at the time T.
9. The method of claim 1, wherein tracking the unique face region at time T according to a face tracking model to determine the unique face region at time T +1 comprises: and tracking the unique face region at the time T according to a position filter and a scale filter in the face tracking model to determine the unique face region at the time T + 1.
10. The method of claim 1, further comprising: and judging whether the unique face area tracking is successful or not according to the unique face area at the moment T and the unique face area at the moment T + 1.
11. The method of claim 10, wherein determining whether the unique face region tracking is successful according to the unique face region at the time T and the unique face region at the time T +1 comprises: if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is equal to a set overlap threshold value, judging that the unique face region is successfully tracked; or if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is smaller than or larger than a set overlap threshold, determining that the unique face region tracking fails.
12. The method of claim 11, wherein if the unique face region tracking is determined to be successful, taking a unique face region at time T +1 as an input of the third neural network model to process the batch of face regions with non-faces filtered to determine a unique face region at time T + 2.
13. The method as claimed in claim 11, wherein if the unique face region tracking is determined to fail, the step of processing the face pictures according to the first neural network model is skipped to determine a batch of face regions again.
14. A face region calibration device is characterized by comprising:
the first program unit is used for processing the face pictures according to the first neural network model so as to determine a batch of face regions;
the second program unit is used for processing the batch of face regions according to a second neural network model so as to filter out non-face regions from the batch of face regions;
a third program unit, configured to process the batch of face regions with non-faces filtered according to a third neural network model, so as to determine a unique face region at time T;
and the fourth program unit is used for tracking the unique face area at the time T according to the face tracking model so as to determine the unique face area at the time T + 1.
15. The apparatus of claim 14, further comprising:
the conversion unit is used for acquiring an acquired original face picture and carrying out zooming processing on the original face picture to obtain image pyramids with different sizes;
and the input unit is used for taking the image pyramids with different sizes as the input of the first neural network model, so that the first neural network model processes the face pictures to determine a batch of face regions.
16. The apparatus according to claim 14, wherein the first program unit is further configured to process the face pictures sequentially according to different convolutional layers and convolutional kernels configured in the first neural network model to determine a batch of face regions.
17. The apparatus according to claim 16, wherein the first program unit is further configured to process the face picture in sequence according to different convolution layers and convolution kernels configured in the first neural network model to obtain a plurality of candidate face region frames, respectively; and determining a batch of face regions according to the overlapping of the candidate face region frames and the set overlapping threshold value.
18. The apparatus of claim 14, wherein the second program unit is further configured to process the batch of face regions sequentially according to different convolutional layers and convolutional kernels configured in a second neural network model, so as to filter out non-face regions from the batch of face regions.
19. The apparatus according to claim 18, wherein the second program unit is further configured to process the batch of face regions in sequence according to different convolutional layers and convolutional kernels configured in a second neural network model to obtain a plurality of candidate face region frames, respectively; and filtering out non-face regions from the batch of face regions according to the overlapping of the candidate face region frames and a set overlapping threshold value.
20. The apparatus according to claim 14, wherein the third program unit is further configured to process the batch of face regions with non-faces filtered successively according to different convolution layers and convolution kernels configured in a third neural network model, so as to determine unique face regions and face key point positions at time T.
21. The apparatus according to claim 14, wherein the fourth program unit is further configured to track the unique face region at time T according to a position filter and a scale filter in the face tracking model to determine the unique face region at time T + 1.
22. The apparatus of claim 14, further comprising: and the feedback unit is used for judging whether the unique face area tracking is successful according to the unique face area at the moment T and the unique face area at the moment T + 1.
23. The apparatus according to claim 22, wherein the feedback unit is further configured to determine that the unique face region tracking is successful if the overlap of the face region frames of the unique face region at time T and the unique face region at time T +1 is equal to a set overlap threshold; or if the overlap of the face region frames of the unique face region at the time T and the unique face region at the time T +1 is smaller than or larger than a set overlap threshold, determining that the unique face region tracking fails.
24. The apparatus of claim 23, wherein if it is determined that the unique face region tracking is successful, the feedback unit is further configured to use a unique face region at time T +1 as an input of the third neural network model, so as to process the batch of face regions with non-faces filtered, so as to determine a unique face region at time T + 2.
25. An electronic device, characterized by comprising the face region calibration apparatus of any one of claims 14 to 24.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810096476.8A CN108229432A (en) | 2018-01-31 | 2018-01-31 | Face calibration method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810096476.8A CN108229432A (en) | 2018-01-31 | 2018-01-31 | Face calibration method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108229432A true CN108229432A (en) | 2018-06-29 |
Family
ID=62670331
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810096476.8A Pending CN108229432A (en) | 2018-01-31 | 2018-01-31 | Face calibration method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108229432A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109472247A (en) * | 2018-11-16 | 2019-03-15 | 西安电子科技大学 | Face identification method based on the non-formula of deep learning |
CN109635749A (en) * | 2018-12-14 | 2019-04-16 | 网易(杭州)网络有限公司 | Image processing method and device based on video flowing |
CN109858435A (en) * | 2019-01-29 | 2019-06-07 | 四川大学 | A kind of lesser panda individual discrimination method based on face image |
CN110046602A (en) * | 2019-04-24 | 2019-07-23 | 李守斌 | Deep learning method for detecting human face based on classification |
CN111488774A (en) * | 2019-01-29 | 2020-08-04 | 北京搜狗科技发展有限公司 | Image processing method and device for image processing |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130286218A1 (en) * | 2012-04-27 | 2013-10-31 | Canon Kabushiki Kaisha | Image recognition device that recognizes specific object area, method of controlling the device, and storage medium, as well as image pickup apparatus, and display device |
CN106650699A (en) * | 2016-12-30 | 2017-05-10 | 中国科学院深圳先进技术研究院 | CNN-based face detection method and device |
CN107578423A (en) * | 2017-09-15 | 2018-01-12 | 杭州电子科技大学 | The correlation filtering robust tracking method of multiple features hierarchical fusion |
CN107609497A (en) * | 2017-08-31 | 2018-01-19 | 武汉世纪金桥安全技术有限公司 | The real-time video face identification method and system of view-based access control model tracking technique |
CN107644430A (en) * | 2017-07-27 | 2018-01-30 | 孙战里 | Target following based on self-adaptive features fusion |
-
2018
- 2018-01-31 CN CN201810096476.8A patent/CN108229432A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130286218A1 (en) * | 2012-04-27 | 2013-10-31 | Canon Kabushiki Kaisha | Image recognition device that recognizes specific object area, method of controlling the device, and storage medium, as well as image pickup apparatus, and display device |
CN106650699A (en) * | 2016-12-30 | 2017-05-10 | 中国科学院深圳先进技术研究院 | CNN-based face detection method and device |
CN107644430A (en) * | 2017-07-27 | 2018-01-30 | 孙战里 | Target following based on self-adaptive features fusion |
CN107609497A (en) * | 2017-08-31 | 2018-01-19 | 武汉世纪金桥安全技术有限公司 | The real-time video face identification method and system of view-based access control model tracking technique |
CN107578423A (en) * | 2017-09-15 | 2018-01-12 | 杭州电子科技大学 | The correlation filtering robust tracking method of multiple features hierarchical fusion |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109472247A (en) * | 2018-11-16 | 2019-03-15 | 西安电子科技大学 | Face identification method based on the non-formula of deep learning |
CN109472247B (en) * | 2018-11-16 | 2021-11-30 | 西安电子科技大学 | Face recognition method based on deep learning non-fit type |
CN109635749A (en) * | 2018-12-14 | 2019-04-16 | 网易(杭州)网络有限公司 | Image processing method and device based on video flowing |
CN109858435A (en) * | 2019-01-29 | 2019-06-07 | 四川大学 | A kind of lesser panda individual discrimination method based on face image |
CN111488774A (en) * | 2019-01-29 | 2020-08-04 | 北京搜狗科技发展有限公司 | Image processing method and device for image processing |
CN109858435B (en) * | 2019-01-29 | 2020-12-01 | 四川大学 | Small panda individual identification method based on face image |
CN110046602A (en) * | 2019-04-24 | 2019-07-23 | 李守斌 | Deep learning method for detecting human face based on classification |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11908244B2 (en) | Human posture detection utilizing posture reference maps | |
JP7490141B2 (en) | IMAGE DETECTION METHOD, MODEL TRAINING METHOD, IMAGE DETECTION APPARATUS, TRAINING APPARATUS, DEVICE, AND PROGRAM | |
JP7386545B2 (en) | Method for identifying objects in images and mobile device for implementing the method | |
CN111160533B (en) | Neural network acceleration method based on cross-resolution knowledge distillation | |
CN111160269A (en) | Face key point detection method and device | |
CN108229432A (en) | Face calibration method and device | |
CN109409222A (en) | A kind of multi-angle of view facial expression recognizing method based on mobile terminal | |
CN111199230B (en) | Method, device, electronic equipment and computer readable storage medium for target detection | |
Tian et al. | Ear recognition based on deep convolutional network | |
CN110069985B (en) | Image-based target point position detection method and device and electronic equipment | |
CN112381061B (en) | Facial expression recognition method and system | |
CN112818764A (en) | Low-resolution image facial expression recognition method based on feature reconstruction model | |
CN110245621B (en) | Face recognition device, image processing method, feature extraction model, and storage medium | |
Zhao et al. | Applying contrast-limited adaptive histogram equalization and integral projection for facial feature enhancement and detection | |
Kishore et al. | Selfie sign language recognition with convolutional neural networks | |
CN112861718A (en) | Lightweight feature fusion crowd counting method and system | |
CN108876776B (en) | Classification model generation method, fundus image classification method and device | |
CN110826534B (en) | Face key point detection method and system based on local principal component analysis | |
CN109508640A (en) | Crowd emotion analysis method and device and storage medium | |
CN112329663A (en) | Micro-expression time detection method and device based on face image sequence | |
Oliveira et al. | A comparison between end-to-end approaches and feature extraction based approaches for sign language recognition | |
Zhang et al. | A simple and effective static gesture recognition method based on attention mechanism | |
CN118314618A (en) | Eye movement tracking method, device, equipment and storage medium integrating iris segmentation | |
Rasel et al. | An efficient framework for hand gesture recognition based on histogram of oriented gradients and support vector machine | |
Marjusalinah et al. | Classification of finger spelling American sign language using convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20200522 Address after: 310051 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province Applicant after: Alibaba (China) Co.,Ltd. Address before: 510627 Guangdong city of Guangzhou province Whampoa Tianhe District Road No. 163 Xiping Yun Lu Yun Ping B radio square 14 storey tower Applicant before: Guangzhou Dongjing Computer Technology Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180629 |