CN111881841A - Face detection and recognition method based on binocular vision - Google Patents
Face detection and recognition method based on binocular vision Download PDFInfo
- Publication number
- CN111881841A CN111881841A CN202010748989.XA CN202010748989A CN111881841A CN 111881841 A CN111881841 A CN 111881841A CN 202010748989 A CN202010748989 A CN 202010748989A CN 111881841 A CN111881841 A CN 111881841A
- Authority
- CN
- China
- Prior art keywords
- face
- points
- binocular vision
- detection
- dimensional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
- G06V20/653—Three-dimensional objects by matching three-dimensional models, e.g. conformal mapping of Riemann surfaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a face detection and identification method based on binocular vision, which comprises the following steps: (1) acquiring a left human face picture and a right human face picture through a binocular camera; (2) detecting two pictures through the hog characteristics, and finding out two corresponding human face pictures from the two pictures; (3) extracting human face features from the two acquired pictures containing human faces; (4) obtaining the depth information of the human face characteristic points by a binocular vision ranging method, thereby solving a human face three-dimensional model; (5) and analyzing the solved result by using a statistical method to realize identification. Specifically, the three-dimensional face model is classified through a support vector machine, so that face recognition is realized. The face detection, three-dimensional reconstruction and identification are realized through the shooting, detection, modeling and classification technology, and the method has the characteristics of high efficiency and rapidness in processing, safety and reliability, and capability of providing more complete and richer detection information.
Description
Technical Field
The invention belongs to the technical field of digital image processing, and particularly relates to a face detection and identification method based on binocular vision.
Background
Since 2001, the public security department began using face detection technology to combat serious criminal crimes and to receive national support. The face detection technology is applied when the Beijing Olympic Games hold in 2008, which marks that the face detection of China enters a practical stage. Since then, in the world exposition in Shanghai, the technology is more widely applied, and meanwhile, various companies are added in succession, so that the large-scale application of the face detection technology in China is accelerated. With the technical progress in the field in China, the 'three-in-two' will be the inevitable trend of the development of the face detection technology. Wherein: the "three-transformation" refers to the mainstream, chip and standardization; "bipartite" refers to the association of multi-biometric fusions with REID with other biometrics.
At present, the research on human face detection at home and abroad is mainly based on two-dimensional images, and the human face detection and detection methods used in various documents are different, but most methods focus on human face detection on a single two-dimensional image, and currently, the mainstream human face identification comprises the following methods,
face recognition based on geometric features;
face recognition based on the characteristic face;
face recognition based on template matching;
face recognition based on a neural network;
hidden Markov Model (HMM) based face recognition;
face recognition based on an elastic matching method;
face recognition based on Bayesian decision;
face recognition based on a support vector machine;
and the three-dimensional face detection system based on binocular vision is less.
The current face recognition is mainly researched on two-dimensional images or two-dimensional dynamic video sequences. The two-dimensional image recognition technology has many applications in other fields, but because the human face is a plastic deformation body, the human face recognition technology has difficulty in the two-dimensional image recognition technology. In addition, the face recognition based on the two-dimensional image is inevitably affected by ambient light, background, visual angle and the like, and the posture, expression, shielding and the like of the face, so that the recognition accuracy is difficult to be further improved.
The two-dimensional face image is only a plane projection result of the three-dimensional face image, a part of information is necessarily lost in the process, and the face is influenced by factors such as illumination conditions, background, posture, expression and the like, so that the problems are difficult to solve by the face detection method based on the monocular camera.
Disclosure of Invention
In order to overcome the defects of a face recognition technology based on a two-dimensional image, the invention provides a face detection recognition method based on binocular vision, which is used for collecting face features to carry out three-dimensional detection research based on the binocular vision and can greatly improve the detection efficiency and accuracy. By introducing a binocular vision technology, compared with a two-dimensional face image obtained by a common monocular camera and a three-dimensional face image formed by the binocular camera, more information, particularly depth information of a face, can be obtained. The method is used for solving the problem that the traditional monocular camera is difficult to detect the face in some special scenes.
In order to solve the technical problem, the invention is solved by the following technical scheme:
a face detection and identification method based on binocular vision comprises the following steps:
step (1) acquiring a left picture and a right picture containing human faces;
respectively carrying out hog feature detection on two pictures containing human faces to obtain human face images corresponding to the two pictures;
performing feature extraction on the two obtained face images to obtain face feature points;
step (4) calculating four-dimensional coordinate point information of each face characteristic point through a binocular ranging algorithm based on the extracted face characteristic points, and fitting a face three-dimensional model based on the four-dimensional coordinate point information of the face characteristic points;
and (5) identifying the three-dimensional model of the face through a support vector machine algorithm to obtain an identification result.
In some embodiments, in step (1), two pictures containing human faces are obtained through a binocular camera.
In some embodiments, in step (3), feature extraction is performed on the two obtained face images by using a convolutional neural network to obtain face feature points.
In some embodiments, in step (3), the face feature points are selected from 9 feature points, which are 2 eyeball center points, 4 eyeball corner points, a nostril center point and 2 mouth corner points.
In some embodiments, the convolutional neural network is a modified P _ Net network, the input layer size is set to 1202 × 1202 × 3, the second layer network is 600 × 600 × 10, the pooling layer is a 2 × 2 matrix, the third layer network is 300 × 300 × 16, the pooling layer is a 2 × 2 matrix, the fourth layer network is 1 × 1 × 1000, and the last layer output layer size is 2 × 9.
In some embodiments, in step (4), four-dimensional coordinate point information of each feature point is calculated by a binocular ranging algorithm, and the calculation formula is as follows:
p point is the target of the depth to be calculated, OlAnd OrTwo points, f, corresponding to the left and right images respectivelyl、frThe distance between the corresponding point of the left and right images and the lens is divided into Xl、YlThe positions of P on the left photograph, X respectivelyr、YrRespectively, the position of P on the right photograph; r is1To r9Coordinate positions of the 9 characteristic points on the two photos; the relationship between the camera coordinate system and the world coordinate system can be represented by a rotation matrix R, whereinTime matrixRepresenting the time conversion relation between a camera coordinate system and a world coordinate system, wherein Z is the depth of the target to be measured relative to the world coordinate system, and x is converted into X through Rl、yl、zl、xr、yr、zrConverted to a position relative to the world coordinate system.
In some embodiments, in step (4), the three-dimensional model of the human face is fitted by cubic spline interpolation based on the four-dimensional coordinate point information of the characteristic points of the human face.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a binocular vision-based face detection and recognition method, which mainly aims at the problem that the face recognition of a monocular camera cannot be well performed due to the influence of factors such as illumination conditions, backgrounds, postures, expressions and the like. Aiming at the defect, a binocular camera is used for acquiring images of the human face to establish a three-dimensional model of the human face, and finally the recognition of the human face is realized through a support vector machine algorithm, so that the problem that the monocular camera cannot work well in the face recognition under the condition of large external interference is effectively solved.
Drawings
FIG. 1 is a general flow diagram of a method involved in an embodiment of the invention;
FIG. 2 is a structure of a modified version of the P _ Net network;
fig. 3 is a schematic view of binocular vision computed depth.
Detailed Description
The invention is described in further detail below with reference to the following figures and detailed description:
as shown in fig. 1, a face detection and recognition method of the present invention includes the following steps:
(1) the images of two human faces are obtained through the binocular camera, and when the images are obtained, the human faces are possibly right opposite to the binocular camera, so that a more reliable human face image can be obtained. The definition of the camera is not less than 500 ten thousand pixels, so that more accurate human face feature point positions can be obtained conveniently. The distance of the face from the camera should ensure that the face occupies the area of the picture as much as possible.
(2) After the left and right images were obtained, the input was calculated using a modified version of P _ Net network, as shown in fig. 2, with the input size set to 1202 × 1202 × 3, the second tier network set to 600 × 600 × 10, the pooling layer to a 2 × 2 matrix, the third tier network set to 300 × 300 × 16, the pooling layer to a 2 × 2 matrix, the fourth tier network set to 1 × 1 × 1000, and the last tier output to 2 × 9. The reason for outputting 2 × 9 here is that the network is designed to find the positions of 9 corner points, 2 eyeball center points, 4 eye corner points, the center points of the nostrils, and 2 mouth corner points in the face, and each corner point has coordinates in two directions on one picture, so the output is 2 × 9.
(3) After coordinate positions of the 9 points on the two photos are obtained, depth coordinate information of each corner point is obtained through a binocular distance measuring algorithm. The ranging algorithm is shown in fig. 3: p point is the target of the depth to be calculated, Zl、ZrFor the depth, O, of the object to be measured in the left and right images, respectivelylAnd OrTwo points, f, corresponding to the left and right images respectivelyl、frThe distance between the corresponding point of the left and right images and the lens is divided into Xl、YlThe positions of P on the left photograph, X respectivelyr、YrRespectively, the position of P on the right photograph; r is1To r9Coordinate positions of 9 characteristic points (respectively 2 eyeball center points, 4 eyeball corner points, nostril center points and 2 mouth corner points) on the two pictures;
the calculation formula is as follows:
the relationship between the camera coordinate system and the world coordinate system can be represented by a rotation matrix R, whereinTime matrixRepresenting the time conversion relation of two coordinate systems, wherein Z is the depth of the target to be measured relative to the world coordinate system, and x is converted into X through Rl、yl、zl、xr、yr、zrConverted to a position relative to the world coordinate system.
(4) After the depth information of 9 points is obtained, specific positions of the 9 corner points in a three-dimensional coordinate space are obtained, and then the specific shape of the human face is fitted through cubic spline interpolation to obtain model parameters of the human face.
(5) And finally, taking the obtained model parameters of the human face as input, and realizing the recognition of the human face through a support vector machine algorithm. The input of the support vector machine is a parameter model of the human face, the output size is set to be 1, different human faces are used as the input of the support vector machine, the output value difference is large, the same human face is used as the input of the support vector machine, the output value difference is small, two input human faces with similar output values can be considered as one person, and two input human faces with larger value difference are considered as two persons.
The positions of 9 corner points, namely 2 eyeball center points, 4 eye corner points, a nostril center point and 2 mouth corner points in the human face, are obtained through a P-Net network, and after the positions of the 9 corner points are obtained (two image images obtained by a left camera and a right camera, 9 points of each human face image are obtained, 18 points are obtained in total), three-dimensional coordinate information corresponding to the 9 corner points is obtained through a binocular vision ranging algorithm.
After 9 positions are obtained, the shape of the human face is restored by cubic spline interpolation.
The information of the human face shape is used as the feature, and the feature is classified through a support vector machine algorithm.
In some embodiments, the input of the improved P-Net network is the face images obtained by the left and right cameras, and the output is the positions of the 2 eyeball center points, 4 eye corner points, the middle point of the nostril and the 9 corner points of the 2 mouth corner points in the face, according to which the output can be expanded to increase the output types, such as the positions of the ears, the chin and the like.
The input of the support vector machine algorithm is a human face shape obtained through cubic spline interpolation, and only one output is set for the output of the human face shape, namely the output dimension is 1 multiplied by 1. The output values are very close to each other for the same face, and the output values are greatly different for different faces.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (7)
1. A face detection and identification method based on binocular vision is characterized by comprising the following steps:
step (1) acquiring a left picture and a right picture containing human faces;
respectively carrying out hog feature detection on two pictures containing human faces to obtain human face images corresponding to the two pictures;
performing feature extraction on the two obtained face images to obtain face feature points;
step (4) calculating four-dimensional coordinate point information of each face characteristic point through a binocular ranging algorithm based on the extracted face characteristic points, and fitting a face three-dimensional model based on the four-dimensional coordinate point information of the face characteristic points;
and (5) identifying the three-dimensional model of the face through a support vector machine algorithm to obtain an identification result.
2. The binocular vision based face detection and recognition method of claim 1, wherein: in the step (1), two pictures containing human faces are obtained through a binocular camera.
3. The binocular vision based face detection and recognition method of claim 1, wherein: and (3) performing feature extraction on the two obtained face images by using a convolutional neural network to obtain face feature points.
4. The binocular vision based face detection and recognition method of claim 1, wherein: in the step (3), 9 feature points are selected from the face feature points, wherein the feature points are respectively 2 eyeball center points, 4 eye corner points, the midpoint of a nostril and 2 mouth corner points.
5. The binocular vision based face detection and recognition method of claim 3, wherein: the convolutional neural network adopts an improved P _ Net network, the size of an input layer is set to be 1202 multiplied by 3, a second layer network is 600 multiplied by 10, a pooling layer is a matrix of 2 multiplied by 2, a third layer network is 300 multiplied by 16, the pooling layer is a matrix of 2 multiplied by 2, a fourth layer network is 1 multiplied by 1000, and the size of a last layer output layer is 2 multiplied by 9.
6. The binocular vision based face detection and recognition method of claim 1, wherein: in the step (4), four-dimensional coordinate point information of each characteristic point is calculated through a binocular ranging algorithm, and a calculation formula is as follows:
p is the target to be calculated in depth, Z is the depth of the target to be calculated relative to the world coordinate system, OlAnd OrTwo points, f, corresponding to the left and right images respectivelyl、frThe distance between the corresponding point of the left and right images and the lens is divided into Xl、YlThe positions of P on the left photograph, X respectivelyr、YrRespectively, the position of P on the right photograph; r is1To r9Coordinate positions of the 9 characteristic points on the two photos; the relationship between the camera coordinate system and the world coordinate system can be represented by a rotation matrix R, whereinTime matrixRepresenting the time-transformed relationship between the camera coordinate system and the world coordinate system.
7. The binocular vision based face detection and recognition method of claim 1, wherein: and (4) fitting a three-dimensional model of the human face through cubic spline interpolation based on the four-dimensional coordinate point information of the characteristic points of the human face.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010748989.XA CN111881841B (en) | 2020-07-30 | 2020-07-30 | Face detection and recognition method based on binocular vision |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010748989.XA CN111881841B (en) | 2020-07-30 | 2020-07-30 | Face detection and recognition method based on binocular vision |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111881841A true CN111881841A (en) | 2020-11-03 |
CN111881841B CN111881841B (en) | 2022-09-13 |
Family
ID=73204262
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010748989.XA Active CN111881841B (en) | 2020-07-30 | 2020-07-30 | Face detection and recognition method based on binocular vision |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111881841B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112396694A (en) * | 2020-12-08 | 2021-02-23 | 北京工商大学 | 3D face video generation method based on monocular camera |
CN112597901A (en) * | 2020-12-23 | 2021-04-02 | 艾体威尔电子技术(北京)有限公司 | Multi-face scene effective face recognition device and method based on three-dimensional distance measurement |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060081778A1 (en) * | 1998-12-11 | 2006-04-20 | Warner Charles C | Portable radiometry and imaging apparatus |
CN105913013A (en) * | 2016-04-08 | 2016-08-31 | 青岛万龙智控科技有限公司 | Binocular vision face recognition algorithm |
CN110188699A (en) * | 2019-05-31 | 2019-08-30 | 安徽柏络智能科技有限公司 | A kind of face identification method and system of binocular camera |
-
2020
- 2020-07-30 CN CN202010748989.XA patent/CN111881841B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060081778A1 (en) * | 1998-12-11 | 2006-04-20 | Warner Charles C | Portable radiometry and imaging apparatus |
CN105913013A (en) * | 2016-04-08 | 2016-08-31 | 青岛万龙智控科技有限公司 | Binocular vision face recognition algorithm |
CN110188699A (en) * | 2019-05-31 | 2019-08-30 | 安徽柏络智能科技有限公司 | A kind of face identification method and system of binocular camera |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112396694A (en) * | 2020-12-08 | 2021-02-23 | 北京工商大学 | 3D face video generation method based on monocular camera |
CN112396694B (en) * | 2020-12-08 | 2023-05-05 | 北京工商大学 | 3D face video generation method based on monocular camera |
CN112597901A (en) * | 2020-12-23 | 2021-04-02 | 艾体威尔电子技术(北京)有限公司 | Multi-face scene effective face recognition device and method based on three-dimensional distance measurement |
CN112597901B (en) * | 2020-12-23 | 2023-12-29 | 艾体威尔电子技术(北京)有限公司 | Device and method for effectively recognizing human face in multiple human face scenes based on three-dimensional ranging |
Also Published As
Publication number | Publication date |
---|---|
CN111881841B (en) | 2022-09-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shao et al. | Real-time and accurate UAV pedestrian detection for social distancing monitoring in COVID-19 pandemic | |
CN109344701B (en) | Kinect-based dynamic gesture recognition method | |
CN108764071B (en) | Real face detection method and device based on infrared and visible light images | |
WO2019056988A1 (en) | Face recognition method and apparatus, and computer device | |
CN113139479B (en) | Micro-expression recognition method and system based on optical flow and RGB modal contrast learning | |
CN109598242B (en) | Living body detection method | |
CN111680588A (en) | Human face gate living body detection method based on visible light and infrared light | |
CN112801015B (en) | Multi-mode face recognition method based on attention mechanism | |
CN106909890B (en) | Human behavior recognition method based on part clustering characteristics | |
CN111639580B (en) | Gait recognition method combining feature separation model and visual angle conversion model | |
CN111881841B (en) | Face detection and recognition method based on binocular vision | |
CN109993103A (en) | A kind of Human bodys' response method based on point cloud data | |
CN112232204B (en) | Living body detection method based on infrared image | |
CN110458235B (en) | Motion posture similarity comparison method in video | |
CN109858433B (en) | Method and device for identifying two-dimensional face picture based on three-dimensional face model | |
CN106599806A (en) | Local curved-surface geometric feature-based human body action recognition method | |
Galiyawala et al. | Person retrieval in surveillance video using height, color and gender | |
CN112257641A (en) | Face recognition living body detection method | |
CN111914643A (en) | Human body action recognition method based on skeleton key point detection | |
Baek et al. | Multimodal camera-based gender recognition using human-body image with two-step reconstruction network | |
CN111582036B (en) | Cross-view-angle person identification method based on shape and posture under wearable device | |
CN105469042A (en) | Improved face image comparison method | |
CN112668550A (en) | Double-person interaction behavior recognition method based on joint point-depth joint attention RGB modal data | |
CN112528902A (en) | Video monitoring dynamic face recognition method and device based on 3D face model | |
CN111767879A (en) | Living body detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |