CN111881841A - Face detection and recognition method based on binocular vision - Google Patents

Face detection and recognition method based on binocular vision Download PDF

Info

Publication number
CN111881841A
CN111881841A CN202010748989.XA CN202010748989A CN111881841A CN 111881841 A CN111881841 A CN 111881841A CN 202010748989 A CN202010748989 A CN 202010748989A CN 111881841 A CN111881841 A CN 111881841A
Authority
CN
China
Prior art keywords
face
points
binocular vision
detection
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010748989.XA
Other languages
Chinese (zh)
Other versions
CN111881841B (en
Inventor
李磊
宦蕴哲
蒋晨阳
王永春
金玮
李建业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Campus of Hohai University
Original Assignee
Changzhou Campus of Hohai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Campus of Hohai University filed Critical Changzhou Campus of Hohai University
Priority to CN202010748989.XA priority Critical patent/CN111881841B/en
Publication of CN111881841A publication Critical patent/CN111881841A/en
Application granted granted Critical
Publication of CN111881841B publication Critical patent/CN111881841B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/653Three-dimensional objects by matching three-dimensional models, e.g. conformal mapping of Riemann surfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a face detection and identification method based on binocular vision, which comprises the following steps: (1) acquiring a left human face picture and a right human face picture through a binocular camera; (2) detecting two pictures through the hog characteristics, and finding out two corresponding human face pictures from the two pictures; (3) extracting human face features from the two acquired pictures containing human faces; (4) obtaining the depth information of the human face characteristic points by a binocular vision ranging method, thereby solving a human face three-dimensional model; (5) and analyzing the solved result by using a statistical method to realize identification. Specifically, the three-dimensional face model is classified through a support vector machine, so that face recognition is realized. The face detection, three-dimensional reconstruction and identification are realized through the shooting, detection, modeling and classification technology, and the method has the characteristics of high efficiency and rapidness in processing, safety and reliability, and capability of providing more complete and richer detection information.

Description

Face detection and recognition method based on binocular vision
Technical Field
The invention belongs to the technical field of digital image processing, and particularly relates to a face detection and identification method based on binocular vision.
Background
Since 2001, the public security department began using face detection technology to combat serious criminal crimes and to receive national support. The face detection technology is applied when the Beijing Olympic Games hold in 2008, which marks that the face detection of China enters a practical stage. Since then, in the world exposition in Shanghai, the technology is more widely applied, and meanwhile, various companies are added in succession, so that the large-scale application of the face detection technology in China is accelerated. With the technical progress in the field in China, the 'three-in-two' will be the inevitable trend of the development of the face detection technology. Wherein: the "three-transformation" refers to the mainstream, chip and standardization; "bipartite" refers to the association of multi-biometric fusions with REID with other biometrics.
At present, the research on human face detection at home and abroad is mainly based on two-dimensional images, and the human face detection and detection methods used in various documents are different, but most methods focus on human face detection on a single two-dimensional image, and currently, the mainstream human face identification comprises the following methods,
face recognition based on geometric features;
face recognition based on the characteristic face;
face recognition based on template matching;
face recognition based on a neural network;
hidden Markov Model (HMM) based face recognition;
face recognition based on an elastic matching method;
face recognition based on Bayesian decision;
face recognition based on a support vector machine;
and the three-dimensional face detection system based on binocular vision is less.
The current face recognition is mainly researched on two-dimensional images or two-dimensional dynamic video sequences. The two-dimensional image recognition technology has many applications in other fields, but because the human face is a plastic deformation body, the human face recognition technology has difficulty in the two-dimensional image recognition technology. In addition, the face recognition based on the two-dimensional image is inevitably affected by ambient light, background, visual angle and the like, and the posture, expression, shielding and the like of the face, so that the recognition accuracy is difficult to be further improved.
The two-dimensional face image is only a plane projection result of the three-dimensional face image, a part of information is necessarily lost in the process, and the face is influenced by factors such as illumination conditions, background, posture, expression and the like, so that the problems are difficult to solve by the face detection method based on the monocular camera.
Disclosure of Invention
In order to overcome the defects of a face recognition technology based on a two-dimensional image, the invention provides a face detection recognition method based on binocular vision, which is used for collecting face features to carry out three-dimensional detection research based on the binocular vision and can greatly improve the detection efficiency and accuracy. By introducing a binocular vision technology, compared with a two-dimensional face image obtained by a common monocular camera and a three-dimensional face image formed by the binocular camera, more information, particularly depth information of a face, can be obtained. The method is used for solving the problem that the traditional monocular camera is difficult to detect the face in some special scenes.
In order to solve the technical problem, the invention is solved by the following technical scheme:
a face detection and identification method based on binocular vision comprises the following steps:
step (1) acquiring a left picture and a right picture containing human faces;
respectively carrying out hog feature detection on two pictures containing human faces to obtain human face images corresponding to the two pictures;
performing feature extraction on the two obtained face images to obtain face feature points;
step (4) calculating four-dimensional coordinate point information of each face characteristic point through a binocular ranging algorithm based on the extracted face characteristic points, and fitting a face three-dimensional model based on the four-dimensional coordinate point information of the face characteristic points;
and (5) identifying the three-dimensional model of the face through a support vector machine algorithm to obtain an identification result.
In some embodiments, in step (1), two pictures containing human faces are obtained through a binocular camera.
In some embodiments, in step (3), feature extraction is performed on the two obtained face images by using a convolutional neural network to obtain face feature points.
In some embodiments, in step (3), the face feature points are selected from 9 feature points, which are 2 eyeball center points, 4 eyeball corner points, a nostril center point and 2 mouth corner points.
In some embodiments, the convolutional neural network is a modified P _ Net network, the input layer size is set to 1202 × 1202 × 3, the second layer network is 600 × 600 × 10, the pooling layer is a 2 × 2 matrix, the third layer network is 300 × 300 × 16, the pooling layer is a 2 × 2 matrix, the fourth layer network is 1 × 1 × 1000, and the last layer output layer size is 2 × 9.
In some embodiments, in step (4), four-dimensional coordinate point information of each feature point is calculated by a binocular ranging algorithm, and the calculation formula is as follows:
Figure BDA0002609404460000041
p point is the target of the depth to be calculated, OlAnd OrTwo points, f, corresponding to the left and right images respectivelyl、frThe distance between the corresponding point of the left and right images and the lens is divided into Xl、YlThe positions of P on the left photograph, X respectivelyr、YrRespectively, the position of P on the right photograph; r is1To r9Coordinate positions of the 9 characteristic points on the two photos; the relationship between the camera coordinate system and the world coordinate system can be represented by a rotation matrix R, wherein
Figure BDA0002609404460000042
Time matrix
Figure BDA0002609404460000043
Representing the time conversion relation between a camera coordinate system and a world coordinate system, wherein Z is the depth of the target to be measured relative to the world coordinate system, and x is converted into X through Rl、yl、zl、xr、yr、zrConverted to a position relative to the world coordinate system.
In some embodiments, in step (4), the three-dimensional model of the human face is fitted by cubic spline interpolation based on the four-dimensional coordinate point information of the characteristic points of the human face.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a binocular vision-based face detection and recognition method, which mainly aims at the problem that the face recognition of a monocular camera cannot be well performed due to the influence of factors such as illumination conditions, backgrounds, postures, expressions and the like. Aiming at the defect, a binocular camera is used for acquiring images of the human face to establish a three-dimensional model of the human face, and finally the recognition of the human face is realized through a support vector machine algorithm, so that the problem that the monocular camera cannot work well in the face recognition under the condition of large external interference is effectively solved.
Drawings
FIG. 1 is a general flow diagram of a method involved in an embodiment of the invention;
FIG. 2 is a structure of a modified version of the P _ Net network;
fig. 3 is a schematic view of binocular vision computed depth.
Detailed Description
The invention is described in further detail below with reference to the following figures and detailed description:
as shown in fig. 1, a face detection and recognition method of the present invention includes the following steps:
(1) the images of two human faces are obtained through the binocular camera, and when the images are obtained, the human faces are possibly right opposite to the binocular camera, so that a more reliable human face image can be obtained. The definition of the camera is not less than 500 ten thousand pixels, so that more accurate human face feature point positions can be obtained conveniently. The distance of the face from the camera should ensure that the face occupies the area of the picture as much as possible.
(2) After the left and right images were obtained, the input was calculated using a modified version of P _ Net network, as shown in fig. 2, with the input size set to 1202 × 1202 × 3, the second tier network set to 600 × 600 × 10, the pooling layer to a 2 × 2 matrix, the third tier network set to 300 × 300 × 16, the pooling layer to a 2 × 2 matrix, the fourth tier network set to 1 × 1 × 1000, and the last tier output to 2 × 9. The reason for outputting 2 × 9 here is that the network is designed to find the positions of 9 corner points, 2 eyeball center points, 4 eye corner points, the center points of the nostrils, and 2 mouth corner points in the face, and each corner point has coordinates in two directions on one picture, so the output is 2 × 9.
(3) After coordinate positions of the 9 points on the two photos are obtained, depth coordinate information of each corner point is obtained through a binocular distance measuring algorithm. The ranging algorithm is shown in fig. 3: p point is the target of the depth to be calculated, Zl、ZrFor the depth, O, of the object to be measured in the left and right images, respectivelylAnd OrTwo points, f, corresponding to the left and right images respectivelyl、frThe distance between the corresponding point of the left and right images and the lens is divided into Xl、YlThe positions of P on the left photograph, X respectivelyr、YrRespectively, the position of P on the right photograph; r is1To r9Coordinate positions of 9 characteristic points (respectively 2 eyeball center points, 4 eyeball corner points, nostril center points and 2 mouth corner points) on the two pictures;
the calculation formula is as follows:
Figure BDA0002609404460000061
the relationship between the camera coordinate system and the world coordinate system can be represented by a rotation matrix R, wherein
Figure BDA0002609404460000062
Time matrix
Figure BDA0002609404460000063
Representing the time conversion relation of two coordinate systems, wherein Z is the depth of the target to be measured relative to the world coordinate system, and x is converted into X through Rl、yl、zl、xr、yr、zrConverted to a position relative to the world coordinate system.
(4) After the depth information of 9 points is obtained, specific positions of the 9 corner points in a three-dimensional coordinate space are obtained, and then the specific shape of the human face is fitted through cubic spline interpolation to obtain model parameters of the human face.
(5) And finally, taking the obtained model parameters of the human face as input, and realizing the recognition of the human face through a support vector machine algorithm. The input of the support vector machine is a parameter model of the human face, the output size is set to be 1, different human faces are used as the input of the support vector machine, the output value difference is large, the same human face is used as the input of the support vector machine, the output value difference is small, two input human faces with similar output values can be considered as one person, and two input human faces with larger value difference are considered as two persons.
The positions of 9 corner points, namely 2 eyeball center points, 4 eye corner points, a nostril center point and 2 mouth corner points in the human face, are obtained through a P-Net network, and after the positions of the 9 corner points are obtained (two image images obtained by a left camera and a right camera, 9 points of each human face image are obtained, 18 points are obtained in total), three-dimensional coordinate information corresponding to the 9 corner points is obtained through a binocular vision ranging algorithm.
After 9 positions are obtained, the shape of the human face is restored by cubic spline interpolation.
The information of the human face shape is used as the feature, and the feature is classified through a support vector machine algorithm.
In some embodiments, the input of the improved P-Net network is the face images obtained by the left and right cameras, and the output is the positions of the 2 eyeball center points, 4 eye corner points, the middle point of the nostril and the 9 corner points of the 2 mouth corner points in the face, according to which the output can be expanded to increase the output types, such as the positions of the ears, the chin and the like.
The input of the support vector machine algorithm is a human face shape obtained through cubic spline interpolation, and only one output is set for the output of the human face shape, namely the output dimension is 1 multiplied by 1. The output values are very close to each other for the same face, and the output values are greatly different for different faces.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (7)

1. A face detection and identification method based on binocular vision is characterized by comprising the following steps:
step (1) acquiring a left picture and a right picture containing human faces;
respectively carrying out hog feature detection on two pictures containing human faces to obtain human face images corresponding to the two pictures;
performing feature extraction on the two obtained face images to obtain face feature points;
step (4) calculating four-dimensional coordinate point information of each face characteristic point through a binocular ranging algorithm based on the extracted face characteristic points, and fitting a face three-dimensional model based on the four-dimensional coordinate point information of the face characteristic points;
and (5) identifying the three-dimensional model of the face through a support vector machine algorithm to obtain an identification result.
2. The binocular vision based face detection and recognition method of claim 1, wherein: in the step (1), two pictures containing human faces are obtained through a binocular camera.
3. The binocular vision based face detection and recognition method of claim 1, wherein: and (3) performing feature extraction on the two obtained face images by using a convolutional neural network to obtain face feature points.
4. The binocular vision based face detection and recognition method of claim 1, wherein: in the step (3), 9 feature points are selected from the face feature points, wherein the feature points are respectively 2 eyeball center points, 4 eye corner points, the midpoint of a nostril and 2 mouth corner points.
5. The binocular vision based face detection and recognition method of claim 3, wherein: the convolutional neural network adopts an improved P _ Net network, the size of an input layer is set to be 1202 multiplied by 3, a second layer network is 600 multiplied by 10, a pooling layer is a matrix of 2 multiplied by 2, a third layer network is 300 multiplied by 16, the pooling layer is a matrix of 2 multiplied by 2, a fourth layer network is 1 multiplied by 1000, and the size of a last layer output layer is 2 multiplied by 9.
6. The binocular vision based face detection and recognition method of claim 1, wherein: in the step (4), four-dimensional coordinate point information of each characteristic point is calculated through a binocular ranging algorithm, and a calculation formula is as follows:
Figure FDA0002609404450000021
p is the target to be calculated in depth, Z is the depth of the target to be calculated relative to the world coordinate system, OlAnd OrTwo points, f, corresponding to the left and right images respectivelyl、frThe distance between the corresponding point of the left and right images and the lens is divided into Xl、YlThe positions of P on the left photograph, X respectivelyr、YrRespectively, the position of P on the right photograph; r is1To r9Coordinate positions of the 9 characteristic points on the two photos; the relationship between the camera coordinate system and the world coordinate system can be represented by a rotation matrix R, wherein
Figure FDA0002609404450000022
Time matrix
Figure FDA0002609404450000023
Representing the time-transformed relationship between the camera coordinate system and the world coordinate system.
7. The binocular vision based face detection and recognition method of claim 1, wherein: and (4) fitting a three-dimensional model of the human face through cubic spline interpolation based on the four-dimensional coordinate point information of the characteristic points of the human face.
CN202010748989.XA 2020-07-30 2020-07-30 Face detection and recognition method based on binocular vision Active CN111881841B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010748989.XA CN111881841B (en) 2020-07-30 2020-07-30 Face detection and recognition method based on binocular vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010748989.XA CN111881841B (en) 2020-07-30 2020-07-30 Face detection and recognition method based on binocular vision

Publications (2)

Publication Number Publication Date
CN111881841A true CN111881841A (en) 2020-11-03
CN111881841B CN111881841B (en) 2022-09-13

Family

ID=73204262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010748989.XA Active CN111881841B (en) 2020-07-30 2020-07-30 Face detection and recognition method based on binocular vision

Country Status (1)

Country Link
CN (1) CN111881841B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396694A (en) * 2020-12-08 2021-02-23 北京工商大学 3D face video generation method based on monocular camera
CN112597901A (en) * 2020-12-23 2021-04-02 艾体威尔电子技术(北京)有限公司 Multi-face scene effective face recognition device and method based on three-dimensional distance measurement

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060081778A1 (en) * 1998-12-11 2006-04-20 Warner Charles C Portable radiometry and imaging apparatus
CN105913013A (en) * 2016-04-08 2016-08-31 青岛万龙智控科技有限公司 Binocular vision face recognition algorithm
CN110188699A (en) * 2019-05-31 2019-08-30 安徽柏络智能科技有限公司 A kind of face identification method and system of binocular camera

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060081778A1 (en) * 1998-12-11 2006-04-20 Warner Charles C Portable radiometry and imaging apparatus
CN105913013A (en) * 2016-04-08 2016-08-31 青岛万龙智控科技有限公司 Binocular vision face recognition algorithm
CN110188699A (en) * 2019-05-31 2019-08-30 安徽柏络智能科技有限公司 A kind of face identification method and system of binocular camera

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396694A (en) * 2020-12-08 2021-02-23 北京工商大学 3D face video generation method based on monocular camera
CN112396694B (en) * 2020-12-08 2023-05-05 北京工商大学 3D face video generation method based on monocular camera
CN112597901A (en) * 2020-12-23 2021-04-02 艾体威尔电子技术(北京)有限公司 Multi-face scene effective face recognition device and method based on three-dimensional distance measurement
CN112597901B (en) * 2020-12-23 2023-12-29 艾体威尔电子技术(北京)有限公司 Device and method for effectively recognizing human face in multiple human face scenes based on three-dimensional ranging

Also Published As

Publication number Publication date
CN111881841B (en) 2022-09-13

Similar Documents

Publication Publication Date Title
Shao et al. Real-time and accurate UAV pedestrian detection for social distancing monitoring in COVID-19 pandemic
CN109344701B (en) Kinect-based dynamic gesture recognition method
CN108764071B (en) Real face detection method and device based on infrared and visible light images
WO2019056988A1 (en) Face recognition method and apparatus, and computer device
CN113139479B (en) Micro-expression recognition method and system based on optical flow and RGB modal contrast learning
CN109598242B (en) Living body detection method
CN111680588A (en) Human face gate living body detection method based on visible light and infrared light
CN112801015B (en) Multi-mode face recognition method based on attention mechanism
CN106909890B (en) Human behavior recognition method based on part clustering characteristics
CN111639580B (en) Gait recognition method combining feature separation model and visual angle conversion model
CN111881841B (en) Face detection and recognition method based on binocular vision
CN109993103A (en) A kind of Human bodys' response method based on point cloud data
CN112232204B (en) Living body detection method based on infrared image
CN110458235B (en) Motion posture similarity comparison method in video
CN109858433B (en) Method and device for identifying two-dimensional face picture based on three-dimensional face model
CN106599806A (en) Local curved-surface geometric feature-based human body action recognition method
Galiyawala et al. Person retrieval in surveillance video using height, color and gender
CN112257641A (en) Face recognition living body detection method
CN111914643A (en) Human body action recognition method based on skeleton key point detection
Baek et al. Multimodal camera-based gender recognition using human-body image with two-step reconstruction network
CN111582036B (en) Cross-view-angle person identification method based on shape and posture under wearable device
CN105469042A (en) Improved face image comparison method
CN112668550A (en) Double-person interaction behavior recognition method based on joint point-depth joint attention RGB modal data
CN112528902A (en) Video monitoring dynamic face recognition method and device based on 3D face model
CN111767879A (en) Living body detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant