CN109344714B - Sight estimation method based on key point matching - Google Patents

Sight estimation method based on key point matching Download PDF

Info

Publication number
CN109344714B
CN109344714B CN201811011543.8A CN201811011543A CN109344714B CN 109344714 B CN109344714 B CN 109344714B CN 201811011543 A CN201811011543 A CN 201811011543A CN 109344714 B CN109344714 B CN 109344714B
Authority
CN
China
Prior art keywords
face
key points
pupil
matching
eye
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811011543.8A
Other languages
Chinese (zh)
Other versions
CN109344714A (en
Inventor
李宏亮
颜海强
尹康
袁欢
梁小娟
邓志康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201811011543.8A priority Critical patent/CN109344714B/en
Publication of CN109344714A publication Critical patent/CN109344714A/en
Application granted granted Critical
Publication of CN109344714B publication Critical patent/CN109344714B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

The invention discloses a sight estimation method based on key point matching, and belongs to sight estimation in the field of computer vision. After the pupil key points are initially positioned through a depth network, the pupil center position is further corrected by adopting an SGBM template matching method. Compared with the existing sight line estimation method, the pupil center position can be more accurately positioned, especially for the situation that the head or eyeball offset is large. The implementation of the invention can effectively improve the precision of sight line estimation, and compared with a pupil corneal reflection method, only a single network camera is adopted, thereby greatly reducing the equipment cost. Compared with the existing method based on single image processing, the method does not need to limit the posture of the head, and the robustness of the algorithm is greatly increased. By matching the 3D face model, the limitation that the existing database can not represent all postures at present is avoided, so that the practicability of the method is improved.

Description

Sight estimation method based on key point matching
Technical Field
The invention provides a sight line estimation method based on key point matching, which is a novel sight line estimation technology in the field of computer vision.
Background
With the development of computer science, human-computer interaction gradually becomes a popular field. The human eye sight line can reflect the attention information of a human and belongs to an important information input source in human-computer interaction. The human-computer interaction based on the sight estimation has wide development prospect in the fields of military affairs, medical treatment, entertainment and the like.
The currently practical sight line estimation technology is mainly based on a pupil corneal reflection technology (PCCR), uses a near-infrared light source to generate a reflection image on the cornea and the pupil of the eye of a user, then uses an image sensor to collect the image of the eye and the reflection, and finally calculates the position and sight line of the eye in space based on a three-dimensional eyeball model. This method, although having a high accuracy, is limited by the difficulty of popularizing expensive sensor equipment.
Aiming at the problems, a sight line estimation method based on a 3D face model appears. The method only needs the picture collected by the camera as input data, carries out key point positioning on the collected picture, estimates the head posture and the eyeball center position by combining a known model, and then obtains the sight angle by combining the detected pupil center position.
However, when the existing sight line estimation method based on the 3D face model calculates the pupil center position, due to the limitation of the database, all real situations cannot be covered, and a large error exists under the condition of a large head posture or eye offset, which causes a great deviation in the final estimation of the sight line.
Disclosure of Invention
The invention aims to: aiming at the existing problems, a method combining a depth network and template matching is provided to accurately position the pupil center, and the feasibility of the scheme is increased.
The sight line estimation method based on key point matching comprises the following steps:
step one, detecting a target face:
inputting a video stream acquired by a camera into a trained face detection network model (selecting a familiar face detection network model such as MobileNet-SSD) to perform face detection, and intercepting a face with the largest size as a target face image for sight detection;
and performing size normalization processing on the target face image, and using the size normalization processed image as an input of a face key point detection network model (selecting a corresponding conventional detection network model, such as SE-Net) to obtain a face key point and a pupil center of the target face image.
Step two, detecting key points of the human face and initially positioning the pupil center:
inputting a target face image after size normalization processing based on a selected face key point detection network model to obtain face key points and coordinates of 2 initial pupil centers on the current target face image, and converting the coordinates into coordinates on a video image (before normalization), wherein the face key points comprise 4 eye key points, and the left eye and the right eye respectively comprise two key points (two end points of the eyes);
thirdly, estimating the head posture and positioning the eyeballs:
matching the detected key points of the human face with the key points of the standard three-dimensional human face through a perspective n-point algorithm (PNP algorithm) to obtain the spatial position and the rotation angle of the human face relative to the camera;
thereby obtaining three-dimensional coordinates of 2 initial pupil centers and three-dimensional coordinates of 4 eye key points;
taking the midpoint of two eye key points of the left eye and the right eye 12mm behind the head posture direction as the center positions of the left eyeball and the right eyeball respectively under the three-dimensional coordinate;
step four, correcting the pupil center position:
intercepting left eye and right eye pictures according to 4 detected eye key points under a three-dimensional coordinate, repositioning a pupil center point by using a semi-global matching SGBM method, and if the confidence coefficient of a currently obtained matching point (the repositioned pupil center point) is greater than 0.7, considering the matching point as credible; taking the median of the two credible matching points as the final pupil center position;
step five, estimating the sight direction:
and calculating the optical axis information from the center of the eyeball to the center of the pupil under the three-dimensional coordinate to obtain the current sight direction. And obtaining the current sight line direction.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
the sight line estimation method based on key point matching can effectively improve the precision of sight line estimation, and compared with a pupil corneal reflection method, only a single network camera is adopted, so that the equipment cost is greatly reduced. Compared with the existing method based on single image processing, the method does not need to limit the posture of the head, and the robustness of the algorithm is greatly increased. By matching the 3D face model, the limitation that the existing database can not represent all postures at present is avoided, so that the practicability of the method is improved.
Drawings
FIG. 1 is a schematic view of the process of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
The existing sight line estimation method has a large error for pupil center positioning, especially for the case of large head pose. The invention tries to preliminarily position key points and pupil centers of the human face through SE-Net (Squeeze-and-Excitation Networks), and then corrects the result by adopting the pupil center obtained by an SGBM matching algorithm (semi-global matching), thereby further improving the pupil positioning precision.
Firstly, carrying out face detection on a picture read by a camera, cutting a face with the largest dimension as a target needing to estimate a sight line, and normalizing to a standard size. The face feature points (68 common key points) and 2 pupil center positions of the face are detected based on the SE-Net network.
Then, the detected 68 face key points are matched with the standard 3D face key points by using a perspective n-point algorithm (PNP) to obtain the position and the rotation angle of the face relative to the camera in the space.
Then, by adopting the method provided by the invention, pictures of the left eye and the right eye are respectively intercepted according to the obtained eye key points, the eye pictures are matched with the standard pupil pictures by adopting a semi-global block matching algorithm (SGBM), the point with the highest confidence level in the matching result is found as the pupil center, if the matched confidence level is more than 0.7, the matched position is considered to be credible, then the final positioning result is calculated by taking the pupil positioning results twice, and the calculation formula is as follows:
Figure BDA0001785229370000031
wherein, PSeNetFor pupil detection results from SE-Net, PSGBMPupil detection results obtained for SGBM, and pupil center confidence obtained for SGBM, TThe larger the degree is, the more accurate the detection result is.
And finally, taking the center of the key point of the eye part as the eyeball center along the head offset direction of 12mm, and combining the vectors of the eyeball center and the pupil center to obtain the final sight line direction.
After the pupil key points are initially positioned through a depth network, the pupil center position is further corrected by adopting an SGBM template matching method. Compared with the existing sight line estimation method, the pupil center position can be more accurately positioned, especially for the situation that the head or eyeball offset is large.
Examples
Referring to fig. 1, the present invention mainly comprises the following steps: detecting a target face, detecting key points of the face, primarily positioning the pupil center, estimating the head posture and positioning eyeballs, correcting the pupil center position and estimating the sight direction.
Step one, detecting a target face.
And inputting the video stream acquired by the camera into a trained face detection network (MobileNet-SSD) for face detection, intercepting the face with the largest size as a target face for sight detection, and normalizing the face to 300 × 300 to be used as the input of the key point detection network.
And step two, detecting key points of the human face and initially positioning the pupil center.
And the SE-Net is used as a basic network to carry out model training of face key point and pupil center detection, and the L1loss is used as a loss function in the training process, so that the positioning precision is further improved. The 300 x 300 face picture is transmitted into the trained model, so as to obtain the coordinates of 68 key points and 2 pupil centers on the face picture, and then, the coordinates are converted into the coordinates on the original image. Wherein, the expression of L1loss is as follows:
Figure BDA0001785229370000041
wherein f (x)i) Model prediction result, y, representing ith input dataiThe corresponding label result is shown, and m represents the number of data input to the model each time.
And thirdly, estimating the head posture and positioning the eyeballs.
And estimating the spatial position and the rotation angle of the face relative to the camera by using the obtained 68 key two-dimensional coordinates on the video picture and the existing 68-point face three-dimensional coordinate model by adopting a PNP algorithm. Then, the midpoint of the two eye key points 12mm behind the head posture direction is taken as the eyeball center.
And step four, correcting the center position of the pupil.
And intercepting the left eye picture and the right eye picture according to the detected 4 eye key points, searching the pupil center by using an SGBM method, and obtaining the center position and the corresponding confidence coefficient, wherein the higher the confidence coefficient is, the higher the accuracy is. And if the confidence coefficient is greater than 0.7, combining the positioning results of the two times to obtain the final pupil center position.
And fifthly, estimating the sight direction.
And taking three-dimensional coordinates of the eyeball center and the pupil center, and calculating the optical axis information to obtain the sight line direction finally.
While the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except mutually exclusive features and/or steps.

Claims (2)

1. A sight line estimation method based on key point matching is characterized by comprising the following steps:
step one, detecting a target face:
inputting a video stream collected by a camera into a trained face detection network model for face detection, and intercepting a face with the largest size as a target face image for sight detection;
carrying out size normalization processing on the target face image;
step two, detecting key points of the human face and initially positioning the pupil center:
inputting a target face image after size normalization processing based on a selected face key point detection network model to obtain face key points and coordinates of 2 initial pupil centers on the current target face image, and converting the coordinates into coordinates on a video image, wherein the face key points comprise 4 eye key points, and the left eye and the right eye respectively comprise two key points;
thirdly, estimating the head posture and positioning the eyeballs:
matching the detected face key points with standard three-dimensional face key points through a perspective n-point algorithm to obtain the spatial position and the rotation angle of the face relative to the camera;
thereby obtaining three-dimensional coordinates of 2 initial pupil centers and three-dimensional coordinates of 4 eye key points;
taking the midpoint of two eye key points of the left eye and the right eye 12mm behind the head posture direction as the center positions of the left eyeball and the right eyeball respectively under the three-dimensional coordinate;
step four, correcting the pupil center position:
intercepting left eye and right eye pictures according to 4 detected eye key points under a three-dimensional coordinate, repositioning a pupil center point by using a semi-global matching SGBM method, and if the confidence coefficient of a currently obtained matching point is greater than 0.7, considering the matching point as credible; taking the median of the two credible matching points as the final pupil center position;
step five, estimating the sight direction:
and calculating the optical axis information from the center of the eyeball to the center of the pupil under the three-dimensional coordinate to obtain the current sight direction.
2. The method of claim 1, wherein the target face image has a normalized size of 300 x 300.
CN201811011543.8A 2018-08-31 2018-08-31 Sight estimation method based on key point matching Active CN109344714B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811011543.8A CN109344714B (en) 2018-08-31 2018-08-31 Sight estimation method based on key point matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811011543.8A CN109344714B (en) 2018-08-31 2018-08-31 Sight estimation method based on key point matching

Publications (2)

Publication Number Publication Date
CN109344714A CN109344714A (en) 2019-02-15
CN109344714B true CN109344714B (en) 2022-03-15

Family

ID=65291973

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811011543.8A Active CN109344714B (en) 2018-08-31 2018-08-31 Sight estimation method based on key point matching

Country Status (1)

Country Link
CN (1) CN109344714B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109901716B (en) * 2019-03-04 2022-08-26 厦门美图之家科技有限公司 Sight point prediction model establishing method and device and sight point prediction method
CN110051319A (en) * 2019-04-23 2019-07-26 七鑫易维(深圳)科技有限公司 Adjusting method, device, equipment and the storage medium of eyeball tracking sensor
CN110414419A (en) * 2019-07-25 2019-11-05 四川长虹电器股份有限公司 A kind of posture detecting system and method based on mobile terminal viewer
CN110503068A (en) * 2019-08-28 2019-11-26 Oppo广东移动通信有限公司 Gaze estimation method, terminal and storage medium
CN111291701B (en) * 2020-02-20 2022-12-13 哈尔滨理工大学 Sight tracking method based on image gradient and ellipse fitting algorithm
CN113780164B (en) * 2021-09-09 2023-04-28 福建天泉教育科技有限公司 Head gesture recognition method and terminal

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1787012A (en) * 2004-12-08 2006-06-14 索尼株式会社 Method,apparatua and computer program for processing image
CN102799888A (en) * 2011-05-27 2012-11-28 株式会社理光 Eye detection method and eye detection equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102402467B1 (en) * 2016-10-05 2022-05-25 매직 립, 인코포레이티드 Periocular test for mixed reality calibration

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1787012A (en) * 2004-12-08 2006-06-14 索尼株式会社 Method,apparatua and computer program for processing image
CN102799888A (en) * 2011-05-27 2012-11-28 株式会社理光 Eye detection method and eye detection equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于双目立体视觉的眼球突出度测量方法研究;张帅;《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》;20170215(第02期);I138-2639 *

Also Published As

Publication number Publication date
CN109344714A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN109344714B (en) Sight estimation method based on key point matching
AU2021240222B2 (en) Eye pose identification using eye features
CN104317391B (en) A kind of three-dimensional palm gesture recognition exchange method and system based on stereoscopic vision
Valenti et al. Combining head pose and eye location information for gaze estimation
US9286694B2 (en) Apparatus and method for detecting multiple arms and hands by using three-dimensional image
KR20160138062A (en) Eye gaze tracking based upon adaptive homography mapping
CN109359514B (en) DeskVR-oriented gesture tracking and recognition combined strategy method
CN109785373B (en) Speckle-based six-degree-of-freedom pose estimation system and method
CN111768449B (en) Object grabbing method combining binocular vision with deep learning
WO2019136588A1 (en) Cloud computing-based calibration method, device, electronic device, and computer program product
CN110794963A (en) Depth camera-based eye control auxiliary input method
CN111259739A (en) Human face pose estimation method based on 3D human face key points and geometric projection
CN115830675B (en) Gaze point tracking method and device, intelligent glasses and storage medium
Liu et al. Robust 3-D gaze estimation via data optimization and saliency aggregation for mobile eye-tracking systems
CN108694348B (en) Tracking registration method and device based on natural features
Liu et al. Towards robust auto-calibration for head-mounted gaze tracking systems
JP2017227687A (en) Camera assembly, finger shape detection system using camera assembly, finger shape detection method using camera assembly, program implementing detection method, and recording medium of program
WO2024113275A1 (en) Gaze point acquisition method and apparatus, electronic device, and storage medium
JP2012212325A (en) Visual axis measuring system, method and program
WO2024116253A1 (en) Information processing method, information processing program, and information processing device
WO2024059927A1 (en) Methods and systems for gaze tracking using one corneal reflection
CN113643362A (en) Limb corrector based on human body measurement in 2D human body posture estimation system
TW201301204A (en) Method of tracking real-time motion of head
KR20170141018A (en) System and Method for Detecting Face Landmark considering Occlusion Landmark

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant