CN109344714B - Sight estimation method based on key point matching - Google Patents
Sight estimation method based on key point matching Download PDFInfo
- Publication number
- CN109344714B CN109344714B CN201811011543.8A CN201811011543A CN109344714B CN 109344714 B CN109344714 B CN 109344714B CN 201811011543 A CN201811011543 A CN 201811011543A CN 109344714 B CN109344714 B CN 109344714B
- Authority
- CN
- China
- Prior art keywords
- face
- key points
- pupil
- matching
- eye
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Eye Examination Apparatus (AREA)
Abstract
The invention discloses a sight estimation method based on key point matching, and belongs to sight estimation in the field of computer vision. After the pupil key points are initially positioned through a depth network, the pupil center position is further corrected by adopting an SGBM template matching method. Compared with the existing sight line estimation method, the pupil center position can be more accurately positioned, especially for the situation that the head or eyeball offset is large. The implementation of the invention can effectively improve the precision of sight line estimation, and compared with a pupil corneal reflection method, only a single network camera is adopted, thereby greatly reducing the equipment cost. Compared with the existing method based on single image processing, the method does not need to limit the posture of the head, and the robustness of the algorithm is greatly increased. By matching the 3D face model, the limitation that the existing database can not represent all postures at present is avoided, so that the practicability of the method is improved.
Description
Technical Field
The invention provides a sight line estimation method based on key point matching, which is a novel sight line estimation technology in the field of computer vision.
Background
With the development of computer science, human-computer interaction gradually becomes a popular field. The human eye sight line can reflect the attention information of a human and belongs to an important information input source in human-computer interaction. The human-computer interaction based on the sight estimation has wide development prospect in the fields of military affairs, medical treatment, entertainment and the like.
The currently practical sight line estimation technology is mainly based on a pupil corneal reflection technology (PCCR), uses a near-infrared light source to generate a reflection image on the cornea and the pupil of the eye of a user, then uses an image sensor to collect the image of the eye and the reflection, and finally calculates the position and sight line of the eye in space based on a three-dimensional eyeball model. This method, although having a high accuracy, is limited by the difficulty of popularizing expensive sensor equipment.
Aiming at the problems, a sight line estimation method based on a 3D face model appears. The method only needs the picture collected by the camera as input data, carries out key point positioning on the collected picture, estimates the head posture and the eyeball center position by combining a known model, and then obtains the sight angle by combining the detected pupil center position.
However, when the existing sight line estimation method based on the 3D face model calculates the pupil center position, due to the limitation of the database, all real situations cannot be covered, and a large error exists under the condition of a large head posture or eye offset, which causes a great deviation in the final estimation of the sight line.
Disclosure of Invention
The invention aims to: aiming at the existing problems, a method combining a depth network and template matching is provided to accurately position the pupil center, and the feasibility of the scheme is increased.
The sight line estimation method based on key point matching comprises the following steps:
step one, detecting a target face:
inputting a video stream acquired by a camera into a trained face detection network model (selecting a familiar face detection network model such as MobileNet-SSD) to perform face detection, and intercepting a face with the largest size as a target face image for sight detection;
and performing size normalization processing on the target face image, and using the size normalization processed image as an input of a face key point detection network model (selecting a corresponding conventional detection network model, such as SE-Net) to obtain a face key point and a pupil center of the target face image.
Step two, detecting key points of the human face and initially positioning the pupil center:
inputting a target face image after size normalization processing based on a selected face key point detection network model to obtain face key points and coordinates of 2 initial pupil centers on the current target face image, and converting the coordinates into coordinates on a video image (before normalization), wherein the face key points comprise 4 eye key points, and the left eye and the right eye respectively comprise two key points (two end points of the eyes);
thirdly, estimating the head posture and positioning the eyeballs:
matching the detected key points of the human face with the key points of the standard three-dimensional human face through a perspective n-point algorithm (PNP algorithm) to obtain the spatial position and the rotation angle of the human face relative to the camera;
thereby obtaining three-dimensional coordinates of 2 initial pupil centers and three-dimensional coordinates of 4 eye key points;
taking the midpoint of two eye key points of the left eye and the right eye 12mm behind the head posture direction as the center positions of the left eyeball and the right eyeball respectively under the three-dimensional coordinate;
step four, correcting the pupil center position:
intercepting left eye and right eye pictures according to 4 detected eye key points under a three-dimensional coordinate, repositioning a pupil center point by using a semi-global matching SGBM method, and if the confidence coefficient of a currently obtained matching point (the repositioned pupil center point) is greater than 0.7, considering the matching point as credible; taking the median of the two credible matching points as the final pupil center position;
step five, estimating the sight direction:
and calculating the optical axis information from the center of the eyeball to the center of the pupil under the three-dimensional coordinate to obtain the current sight direction. And obtaining the current sight line direction.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
the sight line estimation method based on key point matching can effectively improve the precision of sight line estimation, and compared with a pupil corneal reflection method, only a single network camera is adopted, so that the equipment cost is greatly reduced. Compared with the existing method based on single image processing, the method does not need to limit the posture of the head, and the robustness of the algorithm is greatly increased. By matching the 3D face model, the limitation that the existing database can not represent all postures at present is avoided, so that the practicability of the method is improved.
Drawings
FIG. 1 is a schematic view of the process of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
The existing sight line estimation method has a large error for pupil center positioning, especially for the case of large head pose. The invention tries to preliminarily position key points and pupil centers of the human face through SE-Net (Squeeze-and-Excitation Networks), and then corrects the result by adopting the pupil center obtained by an SGBM matching algorithm (semi-global matching), thereby further improving the pupil positioning precision.
Firstly, carrying out face detection on a picture read by a camera, cutting a face with the largest dimension as a target needing to estimate a sight line, and normalizing to a standard size. The face feature points (68 common key points) and 2 pupil center positions of the face are detected based on the SE-Net network.
Then, the detected 68 face key points are matched with the standard 3D face key points by using a perspective n-point algorithm (PNP) to obtain the position and the rotation angle of the face relative to the camera in the space.
Then, by adopting the method provided by the invention, pictures of the left eye and the right eye are respectively intercepted according to the obtained eye key points, the eye pictures are matched with the standard pupil pictures by adopting a semi-global block matching algorithm (SGBM), the point with the highest confidence level in the matching result is found as the pupil center, if the matched confidence level is more than 0.7, the matched position is considered to be credible, then the final positioning result is calculated by taking the pupil positioning results twice, and the calculation formula is as follows:
wherein, PSeNetFor pupil detection results from SE-Net, PSGBMPupil detection results obtained for SGBM, and pupil center confidence obtained for SGBM, TThe larger the degree is, the more accurate the detection result is.
And finally, taking the center of the key point of the eye part as the eyeball center along the head offset direction of 12mm, and combining the vectors of the eyeball center and the pupil center to obtain the final sight line direction.
After the pupil key points are initially positioned through a depth network, the pupil center position is further corrected by adopting an SGBM template matching method. Compared with the existing sight line estimation method, the pupil center position can be more accurately positioned, especially for the situation that the head or eyeball offset is large.
Examples
Referring to fig. 1, the present invention mainly comprises the following steps: detecting a target face, detecting key points of the face, primarily positioning the pupil center, estimating the head posture and positioning eyeballs, correcting the pupil center position and estimating the sight direction.
Step one, detecting a target face.
And inputting the video stream acquired by the camera into a trained face detection network (MobileNet-SSD) for face detection, intercepting the face with the largest size as a target face for sight detection, and normalizing the face to 300 × 300 to be used as the input of the key point detection network.
And step two, detecting key points of the human face and initially positioning the pupil center.
And the SE-Net is used as a basic network to carry out model training of face key point and pupil center detection, and the L1loss is used as a loss function in the training process, so that the positioning precision is further improved. The 300 x 300 face picture is transmitted into the trained model, so as to obtain the coordinates of 68 key points and 2 pupil centers on the face picture, and then, the coordinates are converted into the coordinates on the original image. Wherein, the expression of L1loss is as follows:wherein f (x)i) Model prediction result, y, representing ith input dataiThe corresponding label result is shown, and m represents the number of data input to the model each time.
And thirdly, estimating the head posture and positioning the eyeballs.
And estimating the spatial position and the rotation angle of the face relative to the camera by using the obtained 68 key two-dimensional coordinates on the video picture and the existing 68-point face three-dimensional coordinate model by adopting a PNP algorithm. Then, the midpoint of the two eye key points 12mm behind the head posture direction is taken as the eyeball center.
And step four, correcting the center position of the pupil.
And intercepting the left eye picture and the right eye picture according to the detected 4 eye key points, searching the pupil center by using an SGBM method, and obtaining the center position and the corresponding confidence coefficient, wherein the higher the confidence coefficient is, the higher the accuracy is. And if the confidence coefficient is greater than 0.7, combining the positioning results of the two times to obtain the final pupil center position.
And fifthly, estimating the sight direction.
And taking three-dimensional coordinates of the eyeball center and the pupil center, and calculating the optical axis information to obtain the sight line direction finally.
While the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except mutually exclusive features and/or steps.
Claims (2)
1. A sight line estimation method based on key point matching is characterized by comprising the following steps:
step one, detecting a target face:
inputting a video stream collected by a camera into a trained face detection network model for face detection, and intercepting a face with the largest size as a target face image for sight detection;
carrying out size normalization processing on the target face image;
step two, detecting key points of the human face and initially positioning the pupil center:
inputting a target face image after size normalization processing based on a selected face key point detection network model to obtain face key points and coordinates of 2 initial pupil centers on the current target face image, and converting the coordinates into coordinates on a video image, wherein the face key points comprise 4 eye key points, and the left eye and the right eye respectively comprise two key points;
thirdly, estimating the head posture and positioning the eyeballs:
matching the detected face key points with standard three-dimensional face key points through a perspective n-point algorithm to obtain the spatial position and the rotation angle of the face relative to the camera;
thereby obtaining three-dimensional coordinates of 2 initial pupil centers and three-dimensional coordinates of 4 eye key points;
taking the midpoint of two eye key points of the left eye and the right eye 12mm behind the head posture direction as the center positions of the left eyeball and the right eyeball respectively under the three-dimensional coordinate;
step four, correcting the pupil center position:
intercepting left eye and right eye pictures according to 4 detected eye key points under a three-dimensional coordinate, repositioning a pupil center point by using a semi-global matching SGBM method, and if the confidence coefficient of a currently obtained matching point is greater than 0.7, considering the matching point as credible; taking the median of the two credible matching points as the final pupil center position;
step five, estimating the sight direction:
and calculating the optical axis information from the center of the eyeball to the center of the pupil under the three-dimensional coordinate to obtain the current sight direction.
2. The method of claim 1, wherein the target face image has a normalized size of 300 x 300.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811011543.8A CN109344714B (en) | 2018-08-31 | 2018-08-31 | Sight estimation method based on key point matching |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811011543.8A CN109344714B (en) | 2018-08-31 | 2018-08-31 | Sight estimation method based on key point matching |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109344714A CN109344714A (en) | 2019-02-15 |
CN109344714B true CN109344714B (en) | 2022-03-15 |
Family
ID=65291973
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811011543.8A Active CN109344714B (en) | 2018-08-31 | 2018-08-31 | Sight estimation method based on key point matching |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109344714B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109901716B (en) * | 2019-03-04 | 2022-08-26 | 厦门美图之家科技有限公司 | Sight point prediction model establishing method and device and sight point prediction method |
CN110051319A (en) * | 2019-04-23 | 2019-07-26 | 七鑫易维(深圳)科技有限公司 | Adjusting method, device, equipment and the storage medium of eyeball tracking sensor |
CN110414419A (en) * | 2019-07-25 | 2019-11-05 | 四川长虹电器股份有限公司 | A kind of posture detecting system and method based on mobile terminal viewer |
CN110503068A (en) * | 2019-08-28 | 2019-11-26 | Oppo广东移动通信有限公司 | Gaze estimation method, terminal and storage medium |
CN111291701B (en) * | 2020-02-20 | 2022-12-13 | 哈尔滨理工大学 | Sight tracking method based on image gradient and ellipse fitting algorithm |
CN113780164B (en) * | 2021-09-09 | 2023-04-28 | 福建天泉教育科技有限公司 | Head gesture recognition method and terminal |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1787012A (en) * | 2004-12-08 | 2006-06-14 | 索尼株式会社 | Method,apparatua and computer program for processing image |
CN102799888A (en) * | 2011-05-27 | 2012-11-28 | 株式会社理光 | Eye detection method and eye detection equipment |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102402467B1 (en) * | 2016-10-05 | 2022-05-25 | 매직 립, 인코포레이티드 | Periocular test for mixed reality calibration |
-
2018
- 2018-08-31 CN CN201811011543.8A patent/CN109344714B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1787012A (en) * | 2004-12-08 | 2006-06-14 | 索尼株式会社 | Method,apparatua and computer program for processing image |
CN102799888A (en) * | 2011-05-27 | 2012-11-28 | 株式会社理光 | Eye detection method and eye detection equipment |
Non-Patent Citations (1)
Title |
---|
基于双目立体视觉的眼球突出度测量方法研究;张帅;《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》;20170215(第02期);I138-2639 * |
Also Published As
Publication number | Publication date |
---|---|
CN109344714A (en) | 2019-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109344714B (en) | Sight estimation method based on key point matching | |
AU2021240222B2 (en) | Eye pose identification using eye features | |
CN104317391B (en) | A kind of three-dimensional palm gesture recognition exchange method and system based on stereoscopic vision | |
Valenti et al. | Combining head pose and eye location information for gaze estimation | |
US9286694B2 (en) | Apparatus and method for detecting multiple arms and hands by using three-dimensional image | |
KR20160138062A (en) | Eye gaze tracking based upon adaptive homography mapping | |
CN109359514B (en) | DeskVR-oriented gesture tracking and recognition combined strategy method | |
CN109785373B (en) | Speckle-based six-degree-of-freedom pose estimation system and method | |
CN111768449B (en) | Object grabbing method combining binocular vision with deep learning | |
WO2019136588A1 (en) | Cloud computing-based calibration method, device, electronic device, and computer program product | |
CN110794963A (en) | Depth camera-based eye control auxiliary input method | |
CN111259739A (en) | Human face pose estimation method based on 3D human face key points and geometric projection | |
CN115830675B (en) | Gaze point tracking method and device, intelligent glasses and storage medium | |
Liu et al. | Robust 3-D gaze estimation via data optimization and saliency aggregation for mobile eye-tracking systems | |
CN108694348B (en) | Tracking registration method and device based on natural features | |
Liu et al. | Towards robust auto-calibration for head-mounted gaze tracking systems | |
JP2017227687A (en) | Camera assembly, finger shape detection system using camera assembly, finger shape detection method using camera assembly, program implementing detection method, and recording medium of program | |
WO2024113275A1 (en) | Gaze point acquisition method and apparatus, electronic device, and storage medium | |
JP2012212325A (en) | Visual axis measuring system, method and program | |
WO2024116253A1 (en) | Information processing method, information processing program, and information processing device | |
WO2024059927A1 (en) | Methods and systems for gaze tracking using one corneal reflection | |
CN113643362A (en) | Limb corrector based on human body measurement in 2D human body posture estimation system | |
TW201301204A (en) | Method of tracking real-time motion of head | |
KR20170141018A (en) | System and Method for Detecting Face Landmark considering Occlusion Landmark |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |