WO2020042541A1

WO2020042541A1 - Eyeball tracking interactive method and device

Info

Publication number: WO2020042541A1
Application number: PCT/CN2019/073763
Authority: WO
Inventors: 蒋壮
Original assignee: 深圳市沃特沃德股份有限公司
Priority date: 2018-08-31
Filing date: 2019-01-29
Publication date: 2020-03-05
Also published as: CN109375765B; CN109375765A

Abstract

Disclosed in the preset application are an eyeball tracking interactive method and device, the method comprising: acquiring a user image; acquiring human eye position data and eyeball position data from in the user image; calculating feature data according to the human eye position data and the eyeball position data; and according to preset calibration data and the feature data, calculating coordinates of feature points which the user is looking at. The eyeball tracking interactive method and device of the present application may control a device according to eye movement habits of a user.

Description

Eye tracking interaction method and device

Technical field

The present application relates to the field of human-computer interaction technology, and in particular, to an eye tracking interaction method and device.

Background technique

The eye movement control method is a non-contact human-computer interaction method. The position of the eye's fixation point is calculated by tracking the position of the eyeball. Eye movement control is a great help for users who ca n’t use both hands. With the development of smart terminals, gaming computers with eye tracking capabilities make players more immersive in the game scene. In the prior art, eye tracking technology requires special equipment, such as an eye tracker. During the use of these special-purpose devices, users need to control the device according to the eye movement methods defined in the instruction manual. Users cannot control the device according to their eye movement habits, and the user experience is not high. The trend of human-computer interaction is human-centered, more friendly and convenient, so eye tracking is also moving towards controlling the device according to the user's eye movement habits. However, in the prior art, special equipment is not used for eye tracking, and the positioning accuracy of the line of sight is low. Frequently, the area actually viewed by the user does not match the calculated data obtained through image analysis calculation, which affects the human-machine. The interaction goes on and the user experience is not high.

technical problem

The purpose of this application is to provide an eye-tracking interaction method and device, which aims to solve the problem that in the prior art, eye movement control requires special equipment and cannot achieve accurate gaze positioning according to the user's eye movement habits.

Technical solutions

The present application proposes an eye-tracking interaction method, which includes: obtaining a user image of a user looking at a specified viewing area through a camera; searching for a human eye image and an eyeball image from the user image, and obtaining human eye position data and eyeball position data; Calculate feature data of the human eye position data and the eyeball position data; and calculate, based on preset calibration data and the feature data, the feature points corresponding to the feature data that the user looks at in the designated viewing area Coordinates; wherein the preset calibration data is calibration data of a plurality of positioning points in a designated viewing area.

The present application also proposes an eye-tracking interaction device, including: an image acquisition module for acquiring a user image of a user looking at a specified viewing area through a camera; an image analysis module for finding a human eye image from the user image and Eyeball image to obtain human eye position data and eyeball position data; a data calculation module for calculating feature data based on the human eye position data and the eyeball position data; a line of sight positioning module for obtaining preset calibration data and all The feature data is used to calculate the coordinates of the feature point that the user looks at corresponding to the feature data in the designated viewing area; wherein the preset calibration data is calibration data of a plurality of positioning points in the designated viewing area.

Beneficial effect

The eyeball tracking interaction method and device of the present application collect user images through a common camera, find human eyes and eyeballs from user images, calculate human eye positions and feature data of eyeball positions, and according to the feature data and preset calibration data The calculation is performed to obtain the coordinates of the user's line of sight corresponding to the feature data in the designated viewing area, thereby achieving the line of sight positioning. The feature data and calibration data of this application are collected according to the user's eye movement habits, and the human-computer interaction mode is friendly, easy to implement, and requires no additional equipment, and the cost is low.

BRIEF DESCRIPTION OF THE DRAWINGS

1 is a schematic flowchart of an eye tracking interaction method according to an embodiment of the present application;

Figure 2a of this application is a schematic diagram of each positioning point, Figure 2b is a schematic diagram of the division of the left region and the right region, and Figure 2c is a schematic diagram of the division of the upper region and the lower region of the present application;

3 is a schematic block diagram of a structure of an eye-tracking interactive device according to an embodiment of the present application;

4 is a schematic block diagram of a structure of a data calculation module in FIG. 3;

5 is a schematic block diagram of a structure of an eye-tracking interactive device according to another embodiment of the present application;

6 is a schematic block diagram of the structure of the line-of-sight positioning module in FIG. 5;

FIG. 7 is a schematic block diagram of a structure of a position preliminary judgment unit in FIG. 6; FIG.

FIG. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Best Mode of the Invention

Referring to FIG. 1, an embodiment of the present application provides an eye tracking interaction method, including:

S1. Obtain a user image of a user looking at a specified viewing area through a camera;

S2. Search for a human eye image and an eyeball image from the user image, and obtain human eye position data and eyeball position data;

S3. Calculate feature data according to the human eye position data and the eyeball position data;

S4. Calculate the coordinates of the feature point that the user looks at corresponding to the feature data in the designated viewing area according to the preset calibration data and the feature data; wherein the preset calibration data is designated viewing Calibration data for multiple anchor points in the area.

In this embodiment, the designated viewing area includes a terminal device interface for user-machine interaction, such as a smartphone display, a flat panel display, a smart TV display, a personal computer display, a laptop display, etc .; the camera includes The front camera and external camera of the terminal device, such as the front camera of a mobile phone. In step S1 of this embodiment, taking the eye movement control of the mobile phone display as an example, the user looks at the characteristic points of the mobile phone display at an appropriate distance from the mobile phone display according to his habit, and collects human eyes through the front camera of the mobile phone. An image of the feature point. Specifically, a camera can be used to continuously collect images in real time, and distinguish the state of the human eye through a pre-trained classifier. The preset states include gaze, eye movement, single eye blink, double eye blink, multiple blinks, etc. When any of the above states, user images are collected in real time. This application does not specifically limit the types of classifiers involved.

In step S2 of this embodiment, in order to improve the search efficiency and accuracy of the human eye image and eyeball image search, first search the human face image from the user image; then search the human eye image from the human face image, and according to The human eye image acquires human eye position data; finally, an eyeball image is searched from the human eye image, and eyeball position data is obtained according to the eyeball image.

First find the face image from the image. If no face image is found in the image, return to the step of obtaining the user image and adjust the relative position of the user and the specified viewing area until the face image can be found in the user image obtained by the camera . There are many ways to search for facial images, such as: using face rules (such as the distribution of eyes, nose, mouth, etc.) to perform face detection on the input image; by finding features that are invariant to the face (such as skin color, edges, textures) ) To perform face detection on the input image; describe the facial features of the face with a standard face template. When performing face detection, first calculate the correlation value between the input image and the standard face template, and then The obtained correlation value is compared with a preset threshold value to determine whether a face exists in the input image; the face area is regarded as a type of pattern, and a large amount of face data is used as a sample training to learn potential rules A classifier is constructed to detect faces by discriminating all possible region pattern attributes in the image. In this embodiment, the found face image is marked with a rectangular frame.

Find the human eye image from the face image. If no human eye image is found, return to the step of obtaining the user image and re-acquire the user image until the human eye image can be found in this step. Human eye search methods include template-based methods, statistics-based methods, and knowledge-based methods. Among them, the method based on template matching includes a gray projection template and a geometric feature template. The gray projection method refers to the horizontal and vertical projection of a gray image of a human face, and respectively counts the gray value and / or in two directions. The value of the gray function, find specific change points, and then combine the positions of change points in different directions according to prior knowledge to obtain the position of the human eye; the geometric feature template is implemented using the individual features and distribution features of the eyes as the basis Human eye detection. Statistics-based methods generally train and learn a large number of target samples and non-target samples to obtain a set of model parameters, and then build a classifier or filter to detect the target based on the model. The knowledge-based method is to determine the application environment of the image, summarize the knowledge (such as contour information, color information, position information) that can be used for human eye detection under specific conditions, and summarize them into rules that guide human eye detection. This embodiment uses a rectangular frame to frame the left-eye image and the right-eye image, respectively, to obtain the following human eye position data, including:

r ₁ : the distance from the upper left vertex of the rectangular frame of the left-eye image to the left-most face image;

t ₁ : the distance from the upper left vertex of the rectangular frame of the left eye image to the uppermost edge of the face image;

w ₁ : the width of the rectangular frame of the left-eye image; h ₁ : the height of the rectangular frame of the left-eye image;

r ₂ : the distance from the upper-left vertex of the rectangular frame of the right-eye image to the left-most face image;

t ₂ : the distance from the top left vertex of the rectangular frame of the right eye image to the uppermost edge of the face image;

w ₂ : the width of the rectangular frame of the right-eye image; h ₂ : the height of the rectangular frame of the right-eye image.

Finding an eyeball image from a human eye image includes finding a left eyeball image from a left eye image, and finding a right eyeball image from a right eye image. If no eyeball image is found, return to the step of obtaining a user image, and reacquire the user image until the eyeball image can be found in this step. Eyeball search methods include neural network method, extreme point position discrimination method of edge point integral projection curve, template matching method, multi-resolution mosaic map method, geometric and symmetry detection method, and Hough transform-based method. In this embodiment, the left eyeball image and the right eyeball image are found, and the left eyeball image and the right eyeball image are respectively framed by rectangular frames, and the following eyeball position data is obtained, including:

r ₃ : the distance between the top left vertex of the rectangular frame of the left eyeball image and the leftmost face of the face image;

t ₃ : the distance from the top left vertex of the rectangular frame of the left eyeball image to the uppermost edge of the face image;

w ₃ : the width of the rectangular frame of the left eyeball image; h ₃ : the height of the rectangular frame of the left eyeball image;

r ₄ : the distance between the top left vertex of the rectangular frame of the right eyeball image and the leftmost face of the face image;

t ₄ : the distance from the top left vertex of the rectangular frame of the right eyeball image to the uppermost edge of the face image;

w ₄ : the width of the rectangular frame of the right eyeball image; h ₄ : the height of the rectangular frame of the right eyeball image.

Specific parameters for obtaining eyeball position data from a human eye image are given in this embodiment. Based on the inventive concept of the present application, eyeball position data can also be obtained from a face image.

In this step S3-S4, the feature data is calculated according to the human eye position data and the eyeball position data obtained in step S2, and is compared with the calibration data of the positioning points in the designated viewing area collected in advance and calculated to obtain the user's line of sight. The coordinates of the feature point in the specified viewing area.

Further, the step S3 of calculating feature data based on the human eye position data and the eyeball position data includes:

S31. Calculate the distance feature data when the user looks at the feature point according to the human eye position data; and calculate the eyeball when the user looks at the feature point according to the human eye position data and the eyeball position data. Positional lateral feature data and eyeball position longitudinal feature data.

In this embodiment, the specific process of calculating the distance feature data in step S31 is as follows:

Calculate the coordinates (x ₁ , y ₁ ) of the center position of the left eye by formula ( ₁ ),

Pot (x ₁ , y ₁ ) = Pot (r ₁ + w _1/2 , t ₁ + h _1/2 ) (1)

Calculate the coordinates of the center position of the right eye (x ₂ , y ₂ ) by formula ( ₂ ),

_{_{Pot (x 2, y 2)}} = Pot (r 2 + w 2/2, t 2 + h 2/2) (2)

The distance d _x between the center of the left eye and the center of the right eye is calculated by formula (3), and d _{x is} the distance feature data.

The specific process of calculating horizontal feature data and vertical feature data is as follows:

Calculate the coordinates (x ₃ , y ₃ ) of the center position of the left eyeball by formula (4),

_{_{Pot (x 3, y 3)}} = Pot (r 3 + w 3/2, t 3 + h 3/2) (4)

Calculate the coordinates (x ₄ , y ₄ ) of the center position of the right eyeball by formula (5),

_{_{Pot (x 4, y 4)}} = Pot (r 4 + w 4/2, t 4 + h 4/2) (5)

The first lateral distance d ₁ between the center of the left eyeball and the leftmost side of the left eye image is calculated by formula (6): d ₁ = x _3- r ₁ (6)

The first longitudinal distance d ₃ between the center of the left eyeball and the uppermost edge of the left eye image is calculated by formula (7): d ₃ = y ₃ -t ₁ (7)

The second lateral distance d ₂ between the center of the right eyeball and the rightmost side of the right eye image is calculated by formula (8): d ₂ = r ₂ + w ₂ -x ₄ (8)

Calculate the second longitudinal distance d ₄ between the center of the right eyeball and the lowermost edge of the right eye image by formula (9): d ₄ = t ₂ + h ₂ -y ₄ (9)

Calculate the lateral characteristic data m _x by formula (10): m _x = d ₁ / d ₂ (10)

Calculate the longitudinal feature data n _x by formula (11): n _x = d ₃ / d ₄ (11)

The characteristic data _{_{(d x, m x, n}} x) by at step S31, the user looks to the feature points is obtained.

Further, before step S1 of obtaining a user image of the user looking at the designated viewing area through the camera, the method further includes: S01, retrieving a memory to determine whether the preset calibration data exists in the memory; S02, if not, then Storing the preset calibration data.

Before starting eye movement control of the mobile phone display, the user needs to determine whether the calibration has been performed. If no calibration data can be found in the memory, the eye movement control calibration is performed first. Specifically, referring to FIG. 2a, it is a schematic diagram of the positioning points of the designated viewing area, including 9 positioning points of upper left, upper middle, upper right, left middle, middle middle, right middle, lower left, middle lower, and lower right; see FIG. 2b. , Where the designated viewing area surrounded by top left, middle left, bottom left, bottom middle, middle middle, and top middle is the left area, and the designated viewing area surrounded by top right, middle right, bottom right, middle bottom, middle middle, and top middle is the right area ; Referring to FIG. 2c, the designated viewing areas surrounded by top left, middle left, center middle, right middle, top right, and top middle are top areas, and designated viewing areas surrounded by bottom left, left middle, middle center, right middle, bottom right, and bottom middle Is the lower area.

In step S02, the user looks at a positioning point of the mobile phone display at an appropriate distance from the display screen of the mobile phone according to his habit, and collects an image of the human eye looking at the positioning point through the front camera of the mobile phone. For example, the gaze time can be set in advance to remind the user to keep looking at the anchor point. When the preset gaze time is reached, the camera obtains the shooting instruction and collects the image; it can also use the camera to continuously collect the image in real time and distinguish it by the trained classifier. The state of the human eye. If it is determined that the human eye is in a fixation state, any frame image in the fixation state is acquired. Further searching for the human eye image and the eyeball image from the acquired image, obtaining the human eye position data and the eyeball position data; calculating a series of calibration data according to the human eye position data and the eyeball position data, and sequentially recording the calibration data Correspondence with the anchor point. Specifically, the user first looks at the upper left positioning point, and the camera collects the image of the human eye gazing at the upper left positioning point, searches for the human eye image and the eyeball image from the image, obtains the human eye position data and the eyeball position data, calculates the calibration data, and records the Correspondence between the calibration data and the upper left anchor point; the user starts to look at the upper left anchor point, and the rest of the steps are the same as the upper left anchor point; until the upper left, middle upper, right upper, left middle, middle middle, right middle, left lower, middle lower, and right The corresponding data of the calibration data and positioning points of the next 9 positioning points are collected. The method for obtaining the eye position data and the eye position data of the fixation point of the human eye from the image in this step S02 is the same as that of step S2, and details are not described herein. The calibration data in this step S02 includes distance calibration data, horizontal calibration data, and vertical calibration data. The calculation method of the distance calibration data is the same as the calculation method of the distance feature data in step S31. The calculation method of the horizontal calibration data is the same as the calculation method of the horizontal feature data in step S31. The calculation method of the vertical calibration data is the same as that of step S31. The calculation method of the data is the same, and will not be repeated here. The difference between the calibration data and the feature data in this embodiment is that the calibration data corresponds to the anchor points, and the feature data corresponds to the feature points pointed by the user's line of sight. Both are obtained according to the user's eye movement habits and use the same calculation method. The calculation is helpful to improve the accuracy of the calculation of the feature point coordinates.

In this step S02, the upper left (d ₁₁ , m ₁₁ , n ₁₁ ), the upper middle (d ₁₂ , m ₁₂ , n ₁₂ ), the upper right (d ₁₃ , m ₁₃ , n ₁₃ ), and the middle left (d ₂₁ , m ₂₁ , n ₂₁ ), middle (d ₂₂ , m ₂₂ , n ₂₂ ), right middle (d ₂₃ , m ₂₃ , n ₂₃ ), bottom left (d ₃₁ , m ₃₁ , n ₃₁ ), bottom middle (d ₃₂ , m ₃₂ , n ₃₂ ) and the calibration data of the 9 anchor points at the bottom right (d ₃₃ , m ₃₃ , n ₃₃ ). Where d ₁₁ , d ₁₂ , d ₁₃ , d ₂₁ , d ₂₂ , d ₂₃ , d ₃₁ , d ₃₂ and d ₃₃ are distance calibration data of each positioning point, m ₁₁ , m ₁₂ , m ₁₃ , m ₂₁ , m ₂₂ , M ₂₃ , m ₃₁ , m ₃₂ and m ₃₃ are the lateral calibration data of each positioning point, n ₁₁ , n ₁₂ , n ₁₃ , n ₂₁ , n ₂₂ , n ₂₃ , n ₃₁ , n ₃₂ and n ₃₃ are each positioning Point vertical calibration data.

Further, the preset calibration data includes distance calibration data, horizontal calibration data, and vertical calibration data; and based on the preset calibration data and the feature data, calculating the user's orientation corresponding to the feature data Step S4 of the coordinates of the feature point in the designated viewing area includes: S41. Determine whether the distance feature data is within a calibration range of the distance calibration data; S42. If yes, perform a preliminary judgment on the position of the feature point To obtain a position interval where the feature point is located in the designated viewing area; S43. Calculate the coordinates of the feature point according to a preset calculation formula corresponding to the position interval.

In this embodiment, step S41, the extracted nine anchor point distance calibration data _{_{_{d 11, d 12, d 13}}} , d 21, d 22, d 23, d 31, d 32 and d ₃₃ are the maximum value and the minimum Value to obtain the calibration range of the distance calibration data, and determine whether the distance characteristic data d _x is between the maximum value and the minimum value of the distance calibration data. Step S42, the feature if the distance between the data d _x is not above the maximum value and the minimum value, adjusting the user specified distance viewing area, wherein the data until the distance d _x falls between the maximum and minimum values. If the distance feature data d _{x is} between the above-mentioned maximum value and minimum value, a preliminary judgment is made on the position of the feature point to determine in which interval the feature point is located in the designated viewing area, such as in the left area or the right area, in the upper area or The lower area. Characterized by the required distance range data from the calibration data during calibration d _x falls, so the test specified user views the same distance from the measurement area or when very close, to improve the accuracy of gaze tracking.

In step S43, the coordinates (x _i , y _i ) of the feature point that the user is looking at within a designated viewing area are calculated according to preset calculation formulas corresponding to different position intervals, so as to implement line-of-sight tracking.

Specifically, the abscissa x _{i of the} feature point is calculated according to formula (12).

x _i = Q _x + R _x * ((m _x --m _min ) / (m _max -m _min )) (12)

Where Q _x is a constant, R _x is the total width pixel value of the specified viewing area / 2, m _x is the lateral characteristic data of the feature point, m _min is the minimum lateral calibration data of the position interval where the feature point is, and m _max is the feature point Maximum lateral calibration data for the location interval.

If the feature point is in the left area, then Q _x = 0, m _min is the minimum value among m ₁₁ , m _21, and m ₃₁ , and m _max is the maximum value among m ₁₂ , m _22, and m ₃₂ ; if the feature point is on the right Area, then Q _x = R _x , m _min is the minimum value among m ₁₂ , m _22, and m ₃₂ , and m _max is the maximum value among m ₁₃ , m _23, and m ₃₃ ; if the feature points are located in the left and right areas Where x _i = R _x .

The ordinate y _{i of the} feature point is calculated according to formula (13).

y _i = Q _y + R _y * ((n _x --n _min ) / (n _max -n _min )) (13)

Where Q _y is a constant, R _y is the total height pixel value of the specified viewing area / 2, n _x is the longitudinal characteristic data of the feature point, n _min is the minimum longitudinal calibration data of the position interval where the feature point is, and n _max is the feature point Maximum longitudinal calibration data for the location interval.

If the feature point is located in the upper area, then Q _y = 0, n _min is the minimum value among n ₁₁ , n ₁₂ and n ₁₃ , and n _max is the maximum value among n ₂₁ , n ₂₂ and n ₂₁ ; if the feature point is located below Area, then Q _y = R _y , m _min is the minimum value of n ₂₁ , n _22, and n ₂₁ , and n _max is the maximum value of n ₃₁ , n _32, and n ₃₃ ; if the feature points are located in the upper and lower regions Where y _i = R _y .

Further, the step of performing a preliminary judgment on the position of the feature point to obtain a position interval where the feature point is located in the designated viewing area includes: S431, performing the process by comparing the lateral feature data with the lateral calibration data. Compare the sizes to obtain the lateral position interval where the feature points are located in the designated viewing area; and obtain the feature points located in the longitudinal direction of the designated viewing area by comparing the longitudinal feature data with the longitudinal calibration data. Location interval.

In the determination of the lateral position interval in step S431, if min (m ₁₁ , m ₂₁ , m ₃₁ ) <m _x <m ₂₂ , the feature point is located in the left area; if m ₂₂ <m _x <max (m ₁₃ , m ₂₃ , m ₃₃ ), the feature point is located in the right area; if m _x = m ₂₂ , the feature point is located at the boundary between the left area and the right area; if m _x <min (m ₁₁ , m ₂₁ , m ₃₁ ) or m _x > max (m ₁₃ , m ₂₃ , m ₃₃ ), then the feature point is not on the designated viewing area, and it is necessary to re-enter step S1 to obtain the user image. Where min (m ₁₁ , m ₂₁ , m ₃₁ ) refers to the minimum value of m ₁₁ , m ₂₁ , m ₃₁ , and max (m ₁₃ , m ₂₃ , m ₃₃ ) refers to the maximum value of m ₁₃ , m ₂₃ , m ₃₃ . Judging the vertical position interval, if min (n ₁₁ , n ₁₂ , n ₁₃ ) <n _x <n ₂₂ , the feature point is located in the upper area; if n ₂₂ ＜ n _x ＜ max (n ₃₁ , n ₃₂ , n ₃₃ ) , Then the feature point is located in the lower area; if n _x = n ₂₂ , the feature point is located at the boundary between the upper area and the lower area; if n _x <min (n ₁₁ , n ₁₂ , n ₁₃ ) or n _x > max (n ₃₁ , n ₃₂ , n ₃₃ ), the feature point is not on the designated viewing area, and it is necessary to re-enter step S1 to obtain a user image. Where min (n ₁₁ , n ₁₂ , n ₁₃ ) refers to the minimum value of n ₁₁ , n ₁₂ , n ₁₃ , and max (n ₃₁ , n ₃₂ , n ₃₃ ) refers to the maximum value of n ₃₁ , n ₃₂ , n ₃₃ .

Referring to FIG. 3, the present application also proposes an eye-tracking interaction device, including:

An image acquisition module 10 is configured to acquire a user image of a user looking at a specified viewing area through a camera; an image analysis module 20 is configured to find a human eye image and an eyeball image from the user image, and obtain human eye position data and eyeball position data A data calculation module 30 configured to calculate feature data based on the human eye position data and the eyeball position data; a line of sight positioning module 40 configured to calculate a correspondence of the feature data according to preset calibration data and the feature data The coordinate of the feature point that the user is looking at is in the designated viewing area; wherein the preset calibration data is calibration data of a plurality of positioning points in the designated viewing area.

In this embodiment, the designated viewing area includes a terminal device interface for user-machine interaction, such as a smartphone display, a flat panel display, a smart TV display, a personal computer display, a laptop display, etc .; the camera includes The front camera and external camera of the terminal device, such as the front camera of a mobile phone. In the image acquisition module 10 of this embodiment, taking the eye movement control of the mobile phone display as an example, the user looks at the characteristic points of the mobile phone display at an appropriate distance from the mobile phone display according to his habits, and collects through the mobile phone's front camera. The human eye looks at the image of the feature point. Specifically, the real-time image acquisition unit can continuously acquire images in real time with a camera, and then distinguish the state of the human eye through a pre-trained classifier in the classification unit. The preset states include gaze, eye movement, single-eye blink, double-eye blink, When blinking multiple times, etc., when it is determined that the human eye is in any of the above states, the user image is collected in real time by the image acquisition unit.

In the image analysis module 20 of this embodiment, in order to improve the search efficiency and accuracy of the human eye image and eyeball image search, the human face image is first searched from the user image through the human face search unit; The human eye image is searched in the face image, and the human eye position data is obtained according to the human eye image; finally, the eyeball image is searched from the human eye image through the eyeball search unit, and the eyeball position data is obtained according to the eyeball image.

Find the human eye image from the face image. If no human eye image is found, return to the step of obtaining the user image and re-acquire the user image until the human eye image can be found in this step. Human eye search methods include template-based methods, statistics-based methods, and knowledge-based methods. Among them, the method based on template matching includes a gray projection template and a geometric feature template. The gray projection method refers to the horizontal and vertical projection of a gray image of a human face, and respectively counts the gray value and / or in two directions. The value of the gray function, find specific change points, and then combine the positions of change points in different directions according to prior knowledge to obtain the position of the human eye; the geometric feature template is implemented using the individual features and distribution features of the eyes as the basis Human eye detection. Statistics-based methods generally train and learn a large number of target samples and non-target samples to obtain a set of model parameters, and then build a classifier or filter to detect the target based on the model. The knowledge-based method is to determine the application environment of the image, summarize the knowledge (such as contour information, color information, position information) that can be used for human eye detection under specific conditions, and summarize them into rules that guide human eye detection. In this embodiment, the left-eye image and the right-eye image are framed by rectangular frames, and the following eye position data is obtained, including:

Finding an eyeball image from a human eye image includes finding a left eyeball image from a left eye image, and finding a right eyeball image from a right eye image. If no eyeball image is found, return to the step of obtaining a user image, and reacquire the user image until the eyeball image can be found in this step. Eyeball search methods include neural network method, extreme point position discrimination method of edge point integral projection curve, template matching method, multi-resolution mosaic map method, geometric and symmetry detection method, and Hough transform-based method. This embodiment uses a rectangular frame to frame the left eyeball image and the right eyeball image, respectively, and obtains the following eyeball position data, including:

In the data calculation module 30 and the line-of-sight positioning module 40 of this embodiment, the feature data is calculated according to the human eye position data and the eye position data obtained by the image analysis module 20, and is compared with the calibration data of the positioning points in the specified viewing area collected in advance and Calculate to obtain the coordinates of the feature point that the user looks at in the specified viewing area.

Further, referring to FIG. 4, the data calculation module 30 includes: a first data obtaining unit 301, configured to calculate distance feature data when a user looks at the feature point according to the position data of the human eye; and second data acquisition A unit 302 is configured to calculate, according to the position data of the human eye and the position data of the eyeball, the lateral feature data of the eyeball position and the longitudinal feature data of the eyeball position when the user looks at the feature point.

In this embodiment, the specific process of calculating the distance feature data by the first data acquisition unit 301 is as follows: calculating the coordinates (x ₁ , y ₁ ) of the center position of the left eye by using the first calculation subunit according to formula (14),

Pot (x ₁ , y ₁ ) = Pot (r ₁ + w _1/2 , t ₁ + h _1/2 ) (14)

Calculate the coordinates (x ₂ , y ₂ ) of the center position of the right eye according to formula (15) through the first calculation subunit,

_{_{Pot (x 2, y 2)}} = Pot (r 2 + w 2/2, t 2 + h 2/2) (15)

The distance d _x between the center of the left eye and the center of the right eye is calculated by the second calculation subunit according to formula (16), where d _x is distance feature data.

In this embodiment, the specific process of calculating the horizontal feature data and the vertical feature data by the second data obtaining unit 302 is as follows: calculate the coordinates (x ₃ , y ₃ ) of the center position of the left eyeball by using formula (17),

_{_{Pot (x 3, y 3)}} = Pot (r 3 + w 3/2, t 3 + h 3/2) (17)

Calculate the coordinates (x ₄ , y ₄ ) of the center position of the right eyeball by formula (18),

_{_{Pot (x 4, y 4)}} = Pot (r 4 + w 4/2, t 4 + h 4/2) (18)

The first lateral distance d ₁ between the center of the left eyeball and the leftmost side of the left eye image is calculated by formula (19): d ₁ = x _3- r ₁ (19)

The first longitudinal distance d ₃ between the center of the left eyeball and the uppermost edge of the left eye image is calculated by formula (20): d ₃ = y ₃ -t ₁ (20)

The second lateral distance d ₂ between the center of the right eyeball and the rightmost side of the right eye image is calculated by formula (21): d ₂ = r ₂ + w ₂ -x ₄ (21)

Calculate the second longitudinal distance d ₄ between the center of the right eyeball and the lowermost edge of the right eye image by formula (22): d ₄ = t ₂ + h ₂ -y ₄ (22)

Calculate the lateral characteristic data m _x by formula (23): m _x = d ₁ / d ₂ (23)

Calculate the longitudinal feature data n _x by formula (24): n _x = d ₃ / d ₄ (24)

Through the first data obtaining unit 301 and the second data obtaining unit 302, feature data (d _x , m _x , n _x ) when a user looks at a feature point is obtained.

Further, referring to FIG. 5, the eyeball tracking interaction device further includes:

A judgment module 01 is configured to retrieve a memory to determine whether the preset calibration data exists in the memory; a calibration module 02 is configured to store the preset if the preset calibration data does not exist in the memory Calibration data.

In this embodiment, before starting the eye movement control of the mobile phone display, the user needs to determine whether calibration has been performed. If no calibration data can be found in the memory, the eye movement control calibration is performed first. Specifically, referring to FIG. 4, it is a schematic diagram of an anchor point for a designated viewing area, including nine anchor points of upper left, upper middle, upper right, left middle, middle middle, right middle, lower left, middle lower, and lower right. The designated viewing area surrounded by left middle, bottom left, bottom middle, middle middle, and top middle is the left area, and the designated viewing area surrounded by top right, middle right, bottom right, middle bottom, middle, and middle top is the right area, top left, left The designated viewing areas surrounded by middle, middle, right, middle right, top and middle are the top areas, and the designated viewing areas surrounded by bottom left, left middle, middle, right middle, bottom right, and bottom middle are the bottom areas. In the calibration module 02, the user looks at a positioning point of the mobile phone display at an appropriate distance from the display screen of the mobile phone according to his own habits, and collects an image of the human eye looking at the positioning point through the front camera of the mobile phone. For example, the gaze time can be set in advance to remind the user to keep looking at the anchor point. When the preset gaze time is reached, the camera obtains the shooting instruction and collects the image; it can also use the camera to continuously collect the image in real time and distinguish it by the trained classifier. The state of the human eye. If it is determined that the human eye is in a fixation state, any frame image in the fixation state is acquired. Further, the human eye image and the eyeball image are searched from the obtained image to obtain human eye position data and eyeball position data; a series of calibration data is calculated according to the human eye position data and the eyeball position data, and the calibration data and The correspondence between the anchor points. Specifically, the user first looks at the upper left positioning point, and the camera collects the image of the human eye gazing at the upper left positioning point, searches for the human eye image and the eyeball image from the image, obtains the human eye position data and the eyeball position data, calculates the calibration data, and records the Correspondence between the calibration data and the upper left anchor point; the user starts to look at the upper left anchor point, and the rest of the steps are the same as the upper left anchor point; until the upper left, middle upper, right upper, left middle, middle middle, right middle, left lower, middle lower, and right The corresponding data of the calibration data and positioning points of the next 9 positioning points are collected. The method for obtaining the eye position data and the eye position data of the fixation point of the human eye from the image in the calibration module 02 is the same as that of the image analysis module 20, and details are not described herein. The preset calibration data in the calibration module 02 includes distance calibration data, horizontal calibration data, and vertical calibration data. The calculation method of the distance calibration data is the same as the calculation method of the first data acquisition unit 301, and the calculation method of the horizontal calibration data and the vertical calibration data is the same as the calculation method of the second data acquisition unit 302, and details are not described herein. The difference between the calibration data and the feature data in this embodiment is that the calibration data corresponds to the anchor points, and the feature data corresponds to the feature points pointed by the user's line of sight. Both are obtained according to the user's eye movement habits and use the same calculation method. The calculation is helpful to improve the accuracy of the calculation of the feature point coordinates. The calibration module 02 obtains upper left (d ₁₁ , m ₁₁ , n ₁₁ ), upper middle (d ₁₂ , m ₁₂ , n ₁₂ ), upper right (d ₁₃ , m ₁₃ , n ₁₃ ), left middle (d ₂₁ , m ₂₁ , n ₂₁ ), middle (d ₂₂ , m ₂₂ , n ₂₂ ), right middle (d ₂₃ , m ₂₃ , n ₂₃ ), bottom left (d ₃₁ , m ₃₁ , n ₃₁ ), bottom middle (d ₃₂ , m ₃₂ , n ₃₂ ) and the calibration data of the 9 anchor points at the bottom right (d ₃₃ , m ₃₃ , n ₃₃ ). Where d ₁₁ , d ₁₂ , d ₁₃ , d ₂₁ , d ₂₂ , d ₂₃ , d ₃₁ , d ₃₂ and d ₃₃ are distance calibration data of each positioning point, m ₁₁ , m ₁₂ , m ₁₃ , m ₂₁ , m ₂₂ , M ₂₃ , m ₃₁ , m ₃₂ and m ₃₃ are the lateral calibration data of each positioning point, n ₁₁ , n ₁₂ , n ₁₃ , n ₂₁ , n ₂₂ , n ₂₃ , n ₃₁ , n ₃₂ and n ₃₃ are each positioning Point vertical calibration data.

Further, referring to FIG. 6, the preset calibration data includes distance calibration data, horizontal calibration data, and vertical calibration data; the line-of-sight positioning module 40 includes:

The distance judging unit 401 is configured to judge whether the distance feature data is within a calibration range of the distance calibration data; the position preliminary judgment unit 402 is configured to, if the distance feature data is within a calibration range of the distance calibration data, Performing a preliminary judgment on the position of the feature point to obtain a position interval where the feature point is located in the designated viewing area; a coordinate calculation unit 403 is configured to calculate the feature point according to a preset calculation formula corresponding to the position interval; coordinate of.

In this embodiment, the distance judgment unit 401 extracts the largest of the distance calibration data d ₁₁ , d ₁₂ , d ₁₃ , d ₂₁ , d ₂₂ , d ₂₃ , d ₃₁ , d ₃₂ and d ₃₃ of the nine positioning points. Value and minimum value to obtain the calibration range of the distance calibration data, and determine whether the distance characteristic data d _x is between the maximum value and the minimum value of the distance calibration data. Initial impression of the position unit 402, if the distance between the feature data d _x is not above the maximum value and the minimum value, re-entering the image acquisition module 10 acquires user image. If the distance feature data d _{x is} between the above-mentioned maximum value and minimum value, a preliminary judgment is made on the position of the feature point to determine in which interval the feature point is located in the designated viewing area, such as in the left area or the right area, in the upper area or The lower area. Characterized by the required distance range data from the calibration data during calibration d _x falls, so the test specified user views the same distance from the measurement area or when very close, to improve the accuracy of gaze tracking. In the coordinate calculation unit 403, the coordinates (x _i , y _i ) of the feature point that the user is looking at within a specified viewing area are calculated according to preset calculation formulas corresponding to different position intervals, so as to implement line-of-sight tracking.

x _i = Q _x + R _x * ((m _x --m _min ) / (m _max -m _min )) (12)

If the feature point is in the left area, then Q _x = 0, m _min is the minimum value among m ₁₁ , m ₂₁ and m ₃₁ , and m _max is the maximum value among m ₁₂ , m ₂₂ and m ₃₂ ; if the feature point is on the right Area, then Q _x = R _x , m _min is the minimum value of m ₁₂ , m _22, and m ₃₂ , and m _max is the maximum value of m ₁₃ , m _23, and m ₃₃ ; if the feature points are located on the left and right areas Where x _i = R _x .

y _i = Q _y + R _y * ((n _x --n _min ) / (n _max -n _min )) (13)

Further, referring to FIG. 7, the position preliminary judgment unit 402 includes a first preliminary judgment sub-unit 4021 configured to obtain the feature point at the position by comparing the lateral feature data with the lateral calibration data. A horizontal position interval of the designated viewing area; a second preliminary sub-unit 4022 is configured to obtain a vertical position interval of the feature point in the specified viewing area by comparing the vertical feature data with the vertical calibration data.

In the first preliminary sub-unit 4021, if min (m ₁₁ , m ₂₁ , m ₃₁ ) <m _x <m ₂₂ , the feature point is located in the left area; if m ₂₂ <m _x <max (m ₁₃ , m ₂₃ , m ₃₃ ), the feature point is located in the right area; if m _x = m ₂₂ , the feature point is located at the boundary between the left area and the right area; if m _x <min (m ₁₁ , m ₂₁ , m ₃₁ ) or m _x > max ( m ₁₃ , m ₂₃ , m ₃₃ ), the feature point is not on the designated viewing area, and the user needs to re-enter the image acquisition module 10 to obtain the user image. Where min (m ₁₁ , m ₂₁ , m ₃₁ ) refers to the minimum value of m ₁₁ , m ₂₁ , m ₃₁ , and max (m ₁₃ , m ₂₃ , m ₃₃ ) refers to the maximum value of m ₁₃ , m ₂₃ , m ₃₃ . In the second preliminary judgment sub-unit 4022, if min (n ₁₁ , n ₁₂ , n ₁₃ ) <n _x <n ₂₂ , the feature point is located in the upper area; if n ₂₂ <n _x <max (n ₃₁ , n ₃₂ , n ₃₃ ), the feature point is located in the lower area; if n _x = n ₂₂ , the feature point is located at the boundary between the upper area and the lower area; if n _x <min (n ₁₁ , n ₁₂ , n ₁₃ ) or n _x > max ( n ₃₁ , n ₃₂ , n ₃₃ ), then the feature points are not on the designated viewing area, and the image acquisition module 10 needs to be re-entered to acquire the user image. Where min (n ₁₁ , n ₁₂ , n ₁₃ ) refers to the minimum value of n ₁₁ , n ₁₂ , n ₁₃ , and max (n ₃₁ , n ₃₂ , n ₃₃ ) refers to the maximum value of n ₃₁ , n ₃₂ , n ₃₃ .

The present application also proposes a computer device 03, which includes a processor 04, a memory 01, and a computer program 02 stored on the memory 01 and executable on the processor 04. The processor 04 is implemented when the computer program 02 is executed The above-mentioned method for acquiring eye movement control calibration data.

Claims

An eye-tracking interaction method includes:

Obtaining a user image of a user looking at a specified viewing area through a camera;

Searching for a human eye image and an eyeball image from the user image, and obtaining human eye position data and eyeball position data;

Calculating feature data according to the human eye position data and the eyeball position data;

Calculate, according to the preset calibration data and the feature data, the coordinates of the feature point that the user looks at corresponding to the feature data in the designated viewing area; wherein the preset calibration data is within the designated viewing area Calibration data for multiple anchor points.
The eyeball tracking interaction method according to claim 1, wherein the step of calculating feature data based on the human eye position data and the eyeball position data comprises:

Calculate distance feature data when the user looks at the feature point according to the human eye position data; and calculate lateral position of the eyeball when the user looks at the feature point according to the human eye position data and the eyeball position data Feature data and longitudinal feature data of eyeball position.
The eyeball tracking interaction method according to claim 2, wherein the step of calculating distance feature data when the user looks at the feature point according to the position data of the human eye comprises:

Calculate left eye center position coordinates according to left eye position data included in the human eye position data; and calculate right eye center position coordinates according to right eye center position data included in the human eye position data; according to the left The eye center position coordinates and the right eye center position coordinates are used to calculate the distance between the left eye center and the right eye center to obtain the distance feature data.
The eye tracking interaction method according to claim 1, wherein before the step of obtaining, by a camera, a user image of a user looking at a specified viewing area, further comprising:

Searching the memory to determine whether the preset calibration data exists in the memory;

If not, the preset calibration data is stored.
The interactive method for eye tracking according to claim 4, wherein the preset calibration data includes distance calibration data, horizontal calibration data and vertical calibration data; and according to the preset calibration data and the feature data, The step of calculating the coordinates of the feature point that the user looks at corresponding to the feature data in the designated viewing area includes:

Determining whether the distance characteristic data is within a calibration range of the distance calibration data;

If yes, perform a preliminary judgment on the position of the feature point to obtain a position interval where the feature point is located in the designated viewing area;

The coordinates of the feature points are calculated according to a preset calculation formula corresponding to the position interval.
The eye tracking interaction method according to claim 5, wherein the step of performing a preliminary judgment on the position of the feature point to obtain a position interval where the feature point is located in the designated viewing area, comprises:

Comparing the lateral feature data with the horizontal calibration data to obtain a lateral position interval where the feature point is located in the designated viewing area; and comparing the longitudinal feature data with the vertical calibration data. To obtain a vertical position interval where the feature point is located in the designated viewing area.
The eyeball tracking interaction method according to claim 1, wherein the step of searching for a human eye image and an eyeball image from the user image, and obtaining human eye position data and eyeball position data comprises:

Searching for a face image from the user image;

Find a human eye image from the face image, and obtain human eye position data according to the human eye image; find an eyeball image from the human eye image, and obtain eyeball position data according to the eyeball image.
The eye-tracking interaction method according to claim 1, wherein the step of acquiring, by a camera, a user image of a user looking at a specified viewing area comprises:

Obtain images captured by the camera in real time;

Classify the state of the human eye contained in the image collected in real time through a pre-trained classifier, respectively;

When the human eye is in a preset state, the user image is acquired from the image acquired in real time.
An eye-tracking interactive device includes:

An image acquisition module, configured to acquire a user image of a user looking at a specified viewing area through a camera;

An image analysis module, configured to find a human eye image and an eyeball image from the user image, and obtain human eye position data and eyeball position data;

A data calculation module, configured to calculate feature data according to the human eye position data and the eyeball position data;

The line-of-sight positioning module is configured to calculate, according to preset calibration data and the feature data, the coordinates of the feature point corresponding to the feature data that the user looks at in the designated viewing area; wherein the preset calibration The data is calibration data of a plurality of anchor points in a specified viewing area.
The interactive device for eye tracking according to claim 9, wherein the data calculation module comprises:

A first data obtaining unit, configured to calculate distance feature data when a user looks at the feature point according to the human eye position data;

A second data obtaining unit is configured to calculate, based on the position data of the human eye and the position data of the eyeball, lateral feature data of the eyeball position when the user looks at the feature point and longitudinal feature data of the eyeball position.
The eye-tracking interactive device according to claim 10, wherein the first data acquisition unit comprises:

A first calculation subunit, configured to calculate coordinates of the center position of the left eye according to the position data of the left eye included in the position data of the human eye; and calculate the right eye according to the position data of the center of the right eye included in the position data of the human eye Center position coordinates

A second calculation subunit is configured to calculate a distance between the center of the left eye and the center of the right eye according to the coordinates of the center position of the left eye and the coordinates of the center position of the right eye to obtain the distance feature data.
The interactive device for eye tracking according to claim 9, wherein before the image acquisition module further comprises:

A judging module for retrieving a memory to judge whether the preset calibration data exists in the memory; a calibration module for storing the preset calibration if the preset calibration data does not exist in the memory data.
The interactive device for eye-tracking according to claim 12, wherein the preset calibration data comprises distance calibration data, horizontal calibration data and vertical calibration data; and the line of sight positioning module comprises:

A distance judging unit, configured to judge whether the distance characteristic data is within a calibration range of the distance calibration data;

A preliminary position determining unit, configured to perform a preliminary position determination on the feature point if the distance feature data is within a calibration range of the distance calibration data, to obtain a position interval where the feature point is located in the designated viewing area;

A coordinate calculation unit is configured to calculate coordinates of the feature points according to a preset calculation formula corresponding to the position interval.
The eye-tracking interactive device according to claim 13, wherein the position preliminary judgment unit comprises:

A first preliminary sub-unit, configured to obtain a lateral position interval where the feature point is located in the designated viewing area by comparing the lateral feature data with the lateral calibration data;

A second preliminary sub-unit is configured to obtain a longitudinal position interval where the feature point is located in the designated viewing area by comparing the longitudinal feature data with the longitudinal calibration data.
The interactive device for eye tracking according to claim 9, wherein the image analysis module comprises:

A face finding unit, configured to find a face image from the user image;

A human eye searching unit, configured to search for a human eye image from the human face image, and obtain human eye position data according to the human eye image;

The eyeball search unit is configured to search an eyeball image from the human eye image, and obtain eyeball position data according to the eyeball image.
The eye-tracking interactive device according to claim 9, wherein the image acquisition module comprises:

A real-time image acquisition unit, configured to acquire a real-time image acquired by a camera;

A classification unit, configured to classify the state of the human eye included in the image collected in real time through a pre-trained classifier;

An image acquisition unit is configured to acquire the user image from the images collected in real time when the human eye is in a preset state.
A computer device, comprising a processor, a memory, and a computer program stored on the memory and executable on the processor. The processor implements the computer program according to claim 1 when executing the computer program. The eye tracking interaction method according to any one of to 8.