CN111966219B

CN111966219B - Eye movement tracking method, device, equipment and storage medium

Info

Publication number: CN111966219B
Application number: CN202010700876.2A
Authority: CN
Inventors: 闫野; 马权智; 印二威; 邓宝松; 刘冠军; 宋明武; 范晓丽; 谢良
Original assignee: Tianjin (binhai) Intelligence Military-Civil Integration Innovation Center; National Defense Technology Innovation Institute PLA Academy of Military Science
Current assignee: Tianjin (binhai) Intelligence Military-Civil Integration Innovation Center; National Defense Technology Innovation Institute PLA Academy of Military Science
Priority date: 2020-07-20
Filing date: 2020-07-20
Publication date: 2024-04-16
Anticipated expiration: 2040-07-20
Also published as: CN111966219A

Abstract

The invention discloses an eye movement tracking method, an eye movement tracking device, eye movement tracking equipment and a storage medium, wherein the eye movement tracking method comprises the following steps: acquiring eye images of the person with glasses and eye images of the person without glasses as data sets; training a neural network model according to the dataset; preprocessing an eye image of a person with glasses according to the neural network model; and performing eye movement tracking according to the preprocessed image. The eye movement tracking method can preprocess the eye images of the glasses wearing person, so that the glasses wearing person can perform high-precision eye movement tracking as common people, and can seamlessly butt joint the eye movement tracking algorithm.

Description

Eye movement tracking method, device, equipment and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an eye movement tracking method, an eye movement tracking device, an eye movement tracking apparatus, and a storage medium.

Background

Eye tracking (eye tracking) refers to the tracking of eye movement by measuring the position of the gaze point of the eye or the movement of the eye relative to the head.

At present, most of eye tracking methods are based on video optical recording methods, human eye images are shot through a camera, and the gazing direction of eyes is analyzed according to human eye image information. Optical recording methods can be further classified into no-light-source assist and light-source assist. When no light source is used for assisting, only the camera is used for capturing the image of the human eye, the image processing algorithm is used for extracting the eye characteristics, the structure is simple, but the algorithm is complex, when the light source is used for assisting, the special light source such as infrared light is used for irradiating the eye, so that the eye generates the phenomena of reflecting bright spots or bright and dark pupils, and the sight direction of the human eye can be assisted to be calculated. Although adding light source assistance improves the hardware configuration of eye-tracking systems, those skilled in the art often use optical recording methods with light source assistance due to the simple direction of the extracted line of sight.

However, for the wearer, when the light source irradiates the eyes of the wearer, excessive light spots are generated due to reflection of the light, so that the eye features are blocked or even invisible, and the accuracy of eye movement tracking is greatly affected and even cannot be tracked.

Disclosure of Invention

The embodiment of the disclosure provides an eye movement tracking method, an eye movement tracking device, eye movement tracking equipment and a storage medium. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

In a first aspect, an embodiment of the present disclosure provides an eye movement tracking method, including:

acquiring eye images of the person with glasses and eye images of the person without glasses as data sets;

training a neural network model according to the data set;

preprocessing an eye image of a person with glasses according to a neural network model;

and (5) performing eye movement tracking according to the preprocessed image.

Further, acquiring an eye image of a person with glasses and an eye image of a person without glasses as a data set includes:

the data set is divided into a training set, a test set and a validation set.

Further, training the neural network model according to the data set, comprising:

and taking the eye images of the eye staff with glasses as input of the neural network model, taking the eye images of the eye staff without glasses as targets of the neural network model, and training the neural network model by minimizing a loss function through a gradient descent method.

Further, before preprocessing the eye image of the person with glasses according to the neural network model, the method further comprises:

judging whether the eye image is an eye image of a person with glasses;

and when the eye image is the eye image of the person with the glasses, preprocessing the eye image of the person with the glasses according to the neural network model.

Further, preprocessing the eye image of the person with glasses according to the neural network model, including:

and carrying out the treatment of eliminating redundant light spots and eliminating glasses on the eye images of the people with the glasses according to the neural network model.

Further, performing eye movement tracking according to the preprocessed image, including:

and performing eye tracking on the preprocessed image through an eye tracking algorithm.

Further, the eye-tracking algorithm includes an appearance-based eye-tracking algorithm or a feature-based eye-tracking algorithm.

In a second aspect, embodiments of the present disclosure provide an eye-tracking device comprising:

the acquisition module is used for acquiring the eye images of the glasses-equipped personnel and the eye images of the glasses-free personnel as data sets;

the training module is used for training the neural network model according to the data set;

the preprocessing module is used for preprocessing the eye images of the person with the glasses according to the neural network model;

and the tracking module is used for carrying out eye movement tracking according to the preprocessed image.

In a third aspect, an embodiment of the present disclosure provides an eye tracking device, including a processor and a memory storing program instructions, the processor being configured to perform the eye tracking method provided by the above embodiment when the program instructions are executed.

In a fourth aspect, embodiments of the present disclosure provide a computer readable medium having computer readable instructions stored thereon, the computer readable instructions being executable by a processor to implement an eye tracking method provided by the above embodiments.

The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects:

according to the eye movement tracking method provided by the embodiment of the disclosure, the eye images of the glasses wearer can be preprocessed, and glasses and redundant light spots in the images are eliminated, so that the glasses wearer can perform high-precision eye movement tracking as common people, and other eye movement tracking algorithms can be seamlessly docked.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a flow chart illustrating a method of eye tracking according to an exemplary embodiment;

FIG. 2 is a flow chart illustrating a method of eye tracking according to an exemplary embodiment;

FIG. 3 is a schematic diagram of a head mounted device shown according to an example embodiment;

FIG. 4 is a schematic diagram illustrating one type of image preprocessing before and after image preprocessing, according to an exemplary embodiment;

FIG. 5 is a schematic diagram of an eye tracking device according to an exemplary embodiment;

FIG. 6 is a schematic diagram of an eye-tracking device according to an exemplary embodiment;

fig. 7 is a schematic diagram of a storage medium according to an exemplary embodiment.

Detailed Description

So that the manner in which the features and techniques of the disclosed embodiments can be understood in more detail, a more particular description of the embodiments of the disclosure, briefly summarized below, may be had by reference to the appended drawings, which are not intended to be limiting of the embodiments of the disclosure. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may still be practiced without these details. In other instances, well-known structures and devices may be shown simplified in order to simplify the drawing.

For the personnel wearing the glasses, when the light source irradiates eyes of the personnel wearing the glasses, redundant light spots can be generated due to reflection of the glasses, so that eye features are shielded or even invisible, the accuracy of eye movement tracking is greatly influenced, and even the eyes cannot be tracked. According to the eye movement tracking method, the eye images of the eye persons with the glasses and the eye images of the eye persons without the glasses are obtained to serve as data sets, the neural network model is trained according to the data sets, pretreatment of eliminating the glasses and redundant light spots is carried out on the eye images of the eye persons with the glasses according to the trained neural network model, eye movement tracking is carried out on the treated images through an eye movement tracking algorithm, so that the eye movement tracking of the eye persons wearing the glasses can be carried out with high precision as the eye movement tracking of common people, and other eye movement tracking algorithms can be seamlessly docked.

An eye tracking method, apparatus, device and storage medium according to embodiments of the present disclosure will be described in detail with reference to fig. 1 to 7.

Referring to fig. 1, the method specifically includes the following steps;

step S101, acquiring eye images of the glasses staff and eye images of the non-glasses staff as data sets;

specifically, a large number of eye images of the person with glasses and eye images of the person without glasses are acquired as a data set, wherein paired eye images of the person with glasses and eye images of the person without glasses can be obtained, unpaired eye images of the person with glasses and eye images of the person without glasses can be obtained, in a possible implementation manner, from public data, and can also be acquired by a head-mounted device, fig. 3 is a schematic diagram of a head-mounted device according to an exemplary embodiment, and a miniature camera with a light source is installed below the head-mounted device to capture eye images of an acquired person, as shown in fig. 3.

When a large number of eye images of the eye person with glasses and eye images of the eye person without glasses are acquired, the image data are used as a data set of a training neural network model, the data set is divided into a training set, a test set and a verification set, the training neural network model is trained through the training set, the neural network model is tested through the test set, and the trained neural network model is verified through the verification set.

Step S102, training a neural network model according to the data set;

specifically, a proper neural network structure is designed, the structure of the neural network adopts a depth residual error network, a pooling layer is replaced by stride convolution or micro stride convolution to up-sample or down-sample an image, the network comprises a plurality of residual error blocks, all non-residual error convolution layers are followed by a non-linear layer with a normalization and activation function of RELU, and an output layer uses a scaled tanh function to ensure that output image pixels are between [0 and 255 ]. The first and last layers use a 9 x 9 convolution kernel and all the remaining convolution layers use a 3 x 3 convolution kernel. In order to ensure that the glasses and redundant light spots are eliminated without affecting other details of the image, the U-net structure is used for fusing the features of different scales, the jump-level connection ensures that the features recovered by up-sampling cannot be rough, and meanwhile, the colors and the sizes of the input and output images are consistent.

And then taking the eye images of the eye persons with the glasses as network input, taking the eye images of the eye persons without the glasses as targets, and training a proper neural network model by minimizing a loss function through a gradient descent method. The neural network may be pix2pix, cycleGAN, pix2pixHD or any other neural network that eliminates glasses and unwanted light spots.

In one possible implementation, the neural network model employs CycleGAN, which does not require paired data, only one set of input data, i.e., eye images of the person with glasses, and one set of output data, i.e., eye images of the person without glasses, given a training set sampleAnd->The model contains two mappings: g is X-Y and F is Y-X, two countermeasures of discriminators D are introduced _X And D _Y ，D _X For discriminating { x } and { F (y) }, D _Y Used to distinguish { y } from { G (x) }. The optimization objective consists of two terms:

(1) countering losses: the purpose is to approximate the generated image distribution to the image distribution of the target domain.

For the mapping function: X-Y and its discriminator D _Y The objective function is:

wherein G is used for generating a picture G (x), D after removing glasses and redundant light spots _Y To distinguish between the G (x) generated and the real glasses-free sample y.

For the mapping function: FY-X and its discriminator D _X The objective function is:

(2) cycle consistency loss: the purpose is to make F (G (x)). Apprxeq.x and G (F (y)). Apprxeq.y. The learned mappings F and G are prevented from contradicting each other. The objective function is:

the overall optimization objective is:

L(G,F,D _X ,D _Y )＝L _GAN (G,D _Y ,X,Y)+L _GAN (F,D _X ,Y,X)+λL _cyc (G,F)

in the training phase, on the one hand, the discriminant D is trained _X And D _Y The discrimination capability of the discriminator on the generated image and the real image is continuously enhanced, on the other hand, the generators F and G are trained, so that the generated image and the target domain image are more and more similar, and meanwhile, the cycle consistency loss is less and less, and the method is expressed as:

further, the neural network model is tested by the test set, and the trained neural network model is verified by the verification set.

Through the step, a trained neural network model capable of preprocessing the eye images of the eye person with the glasses can be obtained.

Step S103, preprocessing an eye image of a person with glasses according to a neural network model;

before preprocessing the eye image of the person with the glasses, judging whether the eye image is the eye image of the person with the glasses, and when the eye image is the eye image of the person with the glasses, preprocessing the eye image of the person with the glasses according to the trained neural network model, wherein the processing comprises the steps of eliminating redundant light spots and eliminating the glasses on the eye image of the person with the glasses.

Specifically, the eye image of the person with glasses is input into the trained neural network model to perform the treatment of eliminating the redundant light spots and eliminating the glasses, fig. 4 is a schematic diagram before and after the pretreatment of an image, as shown in fig. 4, the eye image before the treatment includes a glasses frame, a pupil, a reflection light spot and the redundant light spot, and the treated image only includes the pupil and the reflection light spot, and it can be found that the glasses and the redundant light spots in the eye image of the person with glasses can be eliminated through the image treatment.

By this step, glasses and redundant flare in the eye image of the person with glasses can be eliminated.

Step S104 performs eye movement tracking according to the preprocessed image.

After the processed image is obtained, any eye tracking algorithm can be adopted to track eyes, the eye tracking algorithm comprises an eye tracking algorithm based on appearance and also comprises an eye tracking algorithm based on characteristics, in one possible implementation mode, the eye tracking algorithm based on characteristics is adopted to restore the image wearing glasses into a normal image, and then the characteristics are extracted from the image of the eyes to estimate the current gazing point through a mapping model. The mapping model adopts a polynomial mapping model, and the expression form is as follows:

wherein (x, y) represents the two-dimensional coordinates of the gaze point of the human eye, (x) _eye ,y _eye ) Representing the pupil center, i.e. the two-dimensional coordinates of the reflected spot vector, a _i And b _i Representing the coefficients of the mapping function. The unknown parameters of the mapping model are solved by collecting data of a plurality of calibration points watched by a user, the parameters comprise individual difference information of the user, and when in subsequent use, the pupil center can be extracted according to a newly collected image, namely, the current gazing point is estimated by a reflected facula vector, so that eye movement tracking is realized.

Through the steps S101-S104, eye images of the wearer can be preprocessed, glasses and redundant light spots in the images are eliminated, and other eye movement tracking algorithms can be seamlessly connected, so that high-precision eye movement tracking is realized.

In order to facilitate understanding of the eye tracking method provided in the embodiments of the present application, the following description is provided with reference to fig. 2. As shown in fig. 2, the eye movement tracking method includes:

step S201, acquiring eye images of the glasses staff and eye images of the non-glasses staff as data sets; step S202, training a neural network model according to a data set; step S203, judging whether the image is an eye image of a person with glasses, executing step S204 when the image is the eye image of the person with glasses, performing processing of eliminating redundant light spots and eliminating glasses on the eye image of the person with glasses according to the neural network model, and executing step S205 when the image is not the eye image of the person with glasses, and extracting human eye characteristics; step S206, bringing human eye features into a mapping model; in step S207, the gaze point is estimated, and eye tracking is performed.

According to the eye movement tracking method provided by the embodiment of the disclosure, the eye images of the eye persons with glasses and the eye images of the eye persons without glasses are obtained to serve as data sets, the neural network model is trained according to the data sets, the pretreatment of eliminating the glasses and eliminating redundant light spots is carried out on the eye images of the eye persons with glasses according to the trained neural network model, and the eye movement tracking algorithm is used for carrying out eye movement tracking on the treated images, so that the eye movement tracking of the eye persons wearing the glasses can be carried out with high precision as the eye movement tracking of common people, and other eye movement tracking algorithms can be seamlessly docked.

In a second aspect, embodiments of the present disclosure provide an eye-tracking device, as shown in fig. 5, comprising:

s501, an acquisition module, which is used for acquiring an eye image of a person with glasses and an eye image of a person without glasses as a data set;

s502, training a neural network model according to the data set;

s503, a preprocessing module, which is used for preprocessing the eye images of the eye person with the glasses according to the neural network model;

and S504, a tracking module is used for tracking eye movement according to the preprocessed image.

Further, the acquisition module further comprises a dividing unit for dividing the data set into a training set, a test set and a verification set.

Further, the training module is specifically configured to use the eye image of the person with glasses as input of the neural network model, use the eye image of the person without glasses as a target of the neural network model, and train the neural network model by minimizing the loss function through a gradient descent method.

Further, the eye tracking device further comprises a judging module for judging whether the eye image is an eye image of the person with glasses, and when the eye image is the eye image of the person with glasses, the eye image of the person with glasses is preprocessed according to the neural network model.

Further, the preprocessing module is specifically used for processing the eye images of the person with the glasses to eliminate redundant light spots and eliminate the glasses according to the neural network model.

Further, the tracking module is specifically configured to perform eye tracking on the preprocessed image through an eye tracking algorithm, where the eye tracking algorithm includes an appearance-based eye tracking algorithm or a feature-based eye tracking algorithm.

It should be noted that, when the eye tracking apparatus provided in the above embodiment performs the eye tracking method, only the division of the above functional modules is used as an example, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the functions described above. In addition, the eye tracking device and the eye tracking method provided in the above embodiments belong to the same concept, and the implementation process is detailed in the method embodiments, which are not repeated here.

In a third aspect, an embodiment of the present disclosure further provides an electronic device corresponding to the eye tracking method provided in the foregoing embodiment, so as to perform the eye tracking method.

Referring to fig. 6, a schematic diagram of an electronic device according to some embodiments of the present application is shown. As shown in fig. 6, the electronic device includes: processor 601, memory 602, bus 603 and communication interface 604, processor 601, communication interface 604 and memory 602 being connected by bus 603; the memory 602 stores a computer program executable on the processor 601, and the processor 601 executes the eye tracking method provided in any of the foregoing embodiments of the present application when the computer program is executed.

The memory 602 may include a high-speed random access memory (RAM: random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The communication connection between the system network element and at least one other network element is implemented via at least one communication interface 604 (which may be wired or wireless), and may use the internet, a wide area network, a local network, a metropolitan area network, etc.

Bus 603 may be an ISA bus, a PCI bus, or an EISA bus, among others. The buses may be divided into address buses, data buses, control buses, etc. The memory 602 is configured to store a program, and the processor 601 executes the program after receiving an execution instruction, and the eye tracking method disclosed in any of the foregoing embodiments of the present application may be applied to the processor 601 or implemented by the processor 601.

The processor 601 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 601 or instructions in the form of software. The processor 601 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory 602, and the processor 601 reads information in the memory 602 and performs the steps of the above method in combination with its hardware.

The electronic device provided by the embodiment of the application and the eye tracking method provided by the embodiment of the application are the same in inventive concept, and have the same beneficial effects as the method adopted, operated or implemented by the electronic device.

In a fourth aspect, an embodiment of the present application further provides a computer readable storage medium corresponding to the eye tracking method provided in the foregoing embodiment, referring to fig. 7, the computer readable storage medium is shown as an optical disc 700, on which a computer program (i.e. a program product) is stored, where the computer program, when executed by a processor, performs the eye tracking method provided in any of the foregoing embodiments.

It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.

The computer readable storage medium provided by the above embodiments of the present application has the same advantageous effects as the method adopted, operated or implemented by the application program stored therein, for the same inventive concept as the eye tracking method provided by the embodiments of the present application.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An eye movement tracking method, comprising:

according to the data set training neural network model, the structure of the neural network adopts a depth residual error network, a pooling layer is replaced by stride convolution or micro stride convolution to carry out up sampling or down sampling on an image, the network comprises a plurality of residual error blocks, all non-residual error convolution layers are followed by a nonlinear layer with normalization and activation functions of RELU, an output layer uses a scaled tanh function to ensure that the output image pixels are between [0,255], a first layer and a last layer use 9X 9 convolution kernels, all other convolution layers use 3X 3 convolution kernels, a U-net structure is used for fusing the features of different scales, then eye images of eye persons with glasses are used as network input, eye images of eye persons without glasses are used as targets, and a gradient descent method is used for minimizing loss functions so as to train the neural network model;

judging whether the eye image is an eye image of a person with glasses; when the eye image of the person with glasses is the eye image of the person with glasses, preprocessing the eye image of the person with glasses according to the neural network model, wherein the preprocessing comprises the steps of eliminating redundant light spots and eliminating glasses on the eye image of the person with glasses according to the neural network model;

eye tracking is carried out according to the preprocessed image, the eye tracking is carried out on the preprocessed image through an eye tracking algorithm, the eye tracking is carried out on the preprocessed image through a feature-based eye tracking algorithm, after the eye-worn image is restored to a normal image, features are extracted from the eye image, and the current gaze point is estimated through a mapping model.

2. The method of claim 1, wherein the acquiring the eyeglass-person eye image and the non-eyeglass-person eye image as the data sets comprises:

the data set is divided into a training set, a test set and a validation set.

3. The method of claim 1, wherein the eye tracking algorithm further comprises an appearance-based eye tracking algorithm.

4. An eye-tracking device, comprising:

the training module is used for training a neural network model according to the data set, a depth residual error network is adopted by the structure of the neural network, a pooling layer is replaced by stride convolution or micro stride convolution to carry out up sampling or down sampling on an image, the network comprises a plurality of residual error blocks, all non-residual error convolution layers are followed by a nonlinear layer with normalization and activation functions of RELU, an output layer uses a scaled tanh function to ensure that output image pixels are between [0,255], a first layer and a last layer use 9 multiplied by 9 convolution kernels, all other convolution layers use 3 multiplied by 3 convolution kernels, the U-net structure is used for fusing the characteristics of different scales, then eye images of eye persons with glasses are used as network input, eye images of eye persons without glasses are used as targets, and a gradient descent method is used for minimizing a loss function so as to train the neural network model;

the preprocessing module is used for judging whether the eye image is an eye image of a person with glasses; when the eye image of the person with glasses is the eye image of the person with glasses, preprocessing the eye image of the person with glasses according to the neural network model, wherein the preprocessing comprises the steps of eliminating redundant light spots and eliminating glasses on the eye image of the person with glasses according to the neural network model;

the tracking module is used for carrying out eye movement tracking according to the preprocessed image, carrying out eye movement tracking on the preprocessed image through an eye movement tracking algorithm, and estimating the current gaze point through a mapping model by extracting features from the human eye image after the image with glasses is restored to a normal image by adopting a feature-based eye movement tracking algorithm.

5. An eye tracking device comprising a processor and a memory storing program instructions, wherein the processor is configured, when executing the program instructions, to perform the eye tracking method of any of claims 1 to 3.

6. A computer readable medium having stored thereon computer readable instructions executable by a processor to implement an eye tracking method as claimed in any one of claims 1 to 3.