CN109508679A - Realize method, apparatus, equipment and the storage medium of eyeball three-dimensional eye tracking - Google Patents

Realize method, apparatus, equipment and the storage medium of eyeball three-dimensional eye tracking Download PDF

Info

Publication number
CN109508679A
CN109508679A CN201811375929.7A CN201811375929A CN109508679A CN 109508679 A CN109508679 A CN 109508679A CN 201811375929 A CN201811375929 A CN 201811375929A CN 109508679 A CN109508679 A CN 109508679A
Authority
CN
China
Prior art keywords
eyeball
facial image
head pose
network
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811375929.7A
Other languages
Chinese (zh)
Other versions
CN109508679B (en
Inventor
张国生
李东
冯广
章云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201811375929.7A priority Critical patent/CN109508679B/en
Publication of CN109508679A publication Critical patent/CN109508679A/en
Application granted granted Critical
Publication of CN109508679B publication Critical patent/CN109508679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Abstract

The invention discloses a kind of method, apparatus for realizing eyeball three-dimensional eye tracking, equipment and computer readable storage mediums, include: the head pose detection network that facial image to be detected is input to and is constructed in advance, obtains the head pose in the facial image;The facial image is input to the eyeball motion detection network constructed in advance, obtains the eyeball movement of the facial image;The head pose and the eyeball action input to the three-dimensional sight line vector constructed in advance are detected into network, obtain the three-dimensional direction of visual lines vector of eyeball in the facial image.Method, apparatus, equipment and computer readable storage medium provided by the present invention can extract the three-dimensional direction of visual lines vector of the person's of being taken eyeball from two-dimensional facial image, have a wide range of applications scene.

Description

Realize method, apparatus, equipment and the storage medium of eyeball three-dimensional eye tracking
Technical field
The present invention relates to eye tracking technical field, more particularly to a kind of method for realizing eyeball three-dimensional eye tracking, Device, equipment and computer readable storage medium.
Background technique
The research of eye tracking algorithm has had more mature achievement, and successfully real in many business applications It is existing, such as VR/AR technology, although traditional eye tracking technology can be realized higher precision, eye tracking at this stage Algorithm is substantially based on traditional image processing method, and dependent on expensive infrared equipment, and needs are special in head installation Detection device, detect the feature of eyeball.Traditional image processing method detection accuracy by light variation influenced, and detect away from From by serious constraint.So being badly in need of a kind of a kind of RGB image that can be shot by common camera realizes eye tracking Algorithm.In computer vision field, depth convolutional neural networks achieve great achievement, such as target inspection at many aspects Survey, example segmentation etc..
Also there is the eye tracking technology based on deep learning accordingly in the prior art, the specific steps are as follows: obtain view Film lesion image data;Data mark is carried out to retinopathy image data, obtains labeled data;Establish initial depth study Network;Retinopathy image data is inputted in initial depth learning network, output obtains corresponding prediction data;Utilize damage It loses function to be compared the corresponding labeled data of retinopathy image data and prediction data, obtains comparison result;According to Comparison result adjusts the parameter in initial depth learning network, until comparison result reaches preset threshold, obtains final depth Learning network model;Retinopathy image data to be measured is handled using deep learning network model, is obtained corresponding Eyeball centre coordinate and eyeball diameter.
Therefore in existing eye tracking technology, one is realize eye tracking skill based on traditional image processing algorithm Art, although this kind of algorithm has had more mature business application, traditional image processing algorithm detection accuracy is by light The influence of variation, and dependent on expensive head-mount infrored equipment, the convenient sexual experience on head is poor, detecting distance It suffers restraints.Another kind is the eye tracking algorithm based on deep learning algorithm, however existing based on being based on depth in technology The eye tracking algorithm of learning algorithm is only capable of detection eyeball center and eyeball diameter, only comprising the two dimension letter of eyeball movement Breath, application scenarios suffer restraints.
In summary as can be seen that it is current for how obtaining the three-dimensional direction of visual lines vector of eyeball by two-dimension human face image Problem to be solved.
Summary of the invention
The object of the present invention is to provide a kind of method, apparatus for realizing eyeball three-dimensional eye tracking, equipment and computers Readable storage medium storing program for executing can only detect the two dimension letter of eyeball to solve the eye tracking algorithm based on deep learning in the prior art The problem of breath.
In order to solve the above technical problems, the present invention provides a kind of method for realizing eyeball three-dimensional eye tracking, comprising: will be to The facial image of detection is input to the head pose detection network constructed in advance, obtains the head pose in the facial image; The facial image is input to the eyeball motion detection network constructed in advance, obtains the eyeball movement of the facial image;It will The head pose and the eyeball action input to the three-dimensional sight line vector constructed in advance detect network, obtain the face figure The three-dimensional direction of visual lines vector of eyeball as in.
Preferably, described that facial image to be detected is input to the head pose detection network constructed in advance, obtain institute Include: before stating the head pose in facial image
The facial image of several three-dimensional labels with head pose and eyeball sight is acquired, face image data is constructed Collection, wherein the facial image is RGB image;
Construct initial head pose detection network and initial eyeball motion detection network;
It is dynamic to the initial head pose detection network and the initial eyeball respectively using the face image data collection Make detection network to be trained, obtains the head pose detection network for completing training and the eyeball motion detection network.
Preferably, the facial image for acquiring several three-dimensional labels with head pose and eyeball sight, constructs people Face image data set includes:
The facial image for acquiring data set provider respectively using each camera in the battle array camera array of face, obtains face figure As the first subclass;
Every row camera collects several facial images in the face battle array camera array, indicates that the data set provider exists The different head pose in the direction y;
Several collected facial images of each column camera in the face battle array camera array, indicate the data set provider In the different head pose in the direction p;
The face battle array collected facial image of camera array is carried out clockwise and counterclockwise respectively Rotation obtains indicating that the data set provider is closed in the facial image second subset of the different head pose in the direction r;
Merge first subclass of facial image and the facial image second subset closes to obtain the facial image number According to collection.
Preferably, the facial image for acquiring data set provider respectively using each camera in the battle array camera array of face Include:
When acquiring every width facial image, the dynamic point on the display screen that the data set provider eyeball is faced is recorded, from And determine the three-dimensional vector label of the data set provider eyeball sight, and record the head appearance in every width facial image simultaneously State.
Preferably, the initial head pose detection network of the building includes:
Using Alex NET model as basic structure, the initial head detection network, the preliminary head detection net are constructed The network structure of network are as follows:
C(3,1,6)-BN-PReLU-P(2,2)-C(3,1,16)-BN-PReLU-P(2,2)-C(3,1,24)-BN- PReLU-C(3,1,24)-PReLU(3,1,16)-BN-PReLU-P(2,2)-FC(256)-FC(128)-PReLU-FC(3);
Wherein, C (k, s, c) indicates convolution kernel having a size of k, and convolution step-length is s, and port number is the convolutional layer of c, P (k, s) table Show core having a size of k, step-length is the maximum value pond layer of s, and BN indicates batch normalization, and PReLU indicates that activation primitive, FC (n) indicate Full articulamentum, neuron number n.
Preferably, it is described using the face image data collection respectively to the initial head pose detection network and described Initial eyeball motion detection network, which is trained, includes:
Network and the initial eyeball motion detection net are detected to the head pose using the face image data collection Network is trained;
Wherein, loss function Loss1=Lossh+LosseThe loss function of network is detected for the preliminary head poseWith the preliminary eyeball motion detection network losses functionThe sum of.
Preferably, the head pose and the eyeball action input to the three-dimensional sight line vector constructed in advance are detected into net Network, obtain include: before the three-dimensional direction of visual lines vector of eyeball in the facial image
Network and the eyeball motion detection network are detected respectively to the human face data set using the head pose In facial image detected, obtain every width facial image head pose and eyeball movement;
Using head pose and the eyeball movement of each width facial image to the initial three-dimensional sight line vector pre-established Detection network is trained, to obtain completing the three-dimensional sight line vector detection network of training;
Current loss function Loss2=Loss1+Lossg=Lossh+Losse+LossgFor loss function Loss1With it is described Initial three-dimensional sight line vector detects network losses functionThe sum of.
The present invention also provides a kind of devices for realizing eyeball three-dimensional eye tracking, comprising:
Head pose detection module detects net for facial image to be detected to be input to the head pose constructed in advance Network obtains the head pose in the facial image;
Eyeball motion detection module, for the facial image to be input to the eyeball motion detection network constructed in advance, Obtain the eyeball movement of the facial image;
Three-dimensional line-of-sight detection module, for by the head pose and the eyeball action input to the three-dimensional constructed in advance Sight line vector detects network, obtains the three-dimensional direction of visual lines vector of eyeball in the facial image.
The present invention also provides a kind of equipment for realizing eyeball three-dimensional eye tracking, comprising:
Memory, for storing computer program;Processor realizes above-mentioned one kind when for executing the computer program The step of realizing the method for eyeball three-dimensional eye tracking.
The present invention also provides a kind of computer readable storage medium, meter is stored on the computer readable storage medium Calculation machine program, the computer program realize a kind of above-mentioned method for realizing eyeball three-dimensional eye tracking when being executed by processor Step.
The method provided by the present invention for realizing eyeball three-dimensional eye tracking, facial image to be detected is input in advance The head pose of building detects network, has obtained the head pose in the facial image.The facial image is input to institute It states in the eyeball motion detection network constructed in advance, obtains the eyeball movement in the facial image.By the head pose and The eyeball action input to the three-dimensional sight line vector that constructs in advance detects network, in order to according to geometrical constraint and pass through sight Switching network obtains the three-dimensional direction of visual lines vector of eyeball in the facial image.The eye tracking side of offer of the present invention Method is based on deep learning network, head pose and the eyeball movement of the person of being taken is extracted from two-dimensional facial image, and by institute It states in head pose and the eyeball action input three-dimensional sight line vector detection network trained in advance, obtains the face figure The three-dimensional direction of visual lines vector of the person's of being taken eyeball as in.Method provided by the present invention is specifically widely applied field, passes through The three-dimensional sight line vector direction that facial image obtains eyeball can be used for the monitoring field, field of human-computer interaction, the heart of safe driving Manage research field etc.;When solving in the prior art through deep neural network realization eye tracking technology, it is only able to detect eye Ball center position and eyeball diameter do not have the problem of scene is widely applied.Corresponding, device provided by the present invention is set Standby and computer readable storage medium all has above-mentioned beneficial effect.
Detailed description of the invention
It, below will be to embodiment or existing for the clearer technical solution for illustrating the embodiment of the present invention or the prior art Attached drawing needed in technical description is briefly described, it should be apparent that, the accompanying drawings in the following description is only this hair Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root Other attached drawings are obtained according to these attached drawings.
Fig. 1 is the process of the first specific embodiment of the method provided by the present invention for realizing eyeball three-dimensional eye tracking Figure;
Fig. 2 is the process of second of specific embodiment of the method provided by the present invention for realizing eyeball three-dimensional eye tracking Figure;
Fig. 3 is a kind of structural block diagram for the device for realizing eyeball three-dimensional eye tracking provided in an embodiment of the present invention.
Specific embodiment
Core of the invention is to provide a kind of method, apparatus for realizing eyeball three-dimensional eye tracking, equipment and computer Readable storage medium storing program for executing can obtain the three-dimensional sight line vector of eyeball by two-dimension human face image, have and scene is widely applied.
In order to enable those skilled in the art to better understand the solution of the present invention, with reference to the accompanying drawings and detailed description The present invention is described in further detail.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Referring to FIG. 1, Fig. 1 is specific real for the first of the method provided by the present invention for realizing eyeball three-dimensional eye tracking Apply the flow chart of example;Specific steps are as follows:
Step S101: facial image to be detected is input to the head pose detection network constructed in advance, is obtained described Head pose in facial image;
It is described that facial image to be detected is input to the head pose detection network constructed in advance, obtain the face figure The facial image of several three-dimensional labels with head pose and eyeball sight is acquired before head pose as in first, constructs people Face image data set;And construct initial head pose detection network and initial eyeball motion detection network;Utilize the face figure As data set is respectively trained the initial head pose detection network and the initial eyeball motion detection network, obtain Complete the head pose detection network and the eyeball motion detection network of training.
It is preferably general in order to have the initial head pose detection network and the initial eyeball motion detection network Change ability, the face image data set acquired in the present embodiment need to have following characteristics: a, having extensive distribution, to the greatest extent All head poses of possible covering and eyeball movement, while data image should also include different light intensities, or even packet Include glasses reflection interference.B, face image data set has the three-dimensional label of head pose and eyeball sight.C, facial image The RGB image of facial image preferably generally in data acquisition system, rather than relies on specific camera device.
In order to make the face image data set have widely distribution, the present embodiment uses one 3 × 4 camera shooting Head array represents different head poses by different camera visual angles.But face battle array camera array is only capable of representing head Portion's posture (y, p) both direction difference, so, in order to obtain head pose the direction r difference, to the facial image of acquisition Carry out respectively along being rotated counterclockwise, come indicate head side wobbling action variation, corresponding each head pose takes the photograph Label (the y of the corresponding head pose of the angle of the position of array and image rotation as where headGT,pGT,rGT)。
In order to obtain eyeball movement more abundant, while acquiring the face image data collection, data is allowed to provide Person's eye tracking watches a dynamic point of display screen attentively, shows that the dynamic point of screen includes random letters, data set provider is needed to identify Letter is to ensure that data set provider eyeball is just watching the dynamic point of screen attentively, to guarantee the accuracy of data label, to obtain not Same eyeball movement, the position of each corresponding eye tracking record eyeball sight line vector label (φ at this timeGTGT).? The three-dimensional of head pose and corresponding eyeball sight in every width facial image is recorded while acquiring face image data set Vector label.
In the present embodiment, when acquiring the face image data set, it is only necessary to acquire face RGB image, without according to The other special installations of Lai Yu are not only reduced relative to the prior art for the expensive head-mount infrored equipment that needs to rely on Application cost, and since head is freely without constraint, to have better convenience.
Construct the initial head pose detection network, the preliminary eyeball motion detection network and initial three-dimensional sight to Before amount detection network, geometrical analysis and coordinate-system used by the present embodiment are described first.The present embodiment uses two altogether Coordinate system, head coordinate system (Xh,Yh,Zh) and camera coordinate system (Xc,Yc,Zc), g is sight line vector.In order to be further simplified The expression of head pose, the embodiment of the present invention, which uses three-dimensional ball shape rotary angle, indicates (y, p, r), and wherein y indicates yaw angle (along YhThe rotation angle of axis), p indicates inclination angle (along XhThe rotation angle of axis), r indicates yaw angle (along ZhThe rotation of axis Angle).And the movement of eyeball using two-dimensional spherical coordinate system (θ, φ) indicate, wherein θ and φ respectively indicate sight line vector with Head coordinate system both horizontally and vertically on angle.
It is as follows with eyeball movement sight line vector to be described in the head coordinate-system:
gh=[- cos (φ) sin (θ), sin (φ) ,-cos (φ) cos (θ)]T
Camera coordinate system (Xc,Yc,Zc) be then defined as using camera center as origin, camera depth direction is ZcAxis, Both direction perpendicular to the plane of depth direction is respectively Xc,YcAxis.Due to the three-dimensional sight line vector of network final output be What camera coordinate system indicated, so the embodiment of the present invention defines gcFor the three-dimensional sight line vector under camera coordinate system, according to Geometry is gained knowledge it is found that gcDepending on gh, ghIt is undefined in head coordinate system, it is possible to obtain the embodiment of the present invention Global mapping relationship:
Step S102: the facial image is input to the eyeball motion detection network constructed in advance, obtains the face The eyeball of image acts;
Step S103: the head pose and the eyeball action input to the three-dimensional sight line vector constructed in advance are detected Network obtains the three-dimensional direction of visual lines vector of eyeball in the facial image.
The head pose and the eyeball action input to the three-dimensional sight line vector constructed in advance are detected into network, obtained The three-dimensional sight line vector of eyeball in the facial image.
In order to reuse existing data set, the network in the present embodiment uses end-to-end structure, first builds respectively Initial head pose detection network and the eyeball motion detection network are found, it is then that the structure detection result of two parts network is defeated Enter the three-dimensional sight line vector obtained to a fully-connected network to the end, network is divided into Liang Ge branch, and upper element branches are for detecting Then head pose, lower part pass through the sight conversion layer of geometrical constraint, obtain camera coordinate system for detecting eyeball movement Sight three-dimensional vector.
Based on the above embodiment, in the present embodiment, in order to reuse collected face image data collection, this implementation Example uses end-to-end structure, first establishes the network of head pose detection and the network of eyeball motion detection respectively, then will The structure detection result of two parts network is input to a fully-connected network, obtains three-dimensional sight line vector to the end, and network is divided into Liang Ge branch, upper element branches are for detecting head pose, and lower part is for detecting eyeball movement, then by geometrical constraint Sight conversion layer obtains the sight three-dimensional vector of camera coordinate system.Referring to FIG. 2, Fig. 2 is reality provided by the present invention Lose face ball three-dimensional eye tracking method second of specific embodiment flow chart;Specific steps are as follows:
Step S201: the facial image of several data set providers is acquired using face battle array camera array, and records every width people The three-dimensional vector label of head pose and eyeball movement in face image, obtains the first subclass of facial image;
Step S202: side clockwise and anticlockwise is carried out respectively to the facial image in first subclass of facial image To rotation, obtain facial image second subset conjunction;
Step S203: merging first subclass of facial image and the facial image second subset closes to obtain the people Face image data set;
Step S204: network is detected to the initial head pose constructed in advance respectively using the face image data set It is trained with initial eyeball motion detection network, obtains target cranial attitude detection network and target eyeball detection network;
The basic network topology of the initial head pose detection network uses the structure of Alex Net, carries out phase to it The simplification and modification answered.The number of plies of network is constant, but each layer of port number has carried out reduction appropriate, while by local acknowledgement Normalization is changed to batch normalization, and activation primitive uses PReLU.The network structure of the initial head pose detection network is as follows: C (3,1,6)-BN-PReLU-P(2,2)-C(3,1,16)-BN-PReLU-P(2,2)-C(3,1,24)-BN-PReLU-C(3,1, 24)-PReLU(3,1,16)-BN-PReLU-P(2,2)-FC(256)-FC(128)-PReLU-FC(3)
Wherein, wherein C (k, s, c) indicates convolution kernel having a size of k, and convolution step-length is s, and port number is the convolutional layer of c, P (k, s) indicates core having a size of k, and step-length is the maximum value pond layer of s, and BN indicates batch normalization, and PReLU indicates activation primitive, FC (n) full articulamentum, neuron number n are indicated.
The eye areas that the input of the eyeball motion detection network is intercepted by the original image of facial image, divides left eye Its part will be described in detail below with right eye two parts since two parts network is full symmetric, by eyeball image block tune It is whole to arrive consistent size 36x36, then pass through convolutional neural networks and fully-connected network, the initial eyeball motion detection network knot Structure is as follows: C (11,2,96)-BN-PReLU-P (2,2)-C (5,1,256)-BN-PReLU-P (2,2)-C (3,1,384)-BN- PReLU-P(2,2)-C(1,1,64)-BN-PReLU-P(2,2)-FC(128)-FC(2)。
Step S205: using the target cranial attitude detection network and the target eyeball motion detection network to described Each width face that face image data is concentrated is detected, and head pose and the eyeball movement of each width facial image are obtained;
Step S206: it is acted using the head pose and eyeball of every width facial image in the face image data set defeated Enter to the initial three-dimensional sight line vector detection network constructed in advance and be trained, obtains the target three-dimensional sight line vector detection net Network;
(y, p, r) that the initial three-dimensional sight line vector detection network is obtained by the target cranial attitude detection network and Input of (θ, the φ) that the target eyeball motion detection network obtains as the initial three-dimensional sight line vector detection network, institute Stating initial three-dimensional sight line vector detection network is two layers of fully-connected network, and network first tier neuron number is 128, final layer mind It is 3 through first number, corresponding three-dimensional sight line vector.
When being trained to head pose detection network and the initial eyeball motion detection network, loss function Loss1=Lossh+LosseThe loss function of network is detected for the preliminary head poseIt is acted with the preliminary eyeball Detect network losses functionThe sum of.
When using being trained to the initial three-dimensional sight line vector detection network pre-established, current loss function Loss2=Loss1+Lossg=Lossh+Losse+LossgFor loss function Loss1With the initial three-dimensional sight line vector detection Network losses functionThe sum of.
Lossh=| | h-hGT||2, h={ y, p, r }
Losse=| | e-eGT||2, e={ φ, θ }
Lossg=| | gc-gc GT||2,gc={ x, y, z }
Step S207: being input to the target cranial attitude detection network for facial image to be detected, obtain it is described to Head pose in the facial image of detection;
Step S208: the facial image to be detected is input to the target eyeball motion detection network, obtains institute State the eyeball movement of facial image to be detected;
Step S209: by the eyeball of the head pose of the facial image to be detected and the facial image to be detected Action input to the target three-dimensional sight line vector detects network, obtains the three-dimensional view of eyeball in the facial image to be detected Line direction vector.
The two dimension mark for only having carried out eyeball center in eyeball identification in the prior art, finally can only obtain eyeball Two-dimensional signal, so using being limited to, and method provided by the present embodiment is equally based on deep neural network, but this implementation Example has not only handled the action message of eyeball, has also carried out the prediction of head pose, while predicting eyeball three-dimensional sight line vector, To have higher level information, it may have better application value.Network training uses end-to-end in the present embodiment Substep training, in first step training process, can make full use of the data set and eyeball action data of existing head pose Collection allows depth network in the present embodiment to have better generalization ability to significantly increase trained data set.
Referring to FIG. 3, Fig. 3 is a kind of structure for the device for realizing eyeball three-dimensional eye tracking provided in an embodiment of the present invention Block diagram;Specific device may include:
Head pose detection module 100, for facial image to be detected to be input to the head pose constructed in advance inspection Survey grid network obtains the head pose in the facial image;
Eyeball motion detection module 200, for the facial image to be input to the eyeball motion detection net constructed in advance Network obtains the eyeball movement of the facial image;
Three-dimensional line-of-sight detection module 300, for by the head pose and the eyeball action input to constructing in advance Three-dimensional sight line vector detects network, obtains the three-dimensional direction of visual lines vector of eyeball in the facial image.
The present embodiment realization eyeball three-dimensional eye tracking device for realizing realization eyeball three-dimensional sight above-mentioned with The method of track, therefore the visible realization eyeball three hereinbefore of specific embodiment in the device of realization eyeball three-dimensional eye tracking The embodiment part of the method for eye tracking is tieed up, for example, head pose detection module 100, eyeball motion detection module 200, three Tie up line-of-sight detection module 300, be respectively used in the method for realizing above-mentioned realization eyeball three-dimensional eye tracking step S101, S102 and S103, so, specific embodiment is referred to the description of corresponding various pieces embodiment, and details are not described herein.
The specific embodiment of the invention additionally provides a kind of equipment for realizing eyeball three-dimensional eye tracking, comprising: memory is used In storage computer program;Processor realizes a kind of above-mentioned realization eyeball three-dimensional sight when for executing the computer program The step of method of tracking.
The specific embodiment of the invention additionally provides a kind of computer readable storage medium, the computer readable storage medium On be stored with computer program, the computer program realized when being executed by processor a kind of above-mentioned realization eyeball three-dimensional sight with The step of method of track.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other The difference of embodiment, same or similar part may refer to each other between each embodiment.For being filled disclosed in embodiment For setting, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part Explanation.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
It above can to method, apparatus, equipment and the computer provided by the present invention for realizing eyeball three-dimensional eye tracking Storage medium is read to be described in detail.Specific case used herein explains the principle of the present invention and embodiment It states, the above description of the embodiment is only used to help understand the method for the present invention and its core ideas.It should be pointed out that for this skill For the those of ordinary skill in art field, without departing from the principle of the present invention, several change can also be carried out to the present invention Into and modification, these improvements and modifications also fall within the scope of protection of the claims of the present invention.

Claims (10)

1. a kind of method for realizing eyeball three-dimensional eye tracking characterized by comprising
Facial image to be detected is input to the head pose detection network constructed in advance, obtains the head in the facial image Portion's posture;
The facial image is input to the eyeball motion detection network constructed in advance, the eyeball for obtaining the facial image is dynamic Make;
The head pose and the eyeball action input to the three-dimensional sight line vector constructed in advance are detected into network, obtained described The three-dimensional direction of visual lines vector of eyeball in facial image.
2. the method as described in claim 1, which is characterized in that described be input to facial image to be detected constructs in advance Head pose detects network, includes: before obtaining the head pose in the facial image
The facial image of several three-dimensional labels with head pose and eyeball sight is acquired, face image data collection is constructed, In, the facial image is RGB image;
Construct initial head pose detection network and initial eyeball motion detection network;
Using the face image data collection respectively to the initial head pose detection network and the initial eyeball movement inspection Survey grid network is trained, and obtains the head pose detection network for completing training and the eyeball motion detection network.
3. method according to claim 2, which is characterized in that it is described acquisition several with head pose and eyeball sight three The facial image of dimension label, building face image data collection include:
The facial image of data set provider is acquired respectively using each camera in the battle array camera array of face, obtains facial image the One subclass;
Every row camera collects several facial images in the face battle array camera array, indicates the data set provider in the side y To different head poses;
Several collected facial images of each column camera in the face battle array camera array, indicate the data set provider in p The different head pose in direction;
Rotation clockwise and counterclockwise is carried out respectively to the face battle array collected facial image of camera array, It obtains indicating that the data set provider is closed in the facial image second subset of the different head pose in the direction r;
Merge first subclass of facial image and the facial image second subset closes to obtain the face image data collection.
4. method as claimed in claim 3, which is characterized in that described to be distinguished using each camera in the battle array camera array of face Acquisition data set provider facial image include:
When acquiring every width facial image, the dynamic point on the display screen that the data set provider eyeball is faced is recorded, thus really The three-dimensional vector label of the fixed data set provider eyeball sight, and the head pose in every width facial image is recorded simultaneously.
5. method according to claim 2, which is characterized in that the initial head pose of building detects network and includes:
Using Alex NET model as basic structure, the initial head detection network is constructed, the preliminary head detection network Network structure are as follows:
C(3,1,6)-BN-PReLU-P(2,2)-C(3,1,16)-BN-PReLU-P(2,2)-C(3,1,24)-BN-PReLU-C (3,1,24)-PReLU(3,1,16)-BN-PReLU-P(2,2)-FC(256)-FC(128)-PReLU-FC(3);
Wherein, C (k, s, c) indicates convolution kernel having a size of k, and convolution step-length is s, and port number is the convolutional layer of c, and P (k, s) indicates core Having a size of k, step-length is the maximum value pond layer of s, and BN indicates batch normalization, and PReLU indicates activation primitive, and FC (n) expression connects entirely Meet layer, neuron number n.
6. method according to claim 2, which is characterized in that described to utilize the face image data collection respectively to described first Beginning head pose detection network and the initial eyeball motion detection network are trained and include:
Using the face image data collection to the head pose detect network and the initial eyeball motion detection network into Row training;
Wherein, loss function Loss1=Lossh+LosseThe loss function Loss of network is detected for the preliminary head posehWith The preliminary eyeball motion detection network losses function LosseThe sum of.
7. method as claimed in claim 6, which is characterized in that it is described by the head pose and the eyeball action input extremely The three-dimensional sight line vector detection network constructed in advance, obtains wrapping before the three-dimensional direction of visual lines vector of eyeball in the facial image It includes:
Network and the eyeball motion detection network are detected respectively in the human face data set using the head pose Facial image is detected, and head pose and the eyeball movement of every width facial image are obtained;
The initial three-dimensional sight line vector pre-established is detected using head pose and the eyeball movement of each width facial image Network is trained, to obtain completing the three-dimensional sight line vector detection network of training;
Current loss function Loss2=Loss1+Lossg=Lossh+Losse+LossgFor loss function Loss1With described initial three It ties up sight line vector and detects network losses function LossgThe sum of.
8. a kind of device for realizing eyeball three-dimensional eye tracking characterized by comprising
Head pose detection module detects network for facial image to be detected to be input to the head pose constructed in advance, Obtain the head pose in the facial image;
Eyeball motion detection module is obtained for the facial image to be input to the eyeball motion detection network constructed in advance The eyeball of the facial image acts;
Three-dimensional line-of-sight detection module, for by the head pose and the eyeball action input to the three-dimensional sight constructed in advance Vector detection network obtains the three-dimensional direction of visual lines vector of eyeball in the facial image.
9. a kind of equipment for realizing eyeball three-dimensional eye tracking characterized by comprising
Memory, for storing computer program;
Processor realizes a kind of realization eyeball three as described in any one of claim 1 to 7 when for executing the computer program The step of tieing up the method for eye tracking.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program realizes that a kind of realization eyeball is three-dimensional as described in any one of claim 1 to 7 when the computer program is executed by processor The step of method of eye tracking.
CN201811375929.7A 2018-11-19 2018-11-19 Method, device and equipment for realizing three-dimensional eye gaze tracking and storage medium Active CN109508679B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811375929.7A CN109508679B (en) 2018-11-19 2018-11-19 Method, device and equipment for realizing three-dimensional eye gaze tracking and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811375929.7A CN109508679B (en) 2018-11-19 2018-11-19 Method, device and equipment for realizing three-dimensional eye gaze tracking and storage medium

Publications (2)

Publication Number Publication Date
CN109508679A true CN109508679A (en) 2019-03-22
CN109508679B CN109508679B (en) 2023-02-10

Family

ID=65749029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811375929.7A Active CN109508679B (en) 2018-11-19 2018-11-19 Method, device and equipment for realizing three-dimensional eye gaze tracking and storage medium

Country Status (1)

Country Link
CN (1) CN109508679B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110191234A (en) * 2019-06-21 2019-08-30 中山大学 It is a kind of based on the intelligent terminal unlocking method for watching point analysis attentively
CN110555426A (en) * 2019-09-11 2019-12-10 北京儒博科技有限公司 Sight line detection method, device, equipment and storage medium
CN110909611A (en) * 2019-10-29 2020-03-24 深圳云天励飞技术有限公司 Method and device for detecting attention area, readable storage medium and terminal equipment
WO2020216054A1 (en) * 2019-04-24 2020-10-29 腾讯科技(深圳)有限公司 Sight line tracking model training method, and sight line tracking method and device
CN111847147A (en) * 2020-06-18 2020-10-30 闽江学院 Non-contact eye-movement type elevator floor input method and device
CN112114671A (en) * 2020-09-22 2020-12-22 上海汽车集团股份有限公司 Human-vehicle interaction method and device based on human eye sight and storage medium
WO2021135827A1 (en) * 2019-12-30 2021-07-08 上海商汤临港智能科技有限公司 Line-of-sight direction determination method and apparatus, electronic device, and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100026803A1 (en) * 2006-03-27 2010-02-04 Fujifilm Corporaiton Image recording apparatus, image recording method and image recording program
CN104391574A (en) * 2014-11-14 2015-03-04 京东方科技集团股份有限公司 Sight processing method, sight processing system, terminal equipment and wearable equipment
US20150109204A1 (en) * 2012-11-13 2015-04-23 Huawei Technologies Co., Ltd. Human-machine interaction method and apparatus
CN105740846A (en) * 2016-03-02 2016-07-06 河海大学常州校区 Horizontal visual angle estimation and calibration method based on depth camera
CN106598221A (en) * 2016-11-17 2017-04-26 电子科技大学 Eye key point detection-based 3D sight line direction estimation method
JP2017213191A (en) * 2016-05-31 2017-12-07 富士通株式会社 Sight line detection device, sight line detection method and sight line detection program
CN107818310A (en) * 2017-11-03 2018-03-20 电子科技大学 A kind of driver attention's detection method based on sight
US20180140187A1 (en) * 2015-07-17 2018-05-24 Sony Corporation Eyeball observation device, eyewear terminal, line-of-sight detection method, and program
CN108171218A (en) * 2018-01-29 2018-06-15 深圳市唯特视科技有限公司 A kind of gaze estimation method for watching network attentively based on appearance of depth
CN108229284A (en) * 2017-05-26 2018-06-29 北京市商汤科技开发有限公司 Eye-controlling focus and training method and device, system, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100026803A1 (en) * 2006-03-27 2010-02-04 Fujifilm Corporaiton Image recording apparatus, image recording method and image recording program
US20150109204A1 (en) * 2012-11-13 2015-04-23 Huawei Technologies Co., Ltd. Human-machine interaction method and apparatus
CN104391574A (en) * 2014-11-14 2015-03-04 京东方科技集团股份有限公司 Sight processing method, sight processing system, terminal equipment and wearable equipment
US20180140187A1 (en) * 2015-07-17 2018-05-24 Sony Corporation Eyeball observation device, eyewear terminal, line-of-sight detection method, and program
CN105740846A (en) * 2016-03-02 2016-07-06 河海大学常州校区 Horizontal visual angle estimation and calibration method based on depth camera
JP2017213191A (en) * 2016-05-31 2017-12-07 富士通株式会社 Sight line detection device, sight line detection method and sight line detection program
CN106598221A (en) * 2016-11-17 2017-04-26 电子科技大学 Eye key point detection-based 3D sight line direction estimation method
CN108229284A (en) * 2017-05-26 2018-06-29 北京市商汤科技开发有限公司 Eye-controlling focus and training method and device, system, electronic equipment and storage medium
CN107818310A (en) * 2017-11-03 2018-03-20 电子科技大学 A kind of driver attention's detection method based on sight
CN108171218A (en) * 2018-01-29 2018-06-15 深圳市唯特视科技有限公司 A kind of gaze estimation method for watching network attentively based on appearance of depth

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周小龙,汤帆扬,管秋,华敏: "基于3D人眼模型的视线跟踪技术综述", 《计算机辅助设计与图形学学报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020216054A1 (en) * 2019-04-24 2020-10-29 腾讯科技(深圳)有限公司 Sight line tracking model training method, and sight line tracking method and device
US11797084B2 (en) 2019-04-24 2023-10-24 Tencent Technology (Shenzhen) Company Limited Method and apparatus for training gaze tracking model, and method and apparatus for gaze tracking
CN110191234A (en) * 2019-06-21 2019-08-30 中山大学 It is a kind of based on the intelligent terminal unlocking method for watching point analysis attentively
CN110555426A (en) * 2019-09-11 2019-12-10 北京儒博科技有限公司 Sight line detection method, device, equipment and storage medium
CN110909611A (en) * 2019-10-29 2020-03-24 深圳云天励飞技术有限公司 Method and device for detecting attention area, readable storage medium and terminal equipment
CN110909611B (en) * 2019-10-29 2021-03-05 深圳云天励飞技术有限公司 Method and device for detecting attention area, readable storage medium and terminal equipment
WO2021135827A1 (en) * 2019-12-30 2021-07-08 上海商汤临港智能科技有限公司 Line-of-sight direction determination method and apparatus, electronic device, and storage medium
CN111847147A (en) * 2020-06-18 2020-10-30 闽江学院 Non-contact eye-movement type elevator floor input method and device
CN111847147B (en) * 2020-06-18 2023-04-18 闽江学院 Non-contact eye-movement type elevator floor input method and device
CN112114671A (en) * 2020-09-22 2020-12-22 上海汽车集团股份有限公司 Human-vehicle interaction method and device based on human eye sight and storage medium

Also Published As

Publication number Publication date
CN109508679B (en) 2023-02-10

Similar Documents

Publication Publication Date Title
CN109508679A (en) Realize method, apparatus, equipment and the storage medium of eyeball three-dimensional eye tracking
Zhang et al. Saliency detection in 360 videos
JP7136875B2 (en) Eye Pose Identification Using Eye Features
US20230273676A1 (en) Methods and apparatuses for determining and/or evaluating localizing maps of image display devices
US10748313B2 (en) Dynamic multi-view interactive digital media representation lock screen
US10803365B2 (en) System and method for relocalization and scene recognition
CN111243093B (en) Three-dimensional face grid generation method, device, equipment and storage medium
CN106251404B (en) Orientation tracking, the method and relevant apparatus, equipment for realizing augmented reality
Upenik et al. A simple method to obtain visual attention data in head mounted virtual reality
CN115427758A (en) Cross reality system with accurate shared map
CN114586071A (en) Cross-reality system supporting multiple device types
CN104978548A (en) Visual line estimation method and visual line estimation device based on three-dimensional active shape model
CN108135469A (en) Estimated using the eyelid shape of eyes attitude measurement
CN108875524A (en) Gaze estimation method, device, system and storage medium
CN106796449A (en) Eye-controlling focus method and device
CN109887003A (en) A kind of method and apparatus initialized for carrying out three-dimensional tracking
WO2019062056A1 (en) Smart projection method and system, and smart terminal
CN113689503B (en) Target object posture detection method, device, equipment and storage medium
CN105760809A (en) Method and apparatus for head pose estimation
CN107145224A (en) Human eye sight tracking and device based on three-dimensional sphere Taylor expansion
US20210056292A1 (en) Image location identification
CN110188630A (en) A kind of face identification method and camera
US10789778B1 (en) Systems and methods for displaying augmented-reality objects
CN110046554A (en) A kind of face alignment method and camera
CN115550563A (en) Video processing method, video processing device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant