WO2023011339A1 - 视线方向追踪方法和装置 - Google Patents

视线方向追踪方法和装置 Download PDF

Info

Publication number
WO2023011339A1
WO2023011339A1 PCT/CN2022/108896 CN2022108896W WO2023011339A1 WO 2023011339 A1 WO2023011339 A1 WO 2023011339A1 CN 2022108896 W CN2022108896 W CN 2022108896W WO 2023011339 A1 WO2023011339 A1 WO 2023011339A1
Authority
WO
WIPO (PCT)
Prior art keywords
coordinates
sight
positional relationship
light source
line
Prior art date
Application number
PCT/CN2022/108896
Other languages
English (en)
French (fr)
Inventor
凌瑞
张普
王进
Original Assignee
虹软科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 虹软科技股份有限公司 filed Critical 虹软科技股份有限公司
Priority to KR1020247007410A priority Critical patent/KR20240074755A/ko
Priority to JP2024531562A priority patent/JP2024529785A/ja
Priority to EP22852052.4A priority patent/EP4383193A1/en
Publication of WO2023011339A1 publication Critical patent/WO2023011339A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the present invention relates to the technical field of image processing, in particular, to a method and device for tracking a gaze direction.
  • the real-time detection of the driver's sight direction and the location of the sight can provide important information input for assisted driving and various product applications.
  • the angle and direction of the line of sight can be used to judge the current driver's observation area, detect the driver's distracted and dangerous driving behavior, etc.
  • the location of the line of sight can be used to judge the current user's attention target, thereby adjusting the augmented reality head-up display system display position, etc.
  • the gaze direction tracking method in the prior art estimates the gaze direction and viewpoint by analyzing the characteristics of the human eye through the acquired image including the human eye.
  • the above gaze tracking methods are usually divided into appearance-based methods and corneal reflection-based methods.
  • the appearance-based method generally relies on the appearance features contained in the human eye image or face image, including eyelid position, pupil position, iris position, inner/outer eye corner, face orientation and other features to estimate the line of sight direction and viewpoint.
  • the appearance-based method does not use The geometric mapping relationship, but the human eye image corresponds to the point of the high-dimensional space, and then learns the mapping relationship from the point of the given feature space to the screen coordinate.
  • the corneal reflection method Based on the corneal reflection method, in addition to relying on some image features of the appearance method, it also relies on the Purchin spot formed by the reflection of the light source on the cornea, and establishes the mapping from the pupil center of the geometric model to the gaze calibration point through the eye movement characteristics.
  • the corneal reflection method has higher precision than the appearance method, so almost all mature commercial products are based on the corneal reflection method, but the corneal reflection method also requires complex equipment and precise calibration processes, and is not universally applicable. Aiming at the problems of high hardware production cost, low algorithm optimization efficiency and low line-of-sight estimation accuracy of corneal reflection method, no effective solution has been proposed yet.
  • Embodiments of the present invention provide a gaze direction tracking method and device to at least solve the technical problems of high gaze tracking hardware cost, low algorithm optimization efficiency, and low estimation accuracy.
  • a method for tracking the direction of sight including: using multiple light sources to provide corneal reflection to the user's eyes, and using multiple cameras to capture images containing the user's face;
  • the human eye feature set obtained from the image of the face is combined with the hardware calibration parameters to determine the coordinates of the light source reflection point and the pupil center coordinates in the world coordinate system;
  • the optical axis of the line of sight is determined according to the coordinates of the light source reflection point and the above-mentioned pupil center coordinates, and the above-mentioned line of sight optical axis Reconstructing the viewing axis by compensating the angle; determining the viewpoint on the above target object according to the above viewing axis and the position of the target object in the above world coordinate system.
  • the aforementioned human eye feature set includes: light source imaging points and pupil imaging points.
  • any one of the following world coordinate systems may be selected: a light source coordinate system, a camera coordinate system, and a target object coordinate system.
  • the above-mentioned method also performs simulation verification, analysis and optimization through preset eyeball parameters, the above-mentioned compensation angle, the above-mentioned hardware calibration parameters and a preset viewpoint.
  • the plurality of cameras, the plurality of light sources, and the target object face the user, and the fields of view of the plurality of cameras do not include the plurality of light sources and the target object.
  • the above-mentioned hardware calibration parameters include: the internal and external parameters of the above-mentioned multiple camera coordinate systems and the geometric position relationship, wherein the above-mentioned geometric position relationship includes the following: the first distance between the above-mentioned multiple light source coordinate systems and the above-mentioned multiple camera coordinate systems A positional relationship, a second positional relationship between the plurality of light source coordinate systems and the target object coordinate system, a third positional relationship between the plurality of camera coordinate systems and the target object coordinate system.
  • the above-mentioned geometric positional relationship is obtained by combining internal and external parameters of the above-mentioned multiple cameras, and using at least one of a plane mirror and an auxiliary camera to transmit calibration information for calibration.
  • the coordinates of the light source reflection point and the pupil center coordinates in the world coordinate system are determined, including: if the above-mentioned world coordinate system is the above-mentioned light source coordinate system , the face feature set determines the coordinates of the reflection point of the light source in the light source coordinate system and the above-mentioned Pupil center coordinates; if the above-mentioned world coordinate system is the above-mentioned camera coordinate system, the above-mentioned face feature set determines the above-mentioned light source reflection point coordinates and the above-mentioned pupil center coordinates in the above-mentioned multiple camera coordinate systems through the internal and external parameters of the above-mentioned multiple cameras; if The above-mentioned world coordinate system is the above-mentioned target object coordinate system, and the above-mentioned facial feature set determines the above-mentioned target object
  • determining the line-of-sight optical axis according to the coordinates of the above-mentioned light source reflection point and the above-mentioned pupil center coordinates includes: determining the coordinates of the center of corneal curvature according to the above-mentioned coordinates of the light source reflection point and the radius of curvature of the cornea; The coordinates of the above-mentioned line of sight optical axis that connect the above-mentioned pupil center and the above-mentioned corneal curvature center are determined.
  • the above-mentioned geometric positional relationship is obtained by combining the internal and external parameters of the above-mentioned multiple cameras, and using the plane mirror and the auxiliary camera to transmit calibration information for calibration, including: the first auxiliary camera acquires images containing the first mark reflected by the above-mentioned plane mirror in multiple different postures.
  • first marked images of the above-mentioned target object combined with internal and external parameters of the above-mentioned multiple cameras, calculate the above-mentioned third positional relationship based on the multiple first marked images based on orthogonal constraints; the second auxiliary camera acquires the above-mentioned multiple light sources The second marked image, combining the internal and external parameters of the plurality of cameras and obtaining the first positional relationship based on the second marked image, wherein the second auxiliary camera is a stereo vision system; according to the first positional relationship and the third position The relationship determines the second positional relationship described above.
  • the above-mentioned geometric positional relationship is obtained by combining the internal and external parameters of the above-mentioned multiple cameras, and using the auxiliary camera to transmit calibration information for calibration, including: the third auxiliary camera acquires the third marker image containing the above-mentioned multiple light sources, and combines the above-mentioned multiple cameras
  • the above-mentioned first positional relationship is obtained based on the above-mentioned third marker image, wherein the above-mentioned third auxiliary camera is a stereoscopic vision system;
  • the fourth auxiliary camera is set so that its field of view includes the above-mentioned multiple cameras and the above-mentioned third auxiliary camera
  • a calibration plate is set next to the third auxiliary camera, and the plurality of cameras capture the fourth mark image including the area of the calibration plate, and at the same time, the third auxiliary camera captures the fifth mark image of the target object containing the fifth mark;
  • the positional relationship between the four auxiliary cameras and the plurality of cameras is used as
  • the above-mentioned geometric positional relationship is obtained by combining the internal and external parameters of the above-mentioned multiple cameras, and using a plane mirror to transmit calibration information calibration, including: using the above-mentioned plane mirror with no less than 4 marking points as an auxiliary, the above-mentioned multiple cameras acquire belt There are reflection images of the above-mentioned multiple light sources, the above-mentioned target objects and the above-mentioned mark points; respectively calculating the coordinates of each of the above-mentioned mark points, the above-mentioned multiple light sources, and the above-mentioned target objects in the above-mentioned multiple camera coordinate systems according to the above-mentioned reflective images, The coordinates of the specular light source and the specular target object; reconstruct the specular plane according to the coordinates of all the above-mentioned mark points, and confirm the above-mentioned first positional relationship and the above-mentioned third positional relationship according to the principle of specular reflection, combined
  • the method further includes: acquiring a set of sample images when the above-mentioned user gazes at each preset gaze point; The first compensation angle samples; traversing all the first compensation angle samples, and obtaining the above compensation angles through screening and purification.
  • determining the first compensation angle sample according to the sample features extracted from each group of the above sample images includes: extracting sample features for each group of the above sample images, and reconstructing the first line of sight optical axis according to the above sample features; based on the above preset The real coordinates of the gazing point are inversely deduced from the first line of sight visual axis; according to the above-mentioned first line-of-sight optical axis and the above-mentioned first line-of-sight visual axis, the first compensation angle sample is obtained.
  • traversing all the above-mentioned first compensation angle samples, and obtaining the above-mentioned compensation angles through screening and purification include: finding the center points of all the above-mentioned first compensation angle samples, filtering and removing samples that are not within the first threshold range; continue Iterate through the screening and purification of all remaining samples until the difference between the current center point and the last center point is lower than the second threshold, and obtain the compensation angle from all the purified samples.
  • the above method further includes: determining the deviation between the predicted line of sight observation point and the real line of sight observation point through the dynamic compensation model for the collected data, and obtaining the above compensation angle according to the above deviation .
  • initializing the above-mentioned dynamic compensation model includes: obtaining a set of initial sample images when the above-mentioned user gazes at each preset initial point; extracting initial sample features for each set of above-mentioned initial sample images, and The above dynamic compensation model that fits the current above user is obtained through small sample learning initialization according to the above initial sample characteristics.
  • training the above-mentioned dynamic compensation model includes: collecting multiple sets of sample data when multiple users stare at preset calibration points respectively; cleaning the above-mentioned multiple sets of sample data, and extracting training sample features from the above-mentioned multiple sets of samples after cleaning;
  • the initial dynamic compensation model is trained using small sample learning according to the characteristics of the above training samples, and the above dynamic compensation model after training is obtained.
  • the simulation verification, analysis and optimization are carried out through the preset eyeball parameters, the above compensation angle, the above hardware calibration parameters and the preset viewpoint, including: for the above preset viewpoint, according to the above eyeball parameters, the above compensation angle and the above hardware Calculation of the calibration parameter simulation to reconstruct the light source imaging point and the reconstructed pupil imaging point; according to the above-mentioned reconstructed light source imaging point and the above-mentioned reconstructed pupil imaging point, determine the predicted viewpoint according to the above-mentioned line-of-sight direction tracking method; Analyze, and implement verification and optimization based on the analysis results.
  • the simulated calculation of the reconstructed light source imaging point and the reconstructed pupil imaging point according to the aforementioned preset eyeball parameters, the aforementioned compensation angle, and the aforementioned hardware calibration parameters includes: according to the corneal center and The above hardware calibration parameters determine the light source corneal camera angle, based on the above light source corneal camera angle and the corneal curvature radius in the above preset eyeball parameters, combined with the spherical reflection principle to determine the reconstruction light source reflection point coordinates, according to the above reconstruction light source reflection point coordinates combined with the above hardware calibration Parameter calculation of the above reconstruction light source imaging point; determine the first visual axis according to the coordinates of the above preset viewpoint and the corneal center in the above preset eyeball parameters, deduce the first optical axis based on the above first visual axis and the above compensation angle, according to the above The reconstructed pupil center coordinates are determined based on the first optical axis and the pupil-cornea center distance of the preset eyeball
  • the aforementioned preset viewpoints can be preset in advance or randomly generated at multiple different positions to simulate tracking from multiple sight angles.
  • a gaze direction tracking device including: an acquisition module, configured to use multiple light sources to provide corneal reflections to the user's eyes, and use multiple cameras to capture the image containing the user's face Image; the key point determination module is used to determine the light source reflection point coordinates and the pupil center coordinates in the world coordinate system by combining the human eye feature set obtained from the image containing the human face with the hardware calibration parameters; the line of sight reconstruction module is used to The above-mentioned light source reflection point coordinates and the above-mentioned pupil center coordinates determine the optical axis of the line of sight, and the above-mentioned optical axis of the line of sight reconstructs the visual axis of the line of sight through the compensation angle; the viewpoint determination module is used to determine according to the above-mentioned visual axis of the line of sight and the position of the target object in the above-mentioned world coordinate system Viewpoint on the above target object.
  • the above device further includes a simulation module for performing simulation verification, analysis and optimization through preset eyeball parameters, the aforementioned compensation angle, the aforementioned hardware calibration parameters and preset viewpoints.
  • the above-mentioned hardware calibration parameters include: the internal and external parameters of the above-mentioned multiple camera coordinate systems and the geometric position relationship, wherein the above-mentioned geometric position relationship includes the following: the first distance between the above-mentioned multiple light source coordinate systems and the above-mentioned multiple camera coordinate systems A positional relationship, a second positional relationship between the plurality of light source coordinate systems and the target object coordinate system, a third positional relationship between the plurality of camera coordinate systems and the target object coordinate system.
  • the above-mentioned key point determination module includes: a calibration unit, configured to obtain the above-mentioned geometric positional relationship by combining internal and external parameters of the above-mentioned multiple cameras, and using at least one of a plane mirror and an auxiliary camera to transmit calibration information for calibration.
  • a calibration unit configured to obtain the above-mentioned geometric positional relationship by combining internal and external parameters of the above-mentioned multiple cameras, and using at least one of a plane mirror and an auxiliary camera to transmit calibration information for calibration.
  • the above-mentioned key point determination module includes: a first determination unit, configured to, if the above-mentioned world coordinate system is the above-mentioned light source coordinate system, the above-mentioned face feature set through the internal and external parameters of the above-mentioned multiple cameras, combined with the above-mentioned first positional relationship, Or combine the above-mentioned second positional relationship and the above-mentioned third positional relationship to determine the coordinates of the above-mentioned light source reflection point and the above-mentioned pupil center coordinates in the above-mentioned light source coordinate system; the second determining unit is used to determine if the above-mentioned world coordinate system is the above-mentioned camera coordinate system,
  • the above-mentioned face feature set determines the above-mentioned light source reflection point coordinates and the above-mentioned pupil center coordinates in the above-mentioned multiple camera coordinate systems through the internal and external parameters of the above-mentioned multiple
  • the sight line reconstruction module includes: a first reconstruction unit, configured to determine the coordinates of the center of corneal curvature according to the coordinates of the reflection point of the light source and the radius of curvature of the cornea; a second reconstruction unit, configured to The coordinates of the center of curvature determine the optical axis of the line of sight connecting the center of the pupil and the center of curvature of the cornea.
  • the above-mentioned calibration unit includes a first calibration unit, which is used to obtain the above-mentioned geometric positional relationship by combining the internal and external parameters of the above-mentioned multiple cameras, and using a plane mirror and an auxiliary camera to transmit calibration information for calibration
  • the above-mentioned first calibration unit includes: A calibration subunit, used for the first auxiliary camera to obtain multiple first marker images of the above-mentioned target object containing the first marker reflected by the above-mentioned plane mirror in multiple different attitudes;
  • the second calibration subunit is used for combining the above-mentioned multiple cameras internal and external parameters, and calculate the above-mentioned third positional relationship based on the orthogonal constraints based on the multiple above-mentioned first marked images;
  • the third calibration subunit is used for the second auxiliary camera to obtain the second marked image containing the above-mentioned multiple light sources, combined with the above-mentioned multiple internal and external parameters of a camera and obtain the first positional relationship
  • the above-mentioned calibration unit includes a second calibration unit, which is used to obtain the above-mentioned geometric position relationship by combining the internal and external parameters of the above-mentioned multiple cameras and using the auxiliary camera to transmit calibration information for calibration, wherein the above-mentioned second calibration unit includes: a fifth calibration The subunit is used for the third auxiliary camera to acquire a third marker image including the plurality of light sources, and acquire the first positional relationship based on the third marker image in combination with internal and external parameters of the plurality of cameras, wherein the third auxiliary camera It is a stereoscopic vision system; the sixth calibration subunit is used for setting the fourth auxiliary camera to include the above-mentioned multiple cameras and the above-mentioned third auxiliary camera in its field of view, and a calibration plate is set beside the above-mentioned third auxiliary camera, and the above-mentioned multiple camera acquisition includes The fourth mark image of the above-mentioned calibration plate area, and the above-mentioned third
  • the above-mentioned calibration unit includes a third calibration unit, which is used to obtain the above-mentioned geometric position relationship by combining the internal and external parameters of the above-mentioned multiple cameras, and using a plane mirror to transmit calibration information calibration, wherein the above-mentioned third calibration unit includes: a ninth calibration subunit A unit, configured to use the above-mentioned plane mirror with no less than four marking points as an aid, and the above-mentioned multiple cameras to acquire reflection images with the above-mentioned multiple light sources, the above-mentioned target object and including the above-mentioned marking points; the tenth calibration subunit, It is used to calculate the coordinates of each of the above-mentioned marking points, the above-mentioned multiple light sources and the above-mentioned target objects in the above-mentioned multiple camera coordinate systems, the coordinates of the specular light source and the coordinates of the specular target object respectively according to the above-mentioned reflected image; the eleven
  • the sight line reconstruction module includes: a third reconstruction unit, used to acquire a set of sample images when the above-mentioned user gazes at each preset gaze point; a fourth reconstruction unit, used to extract samples according to each set of the above-mentioned sample images The feature determines the first compensation angle samples; the fifth reconstruction unit is configured to traverse all the above-mentioned first compensation angle samples, and obtain the above-mentioned compensation angles through screening and purification.
  • the above-mentioned line-of-sight reconstruction module further includes: a dynamic compensation unit, configured to determine the deviation between the predicted line-of-sight observation point and the real line-of-sight observation point through a dynamic compensation model for the collected data, and obtain the above-mentioned compensation angle according to the above-mentioned deviation.
  • a dynamic compensation unit configured to determine the deviation between the predicted line-of-sight observation point and the real line-of-sight observation point through a dynamic compensation model for the collected data, and obtain the above-mentioned compensation angle according to the above-mentioned deviation.
  • the above-mentioned dynamic compensation unit includes an initialization subunit, configured to initialize the above-mentioned dynamic compensation model before using the above-mentioned dynamic compensation model, wherein, the above-mentioned initialization subunit includes: a first initialization subunit, used for the above-mentioned user to gaze at each Obtain a set of initial sample images when the initial point is preset; the second initialization subunit is used to extract initial sample features for each set of the above initial sample images, and obtain the current user fit through small sample learning initialization according to the above initial sample features The above dynamic compensation model.
  • the above-mentioned dynamic compensation unit includes a training subunit for training the above-mentioned dynamic compensation model
  • the above-mentioned training subunit includes: a first training subunit for collecting multiple data when multiple users stare at preset calibration points respectively. A set of sample data; the second training subunit is used to clean the above-mentioned multiple sets of sample data, and extract training sample features from the above-mentioned multiple sets of samples after cleaning; the third training subunit is used to use small samples according to the above-mentioned training sample features Learn and train the initial dynamic compensation model, and obtain the above-mentioned dynamic compensation model after training.
  • the above-mentioned simulation module includes: a first simulation unit, configured to simulate and calculate the reconstructed light source imaging point and the reconstructed pupil imaging point according to the above-mentioned eyeball parameters, the above-mentioned compensation angle and the above-mentioned hardware calibration parameters for the above-mentioned preset viewpoint; the second simulation A unit, configured to determine the predicted viewpoint according to the above-mentioned reconstruction light source imaging point and the above-mentioned reconstructed pupil imaging point according to the above-mentioned line-of-sight tracking method; a third simulation unit, used for statistical analysis based on the comparison value of the above-mentioned preset viewpoint and the above-mentioned predicted viewpoint, and Validation and optimization are implemented based on the analysis results.
  • a first simulation unit configured to simulate and calculate the reconstructed light source imaging point and the reconstructed pupil imaging point according to the above-mentioned eyeball parameters, the above-mentioned compensation angle and the above-mentioned hardware calibration parameters
  • the above-mentioned first simulation unit includes: a first simulation sub-unit, configured to determine the light source corneal camera angle according to the corneal center in the above-mentioned preset eyeball parameters and the above-mentioned hardware calibration parameters, based on the above-mentioned light source corneal camera angle and the above-mentioned preset
  • the radius of curvature of the cornea in the eyeball parameters is combined with the principle of spherical reflection to determine the coordinates of the reflection point of the reconstructed light source, and the imaging point of the reconstructed light source is calculated according to the coordinates of the reflection point of the reconstructed light source combined with the above hardware calibration parameters
  • the second simulation subunit is used to
  • the first visual axis is determined by the coordinates of the viewpoint and the corneal center in the above-mentioned preset eyeball parameters, and the first optical axis is deduced based on the above-mentioned first visual axis and the above-mentioned compensation angle, based on
  • the above-mentioned third simulation unit includes: a third simulation sub-unit, which is used to verify whether the implementation of the above-mentioned line-of-sight direction tracking method is correct, to test the influence of added disturbance on the viewpoint error and to determine the above-mentioned multiple light sources, the above-mentioned multiple cameras and Target object configuration method.
  • a third simulation sub-unit which is used to verify whether the implementation of the above-mentioned line-of-sight direction tracking method is correct, to test the influence of added disturbance on the viewpoint error and to determine the above-mentioned multiple light sources, the above-mentioned multiple cameras and Target object configuration method.
  • a storage medium characterized in that the above storage medium includes a stored program, wherein when the above program is running, the device where the above storage medium is located is controlled to execute claims 1 to 22 Any one of the above gaze direction tracking methods.
  • an electronic device which is characterized by including: a processor; and a memory for storing executable instructions of the processor; wherein, the processor is configured to execute The above executable instructions are used to execute the gaze direction tracking method described in any one of claims 1 to 22.
  • the embodiment of the present invention by performing the following steps: using multiple light sources to provide corneal reflection to the user's eyes, using multiple cameras to capture images containing the user's face; Combined with the hardware calibration parameters, determine the light source reflection point coordinates and pupil center coordinates in the world coordinate system; determine the line of sight optical axis according to the above light source reflection point coordinates and the above pupil center coordinates, and reconstruct the line of sight line of sight optical axis through the compensation angle; The sight axis of the above-mentioned line of sight and the position of the target object in the above-mentioned world coordinate system determine the viewpoint on the above-mentioned target object.
  • the technical problems of high cost of eye-tracking hardware, low algorithm optimization efficiency, and low estimation accuracy are solved.
  • FIG. 1 is a flow chart of an optional line-of-sight tracking method according to an embodiment of the present invention
  • FIG. 2 is an application scenario diagram of an optional eye-tracking method according to an embodiment of the present invention.
  • Fig. 3 is a schematic diagram of setting an optional gaze tracking system according to an embodiment of the present invention.
  • FIG. 4 is an optional calibration scene diagram according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of an optional calibration method according to an embodiment of the present invention.
  • FIG. 6 is another optional calibration scene diagram according to an embodiment of the present invention.
  • FIG. 7 is a flowchart of another optional calibration method according to an embodiment of the present invention.
  • FIG. 8 is another optional calibration scene diagram according to an embodiment of the present invention.
  • FIG. 9 is a flowchart of another optional calibration method according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of an optional method for reconstructing light source reflection points according to an embodiment of the present invention.
  • Fig. 11 is a structural block diagram of an optional gaze direction tracking device according to an embodiment of the present invention.
  • Fig. 12 is a structural block diagram of another optional gaze direction tracking device according to an embodiment of the present invention.
  • an embodiment of a line-of-sight tracking method is provided. It should be noted that the steps shown in the flowcharts of the accompanying drawings can be executed in a computer coefficient such as a set of computer-executable instructions, and, although A logical order is shown in the flowcharts, but in some cases the steps shown or described may be performed in an order different from that shown or described herein.
  • the embodiment of the present invention provides a line of sight direction tracking method, which can be applied to various application scenarios, including: transparent A-pillar, line of sight bright screen, head-up display system and other application products, the embodiment of the present invention proposes A line-of-sight tracking method with low hardware cost, high processing precision, fast processing speed, real-time, and applicable to most actual usage scenarios is developed.
  • a gaze direction tracking method is provided.
  • FIG. 1 it is a flow chart of an optional gaze direction tracking method according to an embodiment of the present invention. As shown in Figure 1 above, the method comprises the following steps:
  • multiple light sources are used to provide corneal reflection to the user's eyes, and multiple cameras are used to capture the image containing the user's face; by combining the human eye feature set obtained from the image containing the human face with hardware calibration parameters, Determine the coordinates of the light source reflection point and the pupil center coordinates in the world coordinate system; determine the optical axis of the line of sight according to the coordinates of the light source reflection point and the pupil center coordinates, and reconstruct the visual axis of the line of sight through the compensation angle; The relationship determines the visual axis of the line of sight, and the optical axis of the line of sight reconstructs the visual axis of the line of sight through the compensation angle; according to the visual axis of the line of sight and the position of the target object in the above-mentioned world coordinate system, determine the target object of the viewpoint on the above-mentioned target object.
  • the user stares at the target object, uses multiple light sources to emit infrared light sources to the user's eyes, and uses multiple cameras to capture in real time an image including the pupil imaging point and the position of the light source reflection point obtained by reflecting the infrared light source through the cornea.
  • Multiple cameras can be used to capture images of the user's face in real time when the user is looking at the scene.
  • the image of the user's face does not refer to an image that only includes the user's face area, and the captured image only needs to include the image of the user's face area.
  • multiple cameras may focus on collecting the user's face area to obtain an image of the user's face containing clear facial features.
  • the user's gazing target can be a display device, a screen for outputting image content or text content, such as a screen, a monitor, and a head-mounted display, or a non-display device, such as a windshield.
  • Fig. 2 is an application scenario diagram of an optional gaze tracking method according to an embodiment of the present invention.
  • the application scenario of the gaze tracking method includes a gaze tracking system 10 and a terminal processor 30 communicating through a network, wherein the gaze tracking system 10 includes a camera 10a and a light source 10b.
  • the light source 10b When the user gazes at the target object 20, the light source 10b generates an infrared light source directed at the user's eyes, and the camera 10a captures an image containing the user's eyes, and the eye image includes pupil imaging points and light source imaging points.
  • the target object 20 may be a display device, a screen for outputting image content and text content, such as a screen, a monitor, and a head-mounted display, or a non-display device, such as a windshield.
  • the dual-camera and dual-light sources are the minimum system for calculating the corneal center and also the minimum system for realizing line-of-sight tracking. It is the minimum system for finding the center of the cornea and also the minimum system for realizing eye-tracking.
  • the present invention is applicable to different users, whose eyeball model parameters are unknown and different, so the plurality of cameras 10a includes at least two cameras, the plurality of light sources 10b includes at least two light sources, and the application scene includes target objects.
  • the multiple cameras, the multiple light sources and the target object are facing the user, and the fields of view of the multiple cameras do not include the multiple light sources and the above target object.
  • the above-mentioned multiple light sources, the above-mentioned multiple cameras, and the above-mentioned target objects are arranged on the same side, and the present invention does not limit the specific number of cameras, light sources, and target objects, as well as the specific arrangement and arrangement of the devices.
  • Location The positional relationship between multiple light sources, multiple cameras and target objects in the above-mentioned gaze tracking system can be preset or obtained based on calibration.
  • the terminal processor 30 acquires an image containing a human face, detects and extracts features from the image containing a human face, and obtains a human eye feature set, including determining that the pupil imaging point and the light source imaging point in the eye image are in the three-dimensional coordinate system of the target object 20 coordinate of.
  • the terminal processor 30 determines the optical axis of the line of sight according to the input feature set of human eyes, reconstructs the visual axis of the line of sight by compensating the angle, and finally determines the user's viewpoint according to the visual axis of the line of sight and the geometric position relationship.
  • the terminal processor 30 may be a fixed terminal or a mobile terminal, and the mobile terminal may include at least one of the following devices, such as a notebook, a tablet computer, a mobile phone, and a vehicle-mounted device.
  • the human eye feature set includes: light source imaging points and pupil imaging points.
  • This embodiment uses the existing technology to realize centering detection and feature extraction of images containing human faces.
  • This application does not limit the specific technology for obtaining human eye feature sets, and traditional image processing methods and image processing based on deep learning can be used. method etc.
  • the above-mentioned hardware calibration parameters include: internal and external parameters of multiple camera coordinate systems and geometric position relationships, wherein the geometric position relationships include the following: multiple light source coordinate systems and multiple camera coordinate systems The first positional relationship, the second positional relationship between the plurality of light source coordinate systems and the target object coordinate system, and the third positional relationship between the plurality of camera coordinate systems and the target object coordinate system. And when obtaining any two positional relationships in the geometric positional relationship, the remaining one positional relationship can be determined through the spatial transformation relationship and the known two positional relationships.
  • the mapping relationship between any pixel coordinates and the world coordinate system, and the mapping relationship between the coordinate system of any hardware and the world coordinate system can be obtained.
  • the coordinates of the light source reflection point and the pupil center coordinates in the world coordinate system need to be determined based on the position of multiple cameras and multiple light sources in the same world coordinate system according to the human eye feature set.
  • the positions of multiple light sources and multiple cameras in the world coordinate system can be respectively determined according to the mapping relationship included in the hardware calibration parameters.
  • the light source reflection point is the reflection point of the light emitted from the center of the light source on the surface of the cornea.
  • the light source imaging point is the imaging of the light source reflection point in the collected image containing the user's face.
  • Pupil Center which is the center point of the pupil area.
  • the pupil imaging point is the imaging of the pupil refraction point after the pupil center is refracted by the cornea in the collected image including the user's face.
  • the cornea of the human eye is modeled as a sphere, the center of corneal curvature is the sphere center of the sphere, the corneal radius is the distance from the corneal curvature center to the surface of the corneal sphere, and the optical axis direction is the direction of the line connecting the pupil center and the corneal curvature center.
  • the human eyeball can also be modeled as the eyeball center, which is the spherical center of the eyeball, and the eyeball center is also located on the optical axis.
  • the radius of curvature of the cornea can be obtained by combining the position of the light source and the imaging point of the light source.
  • the light source imaging point is the imaging of the light source reflection point in the collected image containing the user's face. According to the camera imaging principle, combined with the position coordinates and internal and external parameters of multiple cameras that collect the image, determine the light source reflection point in the world coordinate system coordinate of. It should be noted that there may be one light source or multiple light sources. When multiple light sources are working, and each light source has a corresponding light source reflection point, the coordinates and method of determining each light source in the world coordinate system are consistent with the above.
  • the pupil imaging point is the imaging of the pupil refraction point after the pupil center is refracted by the cornea in the collected image containing the user's face. According to the camera imaging principle, combined with the position coordinates and internal and external parameters of the camera that collects the image, it is determined at The coordinates of the pupil center in the world coordinate system.
  • any one of the following world coordinate systems may be selected: a light source coordinate system, a camera coordinate system, and a target object coordinate system. That is, the world coordinate system can be the stereo coordinate system of any specified light source, any camera or target object.
  • mapping relationship between any pixel coordinates and the world coordinate system and the mapping relationship between the coordinate system of any hardware and the world coordinate system can be obtained through hardware calibration parameters.
  • the face feature set is two-dimensional information in the pixel coordinate system.
  • the face feature set obtained from the above image containing the face is combined with the hardware calibration parameters to determine the coordinates of the light source reflection point and the pupil center coordinates in the world coordinate system, including:
  • the above-mentioned face feature set uses the internal and external parameters of the above-mentioned multiple cameras, combined with the above-mentioned first positional relationship, or combined with the above-mentioned second positional relationship and the above-mentioned third positional relationship, to determine the coordinates of the above-mentioned light source
  • the above-mentioned light source reflection point coordinates and the above-mentioned pupil center coordinates in the system
  • the internal and external parameters of the above-mentioned multiple cameras of the above-mentioned face feature set determine the above-mentioned light source reflection point coordinates and the above-mentioned pupil center coordinates in the above-mentioned multiple camera coordinate systems;
  • the above-mentioned face feature set is determined by combining the above-mentioned third position relationship, or combining the above-mentioned first position relationship and the above-mentioned second position relationship through the internal and external parameters of the above-mentioned multiple cameras.
  • this application does not limit the method of determining the internal and external parameters of multiple cameras, which can be obtained through preset factory parameters, or through calibration.
  • the application also does not limit the method of calibrating the internal and external parameters of multiple cameras The method, for example, uses a calibration board to calibrate the internal and external parameters of multiple cameras.
  • the geometric position relationship is calibrated and obtained by combining internal and external parameters of multiple cameras, and using at least one of a plane mirror and an auxiliary camera to transmit calibration information.
  • At least one of the plane mirror and the auxiliary camera is used to transmit calibration information, and the geometric position relationship is obtained based on the calibration information.
  • the calibration process aims to obtain the geometric positional relationship between multiple cameras, multiple light sources, and the target object.
  • the target object is arranged on the same side, so the camera cannot directly observe the target object, or can only observe a part of the light target object.
  • This application combines the internal and external parameters of multiple cameras, and uses at least one of the plane mirror and the auxiliary camera to transmit calibration information. Get the geometric positional relationship.
  • FIG. 3 is a schematic diagram of an optional gaze tracking system setup according to an embodiment of the present invention.
  • the gaze tracking system is set with two cameras on the outside and two light sources on the inside.
  • the gaze tracking system 100 includes cameras 110 and 116 , light sources 112 and 114 , and the light sources are used to emit light to the eyes of the user 105 to provide corneal reflection.
  • the camera is used to capture images containing the user's 105 face.
  • the gaze tracking system 100 adopts the arrangement of the light source on the inside and the camera on the outside.
  • the present invention does not limit the specific arrangement of the cameras and light sources. For example, the arrangement of the camera on the inside and the light source on the outside, or the distance between the camera and the light source can also be adopted.
  • the placement arrangement is not limit the specific arrangement of the cameras and light sources.
  • the camera and the light source are respectively fixed on the pan-tilt 103, 104, 101 and 102, and the pan-tilt can adjust each component in the horizontal and vertical directions.
  • the fixed components are installed on the base 107, and the distance between the components can be realized by adjusting their fixed positions.
  • the target object is configured as a display 106, which can be used to display the calibration mark.
  • Fig. 4 is an optional calibration scene diagram according to an embodiment of the present invention.
  • a plane mirror 320 is placed on the front side of the display 106, a first auxiliary camera 318 is fixed above the display 106, and a second auxiliary camera is arranged in front of the line of sight tracking system.
  • the second auxiliary camera is a stereoscopic vision system, including cameras 322 and 324,
  • the display 106 displays the first indicia.
  • the auxiliary camera 318 acquires images of all first marks projected by the display 106 in the plane mirror 320 , and the field of view of the stereo vision system includes all cameras and light sources.
  • Fig. 5 is a flowchart of an optional calibration method according to an embodiment of the present invention.
  • combining internal and external parameters of multiple cameras, and using a plane mirror and an auxiliary camera to transmit calibration information to obtain a geometric position relationship may include:
  • the first auxiliary camera acquires multiple first marker images of the above-mentioned target object containing the first marker reflected by the plane mirror in multiple different postures;
  • the plane mirror 320 is reflected in multiple different postures, so that the virtual image of the first mark on the mirror surface is in the field of view of the auxiliary camera 318, wherein the number of multiple postures of the plane mirror is at least three, and the plane where the plane mirror is located is in different positions. attitudes are not parallel to each other.
  • the internal and external parameters of multiple cameras include the internal and external parameters of the camera, the first auxiliary camera, and the second auxiliary camera. limit.
  • the relative positions of all the first marked points relative to the auxiliary camera are restored.
  • Using plane mirrors with different poses to reflect images of the marker points provides different first auxiliary camera coordinate systems for fixed displays.
  • For all the first marked images use P3P to solve multiple candidate mirror first mark point combinations, and combine the orthogonal constraints existing in the mirror points to select a final combination.
  • the above orthogonal constraints are in the auxiliary camera coordinate system, any two different The axial vector of the mirror surface is orthogonal to the difference vector of the projected coordinates of the same marked point under the corresponding mirror surface.
  • the second auxiliary camera acquires a second marker image including multiple light sources, and acquires a first positional relationship based on the second marker image by combining internal and external parameters of the multiple cameras, wherein the second auxiliary camera is a stereo vision system;
  • the second auxiliary camera is a stereo vision system, and the field of view of the stereo vision system includes all cameras and light sources.
  • the second marker image including all light sources is collected and processed by the stereo vision system. Since the data collected by the stereo vision system contains three-dimensional information, according to In the second marked image, the pose of each light source under the stereo vision system is determined, and further, the positional relationship between the multiple light sources and the multiple cameras is determined by combining the internal and external parameters of the multiple cameras, that is, the first positional relationship.
  • S313 Determine a second positional relationship according to the first positional relationship and the third positional relationship.
  • the pose relationship of the system is determined based on the orthogonal constraint solution based on the established pose relationship between the target object and its virtual image in the plane mirror. Due to the need to change the position of the plane mirror many times, the camera needs to collect multiple images reflected by the mirror, and the movement of the plane mirror needs to meet the preset conditions, the operation is more complicated and the efficiency is low.
  • the above mirror calibration method is based on the orthogonal constraint linear The solution is usually sensitive to noise, and the accuracy of system position relation calibration results is related to the distance from the plane mirror to the camera and the rotation angle of the plane mirror.
  • the second calibration method introduces multiple auxiliary cameras to avoid changing the position of the plane mirror multiple times, uses multiple auxiliary cameras as a conversion bridge, and determines the final geometric position relationship through attitude conversion.
  • Fig. 6 is another optional calibration scene diagram according to an embodiment of the present invention.
  • a plane mirror 320 is placed on the front side of the display 106
  • a third auxiliary camera is arranged in front of the line of sight tracking system
  • the third auxiliary camera is a stereo vision system, including cameras 422 and 424
  • a calibration plate 428 is arranged on the side of the third auxiliary camera
  • the field of view of the cameras 110 and 116 includes the calibration plate area
  • the display 106 displays the fourth mark image
  • the fourth auxiliary camera 426 is set so that its field of view includes the camera and the third auxiliary camera
  • the field of view of the stereo vision system includes all cameras, light sources and the third auxiliary camera. Five marked monitors.
  • Fig. 7 is a flowchart of another optional calibration method according to an embodiment of the present invention.
  • combining the internal and external parameters of the above-mentioned multiple cameras and using the auxiliary camera to transmit calibration information to obtain the above-mentioned geometric position relationship may include:
  • the third auxiliary camera acquires a third marker image including the plurality of light sources, and combines the internal and external parameters of the plurality of cameras to acquire the first positional relationship based on the third marker image, wherein the third auxiliary camera is a stereo vision system ;
  • the fourth auxiliary camera is set to include a plurality of cameras and a third auxiliary camera in its field of view, and a calibration plate is set beside the third auxiliary camera, and the plurality of cameras capture a fourth marker image including the area of the calibration plate, and the third auxiliary camera
  • the auxiliary camera acquires the fifth marked image of the above-mentioned target object containing the fifth marked;
  • the calibration plate image including the fourth mark is displayed on the display 106 , so that the third auxiliary camera can capture most of the calibration image area.
  • a calibration board is placed on the side of the third auxiliary camera, so that the cameras 110 and 116 can capture most of the calibration board areas that appear in the two cameras at the same time, and the fourth auxiliary camera is set so that its field of view includes the camera and the third auxiliary camera.
  • the camera captures a fourth marker image including the target plate area, while the third auxiliary camera captures a fifth marker image of the target object including the fifth marker.
  • S322 Use the positional relationship between the fourth auxiliary camera and the plurality of cameras as a pose conversion bridge, combine the internal and external parameters of the third auxiliary camera and the plurality of cameras, and determine the third Positional relationship;
  • the internal and external parameters of multiple cameras include the internal and external parameters of the camera, the third auxiliary camera, and the fourth auxiliary camera. limit.
  • the fourth auxiliary camera is used as a bridge for pose conversion to realize the calibration of the pose relationship between the display and the camera, and determine the third position relationship.
  • S323 Determine the second positional relationship according to the first positional relationship and the third positional relationship.
  • an additional auxiliary camera and a calibration board are introduced to achieve calibration, and the calibration board is placed within the working range of multiple cameras, and the coordinate system conversion of the auxiliary camera and the calibration board finally realizes multiple cameras and multiple light sources.
  • Calibration of the pose relationship with the target object this method is simple in principle and high in theoretical accuracy, but in the process of actual operation, since the positional relationship between multiple cameras and the target object has been fixed, and multiple cameras cannot capture the target object, according to
  • the calibration board and auxiliary camera are arranged according to the above requirements, the angle between the optical axis of the auxiliary camera and the normal vector of the calibration board and the normal vector of the display is too large, which may lead to unsatisfactory collected calibration patterns and large errors in the extraction of marker points, which is difficult to guarantee The precision of the conversion.
  • the introduction of auxiliary cameras increases the cost and complicates the calibration process.
  • the third calibration method only relies on a fixed plane mirror to avoid complicated operations caused by multiple movements, does not rely on auxiliary cameras, avoids problems such as excessive angle ranges between objects captured by auxiliary cameras, and simplifies the calibration process and reduces costs. .
  • Fig. 8 is another optional calibration scene diagram according to an embodiment of the present invention.
  • a plane mirror 320 is placed on the front side of the display 106, and a plurality of marking patterns are pasted on the plane mirror 320.
  • the marking patterns can be dots, checkerboard grids, concentric circles or other easily distinguishable patterns, and the number is not less than 4 One, the distribution of the mark pattern on the display is not specifically limited.
  • the flat mirror reflects the display 106, cameras 110, 116, light source 112, and light source 114.
  • the field of view of the camera includes images projected on the flat mirror by all cameras, light sources, and displays containing markers.
  • Fig. 9 is a flowchart of another optional calibration method according to an embodiment of the present invention.
  • combining the internal and external parameters of the above-mentioned multiple cameras, using a plane mirror to transmit calibration information, and obtaining the above-mentioned geometric positional relationship may include:
  • the above-mentioned multiple cameras acquire reflection images with the above-mentioned multiple light sources, the above-mentioned target object and including the above-mentioned mark points;
  • S331 respectively calculate the coordinates of each of the above-mentioned marker points, the above-mentioned multiple light sources and the above-mentioned target object in the above-mentioned multiple camera coordinate systems according to the above-mentioned reflected image, the coordinates of the specular light source and the coordinates of the specular target object;
  • S332 Reconstruct the mirror plane according to the coordinates of all the above-mentioned mark points, and confirm the above-mentioned first positional relationship and the above-mentioned third positional relationship according to the principle of specular reflection, combining the above-mentioned specular light source coordinates and the above-mentioned specular target object coordinates;
  • S333 Determine the second positional relationship according to the first positional relationship and the third positional relationship.
  • the basic idea of the third calibration method can be to paste multiple marking patterns on the plane mirror 320.
  • the marking patterns can be dots, checkerboards, concentric circles or other easily distinguishable patterns, and the number is not less than 4, and the distribution of the marking pattern on the display is not specifically limited.
  • Multiple cameras cannot directly capture the target object and light source, based on the principle of specular reflection, for example, in FIG. The camera captures a reflected image towards the mirror, which includes the images of all cameras, light sources, and displays containing markers projected on the plane mirror.
  • the specular plane in multiple camera coordinate systems is reconstructed according to the coordinates of at least 4 marker points above, and according to the specular reflection principle, the specular light source corresponds to the actual light source in multiple camera coordinate systems. According to the coordinates of the specular light source and combined with the specular plane, the first positional relationship is determined.
  • the third positional relationship is determined, and the second positional relationship is determined according to the first positional relationship and the third positional relationship.
  • the third calibration method only relies on a fixed plane mirror to avoid complicated operations caused by multiple movements, does not rely on auxiliary cameras, avoids problems such as excessive angle ranges between objects captured by auxiliary cameras, and effectively reduces the calibration process and cost.
  • the above-mentioned light source reflection point coordinates and the above-mentioned pupil center coordinates determine the optical axis of the line of sight, including:
  • the coordinates of the pupil center coordinates and the corneal curvature center determine the above-mentioned line of sight optical axis of the above-mentioned pupil center and the above-mentioned corneal curvature center line.
  • the center of curvature of the cornea in the world coordinate system can be calculated and determined.
  • the optical axis direction is the line connecting the pupil center and the corneal curvature center. Simple calculation of the coordinates of the pupil center and the corneal curvature center can determine the optical axis direction, for example, after subtracting the coordinates of the pupil center and the corneal curvature center The direction of the optical axis can be determined.
  • the corneal curvature center and the pupil center are coplanar with the pupil imaging point on the imaging plane of the camera and the optical center coordinates of the camera.
  • the unit vector of the intersection line where the two coplanar surfaces intersect is the line of sight optical axis
  • the line connecting the pupil center and the above-mentioned corneal curvature center is the line of sight optical axis.
  • the line-of-sight estimation method based on corneal pupillary reflection can often only obtain the optical axis of the eyeball.
  • the user's line of sight is determined by the visual axis, and there is a compensation angle between the visual axis and the optical axis.
  • This application does not Consider that eye deformation or abnormalities may affect the user's compensation angle.
  • the eyeball and cornea are not spherical, which leads to different reconstruction errors when the gaze estimation method based on corneal pupillary reflection reconstructs the eye's optical axis when gazing at different positions.
  • two feasible methods for reconstructing the visual axis of the visual line from the optical axis of the visual line based on the compensation angle are provided.
  • the structure of the human eye determines that there is a compensation angle between the optical axis of the line of sight and the visual axis of the line of sight, which is called the Kappa angle.
  • the Kappa angle is used as a fixed compensation between the optical axis and the visual axis.
  • the compensation angle may not be calibrated, or the compensation angle may be calibrated through a few points, so that it is easier to implement eye tracking.
  • the above method further includes: acquiring a group of sample images when the user gazes at each preset gaze point; extracting The first compensation angle samples are determined by the sample characteristics; all the first compensation angle samples are traversed, and the above compensation angles are obtained through screening and purification.
  • sample features are extracted for each group of sample images, and the first compensation angle sample is determined according to the sample features extracted from each group of sample images, including:
  • the first line of sight is deduced inversely
  • a first compensation angle sample is acquired.
  • traversing all the above-mentioned first compensation angle samples, traversing all the first compensation angle samples, and obtaining the compensation angle through screening and purification include:
  • the preset positions are known, and the gaze points are uniformly distributed in space.
  • the user stares at a gaze point at a known position, and based on the sample features contained in the collected sample images, the optical axis of the line of sight is reconstructed based on the above-mentioned corneal pupil method, and the visual axis of the line of sight is restored by combining the pupil center coordinates and the preset gaze point, and the gaze point is further obtained Corresponding compensation angle samples. There will be errors in the process of obtaining the compensation angle sample set based on the preset gaze point, such as detection errors.
  • the present invention purifies the compensation angle samples, eliminates abnormal samples, and retains high-quality and reasonably distributed samples in the threshold range, which guarantees to a certain extent While ensuring the accuracy of the compensation angle, the applicability of the final compensation angle is guaranteed.
  • This application can not calibrate, or simply calibrate with few points, so that it is easier to realize eye-tracking.
  • a dynamic compensation model is introduced. Before using the above dynamic compensation model, personalized compensation calibration is performed for different users to track the gaze direction with higher accuracy.
  • the above method further includes: determining the deviation between the predicted line-of-sight observation point and the real line-of-sight observation point through a dynamic compensation model for the collected data, according to the above The offset gets the compensation angle mentioned above.
  • each frame of data in the collected data is input into the trained dynamic compensation model to predict the deviation.
  • the deviation is the distance between the predicted line of sight observation point and the real line of sight observation point.
  • the deviation and obtain the compensation angle according to the above deviation.
  • the predicted line-of-sight observation point can be calculated by the above-mentioned method of reconstructing the line-of-sight optical axis.
  • the embodiment of the present invention does not limit the type of the dynamic compensation model, which can be a neural network, a random model forest, and the like.
  • initialization needs to be performed before using the above-mentioned trained dynamic compensation model, so as to obtain a dynamic compensation model suitable for the current user.
  • the dynamic compensation model before using the dynamic compensation model, is initialized, including:
  • the user gazes at a preset initial point and the positions of the preset initial point are known and evenly distributed in space.
  • a set of sample images is acquired for each preset initial point, and the number of gaze points collected is at least 3.
  • initial sample features are extracted for each group of initial sample images, and the initial sample features are initialized through small-sample learning to obtain the aforementioned dynamic compensation model that fits the current aforementioned user.
  • the initial sample features extracted from the sample image include, but are not limited to: the three-dimensional coordinates of the center of the corneal ball, the Euler angle of the optical axis of the eyeball, the sight observation point calculated based on the optical axis, and the deflection angle of the face.
  • the dynamic compensation model can have the generalization ability of the model in the case of learning category changes in a short period of time.
  • the dynamic compensation model initialized by the prior knowledge of the current user can be used for subsequent data collected by the current user.
  • the data provides a good search direction, which reduces prediction errors and improves the efficiency of prediction deviations.
  • Embodiments of the present invention are not limited to small-sample learning methods, such as the MAML method.
  • the above training dynamic compensation model includes:
  • the initial dynamic compensation model is trained using small sample learning according to the characteristics of the above training samples, and the above dynamic compensation model after training is obtained.
  • the initial dynamic compensation model obtains the trained dynamic compensation model through the above steps, and collects multiple sets of sample data.
  • Each tester is required to gaze at several preset calibration points evenly distributed on the target object, and collect corresponding same number of samples.
  • the sample data is cleaned to remove samples that do not meet the requirements of the line of sight estimation algorithm, such as incomplete light source reflection points and severe blurred pictures.
  • sample features are extracted for each set of sample images after cleaning and purification, and the initial dynamic compensation model is trained using small sample learning according to the training sample features to obtain the trained dynamic compensation model.
  • the training sample features extracted for multiple sets of sample data include, but are not limited to: the three-dimensional coordinates of the center of the corneal ball, the Euler angle of the optical axis of the eyeball, the sight observation point calculated based on the optical axis, and the deflection angle of the face.
  • the trained dynamic compensation model After the trained dynamic compensation model inputs the collected data, it can obtain a highly accurate deviation between the predicted line-of-sight observation point and the real line-of-sight observation point, thereby obtaining a high-precision compensation angle.
  • the above line-of-sight optical axis is reconstructed by compensating the angle.
  • the dynamic compensation model that fits the current user is first initialized, and then the generated dynamic compensation model is used to predict the deviation for the data collected by the current user. Further obtain a better compensation angle, and finally reconstruct a high-precision line-of-sight boresight.
  • the sight axis of the above-mentioned line of sight is located in the world coordinate system.
  • To determine the viewpoint of the sight axis on the above-mentioned target object it is necessary to determine the position of the target object on the world coordinate.
  • the above-mentioned geometric position relationship can provide the position of the target object on the world coordinate. Further determine the plane of the target object on the world coordinates.
  • the problem of determining the user's viewpoint is transformed into the problem of the intersection of the line of sight axis vector and the plane where the target object is located in the same coordinate system.
  • the present invention does not limit the method for solving the intersection problem.
  • the embodiment of the present invention also includes establishing a simulation system to simulate the tracking accuracy and anti-interference ability of the gaze tracking system under various hardware conditions, eyeball states, and algorithm design factors.
  • the above-mentioned method further performs simulation verification, analysis and optimization through preset eyeball parameters, the above-mentioned compensation angle, the above-mentioned hardware calibration parameters and a preset viewpoint.
  • the above-mentioned simulation verification and optimization are carried out through preset eyeball parameters, the above-mentioned compensation angle, the above-mentioned hardware calibration parameters and preset viewpoints, including:
  • the above-mentioned reconstructed light source imaging point and the above-mentioned reconstructed pupil imaging point determine the predicted viewpoint according to the above-mentioned line-of-sight direction tracking method
  • Eyeball parameters and geometric positional relationships are preset according to the scene for simulating gaze tracking.
  • Eyeball parameters include but not limited to: three-dimensional coordinates of eyeball center, eyeball radius, three-dimensional coordinates of corneal spherical center, corneal spherical radius, distance from eyeball center to corneal spherical center, three-dimensional coordinates of pupil center, three-dimensional coordinates of iris center, iris radius, corneal refractive index , the refractive index of the lens, the optical axis of the eyeball, the visual axis of the eyeball, and the Kappa angle, etc. are used to model the three-dimensional model of the eyeball.
  • Hardware calibration parameters include, but are not limited to: camera intrinsic parameters, camera extrinsic parameters, camera distortion parameters, camera stereo calibration parameters, camera 3D coordinates, light source 3D coordinates, and target object 3D coordinates.
  • the simulation system verifies whether the implementation of the above-mentioned gaze direction tracking method is correct according to the actual input parameters, tests the influence of the input parameter disturbance on the viewpoint error, and finds the optimal configuration of the multiple light sources, the multiple cameras, and the target object method.
  • the simulated calculation of the reconstructed light source imaging point and the reconstructed pupil imaging point according to the preset eyeball parameters, compensation angle and hardware calibration parameters includes:
  • the first visual axis is determined according to the coordinates of the above-mentioned preset viewpoint and the corneal center in the above-mentioned preset eyeball parameters, and the first optical axis is deduced based on the above-mentioned first visual axis and the above-mentioned compensation angle.
  • the pupil-cornea center distance of the eyeball parameters is set to determine the reconstructed pupil center coordinates, and the reconstructed pupil imaging points are calculated according to the pupil center coordinates combined with the hardware calibration parameters.
  • Fig. 10 is a schematic diagram of an optional method for reconstructing light source reflection points according to an embodiment of the present invention.
  • the corneal center C, the camera optical center O and the light source center l in the world coordinate system are obtained according to preset parameters.
  • the center of the corneal ball corresponding to the corneal area irradiated by the light source is on the optical axis of the eyeball, and the radius R c is fixed.
  • the light emitted from the center of the light source is totally reflected at the reflection point q on the surface of the cornea, and the reflected light is projected to the image plane through the center O of the camera.
  • the simulation system supports multiple preset gaze point positioning to assist in verifying whether the implementation of the gaze direction tracking method is correct. Specifically, take the screen as an example where the target object contained in the simulation system is an example. Multiple gaze points with known positions appear on the screen.
  • the coordinates of the center of the corneal ball contained in the parameters can be used to calculate the visual axis of the line of sight, and then deduce the optical axis according to the preset compensation angle, and finally reconstruct the coordinates of the pupil center according to the preset distance from the center of the pupil to the center of the corneal ball.
  • the preset view point can be preset in advance or randomly generated at multiple different positions to simulate tracking from various sight angles. Specifically, a plurality of different position preset points are generated so as to simulate the error situation of line of sight tracking when the eyeball looks at different positions of the target object.
  • statistical analysis is performed according to the preset viewpoint and the predicted viewpoint, and verification and optimization are carried out according to the analysis results, including: verifying whether the implementation of the line-of-sight direction tracking method is correct, and testing the input parameters to add disturbances to the viewpoint error Affects and determines multiple light sources, multiple cameras, and target object configuration methods. Specifically, it is calculated through simulation.
  • the simulation process actual hardware parameters and actual image pixel processing are not involved.
  • the preset viewpoint combined with the eyeball parameters and hardware calibration parameters included in the simulation system, the reconstructed light source reflection point and the reconstructed pupil imaging point are calculated, and then the predicted viewpoint is determined according to the above-mentioned line-of-sight tracking method based on corneal pupil reflection, according to the forward and reverse verification
  • the correctness of the implementation of the gaze direction tracking method is determined.
  • multiple variables are involved in the simulation system, including geometric position relations, eyeball parameters, and viewpoints. Any abnormality in any variable will cause the verification of correctness to fail. If the verification fails As a result, abnormal variables can be further screened out by controlling variables and other methods, which can improve the optimization efficiency of the gaze tracking method.
  • the above-mentioned simulation system calculates the predicted viewpoint after adding disturbance to the above-mentioned input parameters, and checks the influence of the above-mentioned disturbance on the viewpoint error by comparing the above-mentioned predicted viewpoint with the real viewpoint.
  • the simulation system can apply any type of perturbation to various parameters to simulate the performance of the designed eye-tracking system in actual use.
  • Algorithm optimization For example, by perturbing the camera calibration parameters, the degree of influence of the calibration error on the system can be tested; by perturbing the image pixels at key points such as the reflection point of the light source and the center of the pupil, the degree of influence of the key point detection algorithm error on the system can be tested.
  • it is used to find the optimal hardware configuration method for realizing the high-precision eye-tracking method, which greatly reduces the hardware cost and improves the efficiency of the optimized eye-tracking method.
  • This embodiment also provides an embodiment of a device for tracking the direction of sight, which is used to implement the above embodiments and preferred implementation modes, and what has already been described will not be repeated.
  • the term "module” may be a combination of software and/or hardware that realizes a predetermined function.
  • the devices described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
  • the gaze direction tracking device 1100 includes an acquisition module 1101 , a key point determination module 1102 , a gaze reconstruction module 1103 and a viewpoint determination module 1104 .
  • Each unit included in the gaze direction tracking device 1100 will be described in detail below.
  • the collection module 1101 is configured to use multiple light sources to provide corneal reflection to the user's eyes, and use multiple cameras to capture images containing the user's face;
  • the user stares at the target object, uses multiple light sources to emit infrared light sources to the user's eyes, and uses multiple cameras to capture in real time an image including the pupil imaging point and the position of the light source reflection point obtained by reflecting the infrared light source through the cornea.
  • Multiple cameras can be used to capture images of the user's face in real time when the user is looking at the scene.
  • the image of the user's face does not refer to an image that only includes the user's face area, and the captured image only needs to include the image of the user's face area.
  • multiple cameras may focus on collecting the user's face area to obtain an image of the user's face containing clear facial features.
  • the user's gazing target can be a display device, a screen for outputting image content or text content, such as a screen, a monitor, and a head-mounted display, or a non-display device, such as a windshield.
  • the multiple cameras, the multiple light sources and the target object are facing the user, and the fields of view of the multiple cameras do not include the multiple light sources and the above target object.
  • the above-mentioned multiple light sources, the above-mentioned multiple cameras, and the above-mentioned target objects are arranged on the same side, and the present invention does not limit the specific number of cameras, light sources, and target objects, as well as the specific arrangement and arrangement of the devices.
  • Location The positional relationship between multiple light sources, multiple cameras and target objects in the above-mentioned gaze tracking system can be preset or obtained based on calibration.
  • the key point determination module 1102 is used to determine the light source reflection point coordinates and pupil center coordinates in the world coordinate system by combining the human eye feature set obtained from the image containing the human face and hardware calibration parameters;
  • the human eye feature set includes: light source imaging points and pupil imaging points.
  • This embodiment uses the existing technology to realize centering detection and feature extraction of images containing human faces.
  • This application does not limit the specific technology for obtaining human eye feature sets, and traditional image processing methods and image processing based on deep learning can be used. method etc.
  • the above-mentioned hardware calibration parameters include: internal and external parameters of multiple camera coordinate systems and geometric position relationships, wherein the geometric position relationships include the following: multiple light source coordinate systems and multiple camera coordinate systems The first positional relationship, the second positional relationship between the plurality of light source coordinate systems and the target object coordinate system, and the third positional relationship between the plurality of camera coordinate systems and the target object coordinate system. And when obtaining any two positional relationships in the geometric positional relationship, the remaining one positional relationship can be determined through the spatial transformation relationship and the known two positional relationships.
  • the mapping relationship between any pixel coordinates and the world coordinate system, and the mapping relationship between the coordinate system of any hardware and the world coordinate system can be obtained.
  • the coordinates of the light source reflection point and the pupil center coordinates in the world coordinate system need to be determined based on the position of multiple cameras and multiple light sources in the same world coordinate system according to the human eye feature set.
  • the positions of multiple light sources and multiple cameras in the world coordinate system can be respectively determined according to the mapping relationship included in the hardware calibration parameters.
  • the light source reflection point is the reflection point of the light emitted from the center of the light source on the surface of the cornea.
  • the light source imaging point is the imaging of the light source reflection point in the collected image containing the user's face.
  • Pupil Center which is the center point of the pupil area.
  • the pupil imaging point is the imaging of the pupil refraction point after the pupil center is refracted by the cornea in the collected image including the user's face.
  • the cornea of the human eye is modeled as a sphere, the center of corneal curvature is the sphere center of the sphere, the corneal radius is the distance from the corneal curvature center to the surface of the corneal sphere, and the optical axis direction is the direction of the line connecting the pupil center and the corneal curvature center.
  • the human eyeball can also be modeled as the eyeball center, which is the spherical center of the eyeball, and the eyeball center is also located on the optical axis.
  • the radius of curvature of the cornea can be obtained by combining the position of the light source and the imaging point of the light source.
  • the light source imaging point is the imaging of the light source reflection point in the collected image containing the user's face. According to the camera imaging principle, combined with the position coordinates and internal and external parameters of multiple cameras that collect the image, determine the light source reflection point in the world coordinate system coordinate of. It should be noted that there may be one light source or multiple light sources. When multiple light sources are working, and each light source has a corresponding light source reflection point, the coordinates and method of determining each light source in the world coordinate system are consistent with the above.
  • the pupil imaging point is the imaging of the pupil refraction point after the pupil center is refracted by the cornea in the collected image containing the user's face. According to the camera imaging principle, combined with the position coordinates and internal and external parameters of the camera that collects the image, it is determined at The coordinates of the pupil center in the world coordinate system.
  • the world coordinate system may be any one of the following: a light source coordinate system, a camera coordinate system, and a target object coordinate system. That is, the world coordinate system can be the stereo coordinate system of any specified light source, any camera or target object.
  • mapping relationship between any pixel coordinates and the world coordinate system and the mapping relationship between the coordinate system of any hardware and the world coordinate system can be obtained through hardware calibration parameters.
  • the face feature set is two-dimensional information in the pixel coordinate system.
  • the key point determining module 1102 includes:
  • the first determination unit 11021 is configured to combine the above-mentioned first positional relationship, or combine the above-mentioned second positional relationship with the above-mentioned first positional relationship, if the above-mentioned world coordinate system is the above-mentioned light source coordinate system, and the above-mentioned face feature set is combined with the above-mentioned internal and external parameters of the multiple cameras.
  • Three-position relationship determining the above-mentioned light source reflection point coordinates and the above-mentioned pupil center coordinates in the above-mentioned light source coordinate system;
  • the second determining unit 11022 is configured to determine the coordinates of the reflection point of the light source and the pupil in the coordinate system of the multiple cameras if the world coordinate system is the camera coordinate system, and the face feature set uses the internal and external parameters of the multiple cameras center coordinates;
  • the third determination unit 11023 is configured to combine the above-mentioned third positional relationship with the above-mentioned facial feature set through the internal and external parameters of the above-mentioned multiple cameras, or combine the above-mentioned first positional relationship with the above-mentioned if the above-mentioned world coordinate system is the above-mentioned target object coordinate system.
  • the second positional relationship is to determine the coordinates of the reflection point of the light source and the coordinates of the center of the pupil in the coordinate system of the target object.
  • this application does not limit the method of determining the internal and external parameters of multiple cameras, which can be obtained through preset factory parameters, or through calibration.
  • the application also does not limit the method of calibrating the internal and external parameters of multiple cameras The method, for example, uses a calibration board to calibrate the internal and external parameters of multiple cameras.
  • the above-mentioned key point determination module 1102 includes: a calibration unit 11024, configured to obtain the above-mentioned Geometric positional relationship.
  • the calibration process aims to obtain the geometric positional relationship between multiple cameras, multiple light sources, and the target object.
  • the target object is arranged on the same side, so the camera cannot directly observe the target object, or can only observe a part of the light target object.
  • This application combines the internal and external parameters of multiple cameras, and uses at least one of the plane mirror and the auxiliary camera to transmit calibration information. Get the geometric positional relationship.
  • the above-mentioned calibration unit 11024 includes a first calibration unit 24100, which is used to obtain the above-mentioned geometric positional relationship by combining the internal and external parameters of the above-mentioned multiple cameras, and using a plane mirror and an auxiliary camera to transmit calibration information for calibration, wherein,
  • the above-mentioned first calibration unit 24100 includes:
  • the first calibration subunit 24101 is used for the first auxiliary camera to acquire multiple first marker images of the above-mentioned target object containing the first marker reflected by the above-mentioned plane mirror in multiple different postures;
  • the second calibration subunit 24102 is configured to combine the internal and external parameters of the above-mentioned multiple cameras, and calculate the above-mentioned third positional relationship based on the above-mentioned multiple first marker images based on orthogonal constraints;
  • the third calibration subunit 24103 is used for the second auxiliary camera to obtain the second marker image including the above-mentioned multiple light sources, combining the internal and external parameters of the above-mentioned multiple cameras and based on the above-mentioned second marker image to obtain the above-mentioned first positional relationship, wherein the above-mentioned
  • the second auxiliary camera is a stereo vision system
  • the fourth calibration subunit 24104 is configured to determine the second positional relationship according to the first positional relationship and the third positional relationship.
  • the pose relationship of the system is determined based on the orthogonal constraint solution based on the established pose relationship between the target object and its virtual image in the plane mirror. Due to the need to change the position of the plane mirror many times, the camera needs to collect multiple images reflected by the mirror, and the movement of the plane mirror needs to meet the preset conditions, the operation is more complicated and the efficiency is low.
  • the above mirror calibration method is based on the orthogonal constraint linear The solution is usually sensitive to noise, and the accuracy of system position relation calibration results is related to the distance from the plane mirror to the camera and the rotation angle of the plane mirror.
  • the above-mentioned calibration unit 11024 includes a second calibration unit 24200, configured to obtain the above-mentioned geometric positional relationship by combining the internal and external parameters of the above-mentioned multiple cameras, and using the auxiliary camera to transmit calibration information for calibration, wherein the above-mentioned first Two calibration units include:
  • the fifth calibration subunit 24201 is used for the third auxiliary camera to obtain the third marker image including the above-mentioned multiple light sources, combining the internal and external parameters of the above-mentioned multiple cameras and based on the above-mentioned third marker image to obtain the above-mentioned first positional relationship, wherein the above-mentioned
  • the third auxiliary camera is a stereo vision system
  • the sixth calibration subunit 24202 is used to set the fourth auxiliary camera so that its field of view includes the above-mentioned multiple cameras and the above-mentioned third auxiliary camera, and a calibration plate is set next to the above-mentioned third auxiliary camera, and the above-mentioned multiple cameras capture areas including the above-mentioned calibration plate
  • the fourth marked image of the above-mentioned third auxiliary camera acquires the fifth marked image of the above-mentioned target object containing the fifth mark at the same time;
  • the seventh calibration subunit 24203 is configured to use the positional relationship between the above-mentioned fourth auxiliary camera and the above-mentioned multiple cameras as a pose conversion bridge, and combine the internal and external parameters of the third auxiliary camera and the above-mentioned multiple cameras, according to the above-mentioned fourth marker image and the above-mentioned
  • the fifth mark image determines the above-mentioned third positional relationship
  • the eighth calibration subunit 24204 is configured to determine the second positional relationship according to the first positional relationship and the third positional relationship.
  • an additional auxiliary camera and a calibration board are introduced to achieve calibration, and the calibration board is placed within the working range of multiple cameras, and the coordinate system conversion of the auxiliary camera and the calibration board finally realizes multiple cameras and multiple light sources.
  • Calibration of the pose relationship with the target object this method is simple in principle and high in theoretical accuracy, but in the process of actual operation, since the positional relationship between multiple cameras and the target object has been fixed, and multiple cameras cannot capture the target object, according to
  • the calibration board and auxiliary camera are arranged according to the above requirements, the angle between the optical axis of the auxiliary camera and the normal vector of the calibration board and the normal vector of the display is too large, which may lead to unsatisfactory collected calibration patterns and large errors in the extraction of marker points, which is difficult to guarantee The precision of the conversion.
  • the introduction of auxiliary cameras increases the cost and complicates the calibration process
  • the above-mentioned calibration unit 11024 includes a third calibration unit 24300, which is used to obtain the above-mentioned geometric positional relationship by combining the internal and external parameters of the above-mentioned multiple cameras and using a plane mirror to transmit calibration information for calibration, wherein the above-mentioned third Calibration unit 24300 includes:
  • the ninth calibration subunit 24301 is configured to use the above-mentioned plane mirror with no less than four marking points attached thereto as an aid, and the above-mentioned multiple cameras acquire reflection images with the above-mentioned multiple light sources, the above-mentioned target object and including the above-mentioned mark points;
  • the tenth calibration subunit 24302 is configured to calculate the coordinates of each of the above-mentioned marking points, the above-mentioned multiple light sources and the above-mentioned target object in the above-mentioned multiple camera coordinate systems, the coordinates of the specular light source and the coordinates of the specular target object respectively according to the above-mentioned reflected image;
  • the eleventh calibration subunit 24303 is used to reconstruct the mirror plane according to the coordinates of all the above-mentioned marker points, and confirm the above-mentioned first positional relationship and the above-mentioned third positional relationship according to the principle of specular reflection in combination with the coordinates of the above-mentioned light source on the specular surface and the coordinates of the above-mentioned specular target object;
  • the twelfth calibration subunit 24304 is configured to determine the second positional relationship according to the first positional relationship and the third positional relationship.
  • the third calibration unit 24300 uses the camera to collect reflection images toward the mirror surface, and the above images include images projected on the plane mirror by all cameras, light sources, and displays containing marks. Determine the marker point coordinates of each of the above-mentioned marker points in the above-mentioned multiple camera coordinate systems according to the reflection image containing all the marker points, and determine the position of the virtual image of the light source in the multiple camera coordinate systems according to the light source and target object that include mirror projection in the reflection image Specular light source coordinates and target object coordinates of the specular target object in multiple camera coordinate systems.
  • the specular plane in multiple camera coordinate systems is reconstructed according to the coordinates of at least 4 marker points above, and according to the specular reflection principle, the specular light source corresponds to the actual light source in multiple camera coordinate systems.
  • the first positional relationship is determined.
  • the third positional relationship is determined, and the second positional relationship is determined according to the first positional relationship and the third positional relationship.
  • the third calibration method only relies on a fixed plane mirror to avoid complicated operations caused by multiple movements, does not rely on auxiliary cameras, avoids problems such as excessive angle ranges between objects captured by auxiliary cameras, and effectively reduces the calibration process and cost.
  • the line-of-sight reconstruction module 1103 is configured to determine the line-of-sight optical axis according to the above-mentioned light source reflection point coordinates and the above-mentioned pupil center coordinates, and the above-mentioned line-of-sight optical axis reconstructs the line-of-sight line-of-sight axis through the compensation angle;
  • the sight line reconstruction module 1103 includes:
  • the first reconstruction unit 11031 is configured to determine the coordinates of the center of corneal curvature according to the above-mentioned coordinates of the reflection point of the light source and the radius of curvature of the cornea;
  • the second reconstruction unit 11032 is configured to determine the line-of-sight optical axis of the line connecting the pupil center and the corneal curvature center according to the pupil center coordinates and the corneal curvature center coordinates.
  • the center of curvature of the cornea in the world coordinate system can be calculated and determined.
  • the optical axis direction is the line connecting the pupil center and the corneal curvature center. Simple calculation of the coordinates of the pupil center and the corneal curvature center can determine the optical axis direction, for example, after subtracting the coordinates of the pupil center and the corneal curvature center The direction of the optical axis can be determined.
  • the corneal curvature center and the pupil center are coplanar with the pupil imaging point on the imaging plane of the camera and the optical center coordinates of the camera.
  • the unit vector of the intersection line where the two coplanar surfaces intersect is the line of sight optical axis
  • the line connecting the pupil center and the above-mentioned corneal curvature center is the line of sight optical axis.
  • the line-of-sight tracking device based on corneal pupil reflection can only find the optical axis of the eye.
  • the user's line of sight is determined by the visual axis, and there is a compensation angle between the visual axis and the optical axis.
  • This application does not It does not take into account that eye deformation or abnormality may affect the user's compensation angle.
  • the eyeball and cornea are not spherical, which leads to different reconstruction errors when the gaze estimation method based on the cornea-pupil reflection reconstructs the eye's optical axis when gazing at different positions.
  • the structure of the human eye determines that there is a compensation angle between the optical axis of the line of sight and the visual axis of the line of sight, which is called the Kappa angle.
  • the Kappa angle is used as a fixed compensation between the optical axis and the visual axis.
  • the compensation angle may not be calibrated, or the compensation angle may be calibrated through a few points, so that it is easier to implement eye tracking.
  • the sight line reconstruction module 1103 includes:
  • the third reconstruction unit 11033 is used to acquire a set of sample images when the above-mentioned user gazes at each preset gaze point;
  • the fourth reconstruction unit 11034 is configured to determine the first compensation angle sample according to the sample features extracted from each group of sample images;
  • the fifth reconstruction unit 11035 is configured to traverse all the first compensation angle samples, and obtain the compensation angles through screening and purification.
  • the fourth reconstruction unit 11034 includes: extracting sample features for each set of sample images, and reconstructing the first line-of-sight optical axis according to the sample features; A line of sight visual axis; according to the first line of sight optical axis and the above-mentioned first line of sight visual axis, a first compensation angle sample is acquired.
  • the fifth reconstruction unit 11035 includes: calculating the center points of all the above-mentioned first compensation angle samples, screening and removing samples that are not within the first threshold range; continuing to traverse, filter and purify all remaining samples Until the difference between the current central point and the previous central point is lower than the second threshold, the compensation angle is obtained from all the purified samples.
  • the preset positions are known, and the gaze points are uniformly distributed in space.
  • the user stares at the gaze point at a known position, and according to the sample features contained in the collected sample image, reconstructs the optical axis of the line of sight based on the above-mentioned corneal pupil reflection, and restores the visual axis of the line of sight by combining the pupil center coordinates and the preset gaze point, and further obtains the gaze point Corresponding compensation angle samples.
  • There will be errors in the process of obtaining the compensation angle sample set based on the preset gaze point such as detection errors.
  • the present invention purifies the compensation angle samples, eliminates abnormal samples, and retains high-quality and reasonably distributed samples in the threshold range, which guarantees to a certain extent While ensuring the accuracy of the compensation angle, the applicability of the final compensation angle is guaranteed.
  • This application can not be calibrated, or can be calibrated through a few points, so that it is easier to achieve eye-tracking.
  • the above-mentioned line-of-sight reconstruction module introduces a dynamic compensation model, and performs individualized compensation calibration for different users before using the above-mentioned dynamic compensation model, so as to track a line-of-sight direction with higher accuracy.
  • the line of sight reconstruction module 1103 further includes: a dynamic compensation unit 11036, configured to determine the deviation between the predicted line of sight observation point and the real line of sight observation point for the collected data through a dynamic compensation model, and obtain above compensation angle.
  • a dynamic compensation unit 11036 configured to determine the deviation between the predicted line of sight observation point and the real line of sight observation point for the collected data through a dynamic compensation model, and obtain above compensation angle.
  • each frame of data in the collected data is input to the trained dynamic compensation model to predict the deviation, and the deviation is the difference between the predicted line of sight observation point and the real line of sight observation point.
  • the deviation and obtain the compensation angle according to the above deviation.
  • the predicted line-of-sight observation point can be calculated by the above-mentioned device for reconstructing the line-of-sight optical axis.
  • the embodiment of the present invention does not limit the type of the dynamic compensation model, which can be a neural network, a random model forest, and the like.
  • the above-mentioned dynamic compensation unit 11036 includes an initialization subunit to obtain a dynamic compensation model suitable for the current user.
  • the dynamic compensation unit 11036 includes an initialization subunit 36100, configured to initialize the dynamic compensation model before using the dynamic compensation model, wherein the initialization subunit 36100 includes: a first initialization subunit The unit 36110 is used to obtain a set of initial sample images when the above-mentioned user gazes at each preset initial point; the second initialization sub-unit 36120 is used to extract the initial sample features for each set of the above-mentioned initial sample images, and pass the Small-sample learning initialization to obtain the above-mentioned dynamic compensation model that fits the current above-mentioned users.
  • the initialization subunit 36100 includes: a first initialization subunit The unit 36110 is used to obtain a set of initial sample images when the above-mentioned user gazes at each preset initial point; the second initialization sub-unit 36120 is used to extract the initial sample features for each set of the above-mentioned initial sample images, and pass the Small-sample learning initialization to obtain the above-mentioned dynamic compensation model
  • the user gazes at a preset initial point and the positions of the preset initial point are known and evenly distributed in space.
  • a set of sample images is acquired for each preset initial point, and the number of gaze points collected is at least 3.
  • initial sample features are extracted for each group of initial sample images, and the initial sample features are initialized through small-sample learning to obtain the aforementioned dynamic compensation model that fits the current aforementioned user.
  • the initial sample features extracted from the sample image include, but are not limited to: the three-dimensional coordinates of the center of the corneal ball, the Euler angle of the optical axis of the eyeball, the sight observation point calculated based on the optical axis, and the deflection angle of the face.
  • the dynamic compensation model can have the generalization ability of the model in the case of learning category changes in a short period of time.
  • the dynamic compensation model initialized by the prior knowledge of the current user can be used for subsequent data collected by the current user.
  • the data provides a good search direction, which reduces prediction errors and improves the efficiency of prediction deviations.
  • Embodiments of the present invention are not limited to small-sample learning methods, such as the MAML method.
  • the dynamic compensation unit 11036 includes a training subunit 36200 for training the dynamic compensation model, wherein the training subunit 36200 includes: a first training subunit 36210 for collecting multiple Multiple sets of sample data when the user stares at the preset calibration points; the second training subunit 36220 is used to clean the above multiple sets of sample data, and extract training sample features from the above multiple sets of samples after cleaning; the third training subunit , for using small sample learning to train an initial dynamic compensation model according to the above training sample characteristics, and obtain the above dynamic compensation model after training.
  • the initial dynamic compensation model obtains the trained dynamic compensation model through the above-mentioned training subunit 36200, and collects multiple sets of sample data. Each tester is required to stare at several preset calibration points evenly distributed on the target object, and Collect the same number of samples accordingly. After the initial sample data is obtained, the sample data is cleaned to remove samples that do not meet the requirements of the line of sight estimation algorithm, such as incomplete light source reflection points and severe blurred pictures. Further, sample features are extracted for each set of sample images after cleaning and purification, and the initial dynamic compensation model is trained using small sample learning according to the training sample features to obtain the trained dynamic compensation model.
  • the training sample features extracted for multiple sets of sample data include, but are not limited to: the three-dimensional coordinates of the center of the corneal ball, the Euler angle of the optical axis of the eyeball, the sight observation point calculated based on the optical axis, and the deflection angle of the face.
  • the trained dynamic compensation model After the trained dynamic compensation model inputs the collected data, it can obtain a highly accurate deviation between the predicted line-of-sight observation point and the real line-of-sight observation point, thereby obtaining a high-precision compensation angle.
  • the above line-of-sight reconstruction module first initializes and generates a dynamic compensation model that fits the current user, then uses the generated dynamic compensation model to predict deviations for the data collected by the current user, further obtains a better compensation angle, and finally reconstructs A high-precision line of sight boresight.
  • the viewpoint determination module 1104 is configured to determine a viewpoint on the above-mentioned target object according to the above-mentioned visual axis of the line of sight and the position of the target object in the above-mentioned world coordinate system.
  • the sight axis of the above-mentioned line of sight is located in the world coordinate system.
  • To determine the viewpoint of the sight axis on the above-mentioned target object it is necessary to determine the position of the target object on the world coordinate.
  • the above-mentioned geometric position relationship can provide the position of the target object on the world coordinate. Further determine the plane of the target object on the world coordinates.
  • the problem of determining the user's viewpoint is transformed into the problem of the intersection of the visual axis vector of the line of sight and the plane where the target object is located in the same coordinate system.
  • the present invention does not limit the method of solving the problem of intersection.
  • the gaze direction tracking device further includes a simulation module 1105 for simulating the tracking accuracy and anti-interference ability of the gaze tracking system under various hardware conditions, eyeball states, algorithm design and other factors.
  • the above device further includes a simulation module 1105 for performing simulation verification, analysis and optimization through preset eyeball parameters, the aforementioned compensation angle, the aforementioned hardware calibration parameters and preset viewpoints.
  • the simulation module 1105 includes: a first simulation unit 11051, configured to simulate and calculate the reconstructed light source imaging point and Reconstructing the pupil imaging point; the second simulation unit 11052 is used to determine the predicted viewpoint according to the above-mentioned line-of-sight tracking method according to the above-mentioned reconstructed light source imaging point and the above-mentioned reconstructed pupil imaging point; the third simulation unit 11053 is used to determine the predicted viewpoint according to the above-mentioned preset viewpoint and Statistical analysis of the comparative values of the above forecast viewpoints, and implementation of verification and optimization based on the analysis results.
  • a first simulation unit 11051 configured to simulate and calculate the reconstructed light source imaging point and Reconstructing the pupil imaging point
  • the second simulation unit 11052 is used to determine the predicted viewpoint according to the above-mentioned line-of-sight tracking method according to the above-mentioned reconstructed light source imaging point and the above-mentioned reconstructed pupil imaging point
  • the third simulation unit 11053 is used to determine
  • the first simulation unit 11051 includes:
  • the first simulation subunit 51100 is used to determine the angle of the light source cornea camera according to the corneal center in the above-mentioned preset eyeball parameters and the above-mentioned hardware calibration parameters.
  • the reflection principle determines the coordinates of the reflection point of the reconstructed light source, and calculates the imaging point of the reconstructed light source according to the coordinates of the reflection point of the reconstructed light source combined with the above hardware calibration parameters;
  • the second simulation subunit 51200 is used to determine the first visual axis according to the coordinates of the preset viewpoint and the corneal center in the preset eyeball parameters, and deduce the first optical axis based on the first visual axis and the compensation angle, according to The first optical axis is combined with the pupil-cornea center distance of the preset eyeball parameters to determine the reconstructed pupil center coordinates, and the reconstructed pupil imaging point is calculated according to the pupil center coordinates combined with the hardware calibration parameters.
  • Eyeball parameters and geometric positional relationships are preset according to the scene for simulating gaze tracking.
  • Eyeball parameters include but not limited to: three-dimensional coordinates of eyeball center, eyeball radius, three-dimensional coordinates of corneal spherical center, corneal spherical radius, distance from eyeball center to corneal spherical center, three-dimensional coordinates of pupil center, three-dimensional coordinates of iris center, iris radius, corneal refractive index , the refractive index of the lens, the optical axis of the eyeball, the visual axis of the eyeball, and the Kappa angle, etc. are used to model the three-dimensional model of the eyeball.
  • Hardware calibration parameters include, but are not limited to: camera intrinsic parameters, camera extrinsic parameters, camera distortion parameters, camera stereo calibration parameters, camera 3D coordinates, light source 3D coordinates, and target object 3D coordinates.
  • the simulation system verifies whether the line-of-sight tracking method is implemented correctly based on the actual input parameters, tests the impact of adding disturbances to the input parameters on the viewpoint error, and finds the optimal configuration method for the above-mentioned multiple light sources, the above-mentioned multiple cameras, and the target object .
  • the simulation module 1105 supports multiple preset gaze point positioning to assist in verifying whether the implementation of the gaze direction tracking method is correct. Specifically, taking the screen as an example where the target object included in the simulation system has multiple known gaze points on the screen, when the simulated eyeball in the simulation module 1105 gazes at a certain gaze point, according to the position of the gaze point and the preset The coordinates of the center of the corneal ball contained in the eyeball parameters can be used to calculate the visual axis of the line of sight, and then the optical axis can be deduced according to the preset compensation angle, and finally the coordinates of the pupil center can be reconstructed according to the preset distance from the center of the pupil to the center of the corneal ball.
  • the preset view point can be preset in advance or randomly generated at multiple different positions to simulate tracking from various sight angles. Specifically, a plurality of different position preset points are generated so as to simulate the error situation of line of sight tracking when the eyeball looks at different positions of the target object.
  • the above-mentioned third simulation unit 11053 includes: a third simulation sub-unit, used to verify whether the implementation of the above-mentioned gaze direction tracking method is correct, to test the influence of added disturbance on the viewpoint error and to determine the above-mentioned multiple Light sources, multiple cameras as described above, and target object configuration methods.
  • the simulation module 1105 involves multiple variables, including geometric positional relationships, eyeball parameters, and viewpoints. Any abnormality in any variable will cause the verification of correctness to fail. If the verification fails, the abnormal variables can be further screened out by controlling variables and other methods, which can improve the optimization efficiency of the gaze tracking method.
  • the above-mentioned simulation module 1105 calculates the predicted viewpoint after adding disturbance to the above-mentioned input parameters, and checks the influence of the above-mentioned disturbance on the viewpoint error by comparing the above-mentioned predicted viewpoint with the real viewpoint.
  • the simulation module 1105 can apply any type of perturbation to each parameter to simulate the performance of the designed gaze tracking system in actual use. By comparing the predicted value of the viewpoint calculated by using the perturbed parameters with the true value of the viewpoint, including statistical analysis of the Euclidean distance, angle error, variance, etc., it can be targeted to guide the design and implementation of the following gaze tracking system. Algorithm optimization.
  • the degree of influence of the calibration error on the system can be tested; by perturbing the image pixels at key points such as the reflection point of the light source and the center of the pupil, the degree of influence of the key point detection algorithm error on the system can be tested.
  • the simulation module 1105 it is used to find the optimal hardware configuration method for realizing a high-precision gaze direction tracking device, which greatly reduces hardware costs and improves the efficiency of optimizing the gaze tracking device.
  • an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein, the processor is configured to execute any A gaze direction tracking method.
  • a storage medium is also provided, and the storage medium includes a stored program, wherein when the program is running, the device where the storage medium is located is controlled to execute any one of the gaze direction tracking methods.
  • the disclosed technical content can be realized in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the above-mentioned units can be a logical function division.
  • there may be another division method for example, multiple units or components can be combined or integrated. to another coefficient, or some features can be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the above integrated units are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the above-mentioned methods in various embodiments of the present invention.
  • the aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Eye Examination Apparatus (AREA)
  • Position Input By Displaying (AREA)

Abstract

本发明公开了一种视线方向追踪方法和装置。其中,该视线方向追踪方法包括:使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含用户人脸的图像;通过从包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;根据光源反射点坐标和瞳孔中心坐标确定视线光轴,视线光轴通过补偿角度重建视线视轴;根据视线视轴和目标对象在世界坐标系的位置,确定在目标对象上的视点。本发明解决了视线追踪硬件成本高、算法优化效率低和估计精度低的技术问题。

Description

视线方向追踪方法和装置
本申请要求于2021年8月5日递交的中国专利申请第202110897215.8号的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。
技术领域
本发明涉及图像处理技术领域,具体而言,涉及一种视线方向追踪方法和装置。
背景技术
在相关技术中,实时检测驾驶员的视线方向和视线的落点位置,可以为辅助驾驶和各种产品应用提供重要的信息输入。比如,视线的角度方向可以用于判断当前驾驶员的观察区域,检测驾驶员分心的危险驾驶行为等,视线的落点位置可以用于判断当前用户的注意目标,从而调整增强现实抬头显示器系统的显示位置等。
现有技术中视线方向追踪方法通过获取的包含人眼的图像,通过分析中人眼的特征估计视线方向和视点,上述视线追踪方法通常分为基于外观法和基于角膜反射法。基于外观法一般依赖人眼图像或人脸图像中包含的外观特征,包括眼睑位置、瞳孔位置、虹膜位置、内/外眼角、人脸朝向等特征估计视线方向和视点,基于外观的方法不是利用几何映射关系,而是将人眼图像对应于高维空间的点,进而学习从给定特征空间的点到屏幕坐标的映射关系。基于角膜反射法,则除了依赖外观法的部分图像特征,还依赖光源在角膜上的反射形成的普尔钦斑,通过眼睛运动特征建立从几何模型的瞳孔中心到凝视校准点的映射,一般来说,角膜反射法比外观法有更高的精度,因此几乎所有的成熟商业产品都基于角膜反射法,但是角膜反射法还需要复杂的设备以及精确的校准过程,并不具有普遍适用性。针对角膜反射法硬件制作成本高、算法优化效率低和视线估计精度低的问题,目前尚未提出有效的解决方案。
发明内容
本发明实施例提供了一种视线方向追踪方法和装置,以至少解决视线追踪硬件成本高、算法优化效率低和估计精度低的技术问题。
根据本发明实施例中的一个方面,提供了一种视线方向追踪方法,包括:使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含上述用户人脸的图像;通过从上述包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;根据上述光源反射点坐标和上述瞳孔中心坐标确定视线光轴,上述视线光轴通过补偿角度重建视线视轴;根据上述视线视轴和目标对象在上述世界坐标系的位置,确定在上述目标对象上的视点。
可选的,上述人眼特征集包括:光源成像点,瞳孔成像点。
可选的,上述世界坐标系可选取以下任意一种:光源坐标系、相机坐标系、目标对象坐标系。
可选的,上述方法还通过预设眼球参数、上述补偿角度、上述硬件标定参数和预设视点进行仿真验证、分析和优化。
可选的,上述多个相机、上述多个光源和上述目标对象朝向用户,且上述多个相机的视野不包括上述多个光源和上述目标对象。
可选的,上述硬件标定参数包括:上述多个相机坐标系的内外参和几何位置关系,其中上述几何位置关系包括以下:上述多个光源坐标系和上述多个相机坐标系之间的第一位置关系,上述多个光源坐标系和上述目标对象坐标系之间的第二位置关系,上述多个相机坐标系和上述目标对象坐标系之间的第三位置关系。
可选的,上述几何位置关系通过结合上述多个相机的内外参,利用平面镜和辅助相机中的至少一项传递标定信息标定获得。
可选的,通过从上述包含人脸的图像获取的人脸特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标,包括:若上述世界坐标系为上述光源坐标系,上述人脸特征集通过上述多个相机的内外参,结合上述第一位置关系,或结合上述第二位置关系和上述第三位置关系,确定上述光源 坐标系中的上述光源反射点坐标和上述瞳孔中心坐标;若上述世界坐标系为上述相机坐标系,上述人脸特征集通过上述多个相机的内外参,确定上述多个相机坐标系中的上述光源反射点坐标和上述瞳孔中心坐标;若上述世界坐标系为上述目标对象坐标系,上述人脸特征集通过上述多个相机的内外参,结合上述第三位置关系,或结合上述第一位置关系和上述第二位置关系,确定上述目标对象坐标系中的上述光源反射点坐标和上述瞳孔中心坐标。
可选的,根据上述光源反射点坐标和上述瞳孔中心坐标确定视线光轴,包括:根据上述光源反射点坐标和角膜曲率半径,确定角膜曲率中心的坐标;根据上述瞳孔中心坐标和上述角膜曲率中心的坐标,确定上述瞳孔中心与上述角膜曲率中心连线的上述视线光轴。
可选的,上述几何位置关系通过结合上述多个相机的内外参,利用平面镜和辅助相机传递标定信息标定获得,包括:第一辅助相机获取上述平面镜以多个不同姿态反射的含有第一标记的上述目标对象的多张第一标记图像;结合上述多个相机的内外参,根据多张上述第一标记图像基于正交约束计算上述第三位置关系;第二辅助相机获取包含上述多个光源的第二标记图像,结合上述多个相机的内外参并基于上述第二标记图像获取上述第一位置关系,其中,上述第二辅助相机为立体视觉系统;根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
可选的,上述几何位置关系通过结合上述多个相机的内外参,利用辅助相机传递标定信息标定获得,包括:第三辅助相机获取包含上述多个光源的第三标记图像,结合上述多个相机的内外参并基于上述第三标记图像获取上述第一位置关系,其中,上述第三辅助相机为立体视觉系统;第四辅助相机设置为其视野包括上述多个相机和上述第三辅助相机,在上述第三辅助相机旁设置标定板,上述多个相机采集包含上述标定板区域的第四标记图像,同时上述第三辅助相机获取含有第五标记的上述目标对象的第五标记图像;将上述第四辅助相机和上述多个相机的位置关系作为姿态转换桥梁,结合第三辅助相机和上述多个相机的内外参,根据上述第四标记图像和上述第五标记图像确定上述第三位置关系;根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
可选的,上述几何位置关系通过结合上述多个相机的内外参,利用平面镜传递标定信息标定获得,包括:利用粘有不少于4个标记点的上述平面镜作为辅助,上述多个相机获取带有上述多个光源、上述目标对象且包含上述标记点的反射图像;依据上述反射图像分别计算各个上述标记点、上述多个光源和上述目标对象在上述多个相机坐标系中的标记点坐标,镜面光源坐标和镜面目标对象坐标;根据所有上述标记点坐标重建镜面平面,并依据镜面反射原理,结合上述镜面光源坐标和上述镜面目标对象坐标确认上述第一位置关系和上述第三位置关系;根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
可选的,上述视线光轴通过补偿角度重建视线视轴之前,上述方法还包括:上述用户凝视每一个预设的凝视点时获取一组样本图像;根据每组上述样本图像提取的样本特征确定第一补偿角度样本;遍历所有上述第一补偿角度样本,通过筛选和提纯获取上述补偿角度。
可选的,根据每组上述样本图像提取的样本特征确定第一补偿角度样本,包括:针对每组上述样本图像提取样本特征,并根据上述样本特征重建出第一视线光轴;基于上述预设的凝视点的真实坐标反推出第一视线视轴;根据上述第一视线光轴和上述第一视线视轴,获取上述第一补偿角度样本。
可选的,遍历所有上述第一补偿角度样本,通过筛选和提纯获取上述补偿角度,包括:求取所有上述第一补偿角度样本的中心点,筛选并去除不在第一阈值范围内的样本;继续遍历筛选和提纯剩余所有样本直至当前中心点与上一次的中心点的差值低于第二阈值,从提纯后的所有样本中获取上述补偿角度。
可选的,上述视线光轴通过补偿角度重建视线视轴之前,上述方法还包括:对采集的数据通过动态补偿模型确定预测视线观测点和真实视线观测点的偏差,根据上述偏差获取上述补偿角度。
可选的,在使用上述动态补偿模型以前对上述动态补偿模型初始化,包括:上述用户凝视每一个预设初始点时获取一组初始样本图像;针对每组上述初始样本图像提取初始样本特征,并根据上述初始样本特征通过小样本学习初始化获得与当前上述用户契合的上述动态补偿模型。
可选的,训练上述动态补偿模型包括:采集多个用户分别凝视预设校准点时的多组样本数据;对上述多组样本数据清洗,并对清洗后的上述多组样本提取训练样本特征;根据上述训练样本特征使用小样本学习训练初始动态补偿模型,获取训练后的上述动态补偿模型。
可选的,上述通过预设眼球参数、上述补偿角度、上述硬件标定参数和预设视点进行仿真验证、分析和优化,包括:针对上述预设视点,根据上述眼球参数、上述补偿角度和上述硬件标定参数仿真计算重建光源成像点和重建瞳孔成像点;根据上述重建光源成像点和上述重建瞳孔成像点,依据上述视线方向追踪方法确定预测视点;根据上述预设视点和上述预测视点的比较值统计分析,并根据分析结果实施验证和优化。
可选的,针对上述预设视点,根据上述预设眼球参数、上述补偿角度和上述硬件标定参数仿真计算重建光源成像点和重建瞳孔成像点,包括:根据上述预设眼球参数中的角膜中心和上述硬件标定参数确定光源角膜相机角度,基于上述光源角膜相机角度和上述预设眼球参数中的角膜曲率半径,结合球面反射原理确定重建光源反射点坐标,根据上述重建光源反射点坐标结合上述硬件标定参数计算上述重建光源成像点;根据上述预设视点的坐标和上述预设眼球参数中的角膜中心确定第一视轴,基于上述第一视轴和上述补偿角度反推出第一光轴,依据上述第一光轴并结合上述预设眼球参数的瞳孔角膜中心距离确定重建瞳孔中心坐标,根据上述瞳孔中心坐标结合上述硬件标定参数计算上述重建瞳孔成像点。
可选的,上述预设视点可提前预设或随机生成于多个不同位置以仿真多种视线角度追踪。
可选的,根据上述预设视点和上述预测视点进行统计分析,并根据分析结果实施验证和优化,包括:验证上述视线方向追踪方法的实现是否正确,测试添加扰动对视点误差的影响和确定上述多个光源、上述多个相机和目标对象配置方法。
根据本发明实施例的另一个方面,还提供了一种视线方向追踪装置,包括:采集模块,用于使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获 包含上述用户人脸的图像;关键点确定模块,用于通过从上述包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;视线重建模块,用于根据上述光源反射点坐标和上述瞳孔中心坐标确定视线光轴,上述视线光轴通过补偿角度重建视线视轴;视点确定模块,用于根据上述视线视轴和目标对象在上述世界坐标系的位置,确定在上述目标对象上的视点。
可选的,上述装置还包括仿真模块,用于通过预设眼球参数、上述补偿角度、上述硬件标定参数和预设视点进行仿真验证、分析和优化。
可选的,上述硬件标定参数包括:上述多个相机坐标系的内外参和几何位置关系,其中上述几何位置关系包括以下:上述多个光源坐标系和上述多个相机坐标系之间的第一位置关系,上述多个光源坐标系和上述目标对象坐标系之间的第二位置关系,上述多个相机坐标系和上述目标对象坐标系之间的第三位置关系。
可选的,上述关键点确定模块包括:标定单元,用于通过结合上述多个相机的内外参,利用平面镜和辅助相机中的至少一项传递标定信息标定获得上述几何位置关系。
可选的,上述关键点确定模块包括:第一确定单元,用于若上述世界坐标系为上述光源坐标系,上述人脸特征集通过上述多个相机的内外参,结合上述第一位置关系,或结合上述第二位置关系和上述第三位置关系,确定上述光源坐标系中的上述光源反射点坐标和上述瞳孔中心坐标;第二确定单元,用于若上述世界坐标系为上述相机坐标系,上述人脸特征集通过上述多个相机的内外参,确定上述多个相机坐标系中的上述光源反射点坐标和上述瞳孔中心坐标;第三确定单元,用于若上述世界坐标系为上述目标对象坐标系,上述人脸特征集通过上述多个相机的内外参,结合上述第三位置关系,或结合上述第一位置关系和上述第二位置关系,确定上述目标对象坐标系中的上述光源反射点坐标和上述瞳孔中心坐标。
可选的,上述视线重建模块包括:第一重建单元,用于根据上述光源反射点坐标和角膜曲率半径,确定角膜曲率中心的坐标;第二重建单元,用于根据 上述瞳孔中心坐标和上述角膜曲率中心的坐标,确定上述瞳孔中心与上述角膜曲率中心连线的上述视线光轴。
可选的,上述标定单元包括第一标定单元,用于通过结合上述多个相机的内外参,利用平面镜和辅助相机传递标定信息标定获得上述几何位置关系,其中,上述第一标定单元包括:第一标定子单元,用于第一辅助相机获取上述平面镜以多个不同姿态反射的含有第一标记的上述目标对象的多张第一标记图像;第二标定子单元,用于结合上述多个相机的内外参,根据多张上述第一标记图像基于正交约束计算上述第三位置关系;第三标定子单元,用于第二辅助相机获取包含上述多个光源的第二标记图像,结合上述多个相机的内外参并基于上述第二标记图像获取上述第一位置关系,其中,上述第二辅助相机为立体视觉系统;第四标定子单元,用于根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
可选的,上述标定单元包括第二标定单元,用于通过结合上述多个相机的内外参,利用辅助相机传递标定信息标定获得上述几何位置关系,其中,上述第二标定单元包括:第五标定子单元,用于第三辅助相机获取包含上述多个光源的第三标记图像,结合上述多个相机的内外参并基于上述第三标记图像获取上述第一位置关系,其中,上述第三辅助相机为立体视觉系统;第六标定子单元,用于第四辅助相机设置为其视野包括上述多个相机和上述第三辅助相机,在上述第三辅助相机旁设置标定板,上述多个相机采集包含上述标定板区域的第四标记图像,同时上述第三辅助相机获取含有第五标记的上述目标对象的第五标记图像;第七标定子单元,用于将上述第四辅助相机和上述多个相机的位置关系作为姿态转换桥梁,结合第三辅助相机和上述多个相机的内外参,根据上述第四标记图像和上述第五标记图像确定上述第三位置关系;第八标定子单元,用于根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
可选的,上述标定单元包括第三标定单元,用于通过结合上述多个相机的内外参,利用平面镜传递标定信息标定获得上述几何位置关系,其中,上述第三标定单元包括:第九标定子单元,用于利用粘有不少于4个标记点的上述平面镜作为辅助,上述多个相机获取带有上述多个光源、上述目标对象且包含上 述标记点的反射图像;第十标定子单元,用于依据上述反射图像分别计算各个上述标记点、上述多个光源和上述目标对象在上述多个相机坐标系中的标记点坐标,镜面光源坐标和镜面目标对象坐标;第十一标定子单元,用于根据所有上述标记点坐标重建镜面平面,并依据镜面反射原理,结合上述镜面光源坐标和上述镜面目标对象坐标确认上述第一位置关系和上述第三位置关系;第十二标定子单元,用于根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
可选的,上述视线重建模块包括:第三重建单元,用于上述用户凝视每一个预设的凝视点时获取一组样本图像;第四重建单元,用于根据每组上述样本图像提取的样本特征确定第一补偿角度样本;第五重建单元,用于遍历所有上述第一补偿角度样本,通过筛选和提纯获取上述补偿角度。
可选的,上述视线重建模块还包括:动态补偿单元,用于对采集的数据通过动态补偿模型确定预测视线观测点和真实视线观测点的偏差,根据上述偏差获取上述补偿角度。
可选的,上述动态补偿单元包括初始化子单元,用于在使用上述动态补偿模型以前对上述动态补偿模型初始化,其中,上述初始化子单元包括:第一初始化子单元,用于上述用户凝视每一个预设初始点时获取一组初始样本图像;第二初始化子单元,用于针对每组上述初始样本图像提取初始样本特征,并根据上述初始样本特征通过小样本学习初始化获得与当前上述用户契合的上述动态补偿模型。
可选的,上述动态补偿单元包括训练子单元,用于训练上述动态补偿模型,其中,上述训练子单元包括:第一训练子单元,用于采集多个用户分别凝视预设校准点时的多组样本数据;第二训练子单元,用于对上述多组样本数据清洗,并对清洗后的上述多组样本提取训练样本特征;第三训练子单元,用于根据上述训练样本特征使用小样本学习训练初始动态补偿模型,获取训练后的上述动态补偿模型。
可选的,上述仿真模块包括:第一仿真单元,用于针对上述预设视点,根据上述眼球参数、上述补偿角度和上述硬件标定参数仿真计算重建光源成像点 和重建瞳孔成像点;第二仿真单元,用于根据上述重建光源成像点和上述重建瞳孔成像点,依据上述视线方向追踪方法确定预测视点;第三仿真单元,用于根据上述预设视点和上述预测视点的比较值统计分析,并根据分析结果实施验证和优化。
可选的,上述第一仿真单元包括:第一仿真子单元,用于根据上述预设眼球参数中的角膜中心和上述硬件标定参数确定光源角膜相机角度,基于上述光源角膜相机角度和上述预设眼球参数中的角膜曲率半径,结合球面反射原理确定重建光源反射点坐标,根据上述重建光源反射点坐标结合上述硬件标定参数计算上述重建光源成像点;第二仿真子单元,用于根据上述预设视点的坐标和上述预设眼球参数中的角膜中心确定第一视轴,基于上述第一视轴和上述补偿角度反推出第一光轴,依据上述第一光轴并结合上述预设眼球参数的瞳孔角膜中心距离确定重建瞳孔中心坐标,根据上述瞳孔中心坐标结合上述硬件标定参数计算上述重建瞳孔成像点。
可选的,上述第三仿真单元包括:第三仿真子单元,用于验证上述视线方向追踪方法的实现是否正确,测试添加扰动对视点误差的影响和确定上述多个光源、上述多个相机和目标对象配置方法。
根据本发明实施例的另一个方面,还提供了一种存储介质,其特征在于,上述存储介质包括存储的程序,其中,在上述程序运行时控制上述存储介质所在设备执行权利要求1至22中任意一项上述的视线方向追踪方法。
根据本发明实施例的另一个方面,还提供了一种电子设备,其特征在于,包括:处理器;以及存储器,用于存储上述处理器的可执行指令;其中,上述处理器配置为经由执行上述可执行指令来执行权利要求1至22中任意一项上述的视线方向追踪方法。
在本发明实施例中,通过执行以下步骤:使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含上述用户人脸的图像;通过从上述包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;根据上述光源反射点坐标和上述瞳孔中心坐标确定视线光轴,上述视线光轴通过补偿角度重建视线视轴;根据上述视线视轴和目标对 象在上述世界坐标系的位置,确定在上述目标对象上的视点。本发明实施例中,从而解决视线追踪硬件成本高、算法优化效率低和估计精度低的技术问题。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据本发明实施例的一种可选的视线方向追踪方法的流程图;
图2是根据本发明实施例的一种可选的视线追踪方法的应用场景图;
图3是根据本发明实施例的一种可选的视线跟踪系统设置示意图;
图4是根据本发明实施例的一种可选的标定场景图;
图5是根据本发明实施例的一种可选的标定方法的流程图;
图6是根据本发明实施例的另一种可选的标定场景图;
图7是根据本发明实施例的另一种可选的标定方法的流程图;
图8根据本发明实施例的另一种可选的标定场景图;
图9根据本发明实施例的另一种可选的标定方法的流程图;
图10是根据本发明实施例的一种可选的重建光源反射点方法的原理图;
图11是根据本发明实施例的一种可选的视线方向追踪装置的结构框图;
图12是根据本发明实施例的另一种可选的视线方向追踪装置的结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。 应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系数、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
根据本发明实施例,提供了一种视线方向追踪方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系数中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
本发明实施例提供了一种视线方向追踪方法,该视线方向追踪方法可以适用于多种应用场景中,包括:透明A柱,视线亮屏,抬头显示系统等应用产品中,本发明实施例提出了一种硬件成本低、处理精度高、处理速度快、可实时、可应用于绝大部分的实际使用场景的视线方向追踪方法。
下面通过详细的实施例来说明本发明。
根据本发明的一个方面,提供了一种视线方向追踪方法。参考图1,是根据本发明实施例的一种可选的视线方向追踪方法的流程图。如图1上述,该方法包括以下步骤:
S100,使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含上述用户人脸的图像;
S200,通过从包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;
S300,根据光源反射点坐标和瞳孔中心坐标确定视线光轴,视线光轴通过补偿角度重建视线视轴;
S400,根据上述视线视轴和目标对象在上述世界坐标系的位置,确定在上述目标对象上的视点。
在本发明实施例中,使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含上述用户人脸的图像;通过从包含人脸的图像获取的人眼特征集 结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;根据光源反射点坐标和瞳孔中心坐标确定视线光轴,视线光轴通过补偿角度重建视线视轴;根据上述人眼特征集和上述几何位置关系确定视线光轴,上述视线光轴通过补偿角度重建视线视轴;根据上述视线视轴和目标对象在上述世界坐标系的位置,确定在上述目标对象上的视点目标对象。通过上述步骤,解决视线追踪硬件成本高、算法优化效率低和估计精度低的技术问题。
下面结合上述各实施步骤进行详细说明。
S100,使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含上述用户人脸的图像;
具体的,用户注视目标对象,使用多个光源向用户的眼睛发射红外光源,使用多个相机可以实时捕获包含瞳孔成像点和红外光源经角膜反射得到光源反射点位置的图像。使用多个相机可实时地捕获用户注视场景时人脸的图像,上述用户人脸的图像并非指只包括用户人脸区域的图像,捕获的图像中包含用户人脸区域的图像即可。为了提高后续特征检测和特征提取的精度,多个相机捕获用户人脸的图像时可重点采集用户人脸区域以获得包含清晰的人脸特征的人脸的图像。用户注视目标可以为显示设备,用于输出图像内容、文字内容的屏幕,例如屏幕、显示器和头戴式显示器等,也可以为非显示设备,例如挡风玻璃等。
图2是根据本发明实施例的一种可选的视线追踪方法的应用场景图。如图2上述,该视线追踪方法的应用场景,包括通过网络进行通信的视线追踪系统10和终端处理器30,其中,视线追踪系统10包括相机10a和光源10b。用户注视目标对象20时,光源10b生成射向用户眼睛的红外光源,相机10a捕获包含用户眼睛的图像,上述眼睛图像包括瞳孔成像点和光源成像点。目标对象20可以为显示设备,用于输出图像内容、文字内容的屏幕,例如屏幕、显示器和头戴式显示器等,也可以为非显示设备,例如挡风玻璃等。
根据现有技术可知,在眼球模型参数未知的情况下,双相机双光源是求取角膜中心的最小系统也是实现视线追踪的最小系统,在眼球模型参数已知的情况下,则单相机双光源是求取角膜中心的最小系统也是实现视线追踪的最小系 统。本发明适用于不同用户使用,不同用户眼球模型参数未知且各异,故多个相机10a包括至少两个相机,多个光源10b包括至少两个光源,应用场景中包括目标对象。
在一种可选的实施例中,多个相机、多个光源和目标对象朝向用户,且多个相机的视野不包括多个光源和上述目标对象。实际视线跟踪系统中将上述多个光源,上述多个相机和上述目标对象配置在同一侧,其中,本发明并不限制相机、光源和目标对象的具体数目,以及设备之间具体排布方式和位置。上述视线追踪系统中多个光源、多个相机和目标对象两两之间位置关系可以预设,亦可以基于标定获得。
终端处理器30获取包含人脸的图像,对包含人脸的图像进行检测和特征提取,获得人眼特征集,包括确定眼睛图像中瞳孔成像点以及光源成像点在目标对象20的三维坐标系中的坐标。终端处理器30根据输入的人眼特征集确定视线光轴,并通过补偿角度重建视线视轴,最后根据视线视轴和几何位置关系确定用户的视点。终端处理器30可以是固定终端或移动终端,移动终端可包括以下设备中的至少一种,笔记本、平板电脑、手机和车载设备等。
S200,通过从上述包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;
对上述包含人脸的图像进行检测和特征提取,获得人眼特征集。在一种可选的实施例中,上述人眼特征集包括:光源成像点,瞳孔成像点。本实施例利用现有技术实现对中对于包含人脸的图像进行检测和特征特征提取,本申请对获得人眼特征集的具体技术不限制,可利用传统图像处理方法和基于深度学习的图像处理方法等。
在一种可选的实施例中,上述硬件标定参数包括:多个相机坐标系的内外参和几何位置关系,其中几何位置关系包括以下:多个光源坐标系和多个相机坐标系之间的第一位置关系,多个光源坐标系和目标对象坐标系之间的第二位置关系,多个相机坐标系和目标对象坐标系之间的第三位置关系。且当获取几何位置关系中的任意两种位置关系时,通过空间变换关系以及已知的两种位置关系可确定剩余一种位置关系。通过硬件标定参数,可获得任意像素坐标和世 界坐标系的映射关系,任意硬件所在坐标系和世界坐标系的映射关系。
基于角膜瞳孔反射确定视线光轴方法中,根据人眼特征集确定世界坐标系中的光源反射点坐标和瞳孔中心坐标需要结合基于多个相机和多个光源在同一个世界坐标系下的位置,根据硬件标定参数中包含的映射关系可分别确定多个光源和多个相机在世界坐标系中的位置。
此外,光源反射点,是光源中心发出的光线在角膜表面的反射点。光源成像点,是光源反射点在被采集的包含用户人脸的图像中的成像。瞳孔中心,是瞳孔区域的中心点。瞳孔成像点,是瞳孔中心经角膜折射后的瞳孔折射点在被采集的包含用户人脸的图像中的成像。将人眼的角膜建模为球体,角膜曲率中心即为该球体的球体中心,角膜半径是角膜曲率中心到角膜球体的表面的距离,光轴方向是瞳孔中心与角膜曲率中心连线的方向。同样,人的眼球亦可以建模为眼球中心,眼球中心则为该眼球的球体中心,且眼球中心也位于光轴上。根据光学原理,结合光源的位置和光源成像点可获取角膜曲率半径。
光源成像点是光源反射点在被采集的包含用户人脸的图像中的成像,按照相机成像原理,结合采集图像的多个相机的位置坐标和内外参,确定在世界坐标系中的光源反射点的坐标。需要说明的是,光源可以为一个也可以为多个。当多个光源都工作时,每个光源都含有对应的光源反射点,则确定每个光源在世界坐标系中的坐标与方法与上述一致。同理,瞳孔成像点是瞳孔中心经角膜折射后的瞳孔折射点在被采集的包含用户人脸的图像中的成像,按照相机成像原理,结合采集图像的相机的位置坐标和内外参,确定在世界坐标系中的瞳孔中心的坐标。
在一种可选的实施例中,上述世界坐标系可选取以下任意一种:光源坐标系、相机坐标系、目标对象坐标系。即该世界坐标系可以为指定的任一光源、任一相机或目标对象的立体坐标系。
由上可知,通过硬件标定参数可获得任意像素坐标和世界坐标系的映射关系,任意硬件所在坐标系和世界坐标系的映射关系。人脸特征集为像素坐标系下的二维信息,通过上述任意像素坐标和世界坐标系的映射关系,可获得世界坐标系下光源反射点坐标和瞳孔中心坐标,且根据指定的世界坐标系不同,存 在以下几种情况。
在一种可选的实施例中,从上述包含人脸的图像获取的人脸特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标,包括:
若上述世界坐标系为上述光源坐标系,上述人脸特征集通过上述多个相机的内外参,结合上述第一位置关系,或结合上述第二位置关系和上述第三位置关系,确定上述光源坐标系中的上述光源反射点坐标和上述瞳孔中心坐标;
若上述世界坐标系为上述相机坐标系,上述人脸特征集上述多个相机的内外参,确定上述多个相机坐标系中的上述光源反射点坐标和上述瞳孔中心坐标;
若上述世界坐标系为上述目标对象坐标系,上述人脸特征集通过上述上述多个相机的内外参,结合上述第三位置关系,或结合上述第一位置关系和上述第二位置关系,确定上述目标对象坐标系中的上述光源反射点坐标和上述瞳孔中心坐标。
进一步的,本申请不限制确定多个相机的内外参的方法,可通过预设的出厂参数获取,亦可通过标定获取多个相机的内外参,本申请同样不限制标定多个相机内外参的方法,例如,采用标定板标定多个相机的内外参。
在一种可选的实施例中,几何位置关系通过结合多个相机的内外参,利用平面镜和辅助相机中的至少一项传递标定信息标定获得。
由于,多个相机的视野不包括多个光源和上述目标对象,故利用平面镜和辅助相机中的至少一项传递标定信息,基于标定信息求取几何位置关系。
具体的,标定过程旨在获得多个相机、多个光源和目标对象的几何位置关系,然而在本实施例中,由于反射光路存在的限制,视线跟踪系统包含的多个光源,多个相机和目标对象配置在同一侧,故相机无法直接观察目标对象,或者只能观察到光目标对象的一部分,本申请结合多个相机的内外参,利用平面镜和辅助相机中的至少一项传递标定信息标定获得几何位置关系。
图3根据本发明实施例的一种可选的视线跟踪系统设置示意图,视线跟踪系统以两个相机在外侧,两个光源在内侧的设置。如图3所示,视线追踪系统100包括相机110和116,光源112和114,光源用于对用户105的眼睛发射光线提供角膜反射。相机用于捕获包含用户105人脸的图像。视线追踪系统100 采用光源在内侧,相机在外侧的排布方式,本发明并不限制相机和光源具体排布方式,例,还可采用相机在内侧光源在外侧的排布,或相机和光源间隔放置的排布方式。相机和光源分别固定于云台103,104,101和102,云台可调整各部件在水平以及垂直方向。固定后的部件安装于底座107,并且部件之间的距离可通过调整其固定位置而实现。在本实施例中将目标对象配置为显示器106,可用于显示标定记号。
在本申请的示例性实施例中,提供三种可行的标定方法,以图3所示的视线跟踪系统为例对标定方法进行说明。
标定方法一:
图4是根据本发明实施例的一种可选的标定场景图。如图4所示,在显示器106前侧放置平面镜320,显示器106上方固定第一辅助相机318,视线跟踪系统前方设置第二辅助相机,第二辅助相机为立体视觉系统,包括相机322和324,显示器106显示第一标记。辅助相机318获取平面镜320中显示器106投射的全部第一标记的像,立体视觉系统的视野包含全部相机和光源。
图5是根据本发明实施例的一种可选的标定方法的流程图。如图5所示,在一种可选的实施例中,结合多个相机的内外参,并利用平面镜和辅助相机传递标定信息,获得几何位置关系,可以包括:
S310第一辅助相机获取平面镜以多个不同姿态反射含有第一标记的上述目标对象的多张第一标记图像;
具体地,平面镜320以多个不同姿态反射,使第一标记在镜面中的虚像在辅助相机318的视野中,其中,平面镜的多个姿态的数目至少为三个,且平面镜所在的平面在不同姿态下互不平行。
S311结合多个相机的内外参,根据多张第一标记图像基于正交约束计算第三位置关系;
具体地,多个相机的内外参包括相机、第一辅助相机和第二辅助相机的内外参数,本实施例采用现有技术实现对多相机进行标定,对多相机的内外参标定的具体技术不限制。
根据上述多张第一标记图像,结合平面镜镜面反射原理,恢复全部第一标 记点相对于辅助相机的相对位置。采用不同位姿的平面镜反射标记点的像,即为针对固定的显示器提供不同第一辅助相机坐标系。针对所有第一标记图像使用P3P求解出多个候选镜像第一标记点组合,结合镜面点存在的正交约束筛选出一个最终组合,上述正交约束即在辅助相机坐标系下,任意两个不同镜面的轴向量与同一标记点在对应镜面下投射坐标的差向量正交。根据最终组合恢复全部第一标记点相对于辅助相机的相对位置,进一步地,结合上述标定获得多个相机的内外参结果,计算多个相机和显示器之间位置关系,即第三位置关系。
S312第二辅助相机获取包含多个光源的第二标记图像,结合多个相机的内外参并基于第二标记图像获取第一位置关系,其中,第二辅助相机为立体视觉系统;
第二辅助相机为立体视觉系统,且立体视觉系统的视野包含全部相机和光源,通过立体视觉系统采集并处理包含全部光源的第二标记图像,由于立体视觉系统采集的数据包含三维信息,故根据第二标记图像中确定各光源在立体视觉系统下的位姿,进一步地,结合多个相机的内外参,确定多个光源和多个相机之间的位置关系,即第一位置关系。
S313根据第一位置关系和第三位置关系确定第二位置关系。
通过上述步骤可以获得具有高准确性和高稳健性的标定结果。基于平面镜对目标对象反射成像,根据建立的目标对象和其在平面镜中虚像的位姿关系基于正交约束求解确定系统位姿关系。由于需要多次改变平面镜的位置,相机需要采集多幅镜面反射的图像,并且平面镜的移动需要获取满足预设条件的,操作较为复杂,效率较低,此外上述镜面标定法中基于正交约束线性求解通常对噪声敏感,系统位置关系标定结果的精度与平面镜到相机的距离和平面镜转动角度均有关系。
标定方法二则引入了多个辅助相机,避免多次改变平面镜的位置,将多个辅助相机作为转换桥梁,通过姿态转换确定最终几何位置关系。
标定方法二:
图6是根据本发明实施例的另一种可选的标定场景图。如图6所示,在显示器106前侧放置平面镜320,视线跟踪系统前方设置第三辅助相机,第三辅助 相机为立体视觉系统,包括相机422和424,在第三辅助相机边设置标定板428,相机110和116的视野包含标定板区域,显示器106显示第四标记图像,设置第四辅助相机426使其视野包括相机和第三辅助相机,立体视觉系统的视野包含全部相机、光源以及含有第五标记的显示器。
图7是根据本发明实施例的另一种可选的标定方法的流程图。在一种可选的实施例中,结合上述多个相机的内外参,并利用辅助相机传递标定信息,获得上述几何位置关系,可以包括:
S320第三辅助相机获取包含上述多个光源的第三标记图像,结合上述多个相机的内外参并基于上述第三标记图像获取上述第一位置关系,其中,上述第三辅助相机为立体视觉系统;
步骤与上述S312中获取第一位置关系的步骤一致,在此不再详细描述。
S321第四辅助相机设置为其视野包括多个相机和第三辅助相机,在上述第三辅助相机旁设置标定板,上述多个相机采集包含上述标定板区域的第四标记图像,同时上述第三辅助相机获取含有第五标记的上述目标对象的第五标记图像;
如图6所示,在显示器106上显示包含第四标记的标定板图像,使得第三辅助相机能拍摄到绝大部分的标定图像区域。在第三辅助相机边上放置一块标定板,使得相机110和116能够拍摄到绝大部分同时出现在两个相机的标定板区域,设置第四辅助相机使其视野包括相机和第三辅助相机。相机采集包含标定板区域的第四标记图像,同时第三辅助相机获取含有第五标记的目标对象的第五标记图像。
S322将上述第四辅助相机和上述多个相机的位置关系作为姿态转换桥梁,结合第三辅助相机和上述多个相机的内外参,根据上述第四标记图像和上述第五标记图像确定上述第三位置关系;
具体地,多个相机的内外参包括相机、第三辅助相机和第四辅助相机的内外参数,本实施例采用现有技术实现对多相机进行标定,对多相机的内外参标定的具体技术不限制。
根据获取的第四标记图像确定相机与标定板之间的位置关系,根据第五标 记图像确定第三辅助相机和显示器之间的位置关系,由于第四辅助相机同时拍摄相机和第三辅助相机,故利用第四辅助相机作为姿态转换的桥梁,实现显示器和相机的位姿关系标定,确定第三位置关系。
S323根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
上述实施例中为实现标定引入额外的辅助相机和一块标定板,将标定板置于多个相机的工作范围内,经辅助相机和标定板的坐标系转换,最终实现多个相机、多个光源和目标对象的位姿关系标定,这种方法原理简单、理论精度高,但在实际操作的过程中,由于多个相机和目标对象的位置关系已经固定,且多个相机无法拍摄目标对象,按照上述要求布置标定板和辅助相机时,辅助相机光轴与标定板法向量和显示器法向量之间夹角均过大,可能导致采集到的标定图案不理想,标记点提取误差较大,难以保证转换的精度。
辅助相机的引入增加了成本,且使得标定过程变得复杂。标定方法三则仅依赖固定的平面镜,避免多次移动引起的操作复杂,未依赖辅助相机,避免了辅助相机采集的对象之间夹角范围过大等问题,且简化了标定流程并降低了成本。
标定方法三:
图8是根据本发明实施例的另一种可选的标定场景图。如图8所示,在显示器106前侧放置平面镜320,在平面镜320上粘贴多个标记图案,标记图案可以是圆点、棋盘格、同心圆或其他易于区分的图案,且数量不少于4个,标记图案在显示器上的分布不做具体限制。平面镜反射显示器106、相机110和相机116、光源112和光源114,相机的视野包含投影在平面镜上全部相机、光源以及含有标记的显示器的像。
图9是根据本发明实施例的另一种可选的标定方法的流程图。
在一种可选的实施例中,结合上述多个相机的内外参,利用平面镜传递标定信息,获得上述几何位置关系,可以包括:
S330利用粘有不少于4个标记点的上述平面镜作为辅助,上述多个相机获取带有上述多个光源、上述目标对象且包含上述标记点的反射图像;
S331依据上述反射图像分别计算各个上述标记点、上述多个光源和上述目 标对象在上述多个相机坐标系中的标记点坐标,镜面光源坐标和镜面目标对象坐标;
S332根据所有上述标记点坐标重建镜面平面,并依据镜面反射原理,结合上述镜面光源坐标和上述镜面目标对象坐标确认上述第一位置关系和上述第三位置关系;
S333根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
在本申请的实施例中,标定方法三的基本思路可以为在平面镜320上粘贴多个标记图案,标记图案可以是圆点、棋盘格、同心圆或其他易于区分的图案,且数量不少于4个,标记图案在显示器上的分布不做具体限制。多个相机无法直接获取目标对象和光源,基于镜面反射原理,例如图8,平面镜反射显示器106、相机110和相机116、光源112和光源114。相机朝向镜面采集反射图像,上述图像包含投影在平面镜上全部相机、光源以及含有标记的显示器的像。依据包含全部标记点的反射图像确定各个上述标记点在上述多个相机坐标系中的标记点坐标,依据反射图像中包含镜面投影的光源和目标对象,确定光源虚像在多个相机坐标系中的镜面光源坐标和目标对象在多个相机坐标系中的镜面目标对象坐标。根据上述至少4个标记点坐标重建在多个相机坐标系下的镜面平面,依据镜面反射原理,镜面光源和实际光源在多个相机坐标系下对应。根据镜面光源坐标,结合镜面平面,确定第一位置关系,同理基于镜面目标对象坐标分别第三位置关系,根据第一位置关系和第三位置关系确定第二位置关系。标定方法三则仅依赖固定的平面镜,避免多次移动引起的操作复杂,未依赖辅助相机,避免了辅助相机采集的对象之间夹角范围过大等问题,且有效降低了标定流程和成本。
S300,根据上述光源反射点坐标和上述瞳孔中心坐标确定视线光轴,上述视线光轴通过补偿角度重建视线视轴;
在一种可选的实施例中,上述光源反射点坐标和上述瞳孔中心坐标确定视线光轴,包括:
根据光源反射点坐标和角膜曲率半径,确定角膜曲率中心的坐标;
根据瞳孔中心坐标和角膜曲率中心的坐标,确定上述瞳孔中心与上述角膜 曲率中心连线的上述视线光轴。
按照相机成像原理、光反射和折射原理,结合采集图像的相机的位置坐标和内外参和光源位置,可计算确定在世界坐标系中角膜曲率中心。光轴方向是瞳孔中心与角膜曲率中心的连线,将瞳孔中心的坐标和角膜曲率中心的坐标的简单计算,可以确定光轴方向,例如将瞳孔中心与角膜曲率中心的坐标相减取模后可确定光轴方向。
具体的,根据光学原理,针对每组采集相机,角膜曲率中心,瞳孔中心,与该相机成像平面上的瞳孔成像点和该相机的光心坐标共面。当存在两组共面关系,两个共面的面相交的交线的单位向量即为视线光轴,并且瞳孔中心与上述角膜曲率中心连线为视线光轴。基于上述平面几何关系和光学原理确定视线光轴。
基于角膜瞳孔反射的视线估计方法,往往只能求出眼球光轴,根据人眼的实际构造,用户视线是由视轴决定的,并且视轴与光轴之间存在补偿角度,本申请并不考虑眼睛变形或者发生异常时可能会对用户的补偿角度带来影响。但在真实情况中,眼球和角膜并非球体,从而导致当凝视不同位置时,基于角膜瞳孔反射的视线估计方法重建眼球光轴时存在不同的重建误差。
在本申请的示例性实施例中,提供两种可行的基于补偿角度从视线光轴重建视线视轴的方法。
重建方法一:
人眼构造决定了视线光轴和视线视轴有一个补偿角度,称为Kappa角。假设视线光轴和视线视轴之间的补偿角度是个不随个体变化的常量,则使用Kappa角作为光轴到视轴之间的固定补偿,在此假设下,可以无需通过校正估算个体的Kappa补偿,本实施例可以不校准补偿角度,或较简单地通过很少的点校准补偿角度,从而更简易实现视线追踪。
在一种可选的实施例中,视线光轴通过补偿角度重建视线视轴之前,上述方法还包括:用户凝视每一个预设的凝视点时获取一组样本图像;根据每组上述样本图像提取的样本特征确定第一补偿角度样本;遍历所有第一补偿角度样本,通过筛选和提纯获取上述补偿角度。
在一种可选的实施例中,针对每组样本图像提取样本特征,根据每组样本图像提取的样本特征确定第一补偿角度样本,包括:
针对每组样本图像提取样本特征,并根据样本特征重建出第一视线光轴;
基于预设的凝视点的真实坐标反推出第一视线视轴;
根据第一视线光轴和上述第一视线视轴,获取第一补偿角度样本。
在一种可选的实施例中,遍历所有上述第一补偿角度样本,遍历所有第一补偿角度样本,通过筛选和提纯获取补偿角度,包括:
求取所有上述第一补偿角度样本的中心点,筛选并去除不在第一阈值范围内的样本;
继续遍历筛选和提纯剩余所有样本直至当前中心点与上一次的中心点的差值低于第二阈值,从提纯后的所有样本中获取上述补偿角度。
具体的,预设的的位置已知,并且凝视点在空间中均匀分布。用户凝视已知位置的凝视点,依据采集的样本图像包含的样本特征,基于上述角膜瞳孔方法重建出视线光轴,结合瞳孔中心坐标和预设的凝视点恢复视线视轴,进一步获得该凝视点对应的补偿角度样本。基于预设凝视点获取补偿角度样本集过程中会存在误差,例如检测误差,本发明对补偿角度样本进行提纯,剔除异常样本,保留阈值范围中的高质量且合理分布的样本,一定程度上保证了补偿角度的准确度的同时保证最终补偿角度的适用性。本申请可以不校准,或较简单地通过很少的点校准,从而更简易实现视线追踪。
在实际的场景中,由于使用视线追踪系统的用户眼球参数各异,若采用固定的补偿角度,将影响视线追踪方法在更高精度视线追踪场景的应用。在重建方法二中通过引入了动态补偿模型,在使用上述动态补偿模型以前针对不同用户分别进行个性化补偿标定,以追踪到更高准确度的视线方向。
重建方法二
在一种可选的实施例中,视线光轴通过补偿角度重建视线视轴之前,上述方法还包括:对采集的数据通过动态补偿模型确定预测视线观测点和真实视线观测点的偏差,根据上述偏差获取上述补偿角度。
具体的,在实际视线方向的追踪方法的应用中,将采集的数据中每一帧数 据输入至已训练好的动态补偿模型以预测偏差,偏差即为预测视线观测点和真实视线观测点之间的偏差,并且根据上述偏差获取补偿角度。预测视线观测点可通过上述重建视线光轴的方法计算,此外,本发明实施例不限制动态补偿模型的类型,可以为神经网络、随机模型森林等。
进一步的,为实现针对不同用户个性化的补偿标定,在使用上述训练好的动态补偿模型以前需要进行初始化,以获得与当前用户契合的动态补偿模型。
在一种可选的实施例中,在使用上述动态补偿模型以前对上述动态补偿模型初始化,包括:
上述用户凝视每一个预设初始点时获取一组初始样本图像;
针对每组初始样本图像提取初始样本特征,并根据上述初始样本特征通过小样本学习初始化获得与当前上述用户契合的上述动态补偿模型。
具体的,当前使用用户凝视预设的初始点,预设的初始点的位置已知并且在空间中均匀分布。针对每一个预设的初始点获取一组样本图像,采集的凝视点数目至少为3个。进一步的,针对每组初始样本图像提取初始样本特征,并初始样本特征通过小样本学习初始化获得与当前上述用户契合的上述动态补偿模型。针对样本图像提取的初始样本特征包括但不限于:角膜球中心的三维坐标、眼球光轴欧拉角、基于光轴计算的视线观测点、人脸偏转角等。通过小样本学习使动态补偿模型在较短时间内即可具备学习类别变化的情况下模型的泛化能力,利用当前使用用户的先验知识初始化后的动态补偿模型,对后续当前使用用户采集的数据提供好的搜索方向,减少了预测误差的同时提高了预测偏差的效率,本发明实施例不限小样本学习的方法,例如MAML方法。
在一种可选的实施例中,上述训练动态补偿模型包括:
采集多个用户分别凝视预设校准点时的多组样本数据;
对上述多组样本数据清洗,并对清洗后的上述多组样本提取训练样本特征;
根据上述训练样本特征使用小样本学习训练初始动态补偿模型,获取训练后的上述动态补偿模型。
此外,初始动态补偿模型通过上述步骤获取已训练好的动态补偿模型,采集多组样本数据,每个测试者要求分别凝视目标对象上均匀分布的若干个点位 预设校准点,并对应地采集同样数量的样本。获得初始样本数据后清洗样本数据,去除不满足视线估计算法需求的样本,如光源反射点不全、图片严重模糊等。进一步的,针对清洗提纯后的每组样本图像提取样本特征,根据训练样本特征使用小样本学习训练初始动态补偿模型,获取训练后的动态补偿模型。针对多组样本数据提取的训练样本特征包括但不限于:角膜球中心的三维坐标、眼球光轴欧拉角、基于光轴计算的视线观测点、人脸偏转角等。训练好的动态补偿模型输入采集的数据后可获取预测视线观测点和真实视线观测点之间具有高准确的偏差,从而得到高精度的补偿角度。
上述视线光轴通过补偿角度重建视线视轴重建方法,在实际视线方向追踪场景中,先初始化生成与当前用户契合的动态补偿模型,随后对于当前用户采集的数据使用生成的动态补偿模型预测偏差,进一步获得更好的补偿角度,最终重建出高精度的视线视轴。
S400,根据上述视线视轴和目标对象在上述世界坐标系的位置,确定在上述目标对象上的视点。
上述视线视轴位于世界坐标系中,若要确定视轴在上述目标对象上的视点,需确定目标对象上在世界坐标上的位置,上述几何位置关系可提供目标对象在世界坐标上的位置,进一步确定目标对象在世界坐标上的而平面。具体地,确定用户视点的问题,即转化为同一坐标系中视线视轴向量和目标对象所在平面的交点问题,本发明不限制解决交点问题的方法。
为了降低硬件制作成本和算法优化的效率,本发明实施例还包括建立仿真系统,用以仿真在各种硬件条件、眼球状态以及算法设计等因素下,视线追踪系统的追踪精度以及抗干扰能力。
在一种可选的实施例中,上述方法还通过预设眼球参数、上述补偿角度和上述硬件标定参数和预设视点进行仿真验证、分析和优化。
在一种可选的实施例中,上述通过预设眼球参数、上述补偿角度和上述硬件标定参数和预设视点进行仿真验证和优化,包括:
针对上述预设视点,根据上述眼球参数、上述补偿角度和上述硬件标定参数仿真计算重建光源成像点和重建瞳孔成像点;
根据上述重建光源成像点和上述重建瞳孔成像点,依据上述视线方向追踪方法确定预测视点;
根据上述预设视点和上述预测视点的比较值统计分析,并根据分析结果实施验证和优化。
具体的,根据需要仿真视线追踪的场景,预设眼球参数和几何位置关系。眼球参数包括但不限于:眼球中心三维坐标、眼球半径、角膜球中心三维坐标、角膜球半径、眼球中心到角膜球中心的距离、瞳孔中心三维坐标、虹膜中心三维坐标、虹膜半径、角膜折射率、晶状体折射率、眼球光轴、眼球视轴和Kappa角等用于建模眼球三维模型的参数。硬件标定参数包括但不限于:相机内参数、相机外参数、相机畸变参数、相机立体标定参数、相机三维坐标、光源三维坐标和目标对象的三维坐标等。通过上述预设参数根据设定视点仿真基于角膜瞳孔反射确定视线方向需要的图像数据,包括重建光源在角膜上的反射点三维坐标(重建光源反射点)、光源反射点在相机图像平面的投影坐标(重建光源成像点)以及瞳孔中心在相机图像平面的投影坐标(重建瞳孔成像点)。进一步地,仿真系统根据输入的实际参数,验证上述视线方向追踪方法的实现是否正确,测试输入参数添加扰动对视点误差的影响和寻找最优的上述多个光源、上述多个相机和目标对象配置方法。
在一种可选的的实施例中,针对预设视点,根据预设眼球参数、补偿角度和硬件标定参数仿真计算重建光源成像点和重建瞳孔成像点,包括:
根据预设眼球参数中的角膜中心和硬件标定参数确定光源角膜相机角度,基于光源角膜相机角度和预设眼球参数中的角膜曲率半径,结合球面反射原理确定重建光源反射点坐标,根据重建光源反射点坐标结合硬件标定参数计算重建光源成像点;
根据上述预设视点的坐标和上述预设眼球参数中的角膜中心确定第一视轴,基于上述第一视轴和上述补偿角度反推出第一光轴,依据上述第一光轴并结合上述预设眼球参数的瞳孔角膜中心距离确定重建瞳孔中心坐标,根据上述瞳孔中心坐标结合上述硬件标定参数计算上述重建瞳孔成像点。
图10是根据本发明实施例的一种可选的重建光源反射点方法的原理图。如 图10所示,根据预设参数获取世界坐标系下的角膜中心C,相机光心O和光源中心l。光源照射的角膜区域所对应的角膜球,其中心在眼球光轴上,且半径R c固定。根据球面反射原理,光源中心发出的光线,在角膜表面的反射点q发出全反射,反射光线经过相机中心O投影到图像平面。根据反射定理可知,
Figure PCTCN2022108896-appb-000001
以角膜中心为原点建立坐标系TCS,将
Figure PCTCN2022108896-appb-000002
作为新的X轴
Figure PCTCN2022108896-appb-000003
Figure PCTCN2022108896-appb-000004
作为新的Z轴
Figure PCTCN2022108896-appb-000005
Figure PCTCN2022108896-appb-000006
作为新的Y轴
Figure PCTCN2022108896-appb-000007
则坐标系TCS相对于世界坐标系,存在
Figure PCTCN2022108896-appb-000008
T=-R·C,有
Figure PCTCN2022108896-appb-000009
将TCS转换到世界坐标系,
Figure PCTCN2022108896-appb-000010
将世界坐标系转换到TCS。接下来,使用M WT将C、O、l转换到TCS。计算TCS下的∠lCO,进而求得θ 2,从而得出q=(R c·cosθ 2,R c·sinθ 2,0),最后使用M TW将q转换到世界坐标系,最终获得重建光源反射点的坐标。
仿真系统支持多个预设凝视点定位以辅助验证验证视线方向追踪方法的实现是否正确。具体的,以仿真系统中包含的目标对象是屏幕为例,屏幕中出现多个位置已知的凝视点,当仿真系统中仿真眼球凝视某个凝视点时候,根据凝视点的位置和预设眼球参数包含的角膜球中心坐标,可计算出视线视轴,接着根据预设的补偿角度推理出光轴,最后根据预设的瞳孔中心到角膜球中心的距离,重建瞳孔中心的坐标,
在一种可选的实施例中,预设视点可提前预设或随机生成于多个不同位置以仿真多种视线角度追踪。具体的,生成多个不同个位置预设点从而仿真眼球看向目标对象的不同位置时,视线追踪的误差情况。
在一种可选的实施例中,根据预设视点和预测视点进行统计分析,并根据分析结果实施验证和优化,包括:验证视线方向追踪方法的实现是否正确,测试输入参数添加扰动对视点误差的影响和确定多个光源、多个相机和目标对象配置方法。具体的,通过仿真计算。
在仿真过程中,并不涉及实际硬件参数和实际图像像素处理。通过预设视点,结合仿真系统包含的眼球参数和硬件标定参数计算重建光源反射点和重建瞳孔成像点,再根据上述基于角膜瞳孔反射的视线方向追踪方法确定预测视 点,根据正向和反向验证视线方向追踪方法的实现的正确性,在该过程中,仿真系统中涉及多个变量,包括几何位置关系、眼球参数和视点等,任何一个变量的异常都会导致验证正确性失败,若得到验证失败的结果,进一步通过控制变量等方法筛查出异常变量,可提高视线追踪方法优化效率。
此外,上述仿真系统针对上述输入参数添加扰动后计算预测视点,通过比较上述预测视点与真实视点检验上述扰动对于视点误差的影响。
仿真系统可以对各个参数施加任意类型的扰动,以仿真所设计的视线追踪系统在实际使用中性能。通过将使用扰动后的参数所计算的视点预测值,同视点真值做比较,包括对欧式距离、角度误差、方差等做统计学分析,从而可以有针对性地指导后面的视线追踪系统设计和算法优化。如,对于相机标定参数施加扰动,可以检验标定误差对于系统的影响程度;对于光源反射点和瞳孔中心等关键点在图像像素施加扰动,可以检验关键点检测算法误差对于系统的影响程度。此外,通过仿真系统的大量实验和数据分析,用于寻找实现高精度视线追踪方法时最优的硬件配置方法,大幅减少了硬件成本的同时提高了优化视线追踪方法的效率。
在本实施例中还提供了一种视线方向追踪装置实施例,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
参考图11,是根据本发明实施例的一种可选的视线方向追踪装置的结构框图。如图11所示,视线方向追踪装置1100包括采集模块1101、关键点确定模块1102、视线重建模块1103和视点确定模块1104。
下面对视线方向追踪装置1100包含的各个单元进行具体描述。
采集模块1101,用于使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含上述用户人脸的图像;
具体的,用户注视目标对象,使用多个光源向用户的眼睛发射红外光源,使用多个相机可以实时捕获包含瞳孔成像点和红外光源经角膜反射得到光源反 射点位置的图像。使用多个相机可实时地捕获用户注视场景时人脸的图像,上述用户人脸的图像并非指只包括用户人脸区域的图像,捕获的图像中包含用户人脸区域的图像即可。为了提高后续特征检测和特征提取的精度,多个相机捕获用户人脸的图像时可重点采集用户人脸区域以获得包含清晰的人脸特征的人脸的图像。用户注视目标可以为显示设备,用于输出图像内容、文字内容的屏幕,例如屏幕、显示器和头戴式显示器等,也可以为非显示设备,例如挡风玻璃等。
在一种可选的实施例中,多个相机、多个光源和目标对象朝向用户,且多个相机的视野不包括多个光源和上述目标对象。实际视线跟踪系统中将上述多个光源,上述多个相机和上述目标对象配置在同一侧,其中,本发明并不限制相机、光源和目标对象的具体数目,以及设备之间具体排布方式和位置。上述视线追踪系统中多个光源、多个相机和目标对象两两之间位置关系可以预设,亦可以基于标定获得。
关键点确定模块1102,用于通过从上述包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;
对上述包含人脸的图像进行检测和特征提取,获得人眼特征集。在一种可选的实施例中,上述人眼特征集包括:光源成像点,瞳孔成像点。本实施例利用现有技术实现对中对于包含人脸的图像进行检测和特征特征提取,本申请对获得人眼特征集的具体技术不限制,可利用传统图像处理方法和基于深度学习的图像处理方法等。
在一种可选的实施例中,上述硬件标定参数包括:多个相机坐标系的内外参和几何位置关系,其中几何位置关系包括以下:多个光源坐标系和多个相机坐标系之间的第一位置关系,多个光源坐标系和目标对象坐标系之间的第二位置关系,多个相机坐标系和目标对象坐标系之间的第三位置关系。且当获取几何位置关系中的任意两种位置关系时,通过空间变换关系以及已知的两种位置关系可确定剩余一种位置关系。通过硬件标定参数,可获得任意像素坐标和世界坐标系的映射关系,任意硬件所在坐标系和世界坐标系的映射关系。
基于角膜瞳孔反射确定视线光轴方法中,根据人眼特征集确定世界坐标系 中的光源反射点坐标和瞳孔中心坐标需要结合基于多个相机和多个光源在同一个世界坐标系下的位置,根据硬件标定参数中包含的映射关系可分别确定多个光源和多个相机在世界坐标系中的位置。
此外,光源反射点,是光源中心发出的光线在角膜表面的反射点。光源成像点,是光源反射点在被采集的包含用户人脸的图像中的成像。瞳孔中心,是瞳孔区域的中心点。瞳孔成像点,是瞳孔中心经角膜折射后的瞳孔折射点在被采集的包含用户人脸的图像中的成像。将人眼的角膜建模为球体,角膜曲率中心即为该球体的球体中心,角膜半径是角膜曲率中心到角膜球体的表面的距离,光轴方向是瞳孔中心与角膜曲率中心连线的方向。同样,人的眼球亦可以建模为眼球中心,眼球中心则为该眼球的球体中心,且眼球中心也位于光轴上。根据光学原理,结合光源的位置和光源成像点可获取角膜曲率半径。
光源成像点是光源反射点在被采集的包含用户人脸的图像中的成像,按照相机成像原理,结合采集图像的多个相机的位置坐标和内外参,确定在世界坐标系中的光源反射点的坐标。需要说明的是,光源可以为一个也可以为多个。当多个光源都工作时,每个光源都含有对应的光源反射点,则确定每个光源在世界坐标系中的坐标与方法与上述一致。同理,瞳孔成像点是瞳孔中心经角膜折射后的瞳孔折射点在被采集的包含用户人脸的图像中的成像,按照相机成像原理,结合采集图像的相机的位置坐标和内外参,确定在世界坐标系中的瞳孔中心的坐标。
在一种可选的实施例中,世界坐标系可选取以下任意一种:光源坐标系、相机坐标系、目标对象坐标系。即该世界坐标系可以为指定的任一光源、任一相机或目标对象的立体坐标系。
由上可知,通过硬件标定参数可获得任意像素坐标和世界坐标系的映射关系,任意硬件所在坐标系和世界坐标系的映射关系。人脸特征集为像素坐标系下的二维信息,通过上述任意像素坐标和世界坐标系的映射关系,可获得世界坐标系下光源反射点坐标和瞳孔中心坐标,且根据指定的世界坐标系不同,存在以下几种情况。
在一种可选的实施例中,上述关键点确定模块1102包括:
第一确定单元11021,用于若上述世界坐标系为上述光源坐标系,上述人脸特征集通过上述多个相机的内外参,结合上述第一位置关系,或结合上述第二位置关系和上述第三位置关系,确定上述光源坐标系中的上述光源反射点坐标和上述瞳孔中心坐标;
第二确定单元11022,用于若上述世界坐标系为上述相机坐标系,上述人脸特征集通过上述多个相机的内外参,确定上述多个相机坐标系中的上述光源反射点坐标和上述瞳孔中心坐标;
第三确定单元11023,用于若上述世界坐标系为上述目标对象坐标系,上述人脸特征集通过上述多个相机的内外参,结合上述第三位置关系,或结合上述第一位置关系和上述第二位置关系,确定上述目标对象坐标系中的上述光源反射点坐标和上述瞳孔中心坐标。
进一步的,本申请不限制确定多个相机的内外参的方法,可通过预设的出厂参数获取,亦可通过标定获取多个相机的内外参,本申请同样不限制标定多个相机内外参的方法,例如,采用标定板标定多个相机的内外参。
在一种可选的实施例中,上述关键点确定模块1102包括:标定单元11024,用于通过结合上述多个相机的内外参,利用平面镜和辅助相机中的至少一项传递标定信息标定获得上述几何位置关系。
由于,多个相机的视野不包括多个光源和上述目标对象,故利用平面镜和辅助相机中的至少一项传递标定信息,基于标定信息求取几何位置关系。具体的,标定过程旨在获得多个相机、多个光源和目标对象的几何位置关系,然而在本实施例中,由于反射光路存在的限制,视线跟踪系统包含的多个光源,多个相机和目标对象配置在同一侧,故相机无法直接观察目标对象,或者只能观察到光目标对象的一部分,本申请结合多个相机的内外参,利用平面镜和辅助相机中的至少一项传递标定信息标定获得几何位置关系。
在一种可选的实施例中,上述标定单元11024包括第一标定单元24100,用于通过结合上述多个相机的内外参,利用平面镜和辅助相机传递标定信息标定获得上述几何位置关系,其中,上述第一标定单元24100包括:
第一标定子单元24101,用于第一辅助相机获取上述平面镜以多个不同姿态 反射的含有第一标记的上述目标对象的多张第一标记图像;
第二标定子单元24102,用于结合上述多个相机的内外参,根据多张上述第一标记图像基于正交约束计算上述第三位置关系;
第三标定子单元24103,用于第二辅助相机获取包含上述多个光源的第二标记图像,结合上述多个相机的内外参并基于上述第二标记图像获取上述第一位置关系,其中,上述第二辅助相机为立体视觉系统;
第四标定子单元24104,用于根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
通过上述单元可以获得具有高准确性和高稳健性的标定结果。基于平面镜对目标对象反射成像,根据建立的目标对象和其在平面镜中虚像的位姿关系基于正交约束求解确定系统位姿关系。由于需要多次改变平面镜的位置,相机需要采集多幅镜面反射的图像,并且平面镜的移动需要获取满足预设条件的,操作较为复杂,效率较低,此外上述镜面标定法中基于正交约束线性求解通常对噪声敏感,系统位置关系标定结果的精度与平面镜到相机的距离和平面镜转动角度均有关系。
在一种可选的实施例中,上述标定单元11024包括第二标定单元24200,用于通过结合上述多个相机的内外参,利用辅助相机传递标定信息标定获得上述几何位置关系,其中,上述第二标定单元包括:
第五标定子单元24201,用于第三辅助相机获取包含上述多个光源的第三标记图像,结合上述多个相机的内外参并基于上述第三标记图像获取上述第一位置关系,其中,上述第三辅助相机为立体视觉系统;
第六标定子单元24202,用于第四辅助相机设置为其视野包括上述多个相机和上述第三辅助相机,在上述第三辅助相机旁设置标定板,上述多个相机采集包含上述标定板区域的第四标记图像,同时上述第三辅助相机获取含有第五标记的上述目标对象的第五标记图像;
第七标定子单元24203,用于将上述第四辅助相机和上述多个相机的位置关系作为姿态转换桥梁,结合第三辅助相机和上述多个相机的内外参,根据上述第四标记图像和上述第五标记图像确定上述第三位置关系;
第八标定子单元24204,用于根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
上述实施例中为实现标定引入额外的辅助相机和一块标定板,将标定板置于多个相机的工作范围内,经辅助相机和标定板的坐标系转换,最终实现多个相机、多个光源和目标对象的位姿关系标定,这种方法原理简单、理论精度高,但在实际操作的过程中,由于多个相机和目标对象的位置关系已经固定,且多个相机无法拍摄目标对象,按照上述要求布置标定板和辅助相机时,辅助相机光轴与标定板法向量和显示器法向量之间夹角均过大,可能导致采集到的标定图案不理想,标记点提取误差较大,难以保证转换的精度。辅助相机的引入增加了成本,且使得标定过程变得复杂
在一种可选的实施例中,上述标定单元11024包括第三标定单元24300,用于通过结合上述多个相机的内外参,利用平面镜传递标定信息标定获得上述几何位置关系,其中,上述第三标定单元24300包括:
第九标定子单元24301,用于利用粘有不少于4个标记点的上述平面镜作为辅助,上述多个相机获取带有上述多个光源、上述目标对象且包含上述标记点的反射图像;
第十标定子单元24302,用于依据上述反射图像分别计算各个上述标记点、上述多个光源和上述目标对象在上述多个相机坐标系中的标记点坐标,镜面光源坐标和镜面目标对象坐标;
第十一标定子单元24303,用于根据所有上述标记点坐标重建镜面平面,并依据镜面反射原理,结合上述镜面光源坐标和上述镜面目标对象坐标确认上述第一位置关系和上述第三位置关系;
第十二标定子单元24304,用于根据上述第一位置关系和上述第三位置关系确定上述第二位置关系。
上述第三标定单元24300通过相机朝向镜面采集反射图像,上述图像包含投影在平面镜上全部相机、光源以及含有标记的显示器的像。依据包含全部标记点的反射图像确定各个上述标记点在上述多个相机坐标系中的标记点坐标,依据反射图像中包含镜面投影的光源和目标对象,确定光源虚像在多个相机坐 标系中的镜面光源坐标和目标对象在多个相机坐标系中的镜面目标对象坐标。根据上述至少4个标记点坐标重建在多个相机坐标系下的镜面平面,依据镜面反射原理,镜面光源和实际光源在多个相机坐标系下对应。根据镜面光源坐标,结合镜面平面,确定第一位置关系,同理基于镜面目标对象坐标分别第三位置关系,根据第一位置关系和第三位置关系确定第二位置关系。标定方法三则仅依赖固定的平面镜,避免多次移动引起的操作复杂,未依赖辅助相机,避免了辅助相机采集的对象之间夹角范围过大等问题,且有效降低了标定流程和成本。
视线重建模块1103,用于根据上述光源反射点坐标和上述瞳孔中心坐标确定视线光轴,上述视线光轴通过补偿角度重建视线视轴;
在一种可选的实施例中,上述视线重建模块1103包括:
第一重建单元11031,用于根据上述光源反射点坐标和角膜曲率半径,确定角膜曲率中心的坐标;
第二重建单元11032,用于根据上述瞳孔中心坐标和上述角膜曲率中心的坐标,确定上述瞳孔中心与上述角膜曲率中心连线的上述视线光轴。
按照相机成像原理、光反射和折射原理,结合采集图像的相机的位置坐标和内外参和光源位置,可计算确定在世界坐标系中角膜曲率中心。光轴方向是瞳孔中心与角膜曲率中心的连线,将瞳孔中心的坐标和角膜曲率中心的坐标的简单计算,可以确定光轴方向,例如将瞳孔中心与角膜曲率中心的坐标相减取模后可确定光轴方向。
具体的,根据光学原理,针对每组采集相机,角膜曲率中心,瞳孔中心,与该相机成像平面上的瞳孔成像点和该相机的光心坐标共面。当存在两组共面关系,两个共面的面相交的交线的单位向量即为视线光轴,并且瞳孔中心与上述角膜曲率中心连线为视线光轴。基于上述平面几何关系和光学原理确定视线光轴。
基于角膜瞳孔反射的视线方向追踪装置,往往只能求出眼球光轴,根据人眼的实际构造,用户视线是由视轴决定的,并且视轴与光轴之间存在补偿角度,本申请并不考虑眼睛变形或者发生异常时可能会对用户的补偿角度带来影响。但在真实情况中,眼球和角膜并非球体,从而导致当凝视不同位置时,基于角 膜瞳孔反射的视线估计方法重建眼球光轴时存在不同的重建误差。
人眼构造决定了视线光轴和视线视轴有一个补偿角度,称为Kappa角。假设视线光轴和视线视轴之间的补偿角度是个不随个体变化的常量,则使用Kappa角作为光轴到视轴之间的固定补偿,在此假设下,可以无需通过校正估算个体的Kappa补偿,本实施例可以不校准补偿角度,或较简单地通过很少的点校准补偿角度,从而更简易实现视线追踪。
在一种可选的实施例中,上述视线重建模块1103包括:
第三重建单元11033,用于上述用户凝视每一个预设的凝视点时获取一组样本图像;
第四重建单元11034,用于根据每组上述样本图像提取的样本特征确定第一补偿角度样本;
第五重建单元11035,用于遍历所有上述第一补偿角度样本,通过筛选和提纯获取上述补偿角度。
在一种可选的实施例中,第四重建单元11034包括:针对每组样本图像提取样本特征,并根据样本特征重建出第一视线光轴;基于预设的凝视点的真实坐标反推出第一视线视轴;根据第一视线光轴和上述第一视线视轴,获取第一补偿角度样本。
在一种可选的实施例中,第五重建单元11035包括:求取所有上述第一补偿角度样本的中心点,筛选并去除不在第一阈值范围内的样本;继续遍历筛选和提纯剩余所有样本直至当前中心点与上一次的中心点的差值低于第二阈值,从提纯后的所有样本中获取上述补偿角度。
具体的,预设的的位置已知,并且凝视点在空间中均匀分布。用户凝视已知位置的凝视点,依据采集的样本图像包含的样本特征,基于上述角膜瞳孔反射重建出视线光轴,结合瞳孔中心坐标和预设的凝视点恢复视线视轴,进一步获得该凝视点对应的补偿角度样本。基于预设凝视点获取补偿角度样本集过程中会存在误差,例如检测误差,本发明对补偿角度样本进行提纯,剔除异常样本,保留阈值范围中的高质量且合理分布的样本,一定程度上保证了补偿角度的准确度的同时保证最终补偿角度的适用性。本申请可以不校准,或较简单地 通过很少的点校准,从而更简易实现视线追踪。
在实际的场景中,由于使用视线追踪系统的用户眼球参数各异,若采用固定的补偿角度,将影响视线方向追踪装置在更高精度视线追踪场景的应用。上述视线重建模块通过引入了动态补偿模型,在使用上述动态补偿模型以前针对不同用户分别进行个性化补偿标定,以追踪到更高准确度的视线方向。
在一种可选的实施例中,上述视线重建模块1103还包括:动态补偿单元11036,用于对采集的数据通过动态补偿模型确定预测视线观测点和真实视线观测点的偏差,根据上述偏差获取上述补偿角度。
具体的,在实际视线方向的追踪装置的应用中,将采集的数据中每一帧数据输入至已训练好的动态补偿模型以预测偏差,偏差即为预测视线观测点和真实视线观测点之间的偏差,并且根据上述偏差获取补偿角度。预测视线观测点可通过上述重建视线光轴的装置计算,此外,本发明实施例不限制动态补偿模型的类型,可以为神经网络、随机模型森林等。
进一步的,为实现针对不同用户个性化的补偿标定,上述动态补偿单元11036包括初始化子单元,以获得与当前用户契合的动态补偿模型。
在一种可选的实施例中,上述动态补偿单元11036包括初始化子单元36100,用于在使用上述动态补偿模型以前对上述动态补偿模型初始化,其中,上述初始化子单元36100包括:第一初始化子单元36110,用于上述用户凝视每一个预设初始点时获取一组初始样本图像;第二初始化子单元36120,用于针对每组上述初始样本图像提取初始样本特征,并根据上述初始样本特征通过小样本学习初始化获得与当前上述用户契合的上述动态补偿模型。
具体的,当前使用用户凝视预设的初始点,预设的初始点的位置已知并且在空间中均匀分布。针对每一个预设的初始点获取一组样本图像,采集的凝视点数目至少为3个。进一步的,针对每组初始样本图像提取初始样本特征,并初始样本特征通过小样本学习初始化获得与当前上述用户契合的上述动态补偿模型。针对样本图像提取的初始样本特征包括但不限于:角膜球中心的三维坐标、眼球光轴欧拉角、基于光轴计算的视线观测点、人脸偏转角等。通过小样本学习使动态补偿模型在较短时间内即可具备学习类别变化的情况下模型的泛 化能力,利用当前使用用户的先验知识初始化后的动态补偿模型,对后续当前使用用户采集的数据提供好的搜索方向,减少了预测误差的同时提高了预测偏差的效率,本发明实施例不限小样本学习的方法,例如MAML方法。
在一种可选的实施例中,上述动态补偿单元11036包括训练子单元36200,用于训练上述动态补偿模型,其中,上述训练子单元36200包括:第一训练子单元36210,用于采集多个用户分别凝视预设校准点时的多组样本数据;第二训练子单元36220,用于对上述多组样本数据清洗,并对清洗后的上述多组样本提取训练样本特征;第三训练子单元,用于根据上述训练样本特征使用小样本学习训练初始动态补偿模型,获取训练后的上述动态补偿模型。
此外,初始动态补偿模型通过上述训练子单元36200获取已训练好的动态补偿模型,采集多组样本数据,每个测试者要求分别凝视目标对象上均匀分布的若干个点位预设校准点,并对应地采集同样数量的样本。获得初始样本数据后清洗样本数据,去除不满足视线估计算法需求的样本,如光源反射点不全、图片严重模糊等。进一步的,针对清洗提纯后的每组样本图像提取样本特征,根据训练样本特征使用小样本学习训练初始动态补偿模型,获取训练后的动态补偿模型。针对多组样本数据提取的训练样本特征包括但不限于:角膜球中心的三维坐标、眼球光轴欧拉角、基于光轴计算的视线观测点、人脸偏转角等。训练好的动态补偿模型输入采集的数据后可获取预测视线观测点和真实视线观测点之间具有高准确的偏差,从而得到高精度的补偿角度。
上述视线重建模块在实际视线方向追踪场景中,先初始化生成与当前用户契合的动态补偿模型,随后对于当前用户采集的数据使用生成的动态补偿模型预测偏差,进一步获得更好的补偿角度,最终重建出高精度的视线视轴。
视点确定模块1104,用于根据上述视线视轴和目标对象在上述世界坐标系的位置,确定在上述目标对象上的视点。
上述视线视轴位于世界坐标系中,若要确定视轴在上述目标对象上的视点,需确定目标对象上在世界坐标上的位置,上述几何位置关系可提供目标对象在世界坐标上的位置,进一步确定目标对象在世界坐标上的而平面。具体地,确定用户视点的问题,即转化为同一坐标系中视线视轴向量和目标对象所在平面 的交点问题,本发明不限制解决交点问题的方法。
为了降低硬件制作成本和算法优化的效率,参考图12,提供了根据本发明实施例的另一种可选的视线方向追踪装置的结构框图。如图12所示,该视线方向追踪装置还包括仿真模块1105,用以仿真在各种硬件条件、眼球状态以及算法设计等因素下,视线追踪系统的追踪精度以及抗干扰能力。
在一种可选的实施例中,上述装置还包括仿真模块1105,用于通过预设眼球参数、上述补偿角度、上述硬件标定参数和预设视点进行仿真验证、分析和优化。
在一种可选的实施例中,上述仿真模块1105包括:第一仿真单元11051,用于针对上述预设视点,根据上述眼球参数、上述补偿角度和上述硬件标定参数仿真计算重建光源成像点和重建瞳孔成像点;第二仿真单元11052,用于根据上述重建光源成像点和上述重建瞳孔成像点,依据上述视线方向追踪方法确定预测视点;第三仿真单元11053,用于根据上述预设视点和上述预测视点的比较值统计分析,并根据分析结果实施验证和优化。
在一种可选的实施例中,上述第一仿真单元11051包括:
第一仿真子单元51100,用于根据上述预设眼球参数中的角膜中心和上述硬件标定参数确定光源角膜相机角度,基于上述光源角膜相机角度和上述预设眼球参数中的角膜曲率半径,结合球面反射原理确定重建光源反射点坐标,根据上述重建光源反射点坐标结合上述硬件标定参数计算上述重建光源成像点;
第二仿真子单元51200,用于根据上述预设视点的坐标和上述预设眼球参数中的角膜中心确定第一视轴,基于上述第一视轴和上述补偿角度反推出第一光轴,依据上述第一光轴并结合上述预设眼球参数的瞳孔角膜中心距离确定重建瞳孔中心坐标,根据上述瞳孔中心坐标结合上述硬件标定参数计算上述重建瞳孔成像点。
具体的,根据需要仿真视线追踪的场景,预设眼球参数和几何位置关系。眼球参数包括但不限于:眼球中心三维坐标、眼球半径、角膜球中心三维坐标、角膜球半径、眼球中心到角膜球中心的距离、瞳孔中心三维坐标、虹膜中心三维坐标、虹膜半径、角膜折射率、晶状体折射率、眼球光轴、眼球视轴和Kappa 角等用于建模眼球三维模型的参数。硬件标定参数包括但不限于:相机内参数、相机外参数、相机畸变参数、相机立体标定参数、相机三维坐标、光源三维坐标和目标对象的三维坐标等。通过上述预设参数根据设定视点仿真基于角膜瞳孔反射确定视线方向需要的图像数据,包括重建光源在角膜上的反射点三维坐标(重建光源反射点)、光源反射点在相机图像平面的投影坐标(重建光源成像点)以及瞳孔中心在相机图像平面的投影坐标(重建瞳孔成像点)。进一步地,仿真系统根据输入的实际参数,验证视线方向追踪方法的实现是否正确,测试输入参数添加扰动对视点误差的影响和寻找最优的上述多个光源、上述多个相机和目标对象配置方法。
仿真模块1105支持多个预设凝视点定位以辅助验证验证视线方向追踪方法的实现是否正确。具体的,以仿真系统中包含的目标对象是屏幕为例,屏幕中出现多个位置已知的凝视点,当仿真模块1105中仿真眼球凝视某个凝视点时候,根据凝视点的位置和预设眼球参数包含的角膜球中心坐标,可计算出视线视轴,接着根据预设的补偿角度推理出光轴,最后根据预设的瞳孔中心到角膜球中心的距离,重建瞳孔中心的坐标,
在一种可选的实施例中,预设视点可提前预设或随机生成于多个不同位置以仿真多种视线角度追踪。具体的,生成多个不同个位置预设点从而仿真眼球看向目标对象的不同位置时,视线追踪的误差情况。
在一种可选的实施例中,上述第三仿真单元11053包括:第三仿真子单元,用于验证上述视线方向追踪方法的实现是否正确,测试添加扰动对视点误差的影响和确定上述多个光源、上述多个相机和目标对象配置方法。
在仿真过程中,并不涉及实际硬件参数和实际图像像素处理。通过预设视点,结合仿真模块1105包含的眼球参数和硬件标定参数计算重建光源反射点和重建瞳孔成像点,再根据上述基于角膜瞳孔反射的视线方向追踪方法确定预测视点,根据正向和反向验证视线方向追踪方法的实现的正确性,在该过程中,仿真模块1105中涉及多个变量,包括几何位置关系、眼球参数和视点等,任何一个变量的异常都会导致验证正确性失败,若得到验证失败的结果,进一步通过控制变量等方法筛查出异常变量,可提高视线追踪方法优化效率。
此外,上述仿真模块1105针对上述输入参数添加扰动后计算预测视点,通过比较上述预测视点与真实视点检验上述扰动对于视点误差的影响。仿真模块1105可以对各个参数施加任意类型的扰动,以仿真所设计的视线追踪系统在实际使用中性能。通过将使用扰动后的参数所计算的视点预测值,同视点真值做比较,包括对欧式距离、角度误差、方差等做统计学分析,从而可以有针对性地指导后面的视线追踪系统设计和算法优化。如,对于相机标定参数施加扰动,可以检验标定误差对于系统的影响程度;对于光源反射点和瞳孔中心等关键点在图像像素施加扰动,可以检验关键点检测算法误差对于系统的影响程度。此外,通过仿真模块1105的大量实验和数据分析,用于寻找实现高精度视线方向追踪装置时最优的硬件配置方法,大幅减少了硬件成本的同时提高了优化视线追踪装置的效率。
根据本发明实施例的另一方面,还提供了一种电子设备,包括:处理器;以及存储器,用于存储处理器的可执行指令;其中,处理器配置为经由执行可执行指令来执行任意一项的视线方向追踪方法。
根据本发明实施例的另一方面,还提供了一种存储介质,存储介质包括存储的程序,其中,在程序运行时控制存储介质所在设备执行任意一项的视线方向追踪方法。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如上述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系数,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为 单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例上述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (40)

  1. 一种视线方向追踪方法,其特征在于,包括:
    使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含所述用户人脸的图像;
    通过从所述包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;
    根据所述光源反射点坐标和所述瞳孔中心坐标确定视线光轴,所述视线光轴通过补偿角度重建视线视轴;
    根据所述视线视轴和目标对象在所述世界坐标系的位置,确定在所述目标对象上的视点。
  2. 根据权利要求1所述的视线方向追踪方法,其特征在于,所述人眼特征集包括:光源成像点,瞳孔成像点。
  3. 根据权利要求1所述的视线方向追踪方法,其特征在于,所述世界坐标系可选取以下任意一种:光源坐标系、相机坐标系、目标对象坐标系。
  4. 根据权利要求1所述的视线方向追踪方法,其特征在于,所述方法还通过预设眼球参数、所述补偿角度、所述硬件标定参数和预设视点进行仿真验证、分析和优化。
  5. 根据权利要求1所述的视线方向追踪方法,其特征在于,所述多个相机、所述多个光源和所述目标对象朝向用户,且所述多个相机的视野不包括所述多个光源和所述目标对象。
  6. 根据权利要求1所述的视线方向追踪方法,其特征在于,所述硬件标定参数包括:
    所述多个相机坐标系的内外参和几何位置关系,其中所述几何位置关系包括以下:所述多个光源坐标系和所述多个相机坐标系之间的第一位置关系,所述多个光源坐标系和所述目标对象坐标系之间的第二位置关系,所述多个相机坐标系和所述目标对象坐标系之间的第三位置关系。
  7. 根据权利要求6所述的视线方向追踪方法,其特征在于,所述几何位置关系通过结合所述多个相机的内外参,利用平面镜和辅助相机中的至少一项传 递标定信息标定获得。
  8. 根据权利要求6所述的视线方向追踪方法,其特征在于,通过从所述包含人脸的图像获取的人脸特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标,包括:
    若所述世界坐标系为所述光源坐标系,所述人脸特征集通过所述多个相机的内外参,结合所述第一位置关系,或结合所述第二位置关系和所述第三位置关系,确定所述光源坐标系中的所述光源反射点坐标和所述瞳孔中心坐标;
    若所述世界坐标系为所述相机坐标系,所述人脸特征集通过所述多个相机的内外参,确定所述多个相机坐标系中的所述光源反射点坐标和所述瞳孔中心坐标;
    若所述世界坐标系为所述目标对象坐标系,所述人脸特征集通过所述多个相机的内外参,结合所述第三位置关系,或结合所述第一位置关系和所述第二位置关系,确定所述目标对象坐标系中的所述光源反射点坐标和所述瞳孔中心坐标。
  9. 根据权利要求1所述的视线方向追踪方法,其特征在于,根据所述光源反射点坐标和所述瞳孔中心坐标确定视线光轴,包括:
    根据所述光源反射点坐标和角膜曲率半径,确定角膜曲率中心的坐标;
    根据所述瞳孔中心坐标和所述角膜曲率中心的坐标,确定所述瞳孔中心与所述角膜曲率中心连线的所述视线光轴。
  10. 根据权利要求6所述的视线方向追踪方法,其特征在于,所述几何位置关系通过结合所述多个相机的内外参,利用平面镜和辅助相机传递标定信息标定获得,包括:
    第一辅助相机获取所述平面镜以多个不同姿态反射的含有第一标记的所述目标对象的多张第一标记图像;
    结合所述多个相机的内外参,根据多张所述第一标记图像基于正交约束计算所述第三位置关系;
    第二辅助相机获取包含所述多个光源的第二标记图像,结合所述多个相机的内外参并基于所述第二标记图像获取所述第一位置关系,其中,所述第二辅 助相机为立体视觉系统;
    根据所述第一位置关系和所述第三位置关系确定所述第二位置关系。
  11. 根据权利要求6所述的视线方向追踪方法,其特征在于,所述几何位置关系通过结合所述多个相机的内外参,利用辅助相机传递标定信息标定获得,包括:
    第三辅助相机获取包含所述多个光源的第三标记图像,结合所述多个相机的内外参并基于所述第三标记图像获取所述第一位置关系,其中,所述第三辅助相机为立体视觉系统;
    第四辅助相机设置为其视野包括所述多个相机和所述第三辅助相机,在所述第三辅助相机旁设置标定板,所述多个相机采集包含所述标定板区域的第四标记图像,同时所述第三辅助相机获取含有第五标记的所述目标对象的第五标记图像;
    将所述第四辅助相机和所述多个相机的位置关系作为姿态转换桥梁,结合第三辅助相机和所述多个相机的内外参,根据所述第四标记图像和所述第五标记图像确定所述第三位置关系;
    根据所述第一位置关系和所述第三位置关系确定所述第二位置关系。
  12. 根据权利要求6所述的视线方向追踪方法,其特征在于,所述几何位置关系通过结合所述多个相机的内外参,利用平面镜传递标定信息标定获得,包括:
    利用粘有不少于4个标记点的所述平面镜作为辅助,所述多个相机获取带有所述多个光源、所述目标对象且包含所述标记点的反射图像;
    依据所述反射图像分别计算各个所述标记点、所述多个光源和所述目标对象在所述多个相机坐标系中的标记点坐标,镜面光源坐标和镜面目标对象坐标;
    根据所有所述标记点坐标重建镜面平面,并依据镜面反射原理,结合所述镜面光源坐标和所述镜面目标对象坐标确认所述第一位置关系和所述第三位置关系;
    根据所述第一位置关系和所述第三位置关系确定所述第二位置关系。
  13. 根据权利要求1所述的视线方向追踪方法,其特征在于,所述视线光轴 通过补偿角度重建视线视轴之前,所述方法还包括:
    所述用户凝视每一个预设的凝视点时获取一组样本图像;
    根据每组所述样本图像提取的样本特征确定第一补偿角度样本;
    遍历所有所述第一补偿角度样本,通过筛选和提纯获取所述补偿角度。
  14. 根据权利要求13所述的视线方向追踪方法,其特征在于,根据每组所述样本图像提取的样本特征确定第一补偿角度样本,包括:
    针对每组所述样本图像提取样本特征,并根据所述样本特征重建出第一视线光轴;
    基于所述预设的凝视点的真实坐标反推出第一视线视轴;
    根据所述第一视线光轴和所述第一视线视轴,获取所述第一补偿角度样本。
  15. 根据权利要求13所述的视线方向追踪方法,其特征在于,遍历所有所述第一补偿角度样本,通过筛选和提纯获取所述补偿角度,包括:
    求取所有所述第一补偿角度样本的中心点,筛选并去除不在第一阈值范围内的样本;
    继续遍历筛选和提纯剩余所有样本直至当前中心点与上一次的中心点的差值低于第二阈值,从提纯后的所有样本中获取所述补偿角度。
  16. 根据权利要求1所述的视线方向追踪方法,其特征在于,所述视线光轴通过补偿角度重建视线视轴之前,所述方法还包括:
    对采集的数据通过动态补偿模型确定预测视线观测点和真实视线观测点的偏差,根据所述偏差获取所述补偿角度。
  17. 根据权利要求16所述的视线方向追踪方法,其特征在于,在使用所述动态补偿模型以前对所述动态补偿模型初始化,包括:
    所述用户凝视每一个预设初始点时获取一组初始样本图像;
    针对每组所述初始样本图像提取初始样本特征,并根据所述初始样本特征通过小样本学习初始化获得与当前所述用户契合的所述动态补偿模型。
  18. 根据权利要求16所述的视线方向追踪方法,其特征在于,训练所述动态补偿模型包括:
    采集多个用户分别凝视预设校准点时的多组样本数据;
    对所述多组样本数据清洗,并对清洗后的所述多组样本提取训练样本特征;
    根据所述训练样本特征使用小样本学习训练初始动态补偿模型,获取训练后的所述动态补偿模型。
  19. 根据权利要求4所述的视线方向追踪方法,其特征在于,所述通过预设眼球参数、所述补偿角度、所述硬件标定参数和预设视点进行仿真验证、分析和优化,包括:
    针对所述预设视点,根据所述眼球参数、所述补偿角度和所述硬件标定参数仿真计算重建光源成像点和重建瞳孔成像点;
    根据所述重建光源成像点和所述重建瞳孔成像点,依据所述视线方向追踪方法确定预测视点;
    根据所述预设视点和所述预测视点的比较值统计分析,并根据分析结果实施验证和优化。
  20. 根据权利要求19所述的视线方向追踪方法,其特征在于,针对所述预设视点,根据所述预设眼球参数、所述补偿角度和所述硬件标定参数仿真计算重建光源成像点和重建瞳孔成像点,包括:
    根据所述预设眼球参数中的角膜中心和所述硬件标定参数确定光源角膜相机角度,基于所述光源角膜相机角度和所述预设眼球参数中的角膜曲率半径,结合球面反射原理确定重建光源反射点坐标,根据所述重建光源反射点坐标结合所述硬件标定参数计算所述重建光源成像点;
    根据所述预设视点的坐标和所述预设眼球参数中的角膜中心确定第一视轴,基于所述第一视轴和所述补偿角度反推出第一光轴,依据所述第一光轴并结合所述预设眼球参数的瞳孔角膜中心距离确定重建瞳孔中心坐标,根据所述瞳孔中心坐标结合所述硬件标定参数计算所述重建瞳孔成像点。
  21. 根据权利要求19所述的视线方向追踪方法,其特征在于,所述预设视点可提前预设或随机生成于多个不同位置以仿真多种视线角度追踪。
  22. 根据权利要求19所述的视线方向追踪方法,其特征在于,根据所述预设视点和所述预测视点进行统计分析,并根据分析结果实施验证和优化,包括:
    验证所述视线方向追踪方法的实现是否正确,测试添加扰动对视点误差的 影响和确定所述多个光源、所述多个相机和目标对象配置方法。
  23. 一种视线方向追踪装置,其特征在于,包括:
    采集模块,用于使用多个光源对用户的眼睛提供角膜反射,使用多个相机捕获包含所述用户人脸的图像;
    关键点确定模块,用于通过从所述包含人脸的图像获取的人眼特征集结合硬件标定参数,确定世界坐标系中的光源反射点坐标和瞳孔中心坐标;
    视线重建模块,用于根据所述光源反射点坐标和所述瞳孔中心坐标确定视线光轴,所述视线光轴通过补偿角度重建视线视轴;
    视点确定模块,用于根据所述视线视轴和目标对象在所述世界坐标系的位置,确定在所述目标对象上的视点。
  24. 根据权利要求23所述的装置,其特征在于,所述装置还包括仿真模块,用于通过预设眼球参数、所述补偿角度、所述硬件标定参数和预设视点进行仿真验证、分析和优化。
  25. 根据权利要求23所述的装置,其特征在于,所述硬件标定参数包括:
    所述多个相机坐标系的内外参和几何位置关系,其中所述几何位置关系包括以下:所述多个光源坐标系和所述多个相机坐标系之间的第一位置关系,所述多个光源坐标系和所述目标对象坐标系之间的第二位置关系,所述多个相机坐标系和所述目标对象坐标系之间的第三位置关系。
  26. 根据权利要求25所述的装置,其特征在于,所述关键点确定模块包括:
    标定单元,用于通过结合所述多个相机的内外参,利用平面镜和辅助相机中的至少一项传递标定信息标定获得所述几何位置关系。
  27. 根据权利要求25所述的装置,其特征在于,所述关键点确定模块包括:
    第一确定单元,用于若所述世界坐标系为所述光源坐标系,所述人脸特征集通过所述多个相机的内外参,结合所述第一位置关系,或结合所述第二位置关系和所述第三位置关系,确定所述光源坐标系中的所述光源反射点坐标和所述瞳孔中心坐标;
    第二确定单元,用于若所述世界坐标系为所述相机坐标系,所述人脸特征集通过所述多个相机的内外参,确定所述多个相机坐标系中的所述光源反射点 坐标和所述瞳孔中心坐标;
    第三确定单元,用于若所述世界坐标系为所述目标对象坐标系,所述人脸特征集通过所述多个相机的内外参,结合所述第三位置关系,或结合所述第一位置关系和所述第二位置关系,确定所述目标对象坐标系中的所述光源反射点坐标和所述瞳孔中心坐标。
  28. 根据权利要求所述23的装置,其特征在于,所述视线重建模块包括:
    第一重建单元,用于根据所述光源反射点坐标和角膜曲率半径,确定角膜曲率中心的坐标;
    第二重建单元,用于根据所述瞳孔中心坐标和所述角膜曲率中心的坐标,确定所述瞳孔中心与所述角膜曲率中心连线的所述视线光轴。
  29. 根据权利要求26所述的装置,其特征在于,所述标定单元包括第一标定单元,用于通过结合所述多个相机的内外参,利用平面镜和辅助相机传递标定信息标定获得所述几何位置关系,其中,所述第一标定单元包括:
    第一标定子单元,用于第一辅助相机获取所述平面镜以多个不同姿态反射的含有第一标记的所述目标对象的多张第一标记图像;
    第二标定子单元,用于结合所述多个相机的内外参,根据多张所述第一标记图像基于正交约束计算所述第三位置关系;
    第三标定子单元,用于第二辅助相机获取包含所述多个光源的第二标记图像,结合所述多个相机的内外参并基于所述第二标记图像获取所述第一位置关系,其中,所述第二辅助相机为立体视觉系统;
    第四标定子单元,用于根据所述第一位置关系和所述第三位置关系确定所述第二位置关系。
  30. 根据权利要求26所述的装置,其特征在于,所述标定单元包括第二标定单元,用于通过结合所述多个相机的内外参,利用辅助相机传递标定信息标定获得所述几何位置关系,其中,所述第二标定单元包括:
    第五标定子单元,用于第三辅助相机获取包含所述多个光源的第三标记图像,结合所述多个相机的内外参并基于所述第三标记图像获取所述第一位置关系,其中,所述第三辅助相机为立体视觉系统;
    第六标定子单元,用于第四辅助相机设置为其视野包括所述多个相机和所述第三辅助相机,在所述第三辅助相机旁设置标定板,所述多个相机采集包含所述标定板区域的第四标记图像,同时所述第三辅助相机获取含有第五标记的所述目标对象的第五标记图像;
    第七标定子单元,用于将所述第四辅助相机和所述多个相机的位置关系作为姿态转换桥梁,结合第三辅助相机和所述多个相机的内外参,根据所述第四标记图像和所述第五标记图像确定所述第三位置关系;
    第八标定子单元,用于根据所述第一位置关系和所述第三位置关系确定所述第二位置关系。
  31. 根据权利要求26所述的装置,其特征在于,所述标定单元包括第三标定单元,用于通过结合所述多个相机的内外参,利用平面镜传递标定信息标定获得所述几何位置关系,其中,所述第三标定单元包括:
    第九标定子单元,用于利用粘有不少于4个标记点的所述平面镜作为辅助,所述多个相机获取带有所述多个光源、所述目标对象且包含所述标记点的反射图像;
    第十标定子单元,用于依据所述反射图像分别计算各个所述标记点、所述多个光源和所述目标对象在所述多个相机坐标系中的标记点坐标,镜面光源坐标和镜面目标对象坐标;
    第十一标定子单元,用于根据所有所述标记点坐标重建镜面平面,并依据镜面反射原理,结合所述镜面光源坐标和所述镜面目标对象坐标确认所述第一位置关系和所述第三位置关系;
    第十二标定子单元,用于根据所述第一位置关系和所述第三位置关系确定所述第二位置关系。
  32. 根据权利要求23所述的装置,其特征在于,所述视线重建模块包括:
    第三重建单元,用于所述用户凝视每一个预设的凝视点时获取一组样本图像;
    第四重建单元,用于根据每组所述样本图像提取的样本特征确定第一补偿角度样本;
    第五重建单元,用于遍历所有所述第一补偿角度样本,通过筛选和提纯获取所述补偿角度。
  33. 根据权利要求23所述的装置,其特征在于,所述视线重建模块还包括:
    动态补偿单元,用于对采集的数据通过动态补偿模型确定预测视线观测点和真实视线观测点的偏差,根据所述偏差获取所述补偿角度。
  34. 根据权利要求33所述的装置,其特征在于,所述动态补偿单元包括初始化子单元,用于在使用所述动态补偿模型以前对所述动态补偿模型初始化,其中,所述初始化子单元包括:
    第一初始化子单元,用于所述用户凝视每一个预设初始点时获取一组初始样本图像;
    第二初始化子单元,用于针对每组所述初始样本图像提取初始样本特征,并根据所述初始样本特征通过小样本学习初始化获得与当前所述用户契合的所述动态补偿模型。
  35. 根据权利要求33所述的装置,其特征在于,所述动态补偿单元包括训练子单元,用于训练所述动态补偿模型,其中,所述训练子单元包括:
    第一训练子单元,用于采集多个用户分别凝视预设校准点时的多组样本数据;
    第二训练子单元,用于对所述多组样本数据清洗,并对清洗后的所述多组样本提取训练样本特征;
    第三训练子单元,用于根据所述训练样本特征使用小样本学习训练初始动态补偿模型,获取训练后的所述动态补偿模型。
  36. 根据权利要求24所述的装置,其特征在于,所述仿真模块包括:
    第一仿真单元,用于针对所述预设视点,根据所述眼球参数、所述补偿角度和所述硬件标定参数仿真计算重建光源成像点和重建瞳孔成像点;
    第二仿真单元,用于根据所述重建光源成像点和所述重建瞳孔成像点,依据所述视线方向追踪方法确定预测视点;
    第三仿真单元,用于根据所述预设视点和所述预测视点的比较值统计分析,并根据分析结果实施验证和优化。
  37. 根据权利要求36所述的装置,其特征在于,所述第一仿真单元包括:
    第一仿真子单元,用于根据所述预设眼球参数中的角膜中心和所述硬件标定参数确定光源角膜相机角度,基于所述光源角膜相机角度和所述预设眼球参数中的角膜曲率半径,结合球面反射原理确定重建光源反射点坐标,根据所述重建光源反射点坐标结合所述硬件标定参数计算所述重建光源成像点;
    第二仿真子单元,用于根据所述预设视点的坐标和所述预设眼球参数中的角膜中心确定第一视轴,基于所述第一视轴和所述补偿角度反推出第一光轴,依据所述第一光轴并结合所述预设眼球参数的瞳孔角膜中心距离确定重建瞳孔中心坐标,根据所述瞳孔中心坐标结合所述硬件标定参数计算所述重建瞳孔成像点。
  38. 根据权利要求36所述的装置,其特征在于,所述第三仿真单元包括:
    第三仿真子单元,用于验证所述视线方向追踪方法的实现是否正确,测试添加扰动对视点误差的影响和确定所述多个光源、所述多个相机和目标对象配置方法。
  39. 一种存储介质,其特征在于,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行权利要求1至22中任意一项所述的视线方向追踪方法。
  40. 一种电子设备,其特征在于,包括:
    处理器;以及
    存储器,用于存储所述处理器的可执行指令;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1至22中任意一项所述的视线方向追踪方法。
PCT/CN2022/108896 2021-08-05 2022-07-29 视线方向追踪方法和装置 WO2023011339A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
KR1020247007410A KR20240074755A (ko) 2021-08-05 2022-07-29 시선 방향 추적 방법 및 장치
JP2024531562A JP2024529785A (ja) 2021-08-05 2022-07-29 視線方向追跡方法及び装置
EP22852052.4A EP4383193A1 (en) 2021-08-05 2022-07-29 Line-of-sight direction tracking method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110897215.8A CN113808160B (zh) 2021-08-05 2021-08-05 视线方向追踪方法和装置
CN202110897215.8 2021-08-05

Publications (1)

Publication Number Publication Date
WO2023011339A1 true WO2023011339A1 (zh) 2023-02-09

Family

ID=78942947

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/108896 WO2023011339A1 (zh) 2021-08-05 2022-07-29 视线方向追踪方法和装置

Country Status (5)

Country Link
EP (1) EP4383193A1 (zh)
JP (1) JP2024529785A (zh)
KR (1) KR20240074755A (zh)
CN (1) CN113808160B (zh)
WO (1) WO2023011339A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115933172A (zh) * 2022-11-29 2023-04-07 大连海事大学 一种基于偏振多光谱成像的人眼视线追踪装置及方法

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113808160B (zh) * 2021-08-05 2024-01-16 虹软科技股份有限公司 视线方向追踪方法和装置
CN114356482B (zh) * 2021-12-30 2023-12-12 业成科技(成都)有限公司 利用视线落点与人机界面互动的方法
CN114500839B (zh) * 2022-01-25 2024-06-07 青岛根尖智能科技有限公司 一种基于注意力跟踪机制的视觉云台控制方法及系统
CN114862990B (zh) * 2022-04-22 2024-04-30 网易(杭州)网络有限公司 一种镜像参数获取方法、装置、电子设备及存储介质
CN117666706A (zh) * 2022-08-22 2024-03-08 北京七鑫易维信息技术有限公司 一种电子设备
CN117635600B (zh) * 2023-12-26 2024-05-17 北京极溯光学科技有限公司 视网膜中央凹的位置确定方法、装置、设备及存储介质
CN117648037B (zh) * 2024-01-29 2024-04-19 北京未尔锐创科技有限公司 一种目标视线跟踪方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130329957A1 (en) * 2010-12-08 2013-12-12 Yoshinobu Ebisawa Method for detecting point of gaze and device for detecting point of gaze
CN106547341A (zh) * 2015-09-21 2017-03-29 现代自动车株式会社 注视跟踪器及其跟踪注视的方法
CN107358217A (zh) * 2017-07-21 2017-11-17 北京七鑫易维信息技术有限公司 一种视线估计方法及装置
CN109696954A (zh) * 2017-10-20 2019-04-30 中国科学院计算技术研究所 视线追踪方法、装置、设备和存储介质
CN113808160A (zh) * 2021-08-05 2021-12-17 虹软科技股份有限公司 视线方向追踪方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6191943B2 (ja) * 2013-03-28 2017-09-06 株式会社国際電気通信基礎技術研究所 視線方向推定装置、視線方向推定装置および視線方向推定プログラム
JP2017213191A (ja) * 2016-05-31 2017-12-07 富士通株式会社 視線検出装置、視線検出方法、及び視線検出プログラム
ES2714853T3 (es) * 2017-01-27 2019-05-30 Zeiss Carl Vision Int Gmbh Procedimiento implementado por ordenador para la detección de un vértice corneal
CN107767421B (zh) * 2017-09-01 2020-03-27 北京七鑫易维信息技术有限公司 视线追踪设备中光斑光源匹配方法和装置
CN108985210A (zh) * 2018-07-06 2018-12-11 常州大学 一种基于人眼几何特征的视线追踪方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130329957A1 (en) * 2010-12-08 2013-12-12 Yoshinobu Ebisawa Method for detecting point of gaze and device for detecting point of gaze
CN106547341A (zh) * 2015-09-21 2017-03-29 现代自动车株式会社 注视跟踪器及其跟踪注视的方法
CN107358217A (zh) * 2017-07-21 2017-11-17 北京七鑫易维信息技术有限公司 一种视线估计方法及装置
CN109696954A (zh) * 2017-10-20 2019-04-30 中国科学院计算技术研究所 视线追踪方法、装置、设备和存储介质
CN113808160A (zh) * 2021-08-05 2021-12-17 虹软科技股份有限公司 视线方向追踪方法和装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115933172A (zh) * 2022-11-29 2023-04-07 大连海事大学 一种基于偏振多光谱成像的人眼视线追踪装置及方法
CN115933172B (zh) * 2022-11-29 2023-09-12 大连海事大学 一种基于偏振多光谱成像的人眼视线追踪装置及方法

Also Published As

Publication number Publication date
CN113808160A (zh) 2021-12-17
CN113808160B (zh) 2024-01-16
JP2024529785A (ja) 2024-08-08
KR20240074755A (ko) 2024-05-28
EP4383193A1 (en) 2024-06-12

Similar Documents

Publication Publication Date Title
WO2023011339A1 (zh) 视线方向追踪方法和装置
JP6902075B2 (ja) 構造化光を用いた視線追跡
KR102062658B1 (ko) 안구 모델을 생성하기 위한 각막의 구체 추적
Plopski et al. Corneal-imaging calibration for optical see-through head-mounted displays
CN107004275B (zh) 确定实物至少一部分的3d重构件空间坐标的方法和系统
CN109558012B (zh) 一种眼球追踪方法及装置
US20130076884A1 (en) Method and device for measuring an interpupillary distance
CN104978548B (zh) 一种基于三维主动形状模型的视线估计方法与装置
Coutinho et al. Improving head movement tolerance of cross-ratio based eye trackers
EP3339943A1 (en) Method and system for obtaining optometric parameters for fitting eyeglasses
WO2015190204A1 (ja) 瞳孔検出システム、視線検出システム、瞳孔検出方法、および瞳孔検出プログラム
JP2016173313A (ja) 視線方向推定システム、視線方向推定方法及び視線方向推定プログラム
US11181978B2 (en) System and method for gaze estimation
JP7168953B2 (ja) 自動キャリブレーションを行う視線計測装置、視線計測方法および視線計測プログラム
CN112099622B (zh) 一种视线追踪方法及装置
CN108369744A (zh) 通过双目单应性映射的3d注视点检测
CN108537103B (zh) 基于瞳孔轴测量的活体人脸检测方法及其设备
CN110051319A (zh) 眼球追踪传感器的调节方法、装置、设备及存储介质
Plopski et al. Automated spatial calibration of HMD systems with unconstrained eye-cameras
CN112183160A (zh) 视线估计方法及装置
US10036902B2 (en) Method of determining at least one behavioural parameter
US11849999B2 (en) Computer-implemented method for determining a position of a center of rotation of an eye using a mobile device, mobile device and computer program
JP2006285531A (ja) 視線方向の検出装置、視線方向の検出方法およびコンピュータに当該視線方向の視線方法を実行させるためのプログラム
CN109917908B (zh) 一种ar眼镜的图像获取方法及系统
CN116382473A (zh) 一种基于自适应时序分析预测的视线校准、运动追踪及精度测试方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22852052

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2024531562

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022852052

Country of ref document: EP

Effective date: 20240305