WO2023275669A1 - Procédé d'étalonnage d'un système comprenant un dispositif de suivi oculaire et un dispositif informatique comprenant un ou plusieurs écrans - Google Patents

Procédé d'étalonnage d'un système comprenant un dispositif de suivi oculaire et un dispositif informatique comprenant un ou plusieurs écrans Download PDF

Info

Publication number
WO2023275669A1
WO2023275669A1 PCT/IB2022/055739 IB2022055739W WO2023275669A1 WO 2023275669 A1 WO2023275669 A1 WO 2023275669A1 IB 2022055739 W IB2022055739 W IB 2022055739W WO 2023275669 A1 WO2023275669 A1 WO 2023275669A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
tracking device
pose
screen
ecs
Prior art date
Application number
PCT/IB2022/055739
Other languages
English (en)
Inventor
Kenneth Alberto FUNES MORA
Alojz Kovacik
Bastjan Prenaj
Original Assignee
Eyeware Tech Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eyeware Tech Sa filed Critical Eyeware Tech Sa
Priority to EP22734686.3A priority Critical patent/EP4374241A1/fr
Publication of WO2023275669A1 publication Critical patent/WO2023275669A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements

Definitions

  • the present invention relates to a method for calibration of a system comprising an eye tracking device and a computing device as a function of the position of the eye-tracking device with respect to one or multiple screens. This method enables capturing the gaze of a user on one or multiple screens of the computing device.
  • Eye tracking has been solved by making use of multiple strategies.
  • An eye tracking setup is generally composed of one or multiple cameras that capture the face and/or eyes and, in most common applications, with one or multiple screens such as a laptop screen or desktop screen. Most systems require to know what the positioning of the physical screens is with respect to the eye-tracking device configured to track the movement of the eyes of a user.
  • US 2020174560 discloses a calibration method for a three-dimensional augmented reality and apparatus thereof.
  • the calibration method includes determining a first conversion parameter representing a relationship between a coordinate system of an eye-tracking camera and a coordinate system of a calibration camera by capturing a physical pattern using the eye-tracking camera and the calibration camera, and determining a second conversion parameter representing a relationship between a coordinate system of a virtual screen and the coordinate system of the calibration camera and a size parameter representing a size of the virtual screen by capturing a virtual pattern displayed on the virtual screen using the calibration camera.
  • the method according to US2020174560 therefore enables to accurately display a virtual object on a virtual screen at a point corresponding to a target position which intersects the gaze ray of the user tracked by the eye-tracking camera.
  • US2014226131 discloses a method to facilitate eye tracking control calibration.
  • Objects are displayed on a display of a device, where the objects are associated with a function unrelated to a calculation of calibration parameters.
  • the calibration parameters relate to a calibration of a calculation of gaze information of a user of the device, where the gaze information indicates where the user is looking.
  • eye movement information associated with the user is determined, which indicates eye movement of one or more eye features associated with at least one eye of the user.
  • the eye movement information is associated with a first object location of the objects.
  • the calibration parameters are calculated based on the first object location being associated with the eye movement information.
  • US2014211995 discloses a point-of-gaze estimation method that accounts for head rotations and/or estimation device rotations.
  • An image of an eye of the user is captured.
  • the image is processed to determine coordinates in the image of defined eye features to determine the eye's optical axis.
  • At least one angle is determined, wherein the at least one angle is proportional to an angle between a line coincident with an edge of the display and an intersection of the sagittal plane of the user's head with a plane of the display.
  • An intersection of the eye's line-of-sight with the plane of the display is estimated using the eye's optical axis and using the at least one angle to account for rotation of the user's head or the display.
  • An aim of the present invention is to provide an alternative method for an easy and quick setup for calibration of an eye tracking device as a function of the position of one or multiple screens irrespective of the position of the eye-tracking device relative to the screen or each screen.
  • this aim is achieved by means of a calibration method for a system comprising an eye-tracking device and a computing device for capturing the gaze of a user on at least one screen of the computing device,
  • the calibration method comprises: a. displaying on one screen of the computing device one or more specific patterns or arbitrary content; b. capturing with a camera of the eye-tracking device at least one image of said one or more specific patterns or said arbitrary content when said eye-tracking device is in an initial 3D pose ECS;c.
  • Capturing the gaze of a user on the screen of the computing device comprises: i. retrieving a gaze ray of the user with the eye-tracking device defined in the coordinate system ECS of said eye-tracking device, and ii. intersecting the gaze ray of the user with said at least one screen of the computing device, as a function of the ECS and SCS parameters, to capture the gaze-on-screen.
  • step c. of the calibration method further comprises: i) using mechanical parameters describing the resolution and/or geometry of said at least one screen of the computing device to establish a set of 3D pattern points from the corresponding pixel coordinates in the screen's display coordinate system DCS of the one or more specific patterns or said arbitrary content, and ii) computing the initial 3D pose SCS 0 by minimizing the reprojection error of the set of pattern points against their corresponding detections in the image coordinate system ICS of the camera of the eye-tracking device.
  • the eye-tracking device comprises a depth-camera and/or multiple cameras.
  • Pattern points of said one or more specific patterns or arbitrary content on said at least one screen are measured directly by the depth-camera or by said multiple cameras using a stereo triangulation computer vision technique. Pattern point thus do not depend on knowledge of mechanical parameters describing the geometry of said at least one screen of the computing device.
  • said final 3D pose ECS of the eye-tracking device under step e. of the calibration method is retrieved by a 3D pose tracking algorithm which estimates 3D changing poses of the eye-tracking device while being moved from its initial 3D position ECS 0 to its final 3D position ECs.
  • the 3D pose tracking algorithm uses visual odometry to estimate the 3D changing poses of the eye-tracking device while being moved from its initial 3D position ECS 0 to its final 3D position ECS.
  • the eye-tracking device further comprises an Inertial Measurement Unit (IMU).
  • IMU Inertial Measurement Unit
  • the 3D pose tracking algorithm uses measurement of the IMU to estimate the 3D changing poses of the eye- tracking device while being moved from its initial position ECS 0 to its final 3D position ECS.
  • the eye-tracking device integrates a self- localisation algorithm configured to recognize features of the environment captured by the eye-tracking device.
  • a model of said environment is generated based on the video stream retrieved from steps b. and d. of the calibration method.
  • the final 3D pose ECS of the eye-tracking device is retrieved by said self-localisation algorithm based on said model.
  • the calibration method further comprises the step of scanning with the eye-tracking device the environment comprising said at least one screen and preferably the place where the eye tracking device is to be positioned in its final 3D pose to create a better model of said environment.
  • a record is made of a calibration model comprising: i) 3D pose SCS 0 of said at least one screen with respect to the eye-tracking device at said initial 3D pose ECS 0 , and ii) said initial 3D pose ECS 0 of the eye-tracking device in relation to the model of the environment.
  • the calibration method further comprises i) retrieving the data from said recorded calibration model, and ii) retrieving the final 3D pose ECS of the eye-tracking device using said self-localisation algorithm based on said environment model for recalibrating the position of the final 3D pose of the eye-tracking device with respect to the 3D pose SCS 0 of said at least one screen.
  • said arbitrary content is captured and processed to detect and establish a set of interest points.
  • said final 3D pose ECS of the eye-tracking device under step e. of the calibration method is retrieved comprising a further step consisting in i) capturing with a camera of the eye-tracking device at least one image of the area comprising the resting position location; ii) computing a 3D pose of the resting position location, and iii) using said 3D pose of the resting position location to compute said final 3D pose ECS of the eye- tracking device.
  • the eye tracking device is a mobile phone or a tablet.
  • a user interface displayed on the screen of said mobile device or tablet, is used to guide the user through at least steps b. and d. of the calibration method.
  • Another aspect of the invention relates to a tangible computer product containing program code for causing a processor to execute the method as described above when said code is executed on said processor.
  • FIG. 1 schematically shows a system for capturing the gaze of a user on a screen of a computing device using an eye-tracking device according to an embodiment of the invention
  • FIG. 2a shows the computing device displaying a specific pattern on screen with a user capturing an image of said pattern using a camera of the eye tracking device according to an embodiment of the invention
  • FIG. 3 schematically shows an explanation view for establishing 2D- 3D correspondence points been points in two distinct coordinate systems
  • - Figure 4 schematically shows a computing device with multiple screens, each displaying several patterns at different locations according to an embodiment of the invention
  • FIG. 5 shows different examples of pixelated patterns to be displayed on a screen to calculate the 3D pose of screen with respect to the camera of the eye-tracking device;
  • FIG. 6 schematically shows the step of moving and placing the eye- tracking device to a resting position convenient the user
  • FIG. 7 shows a perspective view of a stand where the eye-tracking device is to be placed which defines the final 3D pose of the eye-tracking device;
  • FIG. 8 shows features of the environment of the eye-tracking device while it is being moved in order to calculate in real-time the 3D pose of the eye-tracking device until it reaches its final 3D pose;
  • FIG. 9 shows an estimated trajectory versus an actual camera trajectory of the eye-tracking device when the eye-tracking device is moved from its initial pose to its final pose
  • FIG. 10 shows an environment captured by a camera of the eye tracking device showing features used by self-localization algorithms to locate the eye-tracking device.
  • FIG. 1 schematically shows an example of a system 10 for capturing the gaze g of a user P on a screen 13a of a computing device 12 using the eye-tracking device 16 which may advantageously be a smartphone.
  • the eye-tracking device 16 outputs a gaze ray d, also known as the line-of-sight, whose intersection on the screen 13a generates the gaze g, also known as the point of regard.
  • a gaze ray d also known as the line-of-sight
  • the gaze g also known as the point of regard.
  • the system therefore needs to know the 3D pose of the screen SCS, defined with respect to the-eye tracking device.
  • the world referential may be ECS itself, or another arbitrary frame. What is relevant when interpreting the herein described embodiments, is to interpret the screen pose SCS as in relation to the eye-tracking device's pose ECS.
  • the smartphone 16 as shown for example in Figure 1, comprises an RGB camera 17 as well as a depth-sensing camera 18 such as the TrueDepth® camera of the IPhone®.
  • the device comprises one or multiple infrared cameras as an alternative or complementary to the RGB camera. Such infrared data may also be the amplitude data from time-of-f light sensors.
  • the mobile device 16 may comprise dual or multiple cameras, without any depth-sensing camera, that can together work as a depth sensor through stereo triangulation. Cameras of one or different types could indeed be mixed. The resolution, type and focal of different cameras may vary.
  • the eye tracking device 16 is a dedicated hardware with protocols and synchronization with the computing device to enable the display of specific patterns 20a, 20b, 20c, 20d or the retrieval of arbitrary data generated by the computing device, whereas the implementation of the calibration method utilizes processing units of said eye-tracking device.
  • the eye-tracking device functionality is distributed between the eye-tracking device and the computing device, for example, the eye-tracking device consisting only of an external webcam connected to the computing device which internally executes the eye- tracking functionality by processing the video stream from said external webcam.
  • the gaze estimate g of the user may be obtained by different methods.
  • the gaze of the user may be acquired by retrieving an input image and a reference image of an eye of the user and processing the input image and the reference image to estimate a gaze difference between the gaze of the eye within the input image and the gaze of the eye within the reference image.
  • the gaze of the user is retrieved using the estimated gaze difference and the known gaze of the reference image. This procedure is disclosed in detail in W02020/044180, the content of which is hereby incorporated by reference.
  • the gaze of the user may be acquired by comparing an image geometric model with at least one image segmentation map generated from one input image observation corresponding to an image of the user's eye and iteratively modifying at least one parameter in the set of geometric parameters of the image geometric model to generate a new image geometric model of a user's eye until a model correspondence value reaches the optimal value.
  • This procedure is disclosed in detail in W02020/208494, the content of which is hereby incorporated by reference.
  • Other methods for gaze estimation are disclosed in detail for example in WO2014/146199 and WO2015/192879.
  • Yet other methods, such as those based on pupil-centre-corneal-ref lection (PCCR) may also be suitable to retrieve the gaze estimate g, preferably if provided as a 3d ray.
  • Figure 2 schematically shows an example of the setup and available information to compute an initial 3D pose SCSO of a screen 13a using the eye-tracking device 16 camera 17 and an on-screen specific pattern 20a, according to an embodiment.
  • the computing device 12 is configured to display a specific pattern 20a on screen 13a.
  • the user P then uses a camera 17 of the eye-tracking device to capture an image of the computing device screen 13a content and therefore of the specific pattern 20a.
  • Figure 2b) shows the content of the screen referred to the display coordinate system DCS, which is defined in pixel coordinates and matching the physical pixel resolution of the display.
  • DCS and the screen content may instead be referring to logical pixels, in case of changes of resolution of the device.
  • Figure 2c) schematically shows the image captured by the eye-tracking device, wherein the specific pattern 20a has been imaged, and its position can be defined in relation to the image coordinate system ICS.
  • This process thus defines a set of 2D-3D correspondences between the detected markers in the eye-tracker camera coordinate system, which can then be used to find the screen pose SCS 0 by minimizing the reprojection error.
  • the correspondences can be fed into the Perspective- n-Point (PnP) algorithm to estimate the marker's 6DoF pose, allowing to infer the exact screen pose.
  • PnP Perspective- n-Point
  • multiple computer vision algorithms can be used.
  • a Harris corner detection algorithm together with spatial constraints known a priori, such as the number of points to be detected, can be used.
  • a neural network can be used to jointly output per pattern identifier-specific classifiers and their 2D point locations. Other algorithms can be then used to further fine-tune such 2D positions to improve the 2D locations to sub-pixel accuracy, as an example another neural network.
  • Figure 4 schematically shows an embodiment consisting of a computing device with multiple screens 13b, each displaying different patterns 20c at different locations. Similarly, those patterns 20c can then be captured by a camera of the eye-tracking device and used to retrieve the initial pose SCS 0 of the screens, either jointly or separately, assuming the algorithm is able to distinguish which set of patterns is displayed by each of the screens and thus determine a different SCS 0 per screen.
  • the patterns on the said one or more screens are observed by the eye-tracking device, preferably a smartphone with available back and frontal cameras.
  • Figure 5 shows additional examples of patterns 20b, 20d to be displayed on the said one or multiple screens 13a; 13b ( Figures 2 and 4) and used to calculate the initial relative 3D pose of the one or multiple screens with respect to the camera of the eye-tracking device, before the user places the eye-tracking device into a resting position convenient to the user.
  • Such patterns are computer-vision-friendly 2D patterns that are unique and have enough points for 6DoF pose estimation as fiducials or markers.
  • Preferred patterns would be Chessboard, ArUco or ChArUco markers or their derivatives.
  • Figure 6 shows the step of moving and placing the eye-tracking device 16 to a resting position convenient the user P.
  • the system can compute the pose ECS of the eye-tracking device 16, from the initially retrieved pose ECS 0 for the eye-tracking device.
  • computing ECS from the initially retrieved pose ECS 0 assumes the eye-tracking device is able to automatically compute the 3D trajectory that the device took as the user was placing it in its rest position.
  • Figure 7 shows an example of an embodiment, being a perspective view of a stand 22 where the eye-tracking device 16 is to be placed as a resting position, preferred by a user.
  • Figure 8 shows an example of an embodiment, where the eye- tracking device is a mobile phone 16 whose front and back cameras are used to detect arbitrary features f1, f2, f3, f4, f5, f6 of the environment, while the device is being moved or the environment is being explicitly scanned. Such features can then be used by computer vision algorithms to compute pose changes (visual odometry), or to directly compute the position of the device with respect to the environment (self-localization). Such functionality is common in augmented reality applications.
  • Figure 10 shows examples of features that can be retrieved and tracked to build a model of the environment according to one embodiment of the present invention.
  • Such features can be used by self-localization or SLAM algorithms to locate the eye-tracking device either while being moved from its initial pose to its final (resting) pose or directly in the resting pose which is convenient for the user.
  • the input for such algorithms can be the front or back camera of the eye-tracking device.
  • the user can be guided to ensure suitable conditions are achieved to facilitate the detection of features from the environment, or the detection of points on the screen.
  • suitable conditions can help the algorithms work as optimally as they could, for example, could be increased light, reduced effects of direct light, changing location of the said one or more screens, such that better features are captured while moving the eye-tracking device using its cameras.
  • environment features are used as an input to a real-time or near real-time algorithm which tracks the changing 3D pose of the eye-tracking device relative to the said one or multiple screens until the resting or final 3D pose is reached and confirmed by the user manually or established by the system automatically.
  • Examples of such features would be corners, lines or other detections commonly used by camera localization techniques or in visual SLAM, as in the example depicted in Figure 10.
  • the process of retrieving the changes in pose of a device through visual features is typically called visual odometry.
  • the eye-tracking device 16 comprises a depth sensor.
  • a depth sensor can then be used in different steps of the invention.
  • the depth sensor is used to retrieve the depth of points of the 2D correspondences originally defined in relation to the image coordinate system ICS. Based on said depth information, and using the camera pin-hole with its respective intrinsics, said 2D correspondences can be transformed into a set of 3D points. From the set of back- projected 3D points, either the screen physical dimensions can be computed from their direct measurements and removing the need to know a set of screen mechanical parameters (resolution, physical size) in advance or, the initial 3D pose of a screen SCS 0 can be computed directly by establishing 3D to 3D correspondences.
  • a dedicated stand 22 is enhanced with specific patterns designed to facilitate its recognition and further computation of the stands' 3D pose by minimizing reprojection errors, like in the Perspective-n-Point (PnP) algorithm, assuming the physical dimensions of the stand 22 are available.
  • This approach therefore follows the same algorithmic strategy as retrieving the initial 3D pose of a screen SCS 0 .
  • depth sensors or stereo cameras are used to facilitate computing the position of the dedicated stand.
  • the method can account for variable ways in which the user may place the eye-tracking device on the stand, such as querying the user by means of a user interface to provide more hints on the device orientation, or shifts in available directions. This can be also obtained in an automated fashion using data available from IMU sensors if present in the eye-tracking device.
  • eye-tracking device stand 22 is automatically discovered, more possible detections resembling eye-tracking device stands can be discovered. One of them can be chosen automatically by the system, based on predefined dimensions or characteristics. Or the user can be queried to choose one of the detections.
  • the user after the eye-tracking device is placed, the user either manually confirms that the placement has been finalized, or the system automatically detects the final placement based on rate of change of the relative pose to the said one or more screens.
  • Other signals can be used to automatically detect the final position, such as change of measurements coming from an inertial measurement unit (IMU) or change of outputs coming from a set of computer vision algorithms such as SLAM or optical flow.
  • IMU inertial measurement unit
  • SLAM computer vision algorithms
  • the system instead of using specific patterns and displaying them on the screen of the computing device, the system can display arbitrary content, such as video, operating system windows, different programs, etc.
  • interest features can be retrieved to establish the 3D points ⁇ a i , b i ⁇ by means of computer vision algorithms that either detects interest points automatically, and finds their correspondences in the image coordinate system ICS of the camera of the eye-tracking device 16, or by using algorithms which directly establish sparse or dense correspondences between the screen content and the image captured by the eye-tracking device.
  • the screen content can be first retrieved using a screen capture software or API. Multiple algorithms exist to establish said correspondences, many of which were designed for stereo cameras correspondence matching. Such an approach may be beneficial to avoid disrupting the content in which the user may be engaged with.
  • the initial pose of the screen SCSO is computed from multiple video frames from the video stream captured by the eye-tracking device, either assuming that the eye-tracking device is fully static, when located at ECS 0 , or that small pose changes are detected and tracked while the user scans the screen and thus helping to SCSO from the average of the pose results over multiple frames.
  • Such strategies typically deliver pose estimates with higher accuracy as they average over the time- instantaneous noise levels of features detection.
  • infrared lights can be well positioned on the one or more said screens, acting as a pattern that can be detected by the eye- tracking device. This light can be emitted for limited time required to calibrate the eye-tracking device, or emitted constantly in the case of an eye- tracking device which requires continuous estimation of its position relative to screens.
  • the user is instructed into placing the eye- tracking device 16 to a preferred position relative to one or multiple screens. This may be preferable as the system can lead the user into a position optimal for eye tracking, such as ensuring that the face and eyes of the user are centred in the frame captured by the eye-tracking camera or that the eyes are observed from a specific angle.
  • the process of instructing the user where to place the eye-tracking device may comprise audio, visual or haptic cues.
  • a graphical 3D rendering of the eye-tracking device's desired position with respect to the screen can be shown to the user in either the screen of the computing device or a screen on the eye-tracking device, in the case the eye-tracking device is a mobile phone.
  • the cues are specific sounds, speech, or vibrations which can change according to how close and in which direction the desired position is, with respect to the current position of the eye-tracking device, as the user is placing the eye tracking device in the final rest 3D position.
  • the system computes a preferred positioning of the eye-tracking device to achieve the most optimal eye-tracking performance on the one or multiple screens before instructing the user where to place the eye-tracking device.
  • the system decides the positioning of the eye- tracking device to achieve the most optimal eye-tracking performance and guides the user to such positioning as well as refining the user placed position using previously mentioned methods for establishing the eye- tracker position relative to the said one or multiple screens.
  • machine learning models such as neural networks are used for the functioning of the eye-tracking device.
  • the system establishes the most optimal eye-tracking device position based on the relative position of the eye-tracking device and one or more screens used during the training of these machine learning models.
  • the system can recommend such positioning to the user, as an example by showing the optimal placement on the eye-tracking device's screen, as an overlay projection on the camera frames retrieved by the eye-tracking device with the means of augmented reality technology (AR).
  • AR augmented reality technology
  • the dimensions of the one or multiple screen planes can be established by the computing device connected to the screens, for example by querying such information from the monitor device drivers or other APIs meant for such queries.
  • the physical dimensions of the one or multiple screen planes can be established by aligning the eye-tracking device with at least two corners of said one or multiple screens while tracking the position of the eye-tracking device in between such placements. Based on the physical dimensions of the eye-tracking device and the relative position between such corners established by difference between the eye-tracking device tracked positions, the screen's physical size can be computed. Such dimension estimations can be further refined using the knowledge about common physical dimensions of screens.
  • the specific pattern on the screen can be shown in irregular intervals, not matching the screen refresh rate, making it visible to high frame rate cameras present on the eye-tracking device but not to a human.
  • the user may use a camera of the eye-tracking device 16 to scan the environment.
  • the areas of the environment to scan may comprise the one or multiple screens, the location where the computing device is positioned, the region surrounding the final rest position of the eye- tracking device or even the surrounding areas of the room, behind or in front of the devices, that may contain objects unrelated to said devices, as depicted in the example of Figure 10.
  • a collection of features f1, f2, f3, f4, f5, f6 and their 3D positions can then be established to build a model of the environment.
  • Such a model can also or alternatively contain representations such as point clouds, mesh data and/or textures, from which a dynamic set of features can be extracted or compared as needed.
  • the user interface may guide the user into conducting the steps of scanning the environment as well as which areas to cover and in which order.
  • the calibration method may maintain a record, either in memory, or in files, or in a cloud storage, that include the 3D pose SCS 0 of the screen with respect to the eye-tracking device at said initial 3D pose ECS 0 , the initial 3D pose ECS 0 of the eye-tracking device in relation to the model of the environment, and the model of the environment itself.
  • a record either in memory, or in files, or in a cloud storage, that include the 3D pose SCS 0 of the screen with respect to the eye-tracking device at said initial 3D pose ECS 0 , the initial 3D pose ECS 0 of the eye-tracking device in relation to the model of the environment, and the model of the environment itself.
  • using calibration data including a model of the environment makes it possible to compute the pose of the rest 3D pose ECS of the eye-tracking device by computing its 3D pose with respect to the model of the environment using self-localization computer vision techniques. Then, using said computed pose at the final rest 3D pose ECS of the eye-tracking device, the pose difference can be computed with respect to the initial 3D pose ECS 0 of the eye-tracking device 16. Such difference of pose can then be applied to the initial 3D pose SCS 0 of the screen to compute the final 3D pose SCS of the screen when the eye- tracking device is placed in its final rest 3D pose.
  • this process can be conducted immediately after the user is conducting the full calibration process including the scanning of the screen, the environment, and placing the eye-tracking device in its final 3D rest pose.
  • Such an approach makes the assumption that in normal conditions, a user is unlikely to move their screens within the room they are placed, such as in a desk.
  • estimating the 3D pose of the eye-tracking device in relation to the model of the environment uses edge, landmark and object detection or other machine learning techniques to improve the robustness of the self-localization algorithms, for example identifying the most robustly detectable corners visible in the environment retrieved image data.
  • movements of the eye-tracking device post- calibration can be detected and estimated again by continuously monitoring and recomputing the final 3D position ECS of the eye-tracking device with respect to a model of the environment.
  • Such an approach should ideally be robust to movements of a single reference object in the model of the environment.
  • the signals of an IMU can be monitored to decide whether it should be necessary to trigger another re- computation process.
  • the user may manually place markers, patterns, specific or arbitrary images into the field-of-view of the eye- tracking device to assure that a more robust model of the environment is achieved. This may be convenient in scenarios where the user does not have any behind except for a uniformly-coloured wall, which are known to be ambiguous referential for computer vision algorithms.
  • the user interface may suggest the best possible end position of the eye-tracking device during the calibration phase, ensuring that interest points are clearly visible for later monitoring of movements of the eye-tracking device.
  • the user interface for the calibration process uses sounds, vibrations or colours to indicate if the user is moving the eye- tracking device inappropriately, for example moving it too fast or not covering enough interest points, helping to improve the overall quality of the calibration.
  • the user interface highlights a set of interest points and suggests to the user to enhance recording of such interest points during the process of retrieving a model of the environment.
  • the user positions the eye-tracking device so to have an at least a partial area of one of the screens in the field-of-view of one of the cameras of the eye-tracking device after placing it in its final position, making it possible to detect and potentially recalibrate in the scenario in which the screen position or the eye-tracking device position has changed.
  • An embodiment may include a wide field of view camera on the eye-tracking device such that it is easier to capture more content of the screen after the user has placed the eye-tracking device in a rest position. In an embodiment, detecting these movements may further trigger a recalibration alert if a movement is detected by either the eye-tracking device or the screen.
  • holding down a button of a phone or tablet is used to activate the SLAM/environment recording process and allows to pause and continue the recording during the calibration process.
  • a computer product running the user interface for the calibration process is an application on a phone or tablet.
  • one or multiple screens have an attached IMU sensor, from which measurements can be used to detect change of the relative position between the screens and the eye-tracking device after the final position has been established.
  • one or multiple screens have an attached camera or set of cameras, which can be used to determine a position change, which is used to indicate that recalibration of the eye-tracking device or its relative position to one or multiple screens is needed.
  • the user interface uses overlays or other visual guides and can request for input by the user to adjust detected features manually, such as corners of the tracked screen or screens, if some error is detected during the scanning process of the environment. Manual input by the user can generally be used to increase the accuracy during the calibration process, for example requesting a manual confirmation throughout the scanning process if overlays are correctly following the screen edges.
  • the eye-tracking device may be a device other than a mobile phone such as a stand-alone eye-tracking device with its own processing unit and a communication protocol allowing it to share video data or conduct partial calibration processes in its own processing units.
  • the communication protocols are configured such that the interest points displayed on screen, mechanical parameters of the screen and other needed information are sent over to the eye-tracking device, which internally makes the necessary calibration computations.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé d'étalonnage pour un système (10) comprenant un dispositif de suivi oculaire (16) et un dispositif de calcul (12) pour capturer le regard d'un utilisateur (P) sur au moins un écran du dispositif informatique. Le procédé d'étalonnage comprend : a. l'affichage sur un écran (13a ; 13b) du dispositif informatique (12) d'un ou de plusieurs motifs spécifiques (20a ; 20b ; 20c ; 20d) ou d'un contenu quelconque ; b. la capture avec une caméra (18) du dispositif de suivi oculaire (16) d'au moins une image dudit ou desdits motifs spécifiques (20a ; 20b ; 20c ; 20d) ou dudit contenu quelconque lorsque ledit dispositif de suivi oculaire est dans une pose 3D initiale (ECS0) ; c. le calcul d'une pose 3D (SCS0) dudit écran (13a ; 13b) par rapport au dispositif de suivi oculaire (16) à ladite pose 3D initiale (ECS0) en fonction dudit ou desdits motifs spécifiques (20a ; 20b ; 20c ; 20d) ou dudit contenu quelconque défini au début en termes de coordonnées de pixel dans le système de coordonnées d'affichage de l'écran (DCS), et d'autre part détecté dans le système de coordonnées d'image (ICS) de la caméra (18) du dispositif de suivi oculaire (16) ; d. le déplacement et la mise en place du dispositif de suivi oculaire (16) à une position de repos pratique pour un utilisateur (P) correspondant à une pose 3D finale (ECS) du dispositif de suivi oculaire (16) ; et e. le calcul de la pose 3D finale (SCS) dudit ou desdits écran(s) (13a ; 13b) par rapport au système de coordonnées de dispositif de suivi oculaire (ECS) lorsque le dispositif de suivi oculaire (16) se trouve dans ladite pose 3D finale en fonction de la pose 3D (SCS0) dudit ou desdits écran(s) (13a, 13b) par rapport au dispositif de suivi oculaire (16) à ladite pose 3D initiale (ECS0) et à la pose 3D finale (ECS) du dispositif de suivi oculaire (16).
PCT/IB2022/055739 2021-06-30 2022-06-21 Procédé d'étalonnage d'un système comprenant un dispositif de suivi oculaire et un dispositif informatique comprenant un ou plusieurs écrans WO2023275669A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22734686.3A EP4374241A1 (fr) 2021-06-30 2022-06-21 Procédé d'étalonnage d'un système comprenant un dispositif de suivi oculaire et un dispositif informatique comprenant un ou plusieurs écrans

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP21182902.3 2021-06-30
EP21182902.3A EP4113251A1 (fr) 2021-06-30 2021-06-30 Procédé d'étalonnage d'un système comprenant un dispositif de suivi de l' il et un dispositif informatique comprenant un ou plusieurs écrans

Publications (1)

Publication Number Publication Date
WO2023275669A1 true WO2023275669A1 (fr) 2023-01-05

Family

ID=76730437

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2022/055739 WO2023275669A1 (fr) 2021-06-30 2022-06-21 Procédé d'étalonnage d'un système comprenant un dispositif de suivi oculaire et un dispositif informatique comprenant un ou plusieurs écrans

Country Status (2)

Country Link
EP (2) EP4113251A1 (fr)
WO (1) WO2023275669A1 (fr)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140211995A1 (en) 2013-01-27 2014-07-31 Dmitri Model Point-of-gaze estimation robust to head rotations and/or estimation device rotations
US20140226131A1 (en) 2013-02-14 2014-08-14 The Eye Tribe Aps Systems and methods of eye tracking calibration
WO2014146199A1 (fr) 2013-03-18 2014-09-25 Mirametrix Inc. Système et procédé de suivi du regard sur axe
WO2015192879A1 (fr) 2014-06-16 2015-12-23 Fondation De L'institut De Recherche Idiap Procédé et appareil d'estimation de regard
WO2020044180A2 (fr) 2018-08-31 2020-03-05 Eyeware Tech Sa Procédé et système d'estimation du regard
US20200174560A1 (en) 2018-12-04 2020-06-04 Samsung Electronics Co., Ltd. Calibration method for three-dimensional (3d) augmented reality and apparatus thereof
WO2020208494A1 (fr) 2019-04-10 2020-10-15 Eyeware Tech Sa Procédé et système d'estimation de paramètres géométriques liés à l'œil d'un utilisateur

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140211995A1 (en) 2013-01-27 2014-07-31 Dmitri Model Point-of-gaze estimation robust to head rotations and/or estimation device rotations
US20140226131A1 (en) 2013-02-14 2014-08-14 The Eye Tribe Aps Systems and methods of eye tracking calibration
WO2014146199A1 (fr) 2013-03-18 2014-09-25 Mirametrix Inc. Système et procédé de suivi du regard sur axe
WO2015192879A1 (fr) 2014-06-16 2015-12-23 Fondation De L'institut De Recherche Idiap Procédé et appareil d'estimation de regard
WO2020044180A2 (fr) 2018-08-31 2020-03-05 Eyeware Tech Sa Procédé et système d'estimation du regard
US20200174560A1 (en) 2018-12-04 2020-06-04 Samsung Electronics Co., Ltd. Calibration method for three-dimensional (3d) augmented reality and apparatus thereof
WO2020208494A1 (fr) 2019-04-10 2020-10-15 Eyeware Tech Sa Procédé et système d'estimation de paramètres géométriques liés à l'œil d'un utilisateur

Also Published As

Publication number Publication date
EP4113251A1 (fr) 2023-01-04
EP4374241A1 (fr) 2024-05-29

Similar Documents

Publication Publication Date Title
US10410089B2 (en) Training assistance using synthetic images
US10001844B2 (en) Information processing apparatus information processing method and storage medium
US7825948B2 (en) 3D video conferencing
US10659750B2 (en) Method and system for presenting at least part of an image of a real object in a view of a real environment, and method and system for selecting a subset of a plurality of images
US10839544B2 (en) Information processing apparatus, information processing method, and non-transitory computer readable storage medium
CN100442141C (zh) 图像投影方法和设备
US20200387745A1 (en) Method of Determining a Similarity Transformation Between First and Second Coordinates of 3D Features
US20160210785A1 (en) Augmented reality system and method for positioning and mapping
EP3441788A1 (fr) Appareil et procédé permettant de générer une représentation d'une scène
KR101769177B1 (ko) 시선 추적 장치 및 방법
WO2016029939A1 (fr) Procédé et système pour déterminer au moins une caractéristique d'image dans au moins une image
JP2015212849A (ja) 画像処理装置、画像処理方法および画像処理プログラム
US20160259402A1 (en) Contact detection apparatus, projector apparatus, electronic board apparatus, digital signage apparatus, projector system, and contact detection method
KR20160094190A (ko) 시선 추적 장치 및 방법
KR20190027079A (ko) 전자 장치, 그 제어 방법 및 컴퓨터 판독가능 기록 매체
US10884546B2 (en) Projection alignment
CN110880161B (zh) 一种多主机多深度摄像头的深度图像拼接融合方法及系统
CN107113417B (zh) 将图像投影到对象上
EP3910451B1 (fr) Systèmes d'affichage et procédés d'alignement de différents moyens de suivi
CN113870213A (zh) 图像显示方法、装置、存储介质以及电子设备
CN112073640B (zh) 全景信息采集位姿获取方法及装置、系统
US11275434B2 (en) Information processing apparatus, information processing method, and storage medium
EP4113251A1 (fr) Procédé d'étalonnage d'un système comprenant un dispositif de suivi de l' il et un dispositif informatique comprenant un ou plusieurs écrans
US11935286B2 (en) Method and device for detecting a vertical planar surface
CN114402364A (zh) 使用随机森林的3d对象检测

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22734686

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022734686

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022734686

Country of ref document: EP

Effective date: 20240130