CN112114659A - Method and system for determining a fine point of regard for a user - Google Patents

Method and system for determining a fine point of regard for a user Download PDF

Info

Publication number
CN112114659A
CN112114659A CN202010500200.9A CN202010500200A CN112114659A CN 112114659 A CN112114659 A CN 112114659A CN 202010500200 A CN202010500200 A CN 202010500200A CN 112114659 A CN112114659 A CN 112114659A
Authority
CN
China
Prior art keywords
user
determining
determined
spatial representation
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010500200.9A
Other languages
Chinese (zh)
Inventor
杰弗里·库珀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tobii AB
Original Assignee
Tobii AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tobii AB filed Critical Tobii AB
Publication of CN112114659A publication Critical patent/CN112114659A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/0093Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00 with means for monitoring data relating to the user, e.g. head-tracking, eye-tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61FFILTERS IMPLANTABLE INTO BLOOD VESSELS; PROSTHESES; DEVICES PROVIDING PATENCY TO, OR PREVENTING COLLAPSING OF, TUBULAR STRUCTURES OF THE BODY, e.g. STENTS; ORTHOPAEDIC, NURSING OR CONTRACEPTIVE DEVICES; FOMENTATION; TREATMENT OR PROTECTION OF EYES OR EARS; BANDAGES, DRESSINGS OR ABSORBENT PADS; FIRST-AID KITS
    • A61F4/00Methods or devices enabling patients or disabled persons to operate an apparatus or a device not forming part of the body 
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • G02B27/0172Head mounted characterised by optical features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/366Image reproducers using viewer tracking
    • H04N13/383Image reproducers using viewer tracking for tracking with gaze detection, i.e. detecting the lines of sight of the viewer's eyes
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/0138Head-up displays characterised by optical features comprising image capture systems, e.g. camera
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/014Head-up displays characterised by optical features comprising information/image processing systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Optics & Photonics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • Ophthalmology & Optometry (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Vascular Medicine (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

An eye tracking system, a head mounted device, a computer program, a carrier and a method in an eye tracking system for determining a fine gaze point of a user are disclosed. In the method, a gaze convergence distance of the user is determined. Further, a spatial representation of at least a portion of the user's field of view is obtained, and depth data for at least a portion of the spatial representation is obtained. Saliency data of the spatial representation is determined based on the determined gaze convergence distance and the obtained depth data, and a refined gaze point of the user is determined based on the determined saliency data.

Description

Method and system for determining a fine point of regard for a user
Technical Field
The present disclosure relates to the field of eye tracking. In particular, the present disclosure relates to a method and system for determining a fine point of regard for a user.
Background
Eye/gaze tracking functionality is introduced into an increasing number of applications such as Virtual Reality (VR) applications and Augmented Reality (AR) applications. By introducing such an eye tracking function, an estimated gaze point of the user may be determined, which in turn may be used as an input for other functions.
When determining the user's estimated point of regard in an eye tracking system, the signal representing the user's estimated point of regard may deviate, for example due to measurement errors of the eye tracking system. Even if the user actually focuses his gaze on the same point during a certain time period, different points of regard of the user may be determined during different measurement periods during the time period. In US 2016/0291690 a1, saliency data of a user's field of view is used along with the gaze direction of the user's eyes in order to more reliably determine the point of interest at which the user gazes. However, determining saliency data of the user's field of view requires processing, and even if the saliency data is used, the determined point of interest may be different from the actual point of interest.
It is desirable to provide an eye tracking technique that provides a more robust and more accurate point of regard than known methods.
Disclosure of Invention
It is an object of the present disclosure to provide a method and system that seeks to mitigate, alleviate or eliminate one or more of the above-identified deficiencies in the art.
This object is achieved by a method, an eye tracking system, a head-mounted device, a computer program and a carrier according to the appended claims.
According to one aspect, a method in an eye tracking system for determining a fine gaze point of a user is provided. In the method, a gaze convergence distance of the user is determined, a spatial representation of at least a portion of the user's field of view is obtained, and depth data for at least a portion of the spatial representation is obtained. Saliency data of the spatial representation is determined based on the determined gaze convergence distance and the obtained depth data, and then a refined gaze point of the user is determined based on the determined saliency data.
The saliency data provides a measure for attributes in the user's field of view and is represented in a spatial representation, indicating the likelihood that these attributes are brought to the visual attention of a person. Determining saliency data of a spatial representation means determining saliency data related to at least a part of the spatial representation.
The depth data for at least a portion of the spatial representation is indicative of a distance from an eye of the user to an object or feature in the field of view of the user corresponding to the at least a portion of the spatial representation. Depending on the application (e.g., AR or VR), these distances are real or virtual.
The gaze convergence distance indicates the distance from the user's eyes that the user's gaze point is focusing on. The convergence distance may be determined using any method of determining a convergence distance, such as a method based on the gaze direction of the user's eyes and the intersection between these directions, or a method based on the interpupillary distance.
Determining the saliency data further based on the determined line-of-sight convergence distance and the obtained depth data for at least a portion of the spatial representation enables determining the saliency data to be faster and require less processing. It further enables the determination of a refined point of regard for the user, which is a more accurate estimate of the user's point of interest.
In some embodiments, determining the saliency data of the spatial representation comprises identifying a first depth region in the spatial representation corresponding to depth data obtained within a predetermined range comprising the determined line of sight convergence distance. Then, saliency data of a first depth region of the spatial representation is determined.
The identified first depth region of the spatial representation corresponds to objects or features in at least a portion of the user's field of view that are within a predetermined range that includes the determined line of sight convergence distance. A user is generally more likely to be looking at one of the objects or features within a predetermined range than objects or features corresponding to regions of the spatial representation having depth data outside of those predetermined range. Therefore, it is beneficial to determine saliency data for the first depth region and to determine a fine gaze point based on the determined saliency data.
In some embodiments, determining the saliency data of the spatial representation comprises: identifying a second depth region of the spatial representation corresponding to depth data obtained outside the predetermined range including the line of sight convergence distance, and suppressing saliency data determining the second depth region of the spatial representation.
The identified second depth region of the spatial representation corresponds to objects or features in at least a portion of the user's field of view that are outside of a predetermined range that includes the determined line of sight convergence distance. It is generally less likely that a user is looking at one of these objects or features outside of a predetermined range than objects or features corresponding to regions of the spatial representation having depth data within the predetermined range. It is therefore beneficial to suppress the saliency data of the second depth region to avoid processing that may not be necessary or may even provide misleading results, as it is unlikely that the user is looking at objects and/or features corresponding to regions of the spatial representation having depth data outside the predetermined range. This will reduce the processing power used to determine the saliency data compared to methods that also determine the saliency data without employing the determined gaze convergence distance of the user and the depth data of at least part of the spatial representation.
In some embodiments, determining an improved gaze point comprises determining a refined gaze point of the user as the point corresponding to the highest saliency from the determined saliency data. The determined fine gaze point will thus be the point most likely to attract visual attention in some way. Plus the use of the saliency data of the identified first depth region corresponding to the depth data of the determined spatial representation obtained within the predetermined range comprising the determined line of sight convergence distance, the determined fine gaze point will therefore be the point within the first depth region that is most likely to draw visual attention in some respect.
In some embodiments, determining the saliency data of the spatial representation comprises: the method further includes determining first saliency data of the spatial representation based on the visual saliency, determining second saliency data of the spatial representation based on the determined line-of-sight convergence distance and the obtained depth data, and determining the saliency data based on the first saliency data and the second saliency data. This first saliency data may for example be based on high contrast, vivid color, size, motion, etc. After optional normalization and weighting, different types of saliency data are combined together.
In some embodiments, the method further comprises: the method includes determining a new line-of-sight convergence distance for the user, determining new saliency data for the spatial representation based on the new line-of-sight convergence distance, and determining a new refined point-of-regard for the user based on the new saliency data. Thus, a dynamic new fine gaze point may be determined based on the new line-of-sight convergence distance determined over time. Several alternatives are envisaged, such as using only the currently determined new line-of-sight convergence distance or the mean of the line-of-sight convergence points determined over a predetermined period of time.
In some embodiments, the method further comprises: the method further includes determining a plurality of gaze points of the user, and identifying a cropped region of the spatial representation based on the determined plurality of gaze points of the user. Preferably, determining the saliency data then comprises determining the saliency data of the identified cropped regions of the spatial representation.
The user is generally more likely to be looking at the points corresponding to the cropped area than the points corresponding to the areas outside the cropped area. Therefore, it is beneficial to determine saliency data for the cropped regions and determine a fine gaze point based on the determined saliency data.
In some embodiments, the method further comprises suppressing saliency data for regions of the spatial representation determined to be outside the identified cropped regions of the spatial representation.
It is generally less likely that the user is looking at points corresponding to areas outside the cropped area than points corresponding to the cropped area. Thus, it is beneficial to suppress saliency data that determines regions outside of a crop region to avoid processing that may not be necessary or may even provide misleading results, because the user is less likely to look at points corresponding to regions outside of the crop region. This will reduce the processing power used to determine saliency data relative to methods that would otherwise determine saliency data without cropping based on the determined gaze point of the user.
In some embodiments, obtaining the depth data comprises obtaining depth data for the identified cropped regions of the spatial representation. By obtaining depth data for the identified cropped area and not necessarily for areas outside the cropped area, saliency data within the cropped area may be determined based only on the obtained depth data for the identified cropped area. Thus, the amount of processing required to determine significance data may be further reduced.
In some embodiments, the method further comprises determining a respective line-of-sight convergence distance for each of the plurality of determined gaze points of the user.
In some embodiments, the method further comprises determining a new point of regard for the user. In the case where the determined new gaze point is within the identified clipping region, the new clipping region is identified as being the same as the identified clipping region. In an alternative, in the event that the determined new gaze point is outside the identified clipping region, a new clipping region is identified that includes the determined new gaze point and that is different from the identified clipping region.
If it is determined that the new gaze point determined for the user is within the identified crop area, the user may look at points within the crop area. By maintaining the same cropped area in this case, any saliency data determined based on the identified cropped area can be reused. Thus, no further processing is required to determine saliency based on the identified cropped regions.
In some embodiments, successive points of regard of the user are determined in successive time intervals, respectively. Further, for each time interval, it is determined whether the user is gazing or panning. With the user gazing, a fine gaze point is determined. In the case where the user is panning, suppression is made for determining a fine gaze point. If the user is gazing, the user may be looking at a certain point at that point in time, and thus, the fine gaze point may be correctly determined. On the other hand, if the user is panning, the user is less likely to look at a certain point at that point in time, and therefore, is less likely to correctly determine a fine gaze point. These embodiments will enable processing to be reduced while making such determinations if it is possible to correctly determine a fine point of regard.
In some embodiments, successive points of regard of the user are determined in successive time intervals, respectively. Further, for each time interval, it is determined whether the user is in smooth follow. In a case where the user is in smooth following, continuous cropping areas including the continuous gaze points are respectively determined so that the identified continuous cropping areas follow the smooth following. If smooth following is determined, then in the case where it is determined that the cropped region follows the smooth following, little additional processing is required to determine the continuous cropped region.
In some embodiments, the spatial representation is an image, such as a 2D image of the real world, a 3D image of the real world, a 2D image of the virtual environment, or a 3D image of the virtual environment. The data may come from a photo sensor, a virtual 3D scene, or may come from another type of image sensor or a spatial sensor.
According to a second aspect, an eye tracking system for determining a point of gaze of a user is provided. The eye tracking system includes a processor and a memory containing instructions executable by the processor. The eye tracking system is operable to determine a gaze convergence distance of the user and obtain a spatial representation of at least a portion of the user's field of view. The eye tracking system is further operable to obtain depth data for at least a portion of the spatial representation, and determine saliency data for the spatial representation based on the determined line of sight convergence distance and the obtained depth data. The eye tracking system is further operable to determine a fine gaze point of the user based on the determined saliency data.
Embodiments of the eye tracking system according to the second aspect may for example comprise features corresponding to features of any embodiment of the method according to the first aspect.
According to a third aspect, a head mounted device for determining a point of regard of a user is provided. The head-mounted device includes a processor and a memory containing instructions executable by the processor. The head-mounted device is operable to determine a line-of-sight convergence distance of the user and obtain a spatial representation of at least a portion of the user's field of view. The head-mounted device is further operable to obtain depth data for at least a portion of the spatial representation, and determine saliency data for the spatial representation based on the determined line-of-sight convergence distance and the obtained depth data. The head-mounted device is further operable to determine a refined point of regard for the user based on the determined saliency data.
In some embodiments, the head mounted device further comprises one of a transparent display and a non-transparent display.
Embodiments of the head mounted device according to the third aspect may for example comprise features corresponding to features of any embodiment of the method according to the first aspect.
According to a fourth aspect, a computer program is provided. The computer program includes instructions that, when executed by at least one processor, cause the at least one processor to determine a gaze convergence distance of the user and obtain a spatial representation of a field of view of the user. Further, the at least one processor is caused to obtain depth data for at least a portion of the spatial representation, and determine saliency data for the spatial representation based on the determined line-of-sight convergence distance and the obtained depth data. Further, the at least one processor is caused to determine a refined point of regard for the user based on the determined saliency data.
Embodiments of the computer program according to the fourth aspect may for example comprise features corresponding to features of any embodiment of the method according to the first aspect.
According to a fifth aspect, there is provided a carrier comprising a computer program according to the fourth aspect. The carrier is one of an electronic signal, an optical signal, a radio signal, and a computer readable storage medium.
Embodiments of the carrier according to the fifth aspect may for example comprise features corresponding to features of any embodiment of the method according to the first aspect.
Drawings
These and other aspects will now be described in the following illustrative and non-limiting detailed description with reference to the drawings.
Fig. 1 is a flow chart illustrating an embodiment of a method according to the present disclosure.
Fig. 2 includes an image illustrating the results of steps of an embodiment of a method according to the present disclosure.
Fig. 3 is a flow chart illustrating steps of a method according to the present disclosure.
Fig. 4 is a flow chart illustrating further steps of a method according to the present disclosure.
Fig. 5 is a flow chart illustrating yet further steps of a method according to the present disclosure.
Fig. 6 is a block diagram illustrating an embodiment of an eye tracking system according to the present disclosure.
All the figures are schematic, not necessarily to scale, and generally only parts that are necessary for elucidating the respective examples are shown, while other parts may be omitted or only suggested.
Detailed Description
Aspects of the present disclosure are described more fully hereinafter with reference to the accompanying drawings. However, the methods, eye tracking systems, head-mounted devices, computer programs, and carriers disclosed herein may be embodied in many different forms and should not be construed as limited to the aspects set forth herein. Throughout the drawings, like reference numerals refer to like elements.
The saliency data provides a measure for attributes in the user's field of view and is represented in a spatial representation, indicating the likelihood that these attributes are drawing human visual attention. For this reason, some of the attributes most likely to draw human visual attention are, for example, color, motion, orientation, and scale. Such saliency data may be determined using a saliency model. Saliency models generally predict what will attract human visual attention. Many saliency models are based on models of sets of features that simulate biological plausibility of early visual processes, determining saliency data for an area based on, for example, how different that area is from its surroundings.
In a spatial representation of a user's visual field, a saliency model may be used to identify different visual features that contribute to different degrees to the attention selection of stimuli, and to generate saliency data that indicates the saliency of different points in the spatial representation. A refined point of regard that is more likely to correspond to the point of interest at which the user is gazing may then be determined based on the determined saliency data.
When saliency data is determined in a saliency model on a spatial representation, for example in the form of a 2D image, the degree of saliency of each pixel of the image may be analyzed according to a certain visual property and a saliency value assigned to each pixel for that property. Once the saliency is calculated for each pixel, the difference in saliency between pixels is known. Optionally, salient pixels may then be grouped together into salient regions to simplify the feature results.
In the case of using images as input to the model, prior art saliency models typically use a bottom-up approach to compute saliency. The present inventors have recognized that additional top-down determined information about the user from the eye tracking system may be used to achieve a more accurate estimate of the point of interest at which the user is gazing and/or to make the saliency model run faster. The top-down information provided by the eye tracker may be one or more determined gaze convergence distances of the user. Further top-down information provided by the eye tracker may be one or more determined gaze points of the user. Then, saliency data for the spatial representation is determined based on the top-down information.
Fig. 1 is a flow chart illustrating an embodiment of a method 100 for determining a fine gaze point of a user in an eye tracking system. In the method, a line of sight convergence distance of a user is determined 110. The gaze convergence distance indicates the distance from the user's eyes to the location at which the user's gaze is focusing. Any method of determining the convergence distance may be used to determine the convergence distance, such as methods based on the gaze direction of the user's eyes and the intersection between these directions, methods based on time-of-flight measurements, and methods based on the interpupillary distance. The eye tracking system in which the method 100 is performed may be, for example, a head-mounted system, such as Augmented Reality (AR) glasses or Virtual Reality (VR) glasses, but may also be an eye tracking system that is not head-mounted but rather remote from the user. Further, the method comprises the step of obtaining 120 a spatial representation of at least a part of the user's field of view. The spatial representation may be, for example, a digital image of at least a portion of the user's field of view captured by one or more cameras in or remote from the eye tracking system. Further, depth data of at least a portion of the spatial representation is obtained 130. The depth data of the spatial representation of the user's field of view is indicative of a real or virtual distance from the user's eyes to a point or portion of an object or feature in the user's field of view. The depth data is associated with points or portions in the spatial representation corresponding to points or portions of objects or features of the user's field of view, respectively. Thus, a certain point or region in the spatial representation, which represents a point on a certain object or feature or part of the object or feature in the user's field of view, will have depth data indicating the distance from the user's eye to the point or part on the object or feature. For example, the spatial representation may be two images (stereo images) taken from two outward facing cameras in the head-mounted device at a lateral distance. The distance from the user's eyes to a point or portion of an object or feature in the user's field of view can then be determined by analyzing the two images. The depth data thus determined may be linked to points or portions of the two images corresponding to points or portions of objects or features in the user's field of view, respectively. Other examples of spatial representations are also possible, such as 3D meshes based on time-of-flight measurements or instant positioning and mapping (SLAM). Based on the determined line-of-sight convergence distance and the obtained depth data, saliency data of the spatial representation is determined 140. Finally, a refined point of regard for the user is determined 150 based on the determined saliency data.
Depending on the application, the depth data of the spatial representation of the user's field of view indicates the real or virtual distance from the user's eyes to the point or portion of the object or feature in the field of view. Where the spatial representation comprises a representation of a real world object or feature of at least a part of the user's field of view, the distances indicated by the depth data are typically real, i.e. they indicate the real distances from the user's eyes to the real world object or feature represented in the spatial representation. Where the spatial representation comprises a representation of a virtual object or feature of at least a portion of the user's field of view, the distances indicated by the depth data are typically virtual when viewed by the user, i.e. the distances indicate virtual distances from the user's eyes to the virtual object or feature represented in the spatial representation.
The determined gaze convergence distance and the obtained depth data may be used to improve the determination of the saliency data such that they provide determining fine information from which a fine gaze point may be determined. For example, one or more regions in the spatial representation corresponding to portions of objects or features in the field of view whose distances from the user's eyes coincide with the determined gaze convergence distance may be identified. The identified one or more regions may be used to refine the saliency data by adding information indicating which regions of the spatial representation are more likely to correspond to the points of interest at which the user gazes. Furthermore, the identified one or more regions of the spatial representation may be used as some form of filter prior to determining the saliency data of the spatial representation. In this way, saliency data is determined for only those regions of the spatial representation that correspond to portions of objects or features in the field of view whose distance from the user's eyes coincides with the determined gaze convergence distance.
In particular, determining 140 saliency data of the spatial representation may include identifying 142 a first depth region of the spatial representation corresponding to depth data obtained within a predetermined range including the determined line of sight convergence distance. The range may be set wider or narrower depending on, for example, the accuracy of the determined line-of-sight convergence distance, the accuracy of the obtained depth data, and other factors. Then, saliency data for a first depth region of the spatial representation is determined 144.
The identified first depth region of the spatial representation corresponds to objects or features in at least a portion of the field of view of the user that are within a predetermined range that includes the determined line of sight convergence distance. The user is generally more likely to be looking at one of the objects or features within the predetermined range than the objects or features corresponding to the spatially represented region having depth data outside the predetermined range. Thus, the identification of the first depth region provides further information that may be used to identify the point of interest at which the user is gazing.
In addition to determining the first depth region, determining the saliency data of the spatial representation preferably further comprises identifying second depth data of the spatial representation, the second depth data corresponding to depth data obtained outside a predetermined range comprising the line of sight convergence distance. In contrast to the first depth region, the saliency data is not determined for a second depth region of the spatial representation. In contrast, after identifying the second depth region, the method explicitly suppresses the determination of the saliency data of the second depth region.
The identified second depth region of the spatial representation corresponds to objects or features in at least a portion of the field of view of the user that are outside of a predetermined range that includes the determined line of sight convergence distance. It is generally less likely that the user is looking at one of the objects or features outside of the predetermined range than objects or features corresponding to regions of the spatial representation having depth data within the predetermined range. Thus, it is beneficial to refrain from determining saliency data for a second depth region to avoid processing that may not be necessary or may even provide misleading results, as it is less likely that a user will look at objects and/or features corresponding to regions of the spatial representation having depth data outside of the predetermined range.
In general, since the point of interest at which the user is gazing normally changes over time, the method 100 is repeatedly performed to generate a new refined point of gaze over time. Thus, the method 100 generally further comprises: the method includes determining a new line-of-sight convergence distance for the user, determining new saliency data for the spatial representation based on the new line-of-sight convergence distance, and determining a new refined point-of-regard for the user based on the new saliency data. Thus, a dynamic new fine gaze point is determined based on the new line-of-sight convergence distance determined over time. Several alternatives are envisaged, such as for example using only the currently determined new line of sight convergence distance or the mean of the line of sight convergence points determined over a predetermined period of time. Furthermore, if the user's field of view also changes over time, a new spatial representation is obtained and new depth data for at least a portion of the new spatial representation is obtained.
The additional top-down information provided by the eye tracker may be one or more determined gaze points of the user. The method 100 may further include determining 132 a plurality of gaze points of the user, and identifying 134 a cropped region of the spatial representation based on the determined plurality of gaze points of the user. Typically, the plurality of points of regard are determined over a period of time. In general, the determined individual gaze points of the determined plurality of gaze points may be different from each other. This may be because the user looks at different points during the time period, or may be because of errors in the determined respective gaze points, i.e. the user may actually look at the same point during the time period, but the determined respective gaze points are still different from each other. The cropping area preferably includes all of the determined plurality of gaze points. The size of the cropping area may depend on, for example, the accuracy of the determined gaze point, such that a higher accuracy will result in a smaller cropping area.
The user is generally more likely to be looking at the points corresponding to the cropped area than the points corresponding to the areas outside the cropped area. Therefore, it is beneficial to determine saliency data for the cropped regions and determine a fine gaze point based on the determined saliency data. Furthermore, since the user is more likely to be looking at the point corresponding to the clipping region than the point corresponding to the region outside the clipping region, it is possible to suppress the determination of saliency data for the region of the spatial representation that is outside the identified clipping region of the spatial representation. Determining saliency data for each region of the spatial representation that is outside the identified clipping region will reduce the amount of processing required as compared to determining saliency data for all regions of the spatial representation. In general, a cropped region can be made significantly smaller than the entire spatial representation while the probability that the user looks at a point within the cropped region is maintained at a high level. Therefore, suppressing determination of saliency data of a region of a spatial representation outside the trimming region can significantly reduce the amount of processing.
In addition to or instead of using the identified cropped regions in determining the saliency data, cropped regions may be used in obtaining the depth data. For example, since the user is more likely to be looking at the points corresponding to the clipping region than the points corresponding to the region outside the clipping region, the depth data may be obtained for the identified clipping region, and the depth data need not be obtained for the region outside the clipping region. Then, saliency data within the cropped region may be determined based only on the obtained depth data of the identified cropped region. Thus, the amount of processing required for obtaining depth data and determining saliency data may be reduced.
The method 100 may further include determining at least a second gaze convergence distance of the user. Then, a first depth region of the spatial representation is identified, the first depth region corresponding to depth data within a range determined based on the determined line of sight convergence distance and the determined at least second line of sight convergence distance. Then, saliency data of a first depth region of the spatial representation is determined.
The identified first depth region of the spatial representation corresponds to objects or features in at least a portion of the field of view of the user that are within a range determined based on the determined line of sight convergence distance and the determined at least second line of sight convergence distance. The user is generally more likely to be looking at one of these objects or features within the aforementioned range than objects or features corresponding to regions of the spatial representation having depth data outside the range. Thus, the identification of the first depth region provides further information that may be used to identify the point of interest at which the user is gazing.
There are several alternatives for determining the range based on the determined line-of-sight convergence distance and the determined at least second line-of-sight convergence distance. In a first example, a maximum and a minimum line-of-sight convergence distance of the determined line-of-sight convergence distance and the determined at least second line-of-sight convergence distance may be determined. The maximum and minimum line of sight convergence distances may then be used to identify a first depth region of the spatial representation that corresponds to the obtained depth data within a range that includes the determined maximum and minimum line of sight convergence distances. The range may be set wider or narrower depending on, for example, the accuracy of the determined line-of-sight convergence distance, the accuracy of the obtained depth data, and other factors. As an example, the range may be set from the determined minimum line of sight convergence distance to the maximum line of sight convergence distance. Then, saliency data of a first depth region of the spatial representation is determined.
In a first example, the identified first depth region of the spatial representation corresponds to objects or features in at least a portion of the user's field of view that are within a range that includes the determined maximum and minimum gaze convergence distances. The user is generally more likely to be looking at one of these objects or features within the aforementioned range than objects or features corresponding to regions of the spatial representation having depth data outside the range. Thus, the identification of the first depth region according to the first example provides further information that may be used to identify a point of interest at which the user gazes.
In a second example, a mean gaze convergence distance of the determined gaze convergence distance of the user and the determined at least second gaze convergence distance may be determined. The mean line-of-sight convergence distance may then be used to identify a first depth region of the spatial representation that corresponds to the obtained depth data within a range that includes the determined mean line-of-sight convergence distance. The range may be set wider or narrower depending on, for example, the accuracy of the determined line-of-sight convergence distance, the accuracy of the obtained depth data, and other factors. Then, saliency data of a first depth region of the spatial representation may be determined.
In a second example, the identified first depth region of the spatial representation corresponds to objects or features in at least a portion of the user's field of view that are within a range that includes the determined mean gaze convergence distance. The user is generally more likely to be looking at one of these objects or features within the aforementioned range than objects or features corresponding to regions of the spatial representation having depth data outside the range. Thus, the identification of the first depth region according to the second example provides further information that may be used to identify a point of interest at which the user gazes.
From the determined saliency data, a refined point of regard for the user may be determined 150 as the point corresponding to the highest saliency. The determined improved fixation point will thus be the point most likely to attract visual attention in some way. Plus the use of saliency data corresponding to the determination 144 of the identified first depth region of the spatial representation corresponding to the obtained depth data within a predetermined range including the determined line of sight convergence distance, the determined fine gaze point will therefore be the point within the first depth region that is most likely to draw visual attention in some respect. This may further be combined with identifying 132 a plurality of gaze points, identifying 134 a cropping area comprising the determined plurality of gaze points, and obtaining 130 only depth data for the cropping area. Furthermore, only the identified clipping region may be determined 146, and optionally only the saliency data of the identified depth region may be determined, or may be combined with the saliency data of the identified depth region such that the saliency data is generated only for depth regions within the clipping region. The determined fine gaze point will thus be the point within the first depth region within the cropping area that is most likely to attract visual attention in some way.
Determining saliency data of a spatial representation may include: the method further includes determining first saliency data of the spatial representation based on the visual saliency, determining second saliency data of the spatial representation based on the determined line-of-sight convergence distance and the obtained depth data, and determining the saliency data based on the first saliency data and the second saliency data. Visual saliency is the ability of an item, or an item in an image, to attract visual attention (bottom-up, i.e., the value is unknown, but can be inferred from the algorithm). In more detail, visual saliency is a distinctive subjective perceptual quality that makes certain items in the world stand out from their surroundings and draw our attention immediately. Visual saliency may be based on color, contrast, shape, orientation, motion, or any other perceptual characteristic.
Once the saliency data for different saliency features (such as visual saliency and depth saliency) has been calculated based on the determined line-of-sight convergence distance and the obtained depth data, they may be normalized and combined to form a primary saliency result. Depth saliency is related to the depth at which the user is looking (from top to bottom, i.e., the value is known). Distances that fit the determined convergence distance are considered to be more significant. When combining salient features, each feature may be weighted equally or have different weights depending on which features are estimated to have the greatest impact on visual attention and/or which features have the highest maximum saliency value compared to the average or expected value. The combination of salient features may be determined by a Winner-Take-All (Winner-Take-All) mechanism. Optionally, the primary saliency result may be converted into a primary saliency map: topographic map representation of global significance. This is a useful step for a human observer, but is not a necessary step in case the salient result is used as input to a computer program. In the primary significance result, a single spatial location should be highlighted as most significant.
Fig. 2 includes an image illustrating results from steps of an embodiment of a method according to the present disclosure. A spatial representation of at least a portion of the user's field of view in the form of an image 210 is an input to the method for determining a fine gaze point. A plurality of gaze points are determined in image 210 and a crop area is identified that includes the plurality of determined gaze points as illustrated by image 215. Further, a stereoscopic image 220 of at least a portion of the user's field of view is obtained, a cropped region as illustrated by image 225 is identified, and depth data for the obtained cropped region (as illustrated in image 230) based on the stereoscopic image 220. Then, the line-of-sight convergence distance of the user (this line-of-sight convergence distance is 3.5m in this example) is received, and the first depth region is determined as a region corresponding to depth data in a range on the left and right of the line-of-sight convergence distance in the clipping region. In this example, the range is 3m < x <4m, and the resulting first depth region is shown in image 235. The visual saliency of the cropped area shown in 240 is determined to produce saliency data shown in the form of a saliency map 245 of the cropped area. The saliency map 245 and the first depth region shown in the image 235 are combined into a saliency map 250 for the first depth region within the cropped region. The fine gaze point is the point: it is identified as the point with the highest significance in the first depth region within the clipping region. This dot is shown as a black dot in image 255.
Fig. 3 is a flow chart illustrating steps of a method according to the present disclosure. In general, the flow chart illustrates steps related to identifying a crop area over time based on a new determined gaze point (e.g., related to an embodiment of the method as illustrated in fig. 1). The identified clipping region is a clipping region that has been previously identified based on a plurality of previously determined gaze points. Then, a new point of regard is determined 310. In case the determined new gaze point is within the identified clipping region 320, the identified clipping region is not changed but continues to be used and a new gaze point is determined 310. An alternative way to look at this is for the new clipping region to be determined to be the same as the identified clipping region. In case the determined new gaze point is not within the identified clipping region (i.e. outside the identified clipping region) 320, a new clipping region comprising the determined new gaze point is determined 330. In this case, the new clipping region will be different from the identified clipping region.
Fig. 4 is a flow chart illustrating further steps of a method according to the present disclosure. In general, the flow chart illustrates steps related to determining a refined point of regard over time based on a new determined point of regard (e.g., related to an embodiment of the method as illustrated in fig. 1). Successive points of regard of the user are determined 410 in successive time intervals, respectively. Further, for each time interval, it is determined 420 whether the user is gazing or panning (saccading). In case the user is gazing 420, a fine gaze point is determined 430. In the case where the user is panning 420, suppression is made for determining a fine point of regard. If the user is gazing, the user may be looking at a certain point at that point in time, and thus, the fine gaze point may be correctly determined. On the other hand, if the user is panning, the user is less likely to look at a certain point at that point in time, and therefore, is less likely to correctly determine a fine gaze point. Referring to fig. 1, this may for example mean that the method 100 is only performed if it is determined that the user is gazing.
Fig. 5 is a flow chart illustrating yet further steps of a method according to the present disclosure. In general, the flow chart illustrates steps related to identifying a crop area over time based on a determined gaze point (e.g., related to an embodiment of the method as illustrated in fig. 1). The identified clipping region is a clipping region that has been previously identified based on a plurality of previously determined gaze points. Successive points of regard of the user are determined 510 in successive time intervals, respectively. Further, for each time interval, it is determined 520 whether the user is following smoothly (smooth pursuit). In case 520 the user is following smoothly, a new clipping region is determined 530 based on the smooth following. If smooth following is determined, then in the case where it is determined that the cropped region follows the smooth following, little additional processing is required to determine a continuous cropped region. For example, successive crop areas may have the same shape and may simply move relative to each other in the same direction and speed as the user's smooth follow. In case the user is not following smoothly 520, a new clipping region is determined comprising a plurality of gaze points including the determined new gaze point.
In some embodiments, the spatial representation is an image, such as a 2D image of the real world, a 3D image of the real world, a 2D image of the virtual environment, or a 3D image of the virtual environment. The data may come from a photo sensor, a virtual 3D scene, or may come from another type of image sensor or a spatial sensor.
Fig. 1 includes some steps shown in boxes with solid line borders and some steps shown in boxes with dashed line borders. The steps included in the boxes with solid borders are the operations included in the broadest example embodiment. The steps included in the boxes with dashed borders are further operations that may be included in, or may be part of, or may be taken in addition to, the operations of the border example embodiments. Not all steps need be performed in sequence, and not all operations need be performed. Furthermore, at least some of these steps may be performed in parallel.
The method for determining a fine gaze point of a user as disclosed herein, e.g. with respect to fig. 1 to 5, and the steps thereof, may be implemented in an eye tracking system 600, e.g. in the head mounted device of fig. 6. The eye tracking system 600 comprises a processor 610 and a carrier 620 comprising computer executable instructions 630, e.g. in the form of a computer program, which instructions, when executed by the processor 610, cause the eye tracking system 600 to perform the method. The carrier 620 may be, for example, an electronic signal, an optical signal, a radio signal, a transitory computer-readable storage medium, and a non-transitory computer-readable storage medium.
The person skilled in the art realizes that the present invention by no means is limited to the embodiments described above. On the contrary, many modifications and variations are possible within the scope of the appended claims.
Additionally, variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. The terminology used herein is for the purpose of describing particular aspects of the disclosure only and is not intended to be limiting of the invention. The division of tasks between functional units referred to in this disclosure does not necessarily correspond to a division into a plurality of physical units; rather, one physical component may have multiple functions, and one task may be performed in a distributed manner by several physical components that cooperate. A computer program may be stored/distributed on a suitable non-transitory medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the internet or other wired or wireless telecommunication systems. The mere fact that certain measures/features are recited in mutually different dependent claims does not indicate that a combination of these measures/features cannot be used to advantage. The method steps do not necessarily have to be performed in the order in which they appear in the claims or in the embodiments described herein, unless a certain order is explicitly described as required. Any reference signs in the claims shall not be construed as limiting the scope.

Claims (20)

1. A method in an eye tracking system for determining a fine gaze point of a user, the method comprising:
determining a gaze convergence distance of the user;
obtaining a spatial representation of at least a portion of the user's field of view;
obtaining depth data for at least a portion of the spatial representation;
determining saliency data of the spatial representation based on the determined line-of-sight convergence distance and the obtained depth data; and
determining a refined point of regard for the user based on the determined saliency data.
2. The method of claim 1, wherein determining the significance data of the spatial representation comprises:
identifying a first depth region of the spatial representation corresponding to obtained depth data within a predetermined range including the determined line of sight convergence distance; and
determining saliency data of the first depth region of the spatial representation.
3. The method of any of claims 1 and 2, wherein determining significance data of the spatial representation comprises:
identifying a second depth region of the spatial representation corresponding to obtained depth data outside the predetermined range including the line-of-sight convergence distance; and
saliency data of the second depth region determining the spatial representation is suppressed.
4. The method of any of claims 1 to 3, wherein determining a fine gaze point comprises:
determining the refined point of regard of the user as the point corresponding to the highest saliency from the determined saliency data.
5. The method of any of claims 1 to 4, wherein determining significance data comprises:
determining first saliency data of the spatial representation based on visual saliency;
determining second saliency data of the spatial representation based on the determined line of sight convergence distance and the obtained depth data; and
determining saliency data based on the first saliency data and the second saliency data.
6. The method of any of claims 1 to 5, further comprising:
determining a new gaze convergence distance for the user;
determining new saliency data for the spatial representation based on the new line-of-sight convergence distance; and
determining a new refined point of regard for the user based on the new saliency data.
7. The method of any of claims 1 to 6, further comprising:
determining a plurality of gaze points of the user; and
identifying a cropped area of the spatial representation based on the determined plurality of gaze points of the user.
8. The method of claim 7, wherein determining significance data comprises:
determining saliency data of the identified cropped regions of the spatial representation.
9. The method of any of claims 7 and 8, further comprising:
suppressing saliency data that determines regions of the spatial representation that are outside of the identified cropped regions of the spatial representation.
10. The method of any of claims 7 to 9, wherein obtaining depth data comprises:
depth data of the identified cropped regions of the spatial representation is obtained.
11. The method of any of claim 2, further comprising:
determining at least a second gaze convergence distance for the user,
wherein the first depth region of the spatial representation is identified, the first depth region corresponding to obtained depth data within a range based on the determined gaze convergence distance of the user and the determined at least second gaze convergence distance.
12. The method of any of claims 7 to 11, further comprising:
determining a new point of regard for the user;
identifying a new clipping region as identical to the identified clipping region if the determined new gaze point is within the identified clipping region; or
In the event that the determined new gaze point is outside the identified clipping region, identifying a new clipping region that includes the determined new gaze point and that is different from the identified clipping region.
13. The method of any of claims 7 to 12, wherein the continuous point of regard of the user is determined in successive time intervals, respectively, further comprising, for each time interval:
determining whether the user is gazing or panning;
determining a fine gaze point while the user is gazing; and
in the case where the user is panning, the determination of a fine gaze point is suppressed.
14. The method of any of claims 7 to 12, wherein the continuous point of regard of the user is determined in successive time intervals, respectively, further comprising, for each time interval:
determining whether the user is following smoothly; and
in a case where the user is following smoothly, continuous cropping areas each including the continuous gaze point are identified so that the identified continuous cropping areas follow the smooth following.
15. The method of any of claims 1 to 14, wherein the spatial representation is an image.
16. An eye tracking system for determining a point of regard of a user, the eye tracking system comprising a processor and a memory, the memory containing instructions executable by the processor, the eye tracking system operable by executing the instructions to:
determining a gaze convergence distance of the user;
obtaining a spatial representation of at least a portion of the user's field of view;
obtaining depth data for at least a portion of the spatial representation;
determining saliency data of the spatial representation based on the determined line-of-sight convergence distance and the obtained depth data; and
determining a refined point of regard for the user based on the determined saliency data.
17. A head-mounted device for determining a point of gaze of a user, the head-mounted device comprising a processor and a memory, the memory containing instructions executable by the processor, the head-mounted device operable by executing the instructions to:
determining a gaze convergence distance of the user;
obtaining a spatial representation of at least a portion of the user's field of view;
obtaining depth data for at least a portion of the spatial representation;
determining saliency data of the spatial representation based on the determined line-of-sight convergence distance and the obtained depth data; and
determining a refined point of regard for the user based on the determined saliency data.
18. The head-mounted device of claim 17, further comprising one of a transparent display and a non-transparent display.
19. A computer program comprising instructions that, when executed by at least one processor, cause the at least one processor to:
determining a gaze convergence distance of the user;
obtaining a spatial representation of the user's field of view;
obtaining depth data for at least a portion of the spatial representation;
determining saliency data of the spatial representation based on the determined line-of-sight convergence distance and the obtained depth data; and
determining a refined point of regard for the user based on the determined saliency data.
20. A carrier comprising the computer program of claim 19, wherein the carrier is one of an electronic signal, an optical signal, a radio signal, and a computer readable storage medium.
CN202010500200.9A 2019-06-19 2020-06-04 Method and system for determining a fine point of regard for a user Pending CN112114659A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE1950758A SE543229C2 (en) 2019-06-19 2019-06-19 Method and system for determining a refined gaze point of a user
SE1950758-1 2019-06-19

Publications (1)

Publication Number Publication Date
CN112114659A true CN112114659A (en) 2020-12-22

Family

ID=72916461

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010500200.9A Pending CN112114659A (en) 2019-06-19 2020-06-04 Method and system for determining a fine point of regard for a user

Country Status (2)

Country Link
CN (1) CN112114659A (en)
SE (1) SE543229C2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967299A (en) * 2021-05-18 2021-06-15 北京每日优鲜电子商务有限公司 Image cropping method and device, electronic equipment and computer readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761519A (en) * 2013-12-20 2014-04-30 哈尔滨工业大学深圳研究生院 Non-contact sight-line tracking method based on self-adaptive calibration
CN104937519A (en) * 2013-01-13 2015-09-23 高通股份有限公司 Apparatus and method for controlling an augmented reality device
US20160224106A1 (en) * 2015-02-03 2016-08-04 Kobo Incorporated Method and system for transitioning to private e-reading mode
CN109491508A (en) * 2018-11-27 2019-03-19 北京七鑫易维信息技术有限公司 The method and apparatus that object is watched in a kind of determination attentively

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104937519A (en) * 2013-01-13 2015-09-23 高通股份有限公司 Apparatus and method for controlling an augmented reality device
CN103761519A (en) * 2013-12-20 2014-04-30 哈尔滨工业大学深圳研究生院 Non-contact sight-line tracking method based on self-adaptive calibration
US20160224106A1 (en) * 2015-02-03 2016-08-04 Kobo Incorporated Method and system for transitioning to private e-reading mode
CN109491508A (en) * 2018-11-27 2019-03-19 北京七鑫易维信息技术有限公司 The method and apparatus that object is watched in a kind of determination attentively

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967299A (en) * 2021-05-18 2021-06-15 北京每日优鲜电子商务有限公司 Image cropping method and device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
SE1950758A1 (en) 2020-10-27
SE543229C2 (en) 2020-10-27

Similar Documents

Publication Publication Date Title
US20210041945A1 (en) Machine learning based gaze estimation with confidence
CN111602140B (en) Method of analyzing objects in images recorded by a camera of a head-mounted device
EP3488382B1 (en) Method and system for monitoring the status of the driver of a vehicle
US9024972B1 (en) Augmented reality computing with inertial sensors
US8831337B2 (en) Method, system and computer program product for identifying locations of detected objects
WO2020076396A1 (en) Real-world anchor in a virtual-reality environment
US11244496B2 (en) Information processing device and information processing method
WO2017169273A1 (en) Information processing device, information processing method, and program
CN110456904B (en) Augmented reality glasses eye movement interaction method and system without calibration
US20200118349A1 (en) Information processing apparatus, information processing method, and program
JP5016959B2 (en) Visibility determination device
JP6221292B2 (en) Concentration determination program, concentration determination device, and concentration determination method
JPWO2015198592A1 (en) Information processing apparatus, information processing method, and information processing program
CN112114659A (en) Method and system for determining a fine point of regard for a user
US11726320B2 (en) Information processing apparatus, information processing method, and program
JP5448952B2 (en) Same person determination device, same person determination method, and same person determination program
US10650601B2 (en) Information processing device and information processing method
KR101817952B1 (en) See-through type head mounted display apparatus and method of controlling display depth thereof
KR102327578B1 (en) Apparatus and method for providing object and environment information using wearable device
KR20150040194A (en) Apparatus and method for displaying hologram using pupil track based on hybrid camera
CN115410242A (en) Sight estimation method and device
CN106657976A (en) Visual range extending method, visual range extending device and virtual reality glasses
NL2004878C2 (en) System and method for detecting a person&#39;s direction of interest, such as a person&#39;s gaze direction.
US11361511B2 (en) Method, mixed reality system and recording medium for detecting real-world light source in mixed reality
KR101649188B1 (en) Method of measuring 3d effect perception and apparatus for measuring 3d effect perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination