EP1032872A1 - Interface utilisateur - Google Patents

Interface utilisateur

Info

Publication number
EP1032872A1
EP1032872A1 EP98954602A EP98954602A EP1032872A1 EP 1032872 A1 EP1032872 A1 EP 1032872A1 EP 98954602 A EP98954602 A EP 98954602A EP 98954602 A EP98954602 A EP 98954602A EP 1032872 A1 EP1032872 A1 EP 1032872A1
Authority
EP
European Patent Office
Prior art keywords
user
gaze
training
neural network
eye
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP98954602A
Other languages
German (de)
English (en)
Inventor
Behnam Azvine
David Djian
Kwok Ching Tsui
Li-Qun Xu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
British Telecommunications PLC
Original Assignee
British Telecommunications PLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GBGB9724277.0A external-priority patent/GB9724277D0/en
Application filed by British Telecommunications PLC filed Critical British Telecommunications PLC
Priority to EP98954602A priority Critical patent/EP1032872A1/fr
Publication of EP1032872A1 publication Critical patent/EP1032872A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/113Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining or recording eye movement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction

Definitions

  • IPC International Patent Classification
  • WOLFE B ET AL "A neural network approach 1,13,15, to tracking eye position"
  • P0MPLUN M ET AL "An artificial neural 1,13,15, network for high precision eye movement 16 tracking"
  • KI-94 ADVANCES IN ARTIFICIAL INTELLIGENCE. 18TH GERMAN ANNUAL CONFERENCE ON ARTIFICIAL INTELLIGENCE. PROCEEDINGS, KI-94: ADVANCES IN ARTIFICIAL INTELLIGENCE. 18TH GERMAN ANNUAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, SAARBRUCKEN, GERMANY,
  • the present invention relates to a user interface for a data or other software system, which monitors an eye of the user, such as a gaze tracker.
  • the interface finds particular but not exclusive application in a multimodal system.
  • Gaze tracking is a challenging and interesting task traversing several disciplines including machine vision, cognitive science and human computer interactions (Velichkovsky, B.M. and J.P. Hansen ( 1 996): "New technological windows into mind: There is more in eyes and brains for human-computer interaction". Technical Report: Unit of Applied Cognitive Research, Dresden University of Technology, Germany).
  • the idea that a human subject's attention and interest on a certain object, reflected implicitly by eye movements, can be captured and learned by a machine which can then act automatically on the subject's behalf lends itself to many applications, including for instance video conferencing (Yang, J., L. Wu and A.
  • Waibel (1996): “Focus of attention in video conferencing”: Technical Report, CMU-CS-96-150, School of Computer Science, Carnegie Mellon University, June 1 996).
  • This idea can be used for instance for: • focusing on interesting objects and transmitting selected images of them through the communication networks,
  • gaze tracking uses the so-called pupil-center/corneal- reflection method (Cleveland, D. and N. Cleveland (1 992)- "Eyegaze eyetracking system" Proc. of 1 1 th Monte-Carlo International Forum on New Images, Monte- Carlo, January 1 992).
  • This uses controlled infra-red lighting to illuminate the eye, computing the distance between the pupil centre (the bright-eye effect) and the small very bright reflection off the surface of the eye's cornea to find the line of sight on the display screen, through geometric projections.
  • This kind of method normally involves a specialised high speed/high resolution camera, a controlled lighting source and electronic hardware equipment, and is sometimes intrusive (Stampe, D.
  • a gaze tracker is described in US-A-5 481 622, which comprises a helmet worn by a user, which uses a camera to acquire a video image of the pupil, mounted on the helmet.
  • a frame grabber is coupled to the camera to accept and convert analog data from the camera into digital pixel data.
  • a computer coupled to the frame grabber processes the digital pixel data to determine the position of the pupil.
  • a display screen is coupled to the computer and is mounted on the helmet. The system is calibrated by the user following a cursor on the display screen while the system measures the pupil position for known locations of the cursor.
  • the arrangement is complicated and requires special hardware namely the helmet arrangement and is not suited to everyday commercial use.
  • US-A-5 471 542 discloses a gaze tracker in which a video camera is provided on a personal computer in order to detect eye movement to perform functions similar to those achieved with a conventional hand-held mouse.
  • the present invention provides an improved arrangement which can be trained in order take account of characteristics of the user and the user's preferences.
  • a user interface for use in making inputs to a data or communications system, responsive to the user's eye, comprising: i) a scanning device for capturing a quantised image of an eye; ii) a pupil image detector to detect a representation of the pupil of the eye in the quantised image; iii) a display for a plurality of visual targets; iv) a first learning device to relate at least one variable characteristic of said image of the eye to a selected one of said visual targets; and v) a second learning device, for relating external parameters apparent to a user of the system to parameters internal to the system.
  • the invention also provides in another aspect a method of training the user interface, which involves displaying training data on the display and training the first learning device to relate the variable characteristic of the image of the eye to the training data when the user gazes at the displayed training data.
  • the invention may also include training the second learning device to relate external parameters apparent to the user of the system to the internal parameters.
  • the internal parameters may be a function of fixation of the gaze of the user at a particular region on the display, and the external parameters may include the time taken to determine that a fixation has occurred and the positional accuracy thereof.
  • Embodiments of the present invention provide a real-time non-intrusive gaze tracking system; that is, a system which can tell where a user is looking, for instance on a computer screen.
  • the gaze tracking system can provide a vision component of a multimodal intelligent interface, particularly suited for resolving ambiguities and tracking contextual dialogue information.
  • it is also an effective piece of technology in its own right, leading to many potential applications in human-computer interactions where the ability to find human attention is of significant interest.
  • Embodiments of the present invention can provide a flexible, cheap, and adequately fast gaze tracker, using a standard videoconferencing camera sitting on a workstation and without resorting to any additional hardware and special lighting. These embodiments provide a neural network based, real-time, non- intrusive gaze tracker.
  • Data preprocessing means may be provided to enhance the output of the scanning device for use by the learning device.
  • the quantised image of the eye comprises an array of pixels with associated contrast information
  • such data preprocessing means may comprise means to normalise said array and to allocate to each individual pixel thereof a contrast value selected from a set of discrete contrast values.
  • the second learning device may be provided to relate parameters apparent to a user of the system to parameters internal to the system. This can be used to provide an adjustment capability such that the user can adjust parameters apparent in use of the system by inputting parameter information to the system, the system responding thereto by adjusting parameters internal to the system, in accordance with one or more learned relationships therebetween.
  • the invention provides a gaze tracker including means for determining when a user achieves a gaze fixation on a target, comprising learning means for learning a relationship between response time and accuracy for achieving a fixation, and means responsive to a user's preference concerning the relationship for controlling signification of the fixation.
  • the learning means may comprise a Bayesian net.
  • the invention also includes a user interface for a computer workstation usable for videoconferencing, the interface being configured for use in making inputs to a data or communications system in response to movements of the user's eye, comprising: i) a tv videoconferencing camera to be mounted on the workstation for capturing a quantised image of an eye; ii) a pupil image detector to detect a representation of the pupil of the eye in the quantised image; iii) a workstation display for a plurality of visual targets; and a neural net to relate at least one variable characteristic of said image of the eye to a selected one of said visual targets.
  • Figure 1 shows in schematic outline a neural network based gaze modelling/tracking system as an embodiment of the present invention wherein Figure 1 A illustrates the physical configuration and Figure 1 B illustrates the system in terms of functional blocks;
  • Figure 2 shows a snapshot of a captured image of a user's head image for use in the system shown in Figure 1 ;
  • Figure 3 shows an example of a fully segmented eye image for use in the system shown in Figure 1 ;
  • Figure 4 shows a histogram of a segmented grayscale eye image
  • Figure 5 shows a transfer function for the normalisation of segmented eye image data
  • Figure 6 shows a normalised histogram version of the eye image of Figure 3;
  • Figure 7 shows a neural network architecture for use in the system of Figure 1 ;
  • Figure 8 shows a matrix of grids laid over a display screen for the collection of training data for use in a system according to Figure !;
  • Figure 9 shows a Gaussian shaped output activation pattern corresponding to the vertical position of a gaze point
  • Figure 10 shows training errors versus number of training epochs in a typical training trial of the network shown in Figure 7;
  • Figure 1 1 shows learning and validation errors versus number of training epochs in a trial of the learning process for the network shown in Figure 7;
  • Figure 12 shows a histogram of the neural network's connection weights after 100 training epochs.
  • a goal of the gaze tracker is to determine where the user is looking, within the boundary of a computer display, by the appearance of eye images detected by a monitoring camera.
  • Figure 1 A shows an example of the physical configuration of the gaze tracker.
  • a video camera 100 of the kind used for video conferencing is mounted on the display screen 101 of a computer workstation W in order to detect an eye of a user 102.
  • the workstation includes a conventional processor 103 and keyboard 104.
  • the task performed by the gaze tracker can be considered as a simulated forward-pass mapping process from a segmented eye image space, to a predefined coordinate space such as the grid matrix shown in
  • mapping function between an eye appearance and its corresponding gaze point is still highly nonlinear and very complicated. This complexity arises from uncertainties and noise encountered at every processing/modelling stage. In particular, for instance, it can arise from errors in eye segmentation, the user's head movement, changes of the eye image depth relative to the camera, decorations around the eye, such as glasses or pencilled eyebrows, and changes in ambient lighting conditions.
  • embodiments of the present invention work in an office environment with the simple video camera 100 mounted on the right side of the display screen 101 of the workstation W, to monitor the user's face continuously. There is no specialised hardware, such as a lighting source, involved. The user sits comfortably at a distance of about 22 to 25 inches away from the screen. He is allowed to move his head freely while looking at the screen, but needs to keep it within the field of view of the camera, and to keep his face within a search window overlaid on the image. 8
  • the neural network based gaze tracker takes the output of an ordinary video camera 1 00, as might be used for video conferencing, and feeds it to the following functional processing blocks:
  • the switch 1 20 takes the output of the histogram normalisation unit 1 1 5 and feeds it to a learning node 130 when the modeller 1 25 is in training mode, or to a real time running node 1 35 when the modeller 125 has been trained and is to be used for detecting gaze co-ordinates.
  • the analogue video signal from a low cost video camera 100 is captured and digitised by the image acquisition and display unit 105 using the SunVideo Card, a video capture and compression card for Sun SPARCstations, and the XIL imaging foundation library developed by SunSoft. (The library is described in Pratt, W.K. (1 997): "Developing Visual Applications: XIL— An Imaging Foundation Library” - published by Sun Microsystems Press).
  • XIL is a cross-platform C functions library that supports a range of video and imaging requirements. For the purpose of simplicity, only grayscale images are used in embodiments of the present invention described herein.
  • Colour images may however be used in enhancements of the system as colours contain some unique features that are otherwise not available from grayscale images, as is shown in the recent work of Oliver and Pentland (1 997), published in "LAFTER: Lips and face real time tracker" - Proc. of Computer Vision and Pattern Recognition Conference, CVPR'97. June 1997, Puerto Rico.
  • the device image of the SunVideo Card which is a 3-banded 8-bit image in YUV colorspace, sized 768 x 576 pixels for PAL, is converted into an 8-bit grayscale image and scaled to the size of 1 92 pixels in width and 144 pixels in 9
  • the maximum capture rate of the SunVideo Card is 25 fps for PAL.
  • Figure 2 shows, as an example, a snapshot of the captured user's head image in an open-plan office environment under normal illumination. It shows a head image 200 of 1 92 by 144 pixels and a search window 205 within the head image of 100 by 60 pixels.
  • the objective of this processing is first to detect the small darkest region in the pupil of the eye, and then go on to segment the proper eye image.
  • the fixed search window 205 shown in Figure 2 is started in the centre part of the grabbed image 200. Inside this search window 205, the image 200 is iteratively thresholded, initially with a lower threshold T 0 .
  • T 0 a threshold for a gaze tracking task of different purpose, published by Stiefelhagen, R., J. Yang, and A. Waibel (1996): “Gaze tracking for multimodal human-computer interaction", Proc. of IEEE Joint Symposia on Intelligence and Systems.
  • Morphological filters are used to remove noise or fill "gaps" of the generated binary image which is then searched pixel by pixel from top left to bottom right.
  • Individual objects comprising pixel clusters, are found and labelled using the 4-connectivity algorithm described in Jain, R., R. Kasturi, and B.G. Schunck (1995): “Machine Vision", published by McGraw-Hill and MIT Press.
  • a rectangular blob is used to represent each found object. Unless a reasonable number of objects of appropriate size are found, the threshold T 0 is increased by a margin to T , and the search process above is repeated.
  • the number of blobs thus obtained are first merged when appropriate, based on adjacency requirements. Heuristics are then used to filter the remaining blobs and identify the one most likely to be part of the pupil of the eye.
  • the heuristics which have been found useful include: 1 ) the number of detected pixels in each blob, roughly in the range (15, 100)
  • the found pupil is then expanded proportionally, based on local information, to the size of 40 by 15 pixels to contain the cornea and the whole eye socket.
  • Figure 3 shows an example of the segmented right eye image 300. (The right eye only is used in the embodiment of the present invention described herein but either eye could of course be used.)
  • the eye image segmentation approach described above is not very sensitive to changes in lighting conditions as long as the face is well lit (sometimes assisted by an ordinary desk lamp). It is not generally affected by the glasses the user is wearing either although, occasionally, strong reflections off the glasses and the appearance of the frame of the glasses in the segmented eye images due to the head moving away from the camera are problematic. They contribute a burst of noise which disrupts features in activation patterns (discussed below) to be sent to the purpose built neural network modelling system 1 25.
  • Histogram normalisation The segmented grayscale eye image, having a value between 0 and 255 for each pixel, is preprocessed by algorithms.
  • the preprocessing algorithms should be simple, reliable and computationally not intensive. For instance, the algorithms might assume a value between -1 .0 and 1 .0 for each pixel.
  • a neural network can then effectively discover the features inherent in the data and learn to associate these features and their distributions with the correct gaze points on the screen. Through adequate training, the network can then be endowed with the power to generalise to data that was not previously present. That is, it can use data learned in respect of similar scenarios and generate its own gaze point data from input data not previously encountered.
  • the histogram normalisation block 1 1 5 takes as input the individual 40 times
  • the vertical axis gives the number of pixels and the horizontal axis shows the grey levels over the range between 0 and 255, partitioned into 64 bins.
  • and t u respectively, have grey scale values at 36 and 144.
  • the region between the bounds is linearised (see below).
  • Figure 5 shows the transfer function used for the normalisation procedure.
  • t, and t u are the lower and upper bounds of Figure 4.
  • the activation patterns thus generated, with associated properly coded output gaze points (discussed below), are ready for use in training a neural network. In real time operation mode, these patterns are inputs to the system for gaze prediction.
  • Figure 6 shows the same eye image as in Figure 3 after histogram normalisation. It illustrates that the contrast between important features (the eye socket, pupil, the reflection spot) has been significantly enhanced.
  • the central part of the gaze tracking system is the neural network based modeller/tracker 1 25.
  • the neural network is implemented in software and runs on workstation W although hardware net implementations can be used e.g. optical neural nets, as known in the art.
  • 1 2
  • a suitable neural network is shown in Figure 7. This is a three-layer feedforward neural network with 600 input retina units 700, each receiving a normalised activation value from the segmented 40 x 1 5 eye image.
  • Figure 8 shows a matrix of 50 x 40 grids laid over a display screen 800 displayed on the display 1 01 of Figure 1 A, for guiding the movements of a moving cursor and indicating the gaze position in order to collect the training data of eye image/gaze co-ordinate pairs for the neural network described.
  • this can correspond to dividing the display screen 800 uniformly into a rectilinear matrix of 50 by 40 grids, each sized about 23 by 22 pixels on the display.
  • the resolution of the grid matrix 50 times 40
  • the viewing objects in an application are to appear in only part of the display screen 800, it suffices to collect the data (discussed below in the "Training Data Collection” Section) from this part of the screen and use them for training the model.
  • the co-ordinates of an arbitrary gaze point 805 in this grid matrix can be a value between 0 and 49 along the "x" direction and between 0 and 39 along the "y” direction, with the origin being in the top left corner (0,0) of the screen.
  • mapping function simulation task of embodiments of the present invention demand a gradual change in output representations when the data examples (eye appearance) in input data space exhibits slight difference. This preservation of topological relationships after data transformation (mapping) is the main concern in selecting an output coding mechanism.
  • the Gaussian function used is of
  • the Gaussian shaped activation pattern G(n-n 0 ) is moved across the output units for the x- coordinate by changing n 0 from 0 to 49.
  • a least-square fitting procedure is performed at each unit position to try to match the actual output activation pattern.
  • the peak of the Gaussian shaped pattern that achieves the smallest error determines the horizontal position of the gaze point.
  • the vertical position of the gaze point across the 40 output units for the y-co-ordinates can be found.
  • This section describes a means of collecting correct training data, the process of training a large neural network, analysing the significance of the learned connection weights and briefing the features regarding the real-time gaze tracking system.
  • the user is asked to visually track a blob cursor which travels along the grid matrix on the computer screen in one of the two predefined paths, obtaining horizontal/vertical zig-zag movements.
  • the travelling speed of the cursor can be adjusted to accommodate the acuity of the user's eye reaction time so that s(he) can faithfully and comfortably follow the moving cursor.
  • the size of the blob or the resolution of the grid matrix (for indicating the position of the cursor) on the screen depends on the requirements of an envisaged application and the trade-off between running speed, system complexity and prediction accuracy. In the training phase, the smaller the blob is, the more images need to be collected in one session for the cursor to sweep through the entire screen grid matrix. 1 5
  • the neural network (described above) would have to make provisions for more output units to encode all the possible cursor positions.
  • One session of training images collection takes between 2 and 3 minutes.
  • the cursor movement can be paused and resumed at a click of a mouse button.
  • the user needs to satisfy some constraints for the current system to function properly.
  • the user can selectively download certain parts, or all of, the valid paired eye images/co-ordinates.
  • the algorithm can detect automatically those unwanted images when eye blinks have occurred, and report a failure in capturing the eye image at that particular time and its associated gaze point. That is, the algorithm comprises a set of heuristics such that it can for instance learn a normal range of values and report failure when values fall outside the range.
  • the user can playback the downloaded image sequence at a selected speed and, if desired, visually examine and identify those noisy examples.
  • the AGD measures the average difference between the current gaze predictions and the desired gaze positions for the training set, excluding a few wild cards due to the user's unexpected eye movements.
  • this first strategy consists of a fast search phase followed by a fine tuning phase.
  • the network is updated in terms of its weighting functions once for every few tens of training examples (typically between 1 0 and 30) which are drawn at random from the entire training data set. (It was found repeatedly that a training process taking examples in their original collection order would always fail to reach a satisfactory convergence state of the neural network, due to perhaps the network's catastrophic 'forgetting' factor.)
  • a small offset ⁇ 0.05 is added to the derivative of each unit's transfer function to speed up the learning process. This is especially useful when a unit's output approaches the saturation limits, either -1 or 1 , of the hyperbolic tangent function. Besides, for each input training pattern random Gaussian noise is added, corresponding to 5% of the size of each retina input. This is particularly effective for overcoming the over-fitting problem in training a neural network and achieving better generalisation performance. In so doing, the neural network, albeit over ten thousand weights, would always approach a quite satisfactory solution after between 50 and 80 training epochs.
  • the network weights are updated once after presenting the whole training set.
  • the nominal learning rate to use is proportionally much smaller than in the first phase, and a slightly smaller magnitude of Gaussian noise, around 3% of each retina input, is used. After about 30 epochs, the system can settle down to a very robust solution. 17
  • Figure 10 shows a trial learning result for user BA.
  • the original data were collected in two horizontal and two vertical cursor running sessions, respectively.
  • the cursor is confined to only travel within the top-right 40X30 area -- the application interested part of screen -- of the entire 50X40 sized grid matrix. So, each running session can provide at its maximum 1 200 data examples.
  • the total number of examples successfully collected for the four sessions is 3906, and the number of training examples used in obtaining the learning result of Figure 1 0 is 3000.
  • the remaining 906 examples were used to examine the learning performance and to find the most appropriate stopping point.
  • the weights saved at the 60th epochs of training phase 1 are loaded for further refinement in phase 2. It can be seen that this overall strategy leads to rapid reduction in training error which then settles down to a stable status allowing for no further overfitting of the neural network.
  • An independent validation set is set apart by randomly choosing original examples collected.
  • the validation data set is involved to monitor the progress of the learning process in order to prevent the network from overfitting the training data.
  • the learning process stops when the validation error starts to pick up or saturate after a previous general downwards trend.
  • the weight set obtained at this point will be used to drive the real-time gaze tracking system. In practice, however, several trials with different initial weights are needed to find out the weight set with the smallest validation error, which is expected to provide better generalisation performance.
  • Figure 1 1 plots the curves for learning and its corresponding validation errors versus the number of training epochs in one trial of simulations of the neural network. (The ripples on the curves are due to the training scheme of using randomly chosen examples within each epoch and the way of updating the weights once for every ten examples instead of the whole training set of 2,800 examples.) Following the Section on data collection above, 1606 and 1 635 valid paired examples have been collected, respectively, from observing the horizontal and vertical zig-zag cursor movement path, from which 206 and 235 randomly

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Ophthalmology & Optometry (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un dispositif de suivi du regard destiné à une interface utilisateur multimodale. Ce dispositif utilise en poste standard de vidéoconférence équipant une station de travail pour déterminer la direction dans laquelle porte le regard de l'utilisateur sur un écran. Le dispositif de suivi du regard utilise une caméra vidéo (100) donnant une image quantifiée de l'oeil de l'utilisateur. Une fois que la pupille a été détectée, on utilise un réseau neural (125) pour apprendre au dispositif de suivi du regard à détecter la direction du regard. On peut utiliser un préprocesseur (115) pour améliorer l'entrée faite au réseau neural. Un réseau de Bayes (140) permet de faire l'apprentissage des relations entre le temps de réponse et la précision de façon à obtenir une sortie du réseau neural permettant la prise en compte des préférences extérieurement définies d'un utilisateur.
EP98954602A 1997-11-17 1998-11-16 Interface utilisateur Withdrawn EP1032872A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP98954602A EP1032872A1 (fr) 1997-11-17 1998-11-16 Interface utilisateur

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
GB9724277 1997-11-17
GBGB9724277.0A GB9724277D0 (en) 1997-11-17 1997-11-17 User interface
EP98306261 1998-08-05
EP98306261 1998-08-05
PCT/GB1998/003441 WO1999026126A1 (fr) 1997-11-17 1998-11-16 Interface utilisateur
EP98954602A EP1032872A1 (fr) 1997-11-17 1998-11-16 Interface utilisateur

Publications (1)

Publication Number Publication Date
EP1032872A1 true EP1032872A1 (fr) 2000-09-06

Family

ID=26151382

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98954602A Withdrawn EP1032872A1 (fr) 1997-11-17 1998-11-16 Interface utilisateur

Country Status (3)

Country Link
EP (1) EP1032872A1 (fr)
AU (1) AU1165799A (fr)
WO (1) WO1999026126A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10515474B2 (en) 2017-01-19 2019-12-24 Mindmaze Holding Sa System, method and apparatus for detecting facial expression in a virtual reality system
US10521014B2 (en) 2017-01-19 2019-12-31 Mindmaze Holding Sa Systems, methods, apparatuses and devices for detecting facial expression and for tracking movement and location in at least one of a virtual and augmented reality system
US10943100B2 (en) 2017-01-19 2021-03-09 Mindmaze Holding Sa Systems, methods, devices and apparatuses for detecting facial expression
US11328533B1 (en) 2018-01-09 2022-05-10 Mindmaze Holding Sa System, method and apparatus for detecting facial expression for motion capture
US11991344B2 (en) 2017-02-07 2024-05-21 Mindmaze Group Sa Systems, methods and apparatuses for stereo vision and tracking

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4469476B2 (ja) * 2000-08-09 2010-05-26 パナソニック株式会社 眼位置検出方法および眼位置検出装置
US6659611B2 (en) 2001-12-28 2003-12-09 International Business Machines Corporation System and method for eye gaze tracking using corneal image mapping
US7362885B2 (en) * 2004-04-20 2008-04-22 Delphi Technologies, Inc. Object tracking and eye state identification method
US10460346B2 (en) 2005-08-04 2019-10-29 Signify Holding B.V. Apparatus for monitoring a person having an interest to an object, and method thereof
WO2009154484A2 (fr) * 2008-06-20 2009-12-23 Business Intelligence Solutions Safe B.V. Procédés, dispositifs et systèmes de visualisation de données et applications connexes
US10423830B2 (en) * 2016-04-22 2019-09-24 Intel Corporation Eye contact correction in real time using neural network based machine learning
US10664949B2 (en) 2016-04-22 2020-05-26 Intel Corporation Eye contact correction in real time using machine learning
WO2017208227A1 (fr) 2016-05-29 2017-12-07 Nova-Sight Ltd. Système et procédé d'affichage
US10846877B2 (en) 2016-06-28 2020-11-24 Google Llc Eye gaze tracking using neural networks
US10127680B2 (en) 2016-06-28 2018-11-13 Google Llc Eye gaze tracking using neural networks
US11132543B2 (en) 2016-12-28 2021-09-28 Nvidia Corporation Unconstrained appearance-based gaze estimation
IT201800002114A1 (it) * 2018-01-29 2019-07-29 Univ Degli Studi Roma La Sapienza Procedimento indirizzato a pazienti con disabilita' motorie per scegliere un comando mediante un'interfaccia grafica, relativo sistema e prodotto informatico
US11393251B2 (en) 2018-02-09 2022-07-19 Pupil Labs Gmbh Devices, systems and methods for predicting gaze-related parameters
US11556741B2 (en) 2018-02-09 2023-01-17 Pupil Labs Gmbh Devices, systems and methods for predicting gaze-related parameters using a neural network
US11194161B2 (en) 2018-02-09 2021-12-07 Pupil Labs Gmbh Devices, systems and methods for predicting gaze-related parameters
US10866635B2 (en) 2018-09-13 2020-12-15 Toyota Research Institute, Inc. Systems and methods for capturing training data for a gaze estimation model
US11537202B2 (en) 2019-01-16 2022-12-27 Pupil Labs Gmbh Methods for generating calibration data for head-wearable devices and eye tracking system
EP3979896A1 (fr) 2019-06-05 2022-04-13 Pupil Labs GmbH Dispositifs, systèmes et procédés de prédiction de paramètres liés au regard
CN110516282B (zh) * 2019-07-03 2022-11-15 杭州电子科技大学 一种基于贝叶斯统计的磷化铟晶体管建模方法
US11340701B2 (en) 2019-12-16 2022-05-24 Nvidia Corporation Gaze determination using glare as input
CN112559099B (zh) * 2020-12-04 2024-02-27 北京国家新能源汽车技术创新中心有限公司 基于用户行为的远程图像显示方法、装置、系统及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5471542A (en) * 1993-09-27 1995-11-28 Ragland; Richard R. Point-of-gaze tracker
US5481622A (en) * 1994-03-01 1996-01-02 Rensselaer Polytechnic Institute Eye tracking apparatus and method employing grayscale threshold values
US5649061A (en) * 1995-05-11 1997-07-15 The United States Of America As Represented By The Secretary Of The Army Device and method for estimating a mental decision

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9926126A1 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10515474B2 (en) 2017-01-19 2019-12-24 Mindmaze Holding Sa System, method and apparatus for detecting facial expression in a virtual reality system
US10521014B2 (en) 2017-01-19 2019-12-31 Mindmaze Holding Sa Systems, methods, apparatuses and devices for detecting facial expression and for tracking movement and location in at least one of a virtual and augmented reality system
US10943100B2 (en) 2017-01-19 2021-03-09 Mindmaze Holding Sa Systems, methods, devices and apparatuses for detecting facial expression
US11195316B2 (en) 2017-01-19 2021-12-07 Mindmaze Holding Sa System, method and apparatus for detecting facial expression in a virtual reality system
US11495053B2 (en) 2017-01-19 2022-11-08 Mindmaze Group Sa Systems, methods, devices and apparatuses for detecting facial expression
US11709548B2 (en) 2017-01-19 2023-07-25 Mindmaze Group Sa Systems, methods, devices and apparatuses for detecting facial expression
US11989340B2 (en) 2017-01-19 2024-05-21 Mindmaze Group Sa Systems, methods, apparatuses and devices for detecting facial expression and for tracking movement and location in at least one of a virtual and augmented reality system
US11991344B2 (en) 2017-02-07 2024-05-21 Mindmaze Group Sa Systems, methods and apparatuses for stereo vision and tracking
US11328533B1 (en) 2018-01-09 2022-05-10 Mindmaze Holding Sa System, method and apparatus for detecting facial expression for motion capture

Also Published As

Publication number Publication date
WO1999026126A1 (fr) 1999-05-27
AU1165799A (en) 1999-06-07

Similar Documents

Publication Publication Date Title
WO1999026126A1 (fr) Interface utilisateur
CN112970056B (zh) 使用高速和精确的用户交互跟踪的人类-计算机接口
Xu et al. A Novel Approach to Real-time Non-intrusive Gaze Finding.
Grauman et al. Communication via eye blinks and eyebrow raises: Video-based human-computer interfaces
US5360971A (en) Apparatus and method for eye tracking interface
US5517021A (en) Apparatus and method for eye tracking interface
JP3361980B2 (ja) 視線検出装置及びその方法
US20160210503A1 (en) Real time eye tracking for human computer interaction
US20020039111A1 (en) Automated visual tracking for computer access
Dias et al. Gaze estimation for assisted living environments
JP5225870B2 (ja) 情動分析装置
CN116909408B (zh) 一种基于mr智能眼镜的内容交互方法
CN115482574A (zh) 基于深度学习的屏幕注视点估计方法、装置、介质及设备
Sivasangari et al. Eyeball based cursor movement control
Epstein et al. Using kernels for a video-based mouse-replacement interface
Akashi et al. Using genetic algorithm for eye detection and tracking in video sequence
Mimica et al. A computer vision framework for eye gaze tracking
Bature et al. Boosted gaze gesture recognition using underlying head orientation sequence
Dhingra et al. Eye gaze tracking for detecting non-verbal communication in meeting environments
Murauer et al. Natural pursuits for eye tracker calibration
Malladi et al. EG-SNIK: a free viewing egocentric gaze dataset and its applications
Pomerleau et al. Non-intrusive gaze tracking using artificial neural networks
Sukawati et al. A survey of signal processing filters, calibration, and interactive applications based on smooth pursuit eye movement
Ejnestrand et al. Object Tracking based on Eye Tracking Data: A comparison with a state-of-the-art video tracker
Yu et al. Attentional object spotting by integrating multimodal input

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20000508

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB

17Q First examination report despatched

Effective date: 20010605

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

RTI1 Title (correction)

Free format text: USER INTERFACE FOR A SOFTWARE SYSTEM MONITORING THE EYES OF A USER

RTI1 Title (correction)

Free format text: USER INTERFACE FOR A SOFTWARE SYSTEM MONITORING THE EYES OF A USER

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20020806