WO2019167053A1 - A system and method for optimizing gaze controlled gui for human computer interface - Google Patents

A system and method for optimizing gaze controlled gui for human computer interface Download PDF

Info

Publication number
WO2019167053A1
WO2019167053A1 PCT/IN2018/000035 IN2018000035W WO2019167053A1 WO 2019167053 A1 WO2019167053 A1 WO 2019167053A1 IN 2018000035 W IN2018000035 W IN 2018000035W WO 2019167053 A1 WO2019167053 A1 WO 2019167053A1
Authority
WO
WIPO (PCT)
Prior art keywords
input
gaze
command
selection
eye
Prior art date
Application number
PCT/IN2018/000035
Other languages
French (fr)
Inventor
Yogesh Kumar Meena
Girijesh PRASAD
Original Assignee
Yogesh Kumar Meena
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to IN201811007266 priority Critical
Priority to IN201811007266 priority
Application filed by Yogesh Kumar Meena filed Critical Yogesh Kumar Meena
Publication of WO2019167053A1 publication Critical patent/WO2019167053A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/038Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/038Indexing scheme relating to G06F3/038
    • G06F2203/0381Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer

Abstract

Virtual keyboard applications and alternative communication devices provide new means of communication to assist disabled people. To elate, virtual keyboard optimization schemes based on script-specific information along with multimodal input access facility are limited. In this work, we propose a novel method for optimizing the position of the displayed items for gaze-controlled tree-based menu selection systems by considering a combination of letter frequency and command selection time. The optimized graphical user interface (GUI) lyout has been designed for a Hindi language virtual keyboard based on a menu wherein 10 commands provide access to type 88 different characters along with additional text editing commands. The system can be controlled in two different modes: eye-tracking alone and eye-tracking with an access soft-switch. Five different keyboard layouts have been presented and evaluated with ten healthy participants. Further, the two best performing keyboard layouts have been evaluated with ten stroke patients. The overall performance analysis demonstrated significantly superior typing performance, high usability (87% SUS score), and low workload (NASA TLX with 17 scores) for the letter frequency and time- based organization with script specific arrangement design. This work represents the first optimized multimodal Hindi virtual keyboard, which can be extended to other languages.

Description

Title of Invention
A SYSTEM AND METHOD FOR OPTIMIZING GAZE CONTROLLED GUI FOR HUMAN COMPUTER INTERFACE
Field of Invention
The present invention generally relates to a system and method which pertains to an electronic interface between the human and computers, the said system employing eye-tracking mechanism along with the use of multimodality of input devices. This invention intends to propose a novel system for optimizing the position of the displayed items for gaze-controlled tree-based menu selection systems by considering a combination of letter frequency and command selection time. The optimized graphical user interface (GU I) layout has been designed for a H indi language virtual keyboard based on a menu wherein 10 commands provide access to type 88 d ifferent characters along with additional text editing commands, which can further extend to other languages. The system can be controlled in two different modes: eye-tracking a lone and eye-tracking with an access soft-switch which can further include the computer mouse, touch screen, eye-tracking device, surface electromyography devices etc.
Background of the invention
Over 20 million people suffer from stroke annually and due to loss of motor and/or speech skills, they face a lot of difficulties in communicating effectively. Augmentative and alternative communication (AAC) based methods and technologies have been designed to provide new effective means of communication to assist these people. I n general, AAC systems include a large set of gestural, alphabetical, and iconic communications. This area of research is growing rapidly to meet the needs of severe speech and motor impaired individuals, such as aphasia and tetraplegia patients. An AAC system can be divided into two d isti nct components: input device(s) and communication software. The input devices are used to capture any type of voluntary intent of patients (e.g., electrodes for electromyography (EMG) and/or electrooculography (EOG) signals). One of the most popular input devices is the eye-tracker that captures eye gaze of the user in real-time. The eye-gaze based AAC systems have been developed previously for patients suffering from locked- in synd rome, and amyotrophic lateral sclerosis. I n addition to eye-gaze, brain-computer interfaces (BCIs) have been implemented for communication by using non- invasive electroencephalography (EEG) recordings. Moreover, ded icated communication software, e.g. virtual keyboards and complex com munication spreadsheets have been proposed to generate explicit information regarding the users intent by a nalyzing the data captured by input devices. A conventional virtual keyboard includes a specific language-based keyboard layout. This type of keyboard is one of the most primitive mechanisms for alternatively entering text. Some studies indicate that virtual keyboards show lower typing performance than physical keyboards, even when users have no disability. The main reasons behind this drop in performance can be due to the small size of the virtual keys, the absence of tactile feedback, and occlusion of virtual keys by fingers. In addition, the typing rate is affected by word completion and word prediction methods. The usability and efficiency of these systems are therefore highly dependent on the design of their graphical user interface.
The design of an efficient virtual keyboard is mainly driven by its corresponding language script. Currently, the majority of the available virtual keyboards are targeted to the English language and likewise, the optimization schemes are developed in accordance with this language only. However, changes in the nature and size of the alphabets, which may accompany with a change in language, may render the techniques developed for the English language ineffective or even inapplicable for other languages. There is a lack of virtual keyboard applications specific to the Hindi language (490 million speakers), which is the official language of India. The disability ratio is very high in developing countries, such as India, where more than 30 million people suffer from speech and motor disabilities (reported by Census, 2001). Thus, there is a pressing need for the development of efficient Hindi language based virtual keyboards.
The Hindi language script contains 11 vowels, 33 consonants, 1 7 matras (i.e. diacritics) including halant (i.e. killer stroke), and 3 other complex characters (i.e., a combination of two consonants). Hence typing in Hindi using a regular QWERTY keyboard is not an easy task as significant training is required to compose the text. In previous studies, the QWERTY keyboard has been optimized to design virtual keyboard applications in Bengali and eitei language. An optimal approach has been presented for designing virtual keyboards for text composition in Indian languages. However, there can be various inherent complications involved in the development process of such a system with different types of input devices such as an eye- tracker and an access soft- switch. A switch-controlled virtual keyboard for the Bengali language was recently presented along with a performance model to evaluate its performance. However, only one study has shown the head mounted gaze controlled text entry interface for Hindi language but the usability and efficiency of the system were not satisfactory.
Eye-tracker based virtual keyboard interfaces face major challenges for item selection wherein gaze and dwell time durations are used for pointing and selecting items simultaneously. On the one hand, the typing speed can be increased by choosing a short dwell time duration, although it may lead to more false item selections due to involuntary eye movements. On the other hand, if the dwell time is too long, it may increase the users discomfort. This issue is known as the Midas Touch problem in detection theory. Moreover, prior works are mainly focused on developing graphical user interfaces (GUIs) of virtual keyboards and their evaluation methods based on the text entry rate and error aspect. In terms of optimizing the GUI, the frequency- based arrangement of the letters has been previously utilized. However, the GUI optimization has been largely ignored when it is used with access methods such as eye-gaze and soft- switches. Furthermore, no previous work is known to have developed Hindi virtual keyboard with multimodal input access facility.
This invention intends to address the issue of designing an optimized gaze-control virtual keyboard that takes into account both the gaze characteristics and the used language script. The major factors associated are: 1) the estimation of the letter frequencies for the Hindi language script using a large corpus of characters; 2) a method to design an efficient layout for gaze controlled virtual keyboard based on frequency of the letters and the selection time duration of each letter, and 3) an experimental evaluation of the performance of different possible layouts of the proposed Hindi virtual keyboard (HVK).
Hindi is the official language of India. It is written in Devanagari script and contains more than 60 basic letters (see Table I), which can be further divided into three distinct categories of vowels (11), consonants (33), and matras 17). These matras are a unique set of the Hindi alphabets commonly used with the consonants, also called modifier consonants. Most of these matras are represented as dependent forms of the vowels. There are three other consonants apart from 33 regular consonants, known as conjuncts or irregular consonants (a combination of two consonants).
Brief description of the accompanying drawings:
Fig. 1 represents the functional layout of the system.
Fig.2 represents the matrix to define the average time to produce commands.
Fig.3 represents the tree structure based letter selection using various layouts Fig.4 represents the layout of the complete system Table I: represents the H indi Alphabet system
Table II: represents the Healthy (H) and Patient (P) Group demographic characteristics. Table III: represents the typing performance
Table IV: represents the typing performance
Table V: represents the typing performance
Detailed description of the invention with reference to the accompanying drawings:
The GUI design for the virtual Keyboards for the Hind i language involves the inclusion of a total 61 characters, i.e. vowels, consonants, and matras, (provided in Table I) which are significantly larger than the 26 characters of the English language (510 million speakers). A typical virtual keyboard consists of two sections: the key display section and the text display section; and the characters can only be placed in the key display section. Thus, it is a challenging task to place all the 61 characters in a clear and convenient manner to the user and to type with minimum search time. In order to design a robust and efficient GU I, this chal lenge can be overcome by implementing a two-step process. First, we investigate how a convenient and efficient GUI can include a large set of symbols on its visual display unit. Second, we develop a strategy that can locate individual symbols to a particular location of the GUI. These issues can be resolved collectively by designing a multi-layer virtual keyboard and the characters can be placed on the layout based on their probability of occurrence and the constraints of the input device. However, the probability of characters in the large corpus is not read i ly available for the Hindi language and it must be determined.
The performance of a virtual keyboard can be affected by the letter frequencies in the considered script, thus we initiated our study with the estimation of the occurrence of all the Hindi language characters along with other characters required for sentence completion. We denote by Pi the probability of occurrence of the ith letter. The relative letter frequencies in the Hindi script and additional symbols (i.e. 87 letters) are shown in Table I. Apart from the probability of characters, the performance of a gaze-based virtual keyboard can also be affected by visual search and learni ng time. An optimization approach is therefore required to place these characters on the G U I, based on their selection time.
For a tree-based menu structure with L levels and Mcom available com mands at each level, the maximum number of possible symbols is Msymb = (Mcom)L. Thus, the sequence of commands for each symbol can be defined as Sm = (csl, ...csl, ...csL) where m e (1... Msymb), I 6 {1... L) and csl is the command number at the 1th level. For instance, with L = 2 and Mcom = 4, a sequence of commands for a particular symbol, e.g. SI, can be (1,3) wherei n the user has to select command 1 at the First level and then command 3 at the second level. Lets assume SC (1, .... N ) and SD (1, .... N ) to be the sequence of commands and the sequence of thei r time durations, respectively, for an experiment involving N produced commands. The SD(n) represents the duration that was needed to select the command SC(n) immediately following the selection of the command SC (n - 1). To estimate the average time to produce a command i at n based on the command j at n - 1, we considered conditional command du rations, which are defined as:
Figure imgf000007_0001
where AD(i,j) is the sum of the duration times to produce a command i at n based on the command j at n-1 for all the N commands, and LN (i, j) is the n umber of times when it occurs. They can be computed as follows: if Sr{n)— i and S, (?i - 1)— j
otherwise 5, {n )— i and S, (n— 1 )— j
2
Figure imgf000007_0002
herwise
Finally, the estimated duration corresponding to the selection of a symbol m is given as:
Figure imgf000007_0003
where the first part is the average duration to select a command while the next terms are for the conditional command durations. If we consider the prior probability (Pm) to select a symbol m, then the average time to select a symbol is defined as:
Figure imgf000007_0004
The aim of GU I optimization is to minimize the value of E. This can be achieved by measuring the values of D(Sm) usi ng Equation 3 and the probabilities of occurrence of letters Pm. The values of D(Sm) can be coupled in inverse order to the values of Pm. In brief, the highest probabilities of a letter are assigned to the positions of conditional events A(i, j) that belong to the lowest values of D(Sm)..
The main two components of the multi modal system are two different input devices i.e. eye- tracker and single input switch and a text entry application (Fig 1). Further, the GUI comprises of two display components: a multi- level menu that comprises of ten commands at each level and an output text display i.e. message box where the user can see the typed text in real-time. The positions of the ten commands (i.e. cl to clO) are depicted in Fig 3 (a). The system is based on an Alphabetic Organization with Script Specific Arrangement (AOSSA). The tree-based structure of the ten commands is presented in Fig 3(b). The resolution of each rectangular command box is approximately 14% of the optimum screen resolution. All the command boxes are placed on the periphery of the screen while the output text box is placed at the center of the screen (see Fig 3(a)).
The tree-based structure of the GUI provides the ability to type 45 H indi la nguage characters, 17 different matras and halants, 14 punctuation marks, and the 10 digits. Moreover, other functionalities such as delete, delete all, new line, space and go-back commands for corrections are also included. The selection of a particular character requires the user to follow a two-step task. In the first step, the user has to select a particular command box which is done through a gaze based tracking device or an input from the soft switch, where the desired character is located. The successful selection of a command box shifts the GU I to the second level, where the ten commands on the screen are assigned to the ten characters which belong to the selected command box in the previous level. I n the second step, the user can see the desired character and finally select it for typing in the textbox again either by gazing i.e. through gaze based tracking device, the character or providing the input from input mea ns. After the selection of a particular character at the second level, the GUI goes back to the first level automatically. The placement and size of the command boxes are identical at both levels. In addition, this system is designed to optimize the GUI and add ing extra com mand featu res to write all the Hindi language letters including half letter scripts and seven punctuation ma rks.
With a virtual keyboard using gaze based eye-tracking, it is necessary to provide an efficient feedback to the user that the intended command box/character has been selected in order to avoid mistakes and increase efficiency. A visual feedback is provided to the user as a change in the color of the border section of the observed command box. I f the user gazes to a particular box for a duration of time t, the color of the border changes linearly in relation to the dwell time (At). The visual feedback allows the user to continuously adjust and adapt his/her gaze to the intended region on the screen. An audio feedback is also provided to the user through an acoustic beep after successful execution of each command. This beep sound makes them proactive so that they can prepare easily for the next character The color-based visual and acoustic beep based auditory feedbacks allow the user to continuously adjust and adapt his/her gaze to the intended region on the screen. Moreover, to improve the system performance by reducing the need of excessive eye movements, the last five typed characters are also displayed in the GUI at the bottom of the each command box (Fig 1). This helps the user to see the previously written characters without shifting their gaze to the output display box.
An item can be selected based on the d irected gaze of the user to the corresponding command box for At. The selection of a particular box is achieved by selecting the closest box using the Euclidean distance between the center of the box and the gaze coordinates. Thus, if the gaze coordinates remain in the same region (i.e. nearest to the target item) for At duration, this particular item is selected. When the gaze coordinates change from one region to another in less than At ms, the timer for the selection is reset to zero. The addition of the soft- switch with eye-tracking has helped to overcome the Midas touch problem, as the user can search the target item with the eye-tracker, and the selection can be done d i rectly via the soft- switch.
To estimate the duration corresponding to the selection of a symbol D(Sm) an AOSSA layout has been used. The tree structure depicti ng the com mand tags used for letter selection is shown in Fig 3 (b). Twelve healthy volunteers (2 females) in the age range of 21-32 years (27.05±2.96) participated in this experiment. Each participant was asked to type a predefined sentence, given as
Figure imgf000009_0001
4444455 -7Ϊ71
The transliteration of the task sentence in English is Kabtak Jabtak Abhyaasa Karate Raho. 44 - 4455 - 771 and the direct translation i n English is Till When Until Keep Practici ng. 44 - 4455 - 771. This predefined sentence consists of 29 characters from the Hindi language and 9 numbers. The complete task involved 76 commands in one repetition if performed without committing any error. This predefined sentence was formed with a particular combination of characters in order to obtain an equi-probable d istribution of the commands for each of the ten items in the GUI. Thus, the adopted a rrangement provides an u nbiased i nvolvement of the different command boxes and eye-gaze distribution over the GU I of the virtual keyboard. The eye-tracker was used for both poi nting and the selection of items (see Fig 1). During the experiment, participants directed their gaze at the target item for 1.5 s (dwell time (At)). From the above, the time taken during the selection of each command for all participants has been recorded. As a total of ten commands (Mcom = 10] were considered at each level (here, L = 2] the maximum number of possible symbols that can be located to the GU I design was 100 (Msymb = 100]. Thus all the sequences of commands (Sm] for each symbol have been estimated. Furthermore, the average time to produce a command i at time n based on the command j at time n - 1 and created a Mcom c Mcom matrix has been measured by usi ng Eq. 1. The grand average time to select a command in layer one a nd two was esti mated and assigned to the command positions, which were not selected du ring the experiments. Finally the estimated duration corresponding to the selection of a symbol was calculated using Eq. 3. The average time to select each command is presented in Fig 2. As the goal of the GUI optimization was to minimize E according to Eq. 4, the average time to select a symbol was combined with the letter frequencies given in Table I. This combination was performed in different ways in order to compare various possible GUI layouts for gaze-controlled Hindi virtual keyboard.
Proposed GUI of virtual keyboard
Based on the above findings, five different layouts of the GU Is have been proposed i n order to find out an optimal GUI for the gaze-control based H i ndi virtual keyboard. The objective of these layouts was to observe the change in performance of the virtual keyboard in terms of text entry rate to validate their performance. All these layouts are based on a two-layer selection method with ten commands at each level and the selection of each character requires execution of two commands. The positions of the command boxes are depicted in Fig 3(a] The tree- structure of the three menus for selecting the different characters and symbols is given in Fig 3(b-d] These layouts have been developed based on the five distinct strategies that can arrange the different items into a particular subset using the proposed opti mization approach and script specific arrangement of the characters. I n all the five GU I layouts, the 10th position of the second level was assigned to a go-back function i.e. to sh i ft the GUI back to the first level in case of a wrong selection. As each layout consists of 100 symbols (Msymb = 100], the total number of letters (Mletter] can be less than or equal to 100 i.e. M letter < Msymb. Here, we considered 88 symbols for assigning H i ndi characters, matras, halants, basic punctuation, and the space com mand (i.e. Mletter = 88] The remaining 12 symbols were assigned to delete, clear-all, and go-back com mands.
1) Alphabetic organization with script specific arrangement (AOSSA): An alphabetically arranged layout is typically easier to learn and remember. Consequently, its usability improves. However, it is a challenging task for these layouts to accommodate the different symbols based on the alphabetical order of the symbols. Fig 3(b] depicts the command tags used for letter selection. The vowels and consonants are placed on the keys in the same way the alphabet is usually depicted in the fi rst five command boxes (cl to c5) with the ()m symbol. The command boxes (c6, c7) comprise all the matras, halant and nukta. The command boxes (c8, c9) include the most frequently used punctuation marks, digit 0, space, and other functionalities such as delete, new line, and delete all to correct errors. The last com mand box (clO) is used to access the digits (1 to 9).
2) Frequency and time-based organization (FTO): The letters frequency-based approach is used to i mprove the text entry rate of virtual keyboard application. The system uses a combination of letters frequency and time-based approach for eye-tracking based virtual keyboard. This layout mainly considers two factors; the relative letter frequencies in the Hindi language script and the average time to produce a command for each location of the GUI. The new line command is considered with zero frequency. The average time to produce a command (i.e. a sequence of commands for each symbol) is presented in Fig 2. Based on the prior probability of the symbols and the average time to produce a command, a matrix is created using Eq. 4. The symbols are assigned accordingly on the location of GUI from command cl to c9. The tree-based structure is shown in Fig 3(c) depicting the command tags used for letter selection.
3) Frequency and time-based organization with script specific arrangement (FTOSSA):
The Hindi alphabet system is composed of several types of characters which can be grouped according to their usage in the language script. Thus, the i nclusion of these characters on a single GUI interface using only FTO based optimization may not provide the best performance. In order to resolve this issue, a script specific arrangement of all the symbols was considered for designing the layout, including the constraints from the FTO layout. The tree-based structure of GUI is shown in Fig 3(d) depicting the com mand tags used for letter selection. As the GUI contains 100 symbols, the complete set of symbols has been divided into five groups. First, the go-back function with each command box at the clO position. Second, a group of 45 characters including vowels, consonants, and space button has been formed and place these characters to command boxes (cl to c5) based on letter frequencies and the time taken to select a command from particular location. Similarly based on letter frequencies and their selection time third, fourth, and fifth groups are created. We formed th ird group that includes all the matras a nd om symbol, and place these characters to the command boxes (c6 to c7). Fourth, the punctuation marks, number 0 and extra com mand for the error correction are assigned to the command boxes (c8 to c9). Fifth, the nu mbers are placed in the com mand box (clO). 4) Random order of frequency and time-based organization (RaOFTO): I n this layout, all the characters and numbers, except go-back, are randomly placed in each of the ten commands.
5) Reverse order of frequency and time-based organization with script specific arrangement (ReOFTOSSA): This layout comprises of a reverse arrangement of the FTOSSA. The main emphasis of designing this layout is to measu re the virtual keyboard performance in the worst-case scenario.
Several performance indices such as the number of letters spelled-out per minute, the information transfer rate (ITR) at the letter level I T Rletter and command level I T Rcom and the mean and standard deviation (mean±SD) of the time to produce each com mand were used to evaluate the performance of all the proposed GUI layouts. The I T R is calculated based on the total number of actions (i.e. basic commands and letters) and the total time duration that is required to perform these commands. To estimate the I T R all the different commands and letters were assumed as equally probable and there is no typing error. The ITR is given as follows: ITRcom = log2(Mcom) · Ncom/T and ITRletter = log2(Mletter)- Nletter/T, where Ncom is the total number of com mands produced by the user to type Nletter characters. T is total time to produce Ncom or type all Nletter.
In order to find the most effective keyboa rd layout, the five proposed layouts were evaluated through an empirical study based on the performance i ndices mentioned in the previous section. A total of twenty adult volunteers participated in this study (see Table I I). These participants were divided into two groups (i.e. Group A and Group B). The group A was formed with ten healthy volunteers (one female) in the age range of 21 -30 years (25.02+3.08), namely HI to H 10. The group B was formed with ten stroke patients (2 females) in the age range of 24- 72 years (50.2±14.32), namely PI to P10 having variable time durations since stroke. No participant had prior experience of usi ng a n eye-tracker and soft-switch with any application. Two input devices were used ,a portable eye-tracker and a soft-switch as a si ngle-input device.
Data acquisition:
The eye-tracker data was recorded at 30 Hz sampling rate. I t involves bi nocular infrared illumination with a spatial resolution of 0.1 mm root mean square (RMS), which records x and y gaze coordinates and pupil d iameter for both eyes in mm. The soft-switch was used as a single-input device to select a command.
The typing task involved a predefined sentence with 104 characters that includes Hindi alphabets, numbers, punctuation mark, and space. The complete task involved the execution of 208 commands in one repetition if performed without committing any error. This predefined sentence was taken from an Indian newspaper and considered based on its practical flow, meaning, and coverage of all ten commands boxes i n the GUI. Thus, the adopted typing task provides an unbiased involvement of command boxes and eye-gaze d istribution over the virtual keyboard GUI.
Two input modalities were used to type the text that creates two different cond itions of experimental design, shown in Fig 1. The eye-tracker was used for both pointing and the selection of items (see Fig 1). During the experiment, participants d irected their gaze at the target item for a specific period of time. Second, the eye-tracker along with the soft- switch was used in a hybrid mode wherein the user directed the eye-gaze to the target item, and the selection occurs via a soft- switch. This multimodal facility was incorporated to overcome the Midas Touch problem of HC1 systems. Prior to each experiment, participants were advised to avoid moving their body and head positions during the tests as far as possible. One session was performed for each condition.
The typing performances of healthy pa rticipants for all five GUI layouts using the eye-tracker with soft-switch (ETSS) and eye-tracker only (FT) conditions are presented i n Table II I. The average typing speed with the FTOSSA layout (18.99±5.06 letters/min) was found superior among all the five designs during the ETSS condition. Subsequently, it is also found that the FTOSSA design leads to faster performance in terms of speed and ITR than all the four designs (p<0.05). I n particular, a high speed of 28.06 letters/mi n was ach ieved by the participant H I. During ET condition, the average typing speed with the FTOSSA (11.39±0.84 letters/min) was also found greater than all four designs. Consequently, it has also been observed that the FTOSSA leads to better performance in terms of speed and ITR than the other four designs (p<0.05).
Furthermore, statistical significance test was also used to compare the typing speed across the conditions. The average typing speed is measured with ETSS and ET for all GUI layouts and found that ETSS leads to a faster speed than ET (p<0.05). The overall results showed that the top two performing layouts were the FTOSSA and the AOSSA.
The typing performances of stroke patients for both the FTOSSA and the AOSSA layouts with the eye-tracker (ET) condition are presented in "fable IV and Table V, respectively. The average typing speed with the FTOSSA design (9.07±2.33 letters/m in) was found superior when compared to the AOSSA design (7.76±2.60 letters/mi n), (p<0.05). A similar pattern of performance is measured in terms of I T Rcom and I T Rletter for the FTOSSA lay- out. The I T Rcom and I T Rletter with the FTOSSA de- sign (69.11±11.20 bits/min and 58.05±15.32 bits/min) were greater than the AOSSA design (63.72±11.75bits/min and 49.95 + 16.76bits/min), (p<0.05). Hence, the ITRs of the stroke patients decrease not too much, while their speeds are comparable to healthy participants (compared to Table III). In particular, a stroke patient PI with loss of both motor and speech skills achieved 12.04 letters/min typing speed with the FTOSSA design.
It is observed that the performance can be affected significantly by patient groups, in particular, stroke patients. Keeping these issues in mind, an optimized gaze-controlled Hindi virtual keyboard has been designed wherein inputs can be provided using a portable non- invasive eye-tracker with or without a single input switch.
The main focus of this invention is to design and optimize a multimodal HCI system for stroke patients (mainly speech and motor impaired). In terms of performance comparison, virtual keyboards based on brain response detection, such as event-related potentials (e.g. P300) and steady-state visual evoked potentials (SSVEP), offer significantly lower performance than the pro- posed system. Studies reported an average ITR of 25 bits/min with a P300 speller and 37.62 bits/min (average speed of 5.51 letters/min) with a SSVEP speller.
The proposed system outperforms all these systems with an average ITR and average typing speed of 80.06±4.97 bits/min and 11.38±0.84 letters/min, respectively with healthy participants. Furthermore, it provides an average ITR and average typing speed of 69.8±11.20 bits/min and 9.07±2.33 letters/min, respectively with stoke patients. An individual stroke patient with the loss of both motor and speech skills achieved a high typing speed of 12.04 letters/min that demonstrated the high system usability.
A recent study provided a viable usage of eye-gaze indices (i.e., eye fixation, smooth pursuit, and blinking) for the assessment of pathological states in stroke patients. Therefore, a possible extension of the proposed system may be used for assessment purposes to help patients to determine their current pathological state. This system may also help patients to improve the loss of motor skills by the addition of other possible input modalities (e.g. motor imagery) wherein search and target selection of items can be achieved by gaze and motor imagery, respectively.
The present invention therefore has potential applications for a large user population. The flexibility and the usability of this system can be further enhanced by incorporating the adaptive techniques using the dwell time for selection of the characters.

Claims

I Claim:
1. A system for optimizing the posi tion of the displayed items for gaze-coritrolled tree-based menu selection systems, the system comprising of:
a. input means for receiving the input communication from the user;
b. processing means for processing the input received;
c. a storage means to store the input and processed signals
d. display means to display the processed i nput from the processing means;
e. auditory means to confirm the selection based on the i nput means; wherein the position of the displayed items for gaze-controlled tree-based menu selection systems is optimized by a combi nation of letter frequency and command selection time and wherein different commands provide access to multiple characters along with additional text editing commands.
2. A System as claimed in claim 1 wherein in the menu provides at least 10 commands to provide access to type 80 di fferent characters along with additional text editing commands.
3. A system as claimed in claim 1, wherein the input means is a Gaze based eye-tracking device, which can track the movement of the eyes and is adapted to d ifferentiate between the focused and involuntary movement of the eye during the process of gazing.
4. A system as claimed in any of the preceding claims, wherein the device is a non-invasive remote camera based eye-tracking device.
5. A system as clai med in claim 1 wherein the input means is a keyboard or a touch screen device, which can provide the input to the system.
6. A system as clai med in clai m 1 wherei n the input device can be a soft -switch which when pressed in combination of the eye gaze input, selects the gazed i nput.
7. A system as claimed in any of the above claims wherein the input device can be an armband /sEMG based gesture recognition device which can sense the gestures of the hand as an input to the device.
8. A system as claimed in any of the preced i ng claims wherein the processor gets the input from the multi modal input devices and generates response on the visual d isplay u nits.
9. A system as claimed in any of the preced ing claims wherein the visual d isplay unit/ virtual keyboard is a menu driven device.
10. A systems claimed in any of the preceding claims wherein the characters on the Visual display unit / virtual keyboard are based on a tree selection method.
11. A system as claimed in the above claim wherei n the selection of a com mand requi res at least two consecutive steps, where in the characters are first locked by the help of the Gaze based tracking device and then the command is selected by means of input devices.
12. A system as claimed in any of the preceding claims, wherein the selection of a particular box is achieved by selecting the closest box usi ng the Euclidean d ista nce between the center of the box and the gaze coordi nates.
13. A system as claimed in any of the preceding claims wherei n the when a particular command box is gazed for a predefi ned dwell time (At), there is a change in the color of the border section of the observed command.
14. A system as claimed in any of the preceding claims wherei n a n audio feedback is also provided to the user through an acoustic beep after successful execution of each command
15. A method for optimizing the position of the displayed items for gaze-controlled tree-based menu selection systems, the system comprising of input means for receiving the input communication from the user, processing means for processi ng the input received, a storage means to store the input and processed signals, display means to display the processed input from the processing means, auditory means to confi rm the selection based on the input means wherein the position of the displayed items for gaze-controlled tree-based menu selection systems is optimized by a combination of letter frequency and command selection time and wherein different com mands provide access to multiple characters along with additional text editing commands.
16. A method for optimizing the position of the displayed items for gaze-controlled tree-based menu selection systems as claimed in claim 15 wherein the menu provides at least 10 commands to provide access to type 80 d ifferent characters along with additional text editing commands.
17. A method as claimed in any of the preced ing clai ms wherei n the selection of a particular command box is achieved by selecting the closest box using the Euclidean distance between the center of the box and the gaze coordinates.
18. A method as claimed in any of the precedi ng claims wherein to improve the system performance by reducing the need of excessive eye movements; the last five typed characters are also displayed in the GU I at the bottom of the each com mand box.
19. A system and method for optimizing the position of the d isplayed items for gaze- controlled tree-based menu selection systems as claimed in any of the preceding claims as herein above described with reference to the accompanying drawings.
PCT/IN2018/000035 2018-02-27 2018-06-29 A system and method for optimizing gaze controlled gui for human computer interface WO2019167053A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
IN201811007266 2018-02-27
IN201811007266 2018-02-27

Publications (1)

Publication Number Publication Date
WO2019167053A1 true WO2019167053A1 (en) 2019-09-06

Family

ID=67806012

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2018/000035 WO2019167053A1 (en) 2018-02-27 2018-06-29 A system and method for optimizing gaze controlled gui for human computer interface

Country Status (1)

Country Link
WO (1) WO2019167053A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170108938A1 (en) * 1995-03-27 2017-04-20 Donald K. Forest Apparatus for Selecting from a Touch Screen

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170108938A1 (en) * 1995-03-27 2017-04-20 Donald K. Forest Apparatus for Selecting from a Touch Screen

Similar Documents

Publication Publication Date Title
Mott et al. Improving dwell-based gaze typing with dynamic, cascading dwell times
Wobbrock et al. Longitudinal evaluation of discrete consecutive gaze gestures for text entry
Jin et al. Optimized stimulus presentation patterns for an event-related potential EEG-based brain–computer interface
Majaranta et al. Text entry by gaze: Utilizing eye-tracking
Meena et al. Toward optimization of gaze-controlled human–computer interaction: Application to hindi virtual keyboard for stroke patients
Scott MacKenzie et al. BlinkWrite: efficient text entry using eye blinks
Majaranta Text entry by eye gaze
Jain et al. User learning and performance with bezel menus
Morimoto et al. Context switching for fast key selection in text entry applications
Sarcar et al. EyeK: an efficient dwell-free eye gaze-based text entry system
Rozado et al. Gliding and saccadic gaze gesture recognition in real time
Diaz-Tula et al. Augkey: Increasing foveal throughput in eye typing with augmented keys
Panwar et al. EyeBoard: A fast and accurate eye gaze-based text entry system
Rosenberg Computing without mice and keyboards: text and graphic input devices for mobile computing
Kumar et al. Tagswipe: Touch assisted gaze swipe for text entry
Meena et al. A novel multimodal gaze-controlled hindi virtual keyboard for disabled users
Singh et al. Enhancing an eye-tracker based human-computer interface with multi-modal accessibility applied for text entry
Liu et al. Gazetry: Swipe text typing using gaze
Meena et al. Design and evaluation of a time adaptive multimodal virtual keyboard
Gupta et al. Investigating remote tactile feedback for mid-air text-entry in virtual reality
Francis et al. Speed–accuracy tradeoffs in specialized keyboards
Miniotas et al. Symbol creator: An alternative eye-based text entry technique with low demand for screen space
Kotani et al. Design of eye-typing interface using saccadic latency of eye movement
WO2019167053A1 (en) A system and method for optimizing gaze controlled gui for human computer interface
WO2019167052A1 (en) A system for augmentative and alternative communication for people with severe speech and motor disabilities

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18907929

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18907929

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 18907929

Country of ref document: EP

Kind code of ref document: A1