WO2015133889A1

WO2015133889A1 - Method and apparatus to combine ocular control with motion control for human computer interaction

Info

Publication number: WO2015133889A1
Application number: PCT/MY2015/000017
Authority: WO
Inventors: Ngip Khean Chuan; A/L Sivaji Ashok
Original assignee: -Mimos Berhad
Priority date: 2014-03-07
Filing date: 2015-02-27
Publication date: 2015-09-11
Also published as: MY175525A

Abstract

The present invention provides an ocular and motion controller system (200) for controlling interaction between a GUI and a user. The GUI is displayed on one or more display screens (102). The system (200) comprises an ocular tracking module (203) for capturing gaze data of the user's gaze to position a pointer at a corresponding location on the screen, a motion tracking module (204) for capturing gesture data of the user's gesture, wherein gesture data is process to determine its validity, and the valid gesture is used to turn on the motion tracking for taking over the gaze control. The system (200) processes the ocular and motion data that falls within the AOI to determine where the pointer should be positioned. A method thereof is also provided.

Description

Method And Apparatus To Combine Ocular Control With Motion Control For Human Computer Interaction

Field of the Invention

[0001] The present invention relates human-computer interactions. In particular, the present invention relates to a system and method for combining ocular control with motion control for human computer interactions.

Background

[0002] Keyboard and mouse have been widely used as input interfaces to interact with computers. As the computers and their applications are getting more advances, there are increasing demands on new ways of interacting with computers effectively and naturally.

[0003] Human-Computer Interaction (HCI) solely utilizing ocular controls is relatively inaccurate and requires large ocular workload when compared to keyboard and mice controls. Meanwhile, HCI solely utilizing motion control are relatively inaccurate and requires large ergonomics workload when compared to keyboard and mice controls. In the recent technological advancement, motion control is becoming more common and the relevant device specifically adapted for detecting motion for interacting with computer has becoming readily available in the market.

[0004] It is therefore desired that a combined ocular and motion control is provided an interface with a more natural interaction between human and computer through ocular and gesture of the user. Summary

[0005] In accordance with one aspect of the present invention, there is provided a method for interacting with a graphical user interface (GUI) displaying on one or more display screens of a computer, the computer has an ocular tracking module and a motion tracking module. The method comprises receiving gaze data through ocular tracking module to compute a computed-ocular-point (COP); rendering an Area-of- Interest (AOI) based on the COP; positioning a pointer of the GUI to a position on the screen correspond to the COP; monitoring gestures captured by a motion tracking module that falls within a Pointer-Tracking- Area (PTA) of the motion tracking module, wherein the PTA is determined based on the AOI, detecting a tracked object from the gestures that fall within the PTA to extract command gestures therefrom; determining validity of the detected command gestures based on the pointer position and the computer state detected by the motion tracking module; and re-positioning the pointer according to a location of the tracked object determined within a Field-of-View (FOV) of the motion tracking module.

[0006] In one embodiment, computing the COP further comprises receiving gaze data from the ocular tracking module taken within a predefined period; storing all the gaze data taken within the predefined period; filtering the gaze data based on the validity of the detected command gesture; converting respective gaze data into a plurality of coordinate points in associating to the screen; and averaging the coordinate points and output the averaged result as the COP.

[0007] In another embodiment, computing the COP may further comprises determining AOI size based on COP value, an equipment profile and program setting stored on the ocular tracking module; offsetting AOI according to a valid screen border values of the screen; and displaying the AOI on the screen.

[0008] In a further embodiment, positioning the pointer according to location of tracked object may further comprise receiving an enabler signal to enable a pointer controller, and processing a type of gesture; matching the type of gesture with a shape point database to obtain an intended command gesture to obtain a shape point; calculating a new pointer position based on current AOI and the shape point; repositioning the pointer at the shape point; and interacting the GUI according to the gesture command.

[0009] In yet a further embodiment, when the pointer controller is enabled by the gesture controller, gaze data from the ocular tracking controller is ignored and the pointer movement and interaction with the GUI is controlled by the gesture detected by the motion tracking controller.

[0010] In another aspect of the present invention, there is also provided an ocular and motion controller system for controlling interaction between a GUI and a user, wherein the GUI is displayed on one or more display screens. The system comprises an ocular tracking module for capturing gaze data of the user's gaze; an ocular tracking controller operationally receiving the gaze data from the ocular tracking module to compute a COP; an AOI controller operationally receives the COP from the ocular tracking controller to render an AOI; a pointer controller operationally position a pointer of the GUI on the screen; a motion tracking module for capturing gestures of the user; a motion tracking controller for receiving gestures falls within a PTA of the motion tracking module to detect a tracked object, wherein the PTA is determined based on the AOI; and a gesture controller for receiving the tracked object to extract command gesture therefrom, wherein the detected command gestures based on the pointer position and a computer state, wherein the gesture controller enables a motion control on the pointer controller to reposition the pointer according to a location of the tracked object determined within a FOV of the motion tracking module.

[0011] In one embodiment, the ocular tracking controller computes the COP based on the gaze data detected by the ocular tracking module within a predefined period, all the gaze data taken within the predefined period are stored on a gaze database, the gaze data are filtered based on the validity of the detected command gestures, the filtered gaze data are converted into a plurality of coordinate points in associating to the screen and are averaged accordingly to obtain the COP.

[0012] In another embodiment, the ocular tracking controller further determines an AOI size based on COP value, an equipment profile and program setting stored on the ocular tracking module in order to offset the AOI according to a valid screen border values of the screen, thereby displaying the AOI on the screen.

[0013] In another embodiment, the gesture controller further generates an enabler signal to enable the pointer controller and processes a type of gestures to match for an intended command gesture with a shape point database to obtain a shape point; wherein the pointer is repositioned based on a current AOI and the shape point for interacting with the UI according to the gesture command.

[0014] In a further embodiment, when the pointer controller enabled by the gesture controller, gaze data from the ocular tracking controller is ignored and the pointer movement and interaction with the UI is controlled by the motion tracking controller.

Brief Description of the Drawings

[0015] Preferred embodiments according to the present invention will now be described with reference to the figures accompanied herein, in which like reference numerals denote like elements;

[0016] FIG. 1 illustrates a schematic diagram of an ocular and motion controlling system for human-computer interaction according to one embodiment of the present invention; [0017] FIG. 2 illustrates a block diagram of an ocular and motion control system 200 according to one embodiment of the present invention;

[0018] FIG. 3 illustrates a process for combining ocular and motion control in accordance with one embodiment of the present invention;

[00 9] FIG. 4 illustrates a flow diagram of a process for generating a computed- ocular-point from gaze data received from ocular tracking module in accordance with one embodiment of the present invention;

[0020] FIG. 5 illustrates a flow diagram of a process of computing and rendering AOl based on the COP in accordance with one embodiment of the present invention; [0021] FIG. 6 illustrates the generation of the AOI in relation to the screen and

COP of one embodiment; [0022] FIG. 7 is a flow diagram illustrating a process of translating the position of a tracked object in PTA onto the AOI in accordance with one embodiment of the present invention;

[0023] FIGs. 8A and 8B that illustrate the screen with reference to the AOI and the FOV with reference to the PTA; and

[0024] FIGs. 9A-9K depict sequence of human-computer interaction utilizing the system and method of the above embodiments in accordance with the present inventions.

Detailed Description [0025] Embodiments of the present invention shall now be described in detail, with reference to the attached drawings. It is to be understood that no limitation of the scope of the invention is thereby intended, such alterations and further modifications in the illustrated device, and such further applications of the principles of the invention as illustrated therein being contemplated as would normally occur to one skilled in the art to which the invention relates.

[0026] FIG. 1 illustrates a schematic diagram of an ocular and motion controlling system for human-computer interaction according to one embodiment of the present invention. As shown in the upper half of the figure, the user 101 is interacting with a computer 100 having a display screen 102. Eye gaze 103 of the user 101 is tracked by an ocular and motion-tracking device 106 that is placed in an appropriate location suitable for tracking the eye movements. The ocular and motion-tracking device 106 can be a dedicated or general purpose imaging device for capturing images or video of the person to detect the eye moments. The ocular and motion-tracking device 106 captures the images of the user 101 and process accordingly to extract eyes movement information.

[0027] In this invention, ocular control means using eye (or eyes) as a control interface to control a desired means. Ocular control may utilize gaze detection to detect the point (i.e. scalar based) where the subject human eye is looking at. Where necessary, saccade detection (i.e. vector based) can also be adapted for ocular control, in accordance with another embodiment of the present invention. It is also possible that an ocular data, such as oculogram or the like, can be detected through a wearable device that directly detect the eye movement, or it can be a remote device for detecting the eye movement.

[0028] The ocular and motion-tracking device 106 collects ocular data from the user 101 and sends to the computer 100. In one embodiment, the ocular and motion- tracking device 106 is an integrated device for detecting both ocular data and motion data. In an alternative embodiment, the ocular and motion-tracking device 106 can be stand-alone devices adapted respectively for ocular control and motion control. In yet another alternative embodiment, the ocular and motion-tracking device 106 can be adapted as an integrated device with detachable or remote peripheral sensors or detectors for capturing ocular data and motion data. Based on the ocular data, the computer generates a computed-ocular-point (COP) representing a pointer 105 on the display screen 102 where the user is looking at. The COP point as shown is presented in an arrow cursor, and in other embodiments, the point can be presented in other shapes, sizes and colors according to the settings of the computer 100. Based on the detected eye gaze position, the pointer 105 is positioned on the screen to indicate the location of the COP, in the absence of motion tracking input.

[0029] For purpose of the present application, the term "pointer" can be used interchangeably with "cursor" which is most commonly referred to an indicator to show on one or an array of computer screens the position that will respond to the user's input to interact with the computer.

[0030] An area-of-interaction (AOI) 104 is defined based on the COP, i.e. the pointer 105. The AOI 104 limits the pointer 105 within a restricted area. With sufficient space available, the COP is by default set as the center of the AOI 104, until the AOI 104 reaches the boundary of the visible area on the computer screen(s). Mackenzie's Fitts' Law equation, for example, can be adapted herewith for determining the location of the AOI based on the COP.

[0031] As shown in the bottom half of the FIG. 1, once the AOI 104 is defined, the user 101 may enable pointer positioning via motion tracking by performing a tracking gesture with user's hand 107. In one embodiment, the AOI 104 may be a boundary visible on the computer screen(s) for user reference. In another embodiment, it can also be a invisible boundary to the user. Similarly, the motion gestures are also detected through the device 106. The data (such as images or videos or any detected ocular or motion data) captured through the device 106 is inputted on the computer 100 for processing to identify shapes (e.g., hands, fingers, and head), shape points (e.g., fingertips, palm center, and nose tip) and gestures (e.g., position and movement of hands, fingers, and head) are recognized. Of all the identified features, one of the intended features is selected for controlling the pointer. As shown in FIG. 1, the tip of index finger is to guide the pointer 105. The motion-tracking module of the tracking device 106 may contain sensors that can capture depth and image data in the sensor field-of-vision (FOV). When motion tracking is enabled, the pointer position is mapped to the relative position of a shape point (e.g., index fingertip) in a pointer-tracking-area (PTA) 108 of the motion tracking hardware's field-of-vision. The shape point is selected from one point of the detected shape for carrying out the control. In the present illustration, a user motion gesture is initiated, as the user's hand 107 is raised within the FOV.

[0032] The PTA 108 is an imaginary area adapted to limit the interaction (i.e. between the user and the computer) within which for more efficient control and processing. The PTA 108 corresponds to the AOI 104 that is provided as the gesture control is detected. As the user's gesture (i.e. finger tip) moves within a corresponding position in the PTA 108, the pointer 105 maneuvers in a corresponding position within the AOI 104 on the screen 102 in relation to the corresponding position in the PTA 108 as indicated by the line 109. As mentioned earlier, the AOI 104 is provided to limit the area of interaction within that area, even when the user gesture moves out of the PTA 108. Such implementation is found to be able to reduce jittery effect on pure gaze (or ocular) control. The AOI size may depends on gaze data accuracy. It is appreciated that different users and physical environment settings may influence the accuracy. By way of example, not limitations, the AOI size may be relatively bigger towards the edge of the screen than that when it is nearer to the center.

[0033] According to the embodiment of the present invention, the PTA 108 can be a 2D or 3D area depending on whether the user interface being used is 2D or 3D. Accordingly, the user 101 may position the pointer 105 on top of user interface (UI) objects in the display 102 and perform command gesture to interact with these UI objects.

[0034] In this invention, motion data refers to data pertaining to the state or movement of the object of interest, for example. In one embodiment, the data can be generated by analysing and determining the position of object of interest. This is usually achieved by detecting an object shape over time. The object shape can be detected in two-dimensional (2D) (i.e. image data only) or it can be detected in three- dimensional (3D) (i.e. image data plus depth data). In one embodiment, image(s) captured by a Red-Green-Blue (RGB) and depth camera is to be processed through an object recognition module for identifying the object of interest. Such object recognition may be done through matching through a database. In a further embodiment, the detected object of interest may further be combined with gesture motion for more precise control. Gestures are derived from data pertaining the targeted object's shape and motion over time. In another embodiment, time-slide method that determine the differences between pictures to detect movements of the object of interest is also possible, save the effort to recognise the object shape. Preferably, the motion control referred herein is a control interface that uses shape, motion, and gesture detections to render the control command. As opposed to raw motion data to provide control, the motion control of the present invention offers a more precise and more diversify option allowing the user to interact with the computer 100 more economically. Accordingly, the motion control can also herein after referred to as shape and gesture control, or the like. In one example, only part of the FOV, also known as Pointer Tracking Area, is utilized to move the pointer, whilst all the FOV is used for gesture recognition. [0035] FIG. 1 illustrates only on display screen 102 only. But it is well understood to a skilled person that the present invention is implementable on a computer with multiple-display screens that are connected in array as extended screens to the main display screen without any limitation. [0036] In other embodiments, the ocular and motion tracking capacities may be enabled by multiple devices. Yet in another embodiment, the ocular and motion tracking hardware may be integrated with the computer 100 or display 102.

[0037] Through the above, operationally, the user uses ocular motion to initiate the ocular and motion control of the present invention. The ocular motion places the pointer at a proximate area to be interacted, where the gesture motion of the users further allows the user to precisely place the pointer at an intended location and to send commands to the operating system's application programming interface (API).

[0038] FIG. 2 illustrates a block diagram of an ocular and motion control system 200 according to one embodiment of the present invention. The ocular and motion control system 200 comprises hardware modules and application modules. The hardware modules are provided to acquire the ocular and motion tracking parameters, wherein the application modules process the acquired parameters. The hardware modules comprise an ocular tracking module 203 and a motion tracking module 204. It is well understood to a skilled person that the hardware modules may be available in many forms. In one embodiment, it can be a dedicated or proprietary hardware module adapted for functioning with the application module provided herewith to provide the ocular and motion controls. [0039] The ocular tracking hardware 203 operationally tracks user's eye gaze and generated gaze data for feeding into the ocular tracking controller 205. The ocular tracking controller 205 converts the gaze data and produces a computed-ocular-point (COP) that represent the point on the screen that user is looking at for the gaze data collection duration. The AOI controller 207 receives the COP and generates the AOI base on COP and screen information 210 obtained from the operating system 201. The AOI controller 207 may send the AOI information to the OS's API 211 to indicate on the screen where the AOI is rendered.

[0040] The pointer controller 208 is adapted to process the acquitted ocular data and motion data to control the pointer. Operationally, it also receives the COP from the ocular tracking controller 205 for processing the pointer data. The pointer controller 208 is triggerable by the gesture controller 209, which will be explained later. By default, the pointer controller 208 is triggered enable or triggered on, such that the pointer maneuvers according to the gaze data. [0041] Returning to the motion-tracking module 204 that captures images or depth images for use of shape and motion tracking. Depending on the integrated capabilities of the motion-tracking module 204, data that send to the motion tracking controller 206 may be in raw images, shapes (e.g. identification of objects from images), shape points (e.g. fingertips, palm center, head center location, skeletal joints), motion (e.g. movement history, velocity, and acceleration of shape points), or gestures (e.g. specific motions and shapes history) data. The motion-tracking controller 206 receives the data generated by motion tracking module 204 and collects the data for a specific amount of time. The motion-tracking controller 206 is capable of processing its own shapes points data and gestures data or it may use the computed shape points and gestures of the software capabilities.

[0042] The gesture controller 209 receives gesture data from the motion- tracking controller 206. The gesture controller 209 identifies whether the gesture indicates a command for the operating system 201 (i.e. command gesture) or a signal to enable pointer control by motion tracking (i.e. tracking gesture).

[0043] When a command gesture is detected, the gesture controller 209 evaluates the validity of the command gesture based on computer state, and the current position of the pointer. The valid command gesture is then translated into commands that are sent to the OS API 211.

[0044] TABLE 1 below exemplifies relationships between command gesture and the computer state that may be adapted:

TABLE 1 : Examples of Command Gestures vs. Computer State Needed [0045] It is understood that the examples in Table 1 is merely for illustration only, it is possible that computer state may change for different gestures commands according to the actual needs and applicable situations.

[0046] When a tracking gesture is detected, the gesture controller 209 produces tracking gesture details to the OS API 211 and an enable or trigger signal to the pointer controller 208 to enable pointer control by the motion tracking. The pointer controller 208 then finds the shape point corresponding to the pointing gesture received from the shape point database received from the motion-tracking controller 206. The pointer controller 208 then maps the location of the shape point within the pre-defined PTA in the motion tracking module 204 FOV to the AOI on screen and position the pointer at a corresponding location on the screen accordingly.

[0047] FIG. 3 illustrates a process for combining ocular and motion control in accordance with one embodiment of the present invention. This process is carried out by the system 200 of FIG. 2. At step 302, a COP is computed from the gaze data. At step 304, an AOI is rendered based on the COP. At step 306, the pointer is moved to the COP. At step 308, the motion controller 206 monitors for any motion tracking gesture. At step 310, when a motion tracking gesture is detected, the pointer is moved to a location where the tracked object is associated at step 312. When no motion tracking gesture is detected at step 310, a command gesture from user will be received at step 314. At step 316, the command gesture is checked for validity. When the command gesture is applicable (i.e. valid), the corresponding command gestures is executed on the operating system at step 318. Otherwise, the command gesture will be ignored at step 320. [0048] FIG. 4 illustrates a flow diagram of a process for generating a computed- ocular-point from gaze data received from ocular tracking module in accordance with one embodiment of the present invention. In particular, the process includes a COP generation. In step 402, gaze data is received from hardware API, such as the ocular module API. The gaze data includes, but not limited to, eyes gaze location on screen, eyes gaze data validity level, distance of eyes from screen (or from the tracking module), and eyes ocular axis in sensor's FOV. At step 404, the gaze data is stored on a gaze data collection database 405. The data is stored at a pre-defined duration that approaches the user's response time (e.g. 200ms) so that overall ocular control interaction is not perceived as slow for the user. At step 406, the system checks if the gaze data is received at the pre-defined duration. If no data is received, the process returns to the step 402. If data is received, the gaze data is retrieved from the gaze data collection database 405 at step 408. At step 410, the gaze data is filtered based on it validity level and statistical analysis for outliers. At step 412, the filtered gaze data is then converted into coordinate points. The conversion determines the user's left and right eye positions relative to the screen. Offsets and correction factor may be added during the conversion. At step 414, the coordinate points are averaged out to obtain a COP. If multiple display units are utilised, the screen identification is included into the COP. [0049] FIG. 5 illustrates a flow diagram of a process of computing and rendering AOI based on the COP in accordance with one embodiment of the present invention. At step 502, a COP is received or acquired from the ocular tracking module 203. At step 504, an AOI size is determined based on the COP value, equipment profile and settings, which can be acquired from the equipment profile and setting database 505. In one embodiment, it is known in the art that the AOI size can be adjusted at a bigger size to compensate decrease of accuracy in certain part of the screen (e.g. screen edge). In another embodiment, the AOI size can also be customised by the user as desired. The AOI size is directly proportional to the system workload when processing motion control. It is important that a optimum AOI should take into account the accuracy of ocular control and motion control to provide interaction with lesser human effort.

[0050] While rendering the AOI size at the step 504, the equipment profile and settings may be taken from the ocular tracking module 203 and the motion tracking module 204.

[0051] In another embodiment, user may manually configure parameters used to calculate AOI height and width. The modified parameters could be stored as part of user profile. It is known that the ocular tracking module 203 or device may comprise one or more eye tracker or tracking sensors. The accuracy level of the ocular tracking module 203 can be based on the uncertainty of the ocular gaze angle, for example, ocular gaze angle uncertainty of less than or equal to about 0.5 degree, the accuracy level can be relatively high, and ocular gaze angle uncertainly of less than or equal to 1 degree may obtain only medium level of accuracy, whilst anything beyond 1 degree of the ocular gaze angle uncertainty generally gets low accuracy level. [0052] Similarly, the motion-tracking module 204 may also include one or more motion trackers or tracking sensors. The accuracy level of the motion tracking module 204 depends on sensitivity of motion tracking. The sensitivity of the sensor is defined as the minimum motion distance that the sensor can detect. In one given example, sensitivity level of the motion tracking module 204 may be set as follows:

[0053] The effectiveness of motion tracking is, on the other hand, dependent on the percentage of instances the module is able to detect motion. The effectiveness level of the motion tracking module 204 can be set as follows:

Following the above, the accuracy of the module 204 may be rendered

table as provided below:

Eye tracking Accuracy

High Medium Low

Motion High Medium Medium Small

Tracking Medium Big Medium Small

Accuracy Low Big Big Medium 1 ί

[0056] Following that, the AOl size level count be converted to user ocular angle range based on the lookup table below:

[0057] The AOl height in pixel could be calculated with following formula: AOl height =

user Distance x tangent (Use Ocular View Angle Range) xscreen pixel density

[0058] And the AOl width by default is 4/3 of the height for rectangular shaped

AOl. It can be of other ratio such as on the screen aspect ratio.

[0059] Referring now to step 506 of FIG. 5, the AOl position is calculated based on the COP. FIG. 6 illustrates the generation of the AOl in relation to the screen and COP. In the present embodiment, the AOl position can be rendered through the Formulae (1) and (2) as provided below:

[0060] X A, Ol — ^A COP (1)

WAOl

[0061] (2) [0062] where, [0063] X_AOI ⁼ x-axis coordinate of AOl upper left corner; [0064] XCOP = x-axis coordinate of COP upper left corner; [0065] h_Ao_/= height of AOl; [0066] Υ_ΑΟι = y-axis coordinate of AOI upper left corner;

[0067] YCOP = y-axis coordinate of COP upper left corner; and

[0068] WAOI= width of AOI.

[0069] At step 508, the AOI is offset according to a valid borders based on the screen settings 509. The AOI offset is calculated through the Formulae (3) and (4) as provided below:

^BorderRight ~ ^wAOI ^ ^BorderRight

[0072] X_A' _{0 I} = x-axis coordinate of actual AOI upper left corner; [0073] ^BorderLeft ⁼ x-axis coordinate of left (as relative to user) border of computer screen. This is usually value 0;

[0074] ^BorderRight ⁼ x-axis coordinate of right (as relative to user) border of computer screen;

[0075] Y = y-axis coordinate of actual AOI upper left corner [0076] V_BoriierLe^_t = y-axis coordinate of upper (as relative to user) border of computer screen. This is usually value 0. [0077] ^BorderRight ~ x-axis coordinate of bottom (as relative to user) border of computer screen.

[0078] As noted above in conjunction with illustrations in FIG. 6, it will be appreciated that the present invention is applicable on an irregular non-rectangular screen size with different top, right, bottom, and left borders. In the embodiments of the present invention, it is desired that the screen settings 509 is being monitored, computed and stored accordingly to cater for different screen configurations.

[0079] Once the AOI is computed, at step 510, the AOI is drawn on the display screen for user's reference. The AOI may be presented in different appearances of Ul overlays based on data accuracy and user preference.

[0080] FIG. 7 is a flow diagram illustrating a process of translating the position of a tracked object in PTA onto the AOI in accordance with one embodiment of the present invention. At step 702, an enable signal is generated along with a tracking gesture type by the gesture controller 209. The enable signal is sent to pointer controller 208 to enable motion tracking for pointer positioning. The tracking gesture type allows the pointer controller to select a corresponding shape point. At step 704, the latest stored shape points list and PTA details are obtain from a database 705, which can be stored on the motion tracking controller. At step 706, the detected tracking gesture is used to find and match a desired shape point. At step 708, if no desire shape point is found, the process returns to the step 704 to wait for a new list of shape point. If a desire shape point is found at the step 708, at step 710, latest stored AOI details is obtained through AOI controller 207 from the AOI database 711. The AOI details are pre-stored on AOI database 711. At step 712, the pointer position is calculated. For better illustrations, FIGs. 8A and 8B that illustrate the screen with reference to the AOI and the FOV with reference to the PTA are provided. The pointer position is calculated as shown in the Formulae (5) and (6) as illustrated below: xObj -XPTA

[0081] X h_{A 0I} + X A, OI

^YObj~^YPTA

[0082] XW A_A OI MO/ (6)

WPTA

[0083] where,

[0084] XP = x-coordinate of pointer on computer screen;

[0085] Xob_j - x-coordinate a specific point on the tracked object depending the shape of the object in the PTA;

[0086] ΧρτΑ = x-coordinate of the upper left corner of PTA;

[0087] hpTA - height of PTA;

[0088] ^AOI = height of AOI;

[0089] XAOI - x-coordinate of AOI on screen;

[0090] YP = y-coordinate of pointer on computer screen; [0091] Yob_j - y-coordinate a specific point on the tracked object depending on the shape of the object in the PTA;

[0092] YPTA - y-coordinate of the upper left corner of PTA;

[0093] WPTA ⁼ width of PTA; [0094] WAOI = width of AOI;

[0095] Y_AOi = y-coordinate of AOI on screen.

[0096] Once the pointer position is calculated based on the tracked gesture, the pointer is positioned at it corresponding location within AOI at step 714. Depending on the gesture type, the pointer appearance may be changed, as the pointer is located according to the computer interface, at step 716.

[0097] FIGs. 9A-9K depict sequence of human-computer interaction utilizing the system and method of the above embodiments in accordance with the present inventions. As shown in FIG. 9A, as the user is gazing on the screen, an AOI 902 is rendered. The pointer 901 is also moved from its original position to its new position within the AOI 902 according to the detected user's gaze. Preferable, the AOI 902 is a visible boundary to the user. Once the AOI 902 is presented, as shown in FIG. 9B, as the user rises one of his hands, a PTA 904 is formed within a FOV 906. The FOV 906 is limited by the hardware configurations of the ocular and motion-tracking device. The PTA 904 boundary and position is defined based on the AOI 902. As shown in FIG. 9C, as the user's hand is moving within the PTA 904, the pointer 901 moves accordingly within the AOI 902. It is to be noted that the AOI 902 remains static in position as the pointer moves within the AOI 902.

[0098] As shown in FIG. 9D, the user raises the other hand as a gesture command to fix the AOI 902 in place. As shown in FIG. 9E, the other hand that fixes the position of AOI 902 may further grasp his fingers as a grab gesture command to further fix the AOI 902. It accordingly disables the gaze control, and passes over the control of the AOI 902 to the other hand with the grab gesture as shown in FIG. 9F. User may then interact with any GIU, such as window 908, that falls within the AOI 902 with his right hand. For example, the user may move the window 908 to its new locations 908' as the right hand is moving. The window 908 can be moved as far as the right hand is within the PTA 904. if the right hand moves beyond the boundary of PTA 904, the window 908 stops moving. As shown in FIG. 9G, the user may disable pointer tracking gesture, so that the window can be moved according to a gaze point 912. To move the window 908 to user's gaze point 912, the user simply needs to move his hand out of the FOV 906.

[0099] As shown in FIG. 91, as the window 908 is moved to its new position 908', the AOI 902 also moves to its new location. Similarly, the PTA 904 will be recomputed to correspond to the new AOI 902 location. As shown in FIG. 9J, the user right hand may rise again within the PTA 904 to disable the gaze control and the pointer will again change its position to correspond to the right movements. As shown in FIG. 9K, user releases the window from the "click-and-drag" mode by opening his hand so that he may again interact with the UI within AOI 902 as described above.

[00100] While specific embodiments have been described and illustrated, it is understood that many changes, modifications, variations, and combinations thereof could be made to the present invention without departing from the scope of the invention.

Claims

1. An ocular and motion controller system (200) for controlling interaction between a Graphical User Interface (GUI) and a user, wherein the GUI is displayed on one or more display screens (102), the system (200) comprising:

an ocular tracking module (203) for capturing gaze data of the user's gaze; an ocular tracking controller (205) operationally receiving the gaze data from the ocular tracking module (203) to compute a Computed Ocular Point (COP);

an Area of Interest (AOI) controller (207) operationally receives the COP from the ocular tracking controller (205) to render an AOI (104);

a pointer controller (208) operationally position a pointer of the GUI on the screen (102);

a motion tracking module (204) for capturing gestures of the user;

a motion tracking controller (206) for receiving gestures falls within a Pointer- Tracking-Area (PTA) of the motion tracking module (204) to detect a tracked object, wherein the PTA is determined based on the AOI ( 104); and

a gesture controller (209) for receiving the tracked object to extract command gesture therefrom, wherein operationally, the command gestures is extracted based on the pointer position and a computer state, wherein the gesture controller (209) enables a motion control on the pointer controller (208) to reposition the pointer according to a location of the tracked object determined within a Field-of-View (FOV) of the motion tracking module (204), wherein the user fixs a position of the AOI with one command gesture in order to disable the gaze control, which allows another command gesture to interact with the GUI within the AOI.

2. An ocular and motion controller system (200) according to claim 1, wherein ocular tracking controller (205) is adapted for computing the COP based on the gaze data detected by the ocular tracking module (203) within a predefined period, all the gaze data taken within the predefined period are stored on a gaze database, the gaze data are filtered based on the validity of the detected command gestures, the filtered gaze data are converted into a plurality of coordinate points in associating to the screen information (210) and are averaged accordingly to obtain the COP.

3. An ocular and motion controller system (200) according to claim 1 , wherein ocular tracking controller (205) further determines an AOI size based on COP value, an equipment profile and program setting stored on the ocular tracking module (203) in order to offset the AOI according to a valid screen border values of the screen, thereby displaying the AOI on the screen.

4. An ocular and motion controller system (200) according to claim 1, wherein the gesture controller (209) is further adapted for generating an enabler signal to enable the pointer controller (208) and processes a type of gestures to match for an intended command gesture with a shape point database to obtain a shape point.

5. An ocular and motion controller system (200) according to claim 4, wherein the pointer is repositioned based on a current AOI and the shape point for interacting with the UI according to the gesture command.

6. An ocular and motion controller system (200) according to claim 1 , wherein when the pointer controller (208) is enabled by the gesture controller (209), gaze data from the ocular tracking controller (205) is ignored and the pointer movement and interaction with the GUI is controlled by the motion tracking controller (206).

7. A method for interacting with a graphical user interface (GUI) displaying on one or more display screens (102) of a computer (100), the computer has an ocular tracking module (203) and a motion tracking module (204), the method comprising:

receiving gaze data through the ocular tracking module (203) to compute a

Computed-Ocular-Point (COP);

rendering an Area-of-Interest (AOI) (104) based on the COP;

positioning a pointer of the GUI to a position on the screen (102) correspond to the COP;

monitoring gestures captured by the motion tracking module (204) that falls within a Pointer-Tracking-Area (PTA) of the motion tracking module (204), wherein the PTA is determined based on the AOI (104),

detecting a tracked object from the gestures that fall within the PTA to extract command gestures therefrom;

determining validity of the detected command gestures based on the pointer position and the computer state detected by the motion tracking module (204); and re-positioning the pointer according to a location of the tracked object determined within a Field-of-View (FOV) of the motion tracking module (204).

8. A method according to claim 7, wherein computing the COP further comprising: receiving gaze data from the ocular tracking module (203) taken within a predefined period;

storing all the gaze data taken within the predefined period; filtering the gaze data based on the validity of the detected command gesture; converting respective gaze data into a plurality of coordinate points in associating to the screen; and

averaging the coordinate points and output the averaged result as the COP.

9. A method according to claim 7, wherein computing the COP further comprising: determining AOI size based on COP value, an equipment profile and program setting stored on the ocular tracking module;

offsetting AOI according to a valid screen border values of the screen; and displaying the AOI on the screen.

10. A method according to claim 7, wherein positioning the pointer according to location of tracked object, further comprising:

receiving an enabler signal to enable a pointer controller, and processing a type of gesture;

matching the type of gesture with a shape point database to obtain an intended command gesture to obtain a shape point;

calculating a new pointer position based on current AOI and the shape point; re-positioning the pointer at the shape point; and

interacting the UI according to the gesture command.

1 1. A method according to claim 10, wherein when the pointer controller (208) is enabled by the gesture controller (209), gaze data from the ocular tracking controller (205) is ignored and the pointer movement and interaction with the UI is controlled by the gesture detected by the motion tracking controller (206).