US20090268945A1 - Architecture for controlling a computer using hand gestures - Google Patents

Architecture for controlling a computer using hand gestures Download PDF

Info

Publication number
US20090268945A1
US20090268945A1 US12/494,303 US49430309A US2009268945A1 US 20090268945 A1 US20090268945 A1 US 20090268945A1 US 49430309 A US49430309 A US 49430309A US 2009268945 A1 US2009268945 A1 US 2009268945A1
Authority
US
United States
Prior art keywords
user
hand
image
computer
command
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/494,303
Inventor
Andrew D. Wilson
Nuria M. Oliver
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US12/494,303 priority Critical patent/US20090268945A1/en
Publication of US20090268945A1 publication Critical patent/US20090268945A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/10Program control for peripheral devices
    • G06F13/105Program control for peripheral devices where the programme performs an input/output emulation function
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/038Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry

Definitions

  • the present invention relates generally to controlling a computer system, and more particularly to a system and method to implement alternative modalities for controlling computer application programs and manipulating on-screen objects through hand gestures or a combination of hand gestures and verbal commands.
  • a user interface facilitates the interaction between a computer and computer user by enhancing the user's ability to utilize application programs.
  • the traditional interface between a human user and a typical personal computer is implemented with graphical displays and is generally referred to as a graphical user interface (GUI).
  • GUI graphical user interface
  • Input to the computer or particular application program is accomplished through the presentation of graphical information on the computer screen and through the use of a keyboard and/or mouse, trackball or other similar implements.
  • Many systems employed for use in public areas utilize touch screen implementations whereby the user touches a designated area of a screen to effect the desired input.
  • Airport electronic ticket check-in kiosks and rental car direction systems are examples of such systems. There are, however, many applications where the traditional user interface is less practical or efficient.
  • the traditional computer interface is not ideal for a number of applications.
  • Providing stand-up presentations or other type of visual presentations to large audiences is but one example.
  • a presenter generally stands in front of the audience and provides a verbal dialog in conjunction with the visual presentation that is projected on a large display or screen.
  • Manipulation of the presentation by the presenter is generally controlled through use of awkward remote controls, which frequently suffer from inconsistent and less precise operation, or require the cooperation of another individual.
  • Traditional user interfaces require the user either to provide input via the keyboard or to exhibit a degree of skill and precision more difficult to implement with a remote control than a traditional mouse and keyboard.
  • Other examples include control of video, audio, and display components of a media room.
  • Switching between sources, advancing fast fast-forward, rewinding, changing chapters, changing volume, etc., can be very cumbersome in a professional studio as well as in the home.
  • traditional interfaces are not well suited for smaller, specialized electronic gadgets.
  • WIMP Window, Icon, Menu, Pointing device (or Pull-down menu) interfaces allow fairly non-trivial operations to be performed with a few mouse motions and clicks.
  • this shift in the user interaction from a primarily text-oriented experience to a point-and-click experience has erected new barriers between people with disabilities and the computer.
  • using the mouse can be quite challenging.
  • Perceptual user interfaces utilize alternate sensing modalities, such as the capability of sensing physical gestures of the user, to replace or complement traditional input devices such as the mouse and keyboard.
  • Perceptual user interfaces promise modes of fluid computer-human interaction that complement and/or replace the mouse and keyboard, particularly in non-desktop applications such as control for a media room.
  • gestures play a symbolic communication role similar to speech, suggesting that for simple tasks gesture may enhance or replace speech recognition.
  • Small gestures near the keyboard or mouse do not induce fatigue as quickly as sustained whole arm postures.
  • Previous studies indicate that users find gesture-based systems highly desirable, but that users are also dissatisfied with the recognition accuracy of gesture recognizers.
  • experimental results indicate that a user's difficulty with gestures is in part due to a lack of understanding of how gesture recognition works. The studies highlight the ability of users to learn and remember gestures as an important design consideration.
  • Gestures may compensate for the limitations of the mouse when the display is several times larger than a typical display. In such a scenario, gestures can provide mechanisms to restore the ability to quickly reach any part of the display, where once a mouse was adequate with a small display. Similarly, in a multiple display scenario it is desirable to have a fast comfortable way to indicate a particular display. For example, the foreground object may be “bumped” to another display by gesturing in the direction of the target display.
  • perceptual user interfaces to date are dependent on significant limiting assumptions.
  • One type of perceptual user interface utilizes color models that make certain assumptions about the color of an object. Proper operation of the system is dependent on proper lighting conditions and can be negatively impacted when the system is moved from one location to another as a result of changes in lighting conditions, or simply when the lighting conditions change in the room. Factors that impact performance include sun light versus artificial light, florescent light versus incandescent light, direct illumination versus indirect illumination, and the like.
  • most attempts to develop perceptual user interfaces require the user to wear specialized devices such as gloves, headsets, or close-talk microphones. The use of such devices is generally found to be distracting and intrusive for the user.
  • perceptual user interfaces have been slow to emerge. The reasons include heavy computational burdens, unreasonable calibration demands, required use of intrusive and distracting devices, and a general lack of robustness outside of specific laboratory conditions. For these and similar reasons, there has been little advancement in systems and methods for exploiting perceptual user interfaces. However, as the trend towards smaller, specialized electronic gadgets continues to grow, so does the need for alternate methods for interaction between the user and the electronic device. Many of these specialized devices are too small and the applications unsophisticated to utilize the traditional input keyboard and mouse devices. Examples of such devices include TabletPCs, Media center PCs, kiosks, hand held computers, home appliances, video games, and wall sized displays, along with many others. In these, and other applications, the perceptual user interface provides a significant advancement in computer control over traditional computer interaction modalities.
  • gestures may offer significant bits of functionality where they are needed most. For example, dismissing a notification window may be accomplished by a quick gesture to the one side or the other, as in shooing a fly.
  • gestures for “next” and “back” functionality found in web browsers, presentation programs (e.g., PowerPointTM) and other applications. Note that in many cases the surface forms of these various gestures may remain the same throughout these examples, while the semantics of the gestures depends on the application at hand. Providing a small set of standard gestures eases problems users have in recalling how gestures are performed, and also allows for simpler and more robust signal processing and recognition processes.
  • the present invention relates to a system and methodology to implement a perceptual user interface comprising alternative modalities for controlling computer application programs and manipulating on-screen objects through hand gestures or a combination of hand gestures and verbal commands.
  • a perceptual user interface system is provided that detects and tracks hand and/or object movements, and provides for the control of application programs and manipulation of on-screen objects in response to hand or object movements performed by the user.
  • the system operates in real time, is robust, responsive, and introduces a reduced computational load due to “lightweight” sparse stereo image processing by not imaging every pixel, but only a reduced representation of image pixels. That is, the depth at every pixel in the image is not computed, which is the typical approach in conventional correlation-based stereo systems.
  • the present invention utilizes the depth information at specific locations in the image that correspond to object hypotheses.
  • the system provides a relatively inexpensive capability for the recognition of hand gestures.
  • Mice are particularly suited to fine cursor control, and most users have much experience with them.
  • the disclosed invention can provide a secondary, coarse control that may complement mice in some applications. For example, in a map application, the user might cause the viewpoint to change with a gesture, while using the mouse to select and manipulate particular objects in the view.
  • the present invention may also provide a natural “push-to-talk” or “stop-listening” signal to speech recognition processes. Users were shown to prefer using a perceptual user interface for push-to-talk.
  • the invention combines area cursors with gesture-based manipulation of on-screen objects, and may be configured to be driven by gross or fine movements, and may be helpful to people with limited manual dexterity.
  • a multiple hypothesis tracking framework allows for the detection and tracking of multiple objects.
  • tracking of both hands may be considered for a two-handed interface.
  • Two-handed interfaces are often used to specify spatial relationships that are otherwise more difficult to describe in speech. For example, it is natural to describe the relative sizes of objects by holding up two hands, or to specify how an object (dominant hand) is to be moved with respect to its environment (non-dominant hand).
  • a system that facilitates the processing of computer-human interaction in response to multiple input modalities.
  • the system processes commands in response to hand gestures or a combination of hand gestures and verbal commands, or in addition to traditional computer-human interaction modalities such as a keyboard and mouse.
  • the user interacts with the computer and controls the application through a series of hand gestures, or a combination of hand gestures and verbal commands, but is also free to operate the system with traditional interaction devices when more appropriate.
  • the system and method provide for certain actions to be performed in response to particular verbal commands. For example, a verbal command “Close” may be used to close a selected window and a verbal command “Raise” may be used to bring the window to the forefront of the display.
  • the present invention facilitates adapting the system to the particular preferences of an individual user.
  • the system and method allow the user to tailor the system to recognize specific hand gestures and verbal commands and to associate these hand gestures and verbal commands with particular actions to be taken. This capability allows different users, which may prefer to make different motions for a given command, the ability to tailor the system in a way most efficient for their personal use. Similarly, different users can choose to use different verbal commands to perform the same function. For example, one user may choose to say “Release” to stop moving a window while another may wish to say “Quit”.
  • dwell time is used as an alternative modality to complement gestures or verbal commands.
  • Dwell time is the length of time an input device pointer remains in a particular position (or location of the GUI), and is controlled by the user holding one hand stationary while the system is tracking that hand.
  • the pointer may be caused to be moved by the system to a location of the GUI.
  • the disclosed invention provides for a modality such that if the pointer dwell time equals or exceeds predetermined dwell criteria, the system reacts accordingly. For example, where the dwell time exceeds a first criteria, the GUI window is selected. Dwelling of the pointer for a longer period of time in a portion of a window invokes a corresponding command to bring the window to the foreground of the GUI display, while dwelling still longer invokes a command to cause the window to be grabbed and moved.
  • video cameras are used to view a volume of area.
  • This volume of area is generally in front of the video display (on which the video cameras may be located) and is designated as an engagement volume wherein gesture commands may be performed by the user and recognized by the system.
  • Objects in motion are detected by comparing corresponding patches (subsets of video of the entire video image) of video from successive video images. By analyzing and comparing the corresponding video patches from successive images, objects in motion are detected and tracked.
  • two video cameras are mounted substantially parallel to each other to generate video images that are used to determine the depth (distance from the camera, display, or other point of reference) of a moving object using a lightweight sparse stereo technique.
  • the lightweight sparse stereo technique reduces the computational requirements of the system and the depth component is used as an element in determining whether that particular object is the nearest object within the engagement volume.
  • FIG. 1 illustrates a system block diagram of components of the present invention for controlling a computer and/or other hardware/software peripherals interfaced thereto.
  • FIG. 2 illustrates a schematic block diagram of a perceptual user interface system, in accordance with an aspect of the present invention.
  • FIG. 3 illustrates a flow diagram of a methodology for implementing a perceptual user interface system, in accordance with an aspect of the present invention.
  • FIG. 4 illustrates a flow diagram of a methodology for determining the presence of moving objects within images, in accordance with an aspect of the present invention.
  • FIG. 5 illustrates a flow diagram of a methodology for tracking a moving object within an image, in accordance with an aspect of the present invention.
  • FIG. 6 illustrates a disparity between two video images captured by two video cameras mounted substantially parallel to each other for the purpose of determining the depth of objects, in accordance with an aspect of the present invention.
  • FIG. 7 illustrates an example of the hand gestures that the system may recognize and the visual feedback provided through the display, in accordance with an aspect of the present invention.
  • FIG. 8 illustrates an alternative embodiment wherein a unique icon is displayed in association with a name of a specific recognized command, in accordance with an aspect of the present invention.
  • FIG. 9 illustrates an engagement plane and engagement volume of both single and multiple monitor implementations, in accordance with an aspect of the present invention.
  • FIG. 10 illustrates a briefing room environment where gestures are utilized to control a screen projector via a computer system configured in accordance with an aspect of the present invention.
  • FIG. 11 illustrates a block diagram of a computer system operable to execute the present invention.
  • FIG. 12 illustrates a network implementation of the present invention.
  • a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
  • a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
  • an application running on a server and the server can be a component.
  • One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
  • the present invention relates to a system and methodology for implementing a perceptual user interface comprising alternative modalities for controlling computer programs and manipulating on-screen objects through hand gestures or a combination of hand gestures and/or verbal commands.
  • a perceptual user interface system is provided that tracks hand movements and provides for the control of computer programs and manipulation of on-screen objects in response to hand gestures performed by the user.
  • the system provides for the control of computer programs and manipulation of on-screen objects in response to verbal commands spoken by the user.
  • the gestures and/or verbal commands may be tailored by a particular user to suit that user's personal preferences.
  • the system operates in real time and is robust, light in weight and responsive.
  • the system provides a relatively inexpensive capability for the recognition of hand gestures and verbal commands.
  • the system 100 includes a tracking component 102 for detecting and tracking one or more objects 104 through image capture utilizing cameras (not shown) or other suitable conventional image-capture devices.
  • the cameras operate to capture images of the object(s) 104 in a scene within the image capture capabilities of the cameras so that the images may be further processed to not only detect the presence of the object(s) 104 , but also to detect and track object(s) movements. It is appreciated that in more robust implementations, object characteristics such as object features and object orientation may also be detected, tracked, and processed.
  • the object(s) 104 of the present invention include basic hand movements created by one or more hands of a system user and/or other person selected for use with the disclosed system.
  • objects may include many different types of objects with object characteristics, including hand gestures each of which have gesture characteristics including but not limited to, hand movement, finger count, finger orientation, hand rotation, hand orientation, and hand pose (e.g., opened, closed, and partially closed).
  • the tracking component 102 interfaces to a control component 106 of the system 100 that controls all onboard component processes.
  • the control component 106 interfaces to a seeding component 108 that seeds object hypotheses to the tracking component based upon the object characteristics.
  • the object(s) 104 are detected and tracked in the scene such that object characteristic data is processed according to predetermined criteria to associate the object characteristic data with commands for interacting with a user interface component 110 .
  • the user interface component 110 interfaces to the control component 106 to receive control instructions that affect presentation of text, graphics, and other output (e.g., audio) provided to the user via the interface component 110 .
  • the control instructions are communicated to the user interface component 110 in response to the object characteristic data processed from detection and tracking of the object(s) within a predefined engagement volume space 112 of the scene.
  • a filtering component 114 interfaces to the control component 106 to receive filtering criteria in accordance with user filter configuration data, and to process the filtering criteria such that tracked object(s) of respective object hypotheses are selectively removed from the object hypotheses and/or at least one hypothesis from a set of hypotheses within the volume space 112 and the scene.
  • Objects are detected and tracked either within the volume space 112 or outside the volume space 112 . Those objects outside of the volume space 112 are detected, tracked, and ignored, until entering the volume space 112 .
  • the system 100 also receives user input via input port(s) 116 such as input from pointing devices, keyboards, interactive input mechanisms such as touch screens, and audio input devices.
  • input port(s) 116 such as input from pointing devices, keyboards, interactive input mechanisms such as touch screens, and audio input devices.
  • the subject invention can employ various artificial intelligence based schemes for carrying out various aspects of the subject invention.
  • a process for determining which object is to be selected for tracking can be facilitated via an automatic classification system and process.
  • Such classification can employ a probabilistic and/or statistical-based analysis (e.g., factoring into the analysis utilities and costs) to prognose or infer an action that a user desires to be automatically performed.
  • a support vector machine (SVM) classifier can be employed.
  • Other classification approaches include Bayesian networks, decision trees, and probabilistic classification models providing different patterns of independence can be employed. Classification as used herein also is inclusive of statistical regression that is utilized to develop models of priority.
  • the subject invention can employ classifiers that are explicitly trained (e.g., via a generic training data) as well as implicitly trained (e.g., via observing user behavior, receiving extrinsic information) so that the classifier(s) is used to automatically determine according to a predetermined criteria which object(s) should be selected for tracking and which objects that were being tracked are now removed from tracking.
  • the criteria can include, but is not limited to, object characteristics such as object size, object speed, direction of movement, distance from one or both cameras, object orientation, object features, and object rotation.
  • classifiers which are well understood—it is to be appreciated that other classifier models may also be utilized such as Naive Bayes, Bayes Net, decision tree and other learning models—SVM's are configured via a learning or training phase within a classifier constructor and feature selection module.
  • attributes are words or phrases or other data-specific attributes derived from the words (e.g., parts of speech, presence of key terms), and the classes are categories or areas of interest (e.g., levels of priorities).
  • FIG. 2 there is illustrated a schematic block diagram of a perceptual user interface system, in accordance with an aspect of the present invention.
  • the system comprises a computer 200 with a traditional keyboard 202 , input pointing device (e.g., a mouse) 204 , microphone 206 , and display 208 .
  • the system further comprises at least one video camera 210 , at least one user 212 , and software 214 .
  • the exemplary system of FIG. 2 is comprised of two video cameras 210 mounted substantially parallel to each other (that is, the rasters are parallel) and the user 212 .
  • the first camera is used to detect and track the object, and the second camera is used for determining the depth (or distance) of the object from the camera(s).
  • the computer 200 is operably connected to the keyboard 202 , mouse 204 and display 208 .
  • Video cameras 210 and microphone 206 are also operably connected to computer 200 .
  • the video cameras 210 “look” towards the user 212 and may point downward to capture objects within the volume defined above the keyboard and in front of the user.
  • User 212 is typically an individual that is capable of providing hand gestures, holding objects in a hand, verbal commands, and mouse and/or keyboard input.
  • the hand gestures and/or object(s) appear in video images created by the video cameras 210 and are interpreted by the software 214 as commands to be executed by computer 200 .
  • microphone 206 receives verbal commands provided by user 212 , which are in turn, interpreted by software 214 and executed by computer 200 .
  • User 212 can control and operate various application programs on the computer 200 by providing a series of hand gestures or a combination of hand gestures, verbal commands, and mouse/keyboard input.
  • FIGS. 3-5 methodologies in accordance with various aspects of the present invention will be better appreciated with reference to FIGS. 3-5 . While, for purposes of simplicity of explanation, the methodologies of FIGS. 3-5 are shown and described as executing serially, it is to be understood and appreciated that the present invention is not limited by the illustrated order, as some aspects could, in accordance with the present invention, occur in different orders and/or concurrently with other aspects from that shown and described herein. Moreover, not all illustrated features may be required to implement a methodology in accordance with an aspect the present invention.
  • FIG. 3 is a flow diagram that illustrates a high level methodology for detecting the user's hand, tracking movement of the hand and interpreting commands in accordance with an aspect of the invention. While, for purposes of simplicity of explanation, the methodologies shown here and below are described as a series of acts, it is to be understood and appreciated that the present invention is not limited by the order of acts, as some acts may, in accordance with the present invention, occur in different orders and/or concurrently with other acts from that shown and described herein. For example, those skilled in the art will understand and appreciate that a methodology could alternatively be represented as a series of interrelated states or events, such as in a state diagram. Moreover, not all illustrated acts may be required to implement a methodology in accordance with the present invention.
  • the methodology begins at 300 where video images are scanned to determine whether any moving objects exist within the field of view (or scene) of the cameras.
  • the system is capable of running one or more object hypothesis models to detect and track objects, whether moving or not moving.
  • the system runs up to and including six object hypotheses. If more than one object is detected as a result of the multiple hypotheses, the system drops one of the objects if the distance from any other object falls below a threshold distance, for example, five inches. It is assumed that the two hypotheses are redundantly tracking the same object, and one of the hypotheses is removed from consideration.
  • a threshold distance for example, five inches. It is assumed that the two hypotheses are redundantly tracking the same object, and one of the hypotheses is removed from consideration.
  • no moving object(s) have been detected, and flow returns to 300 where the system continues to scan the current image for moving objects.
  • the engagement volume is defined as a volume of space in front of the video cameras and above the keyboard wherein the user is required to introduce the hand gestures (or object(s)) in order to utilize the system.
  • a purpose of the engagement volume is to provide a means for ignoring all objects and/or gestures in motion except for those intended by the user to effect control of the computer. If a moving object is detected at 302 , but is determined not to be within the engagement volume, then the system dismisses the moving object as not being a desired object to track for providing commands. Flow then loops back to the input of 300 to scan for more objects. However, if the moving object is determined to be within the engagement volume, then the methodology proceeds to 306 .
  • new objects are seeded only when it is determined that the new object is a sufficient distance away from an existing object that is being tracked (in 3-D).
  • the system determines the distance of each moving object from the video cameras.
  • the object closest to the video camera(s) is selected as the desired command object. If by the given application context the user is predisposed to use hand gestures towards the display, the nearest object hypotheses will apply to the hands. In other scenarios, more elaborate criteria for object selection may be used. For example, an application may select a particular object based upon its quality of movement over time.
  • a two-handed interaction application may select an object to the left of the dominant hand (for right handed users) as the non-dominant hand.
  • the command object is the object that has been selected for tracking, the movements of which will be analyzed and interpreted for gesture commands.
  • the command object is generally the user's dominant hand. Once the command object is selected, its movement is tracked, as indicated at 310 .
  • the system determines whether the command object is still within the engagement volume. If NO, the object has moved outside the engagement volume, and the system dismisses the object hypothesis and returns to 300 where the current image is processed for moving objects. If NO, the object is still within the engagement volume, and flow proceeds to 314 .
  • the system determines whether the object is still moving. If no movement is detected, flow is along the NO path returning to 300 to process the current camera images for moving objects. If however, movement is detected, then flow proceeds from 314 to 316 .
  • the system analyzes the movements of the command object to interpret the gestures for specific commands. At 318 , it is determined whether the interpreted gesture is a recognized command.
  • algorithms used to interpret gestures are kept to simple algorithms and are performed on sparse (“lightweight”) images to limit the computational overhead required to properly interpret and execute desired commands in real time.
  • the system is able to exploit the presence of motion and depth to minimize computational requirements involved in determining objects that provide gesture commands.
  • FIG. 4 there is illustrated a flow diagram of a methodology for determining the presence of moving objects within video images created by one or more video sources, in accordance with an aspect of the present invention.
  • the methodology exploits the notion that attention is often drawn to objects that move.
  • video data is acquired from one or more video sources. Successive video images are selected from the same video source, and motion is detected by comparing a patch of a current video image, centered on a given location, to a patch from the previous video image centered on the same location.
  • a video patch centered about a point located at (u 1 ,v 1 ), and (u 2 ,v 2 ) is selected from successive video images I 1 and I 2 , respectively.
  • a simple comparison function is utilized wherein the sum of the absolute differences (SAD) over square patches in two images is obtained.
  • SAD absolute differences
  • I(u,v) refers to the pixel at (u,v)
  • D is the patch width
  • the absolute difference between two pixels is the sum of the absolute differences taken over all available color channels. Regions in the image that have movement can be found by determining points (u,v) such that function SAD(I t ⁇ 1 ,u t ⁇ 1 ,v t ⁇ 1 ,I t ,u t ,v t )> ⁇ , where the subscript refers to the image at time t, and ⁇ is a threshold level for motion.
  • a comparison is made between patches from image I 1 and I 2 using the sum of the absolute difference algorithm.
  • the result of the sum of the absolute difference algorithm is compared to a threshold value to determine whether a threshold level of motion exists within the image patch. If SAD ⁇ , no sufficient motion exists, and flow proceeds to 410 . If at 406 , SAD> ⁇ , then sufficient motion exists within the patch, and flow is to 408 where the object is designated for continued tracking.
  • the system determines whether the current image patch is the last patch to be examined within the current image. If NO, the methodology returns to 402 where a new patch is selected. If YES, then the system returns to 400 to acquire a new video image from the video source.
  • the SAD algorithm is computed on a sparse regular grid within the image.
  • the sparse regular grid is based on sixteen pixel centers.
  • the system tracks the motion of the object.
  • a position prediction algorithm is used to predict the next position of the moving object.
  • the prediction algorithm is a Kalman filter. However, it is to be appreciated that any position prediction algorithm can be used.
  • image operations may use the same SAD function on image patches, which allows for easy SIMD (Single-Instruction Stream Multiple-Data Stream, which architectures are essential in the parallel world of computers) optimization of the algorithm's implementation, which in turn allows it to run with sufficiently many trackers while still leaving CPU time to the user.
  • SIMD Single-Instruction Stream Multiple-Data Stream
  • the process of seeding process hypotheses based upon motion may place more than one hypothesis on a given moving object.
  • One advantage of this multiple hypothesis approach is that a simple, fast, and imperfect tracking algorithm may be used. Thus if one tracker fails, another may be following the object of interest. Once a given tracker has been seeded, the algorithm updates the position of the object being followed using the same function over successive frames.
  • the methodology begins at 500 where, after the motion detection methodology has identified the location of a moving object to be tracked, the next position of the object is predicted. Once identified, the methodology utilizes a prediction algorithm to predict the position of the object in successive frames. The prediction algorithm limits the computational burden on the system. In the successive frames, the moving object should be at the predicted location, or within a narrow range centered on the predicted location.
  • the methodology selects a small pixel window (e.g., ten pixels) centered on the predicted location. Within this small window, an algorithm executes to determine the actual location of the moving object.
  • the new position is determined by examining the sum of the absolute difference algorithm over successive video frames acquired at time t and time t ⁇ 1. The actual location is determined by finding the location (u t , v t ) that minimizes:
  • I t refers to the image at time t
  • I t ⁇ 1 refers to the image at time t ⁇ 1
  • (u t , v t ) refers to the location at time t.
  • a lightweight sparse stereo approach is utilized in accordance with an aspect of the invention.
  • the sparse stereo approach is a region-based approach utilized to find the disparity at only locations in the image corresponding to the object hypothesis. Note that in the stereo matching process, it is assumed that both cameras are parallel (in rasters). Object hypotheses are supported by frame-to-frame tracking through time in one view and stereo matching across both views. A second calibration issue is the distance between the two cameras (i.e., the baseline), which must be considered to recover depth in real world coordinates.
  • both calibration issues maybe dealt with automatically by fixing the cameras on a prefabricated mounting bracket or semi-automatically by the user presenting objects at a known depth in a calibration routine that requires a short period of time to complete.
  • the accuracy of the transform to world coordinates is improved by accounting for lens distortion effects with a static, pre-computed calibration procedure for a given camera.
  • Binocular disparity is the primary means for recovering depth information from two or more images taken from different viewpoints. Given the two-dimensional position of an object in two views, it is possible to compute the depth of the object. Given that the two cameras are mounted parallel to each other in the same horizontal plane, and given that the two cameras have a focal length f, the three-dimensional position (x,y,z) of an object is computed from the positions of the object in both images (u 1 ,v 1 ) and (u r ,v r ) by the following perspective projection equations:
  • disparity is the shift in location of the object in one view with respect to the other, and is related to the baseline b, the distance between the two cameras.
  • the vision algorithm performs 3-dimensional (3-D) tracking and 3-D depth computations.
  • 3-D 3-dimensional
  • each object hypothesis is supported only by consistency of the object movement in 3-D.
  • the present invention does not rely on fragile appearance models such as skin color models or hand image templates, which are likely invalidated when environmental conditions change or the system is confronted with a different user.
  • FIG. 6 there is illustrated a disparity between two video images captured by two video cameras mounted substantially parallel to each other for the purpose of determining the depth of objects, in accordance with an aspect of the present invention.
  • a first camera 600 and a second camera 602 are mounted substantially parallel to each other in the same horizontal plane and laterally aligned.
  • the two cameras ( 600 and 602 ) are separated by a distance 604 defined between the longitudinal focal axis of each camera lens, also known as the baseline, b.
  • a first video image 606 is the video image from the first camera 600 and a second video image 608 is the video image from the second camera 602 .
  • the disparity d (also item number 610 ), or shift in the two video images ( 606 and 608 ), can be seen by looking to an object 612 in the center of the first image 606 , and comparing the location of that object 612 in the first image 606 to the location of that same object 612 in the second image 608 .
  • the disparity 610 is illustrated as the difference between a first vertical centerline 614 of the first image 606 that intersects the center of the object 612 , and a second vertical centerline 616 of the second image 608 .
  • the object 612 is centered about the vertical centerline 614 with the top of the object 612 located at point (u,v).
  • the same point (u,v) of the object 612 is located at point (u ⁇ d,v) in the second image 608 , where d is the disparity 610 , or shift in the object from the first image 606 with respect to the second image 610 .
  • d is the disparity 610
  • a depth z can be determined.
  • the depth component z is used in part to determine if an object is within the engagement volume, where the engagement volume is the volume within which objects will be selected by the system.
  • a sparse stereo approach is utilized in order to limit computational requirements.
  • the sparse stereo approach is that which determines disparity d only at the locations in the image that corresponds to a moving object. For a given point (u,v) in the image, the value of disparity d is found such that the sum of the absolute differences over a patch in the first image 606 (i.e., a left image I L ) centered on (u,v) and a corresponding patch in the second image 608 (i.e., a right image I R ) centered on (u ⁇ d,v) is minimized, i.e., the dispatch value d that minimizes SAD(I 1 ,u ⁇ d,v,I r ,u,v). If an estimate of depth z is available from a previous time, then in order to limit computational requirements, the search for the minimal disparity d is limited to a range of depth z around the last known depth.
  • the search range may be further narrowed by use of an algorithm to predict the objects new location.
  • the prediction is accomplished by utilization of a Kalman filter.
  • the depth z can also be computed using traditional triangulation techniques.
  • the sparse stereo technique is used when the system operation involves detecting moving objects within a narrow range in front of the display, e.g., within twenty inches.
  • the two video cameras are mounted in parallel and separated by a distance equal to the approximate width of the display.
  • the distance between the two video cameras may be much greater.
  • traditional triangulation algorithms are used to determine the depth.
  • a user 700 gives commands by virtue of different hand gestures 702 and/or verbal commands 704 .
  • the gestures 702 are transmitted to a system computer (not shown) as part of the video images created by a pair of video cameras ( 706 and 708 ).
  • Verbal and/or generally, audio commands are input to the system computer through a microphone 710 .
  • Typical GUI windows 712 , 714 , and 716 are displayed in a layered presentation in an upper portion of display 718 while a lower portion of display 718 provides visual graphic feedback of in the form of icons 720 , 722 , 724 , and 726 of some of the gestures 702 recognized by the system.
  • the hand icon 720 is displayed when a corresponding gesture 728 is recognized.
  • the name of the recognized command (Move) is also then displayed below the icon 720 to provide additional textual feedback to the user 700 .
  • Move and Raise commands may be recognized by dwelling on the window for a period of time.
  • the pose of the hand may be mapped to any functionality, as described in greater detail below.
  • the shape of the hand icon may be changed in association with the captured hand pose to provide visual feedback to the user that the correct hand pose is being processed.
  • the hand icon is positioned for selecting the window for interaction, or to move the window, or effect scrolling.
  • a Scroll command may be initiated first by voicing a corresponding command that is processed by speech recognition, and then using the hand (or object) to commence scrolling of the window by moving the hand (or object) up and down for the desired scroll direction.
  • the single displayed hand icon 720 is presented for all recognized hand gestures 702 , however, the corresponding specific command name is displayed below the icon 720 .
  • the same hand icon 720 is displayed in accordance with four different hand gestures utilized to indicate four different commands: Move, Close, Raise, and Scroll.
  • a different hand shaped icon is used for each specific command and the name of the command is optionally displayed below the command.
  • audio confirmation is provided by the computer, in addition to the displayed icon and optional command name displayed below the icon.
  • FIG. 7 illustrates the embodiment where a single hand shaped icon 720 is used, and the corresponding command recognized by the system is displayed below the icon 720 .
  • the icon 720 and corresponding command word “MOVE” are displayed by the display 718 .
  • the icon 720 and corresponding command word “CLOSE” may be displayed by the display 718 .
  • Additional examples include, but are not limited to, displaying the icon 720 and corresponding command word “RAISE” when the system recognizes a hand gesture to bring a GUI window forward.
  • the icon 720 and command word “SCROLL” are displayed by the display 718 .
  • the disclosed system may be configured to display any number and type of graphical icons in response to one or more hand gestures presented by the system user.
  • audio feedback may be used such that a beep to tone may be presented in addition to or in lieu of the graphical feedback.
  • the graphical icon may be used to provide feedback in the form of a color, combination of colors, and/or flashing color or colors. Feedback may also be provided by flashing a border of the selected window, the border in the direction of movement. For example, if the window is to be moved to the right, the right window border could be flashed to indicate the selected direction of window movement.
  • a corresponding tone frequency may be emitted to indicate direction of movement, e.g., an upward movement would have and associated high pitch and a downward movement would have a low pitch.
  • rotational aspects may be provided such that movement to the left effects a counterclockwise rotation of a move icon, or perhaps a leftward tilt in the GUI window in the direction of movement.
  • FIG. 8 there is illustrated an alternative embodiment wherein a unique icon is displayed in association with a name of a specific recognized command, in accordance with an aspect of the present invention.
  • each icon-word pair is unique for each recognized command.
  • Icon-word pairs 800 , 802 , 804 , and 806 for the respective commands “MOVE”, “CLOSE”, “RAISE”, and “SCROLL”, are examples of visual feedback capabilities that can be provided.
  • the system is capable of interpreting commands based on interpreting hand gestures, verbal commands, or both in combination.
  • a hand is identified as a moving object by the motion detection algorithms and the hand movement is tracked and interpreted.
  • hand gestures and verbal commands are used cooperatively.
  • Speech recognition is performed using suitable voice recognition applications, for example, Microsoft SAPI 5.1, with a simple command and control grammar. However, it is understood that any similar speech recognition system can be used.
  • An inexpensive microphone is placed near the display to receive audio input. However, the microphone can be placed at any location insofar as audio signals can be received thereinto and processed by the system.
  • Interaction with the system can be initiated by a user moving a hand across an engagement plane and into an engagement volume.
  • a user 900 is located generally in front of a display 902 , which is also within the imaging capabilities of a pair of cameras ( 906 and 908 ).
  • a microphone 904 (similar to microphones 206 and 710 ) is suitably located such that user voice signals are input for processing, e.g., in front of the display 902 .
  • the cameras ( 906 and 908 , similar to cameras 200 and, 706 and 708 ) are mounted substantially parallel to each other and on a horizontal plane above the display 902 .
  • the two video cameras ( 906 and 908 ) are separated by a distance that provides optimum detection and tracking for the given cameras and the engagement volume.
  • cameras suitable for wider fields of view, higher resolution may be placed further apart on a plane different from the top of the display 902 , for example, lower and along the sides of the display facing upwards, to capture gesture images for processing in accordance with novel aspects of the present invention.
  • more robust image processing capabilities and hypothesis engines can be employed in the system to process greater amounts of data.
  • the engagement volume 910 is typically defined to be located where the hands and/or objects in the hands of the user 900 are most typically situated, i.e., above a keyboard of the computer system and in front of the cameras ( 906 and 908 ) between the user 900 and the display 902 (provided the user 900 is seated in front of the display on which the cameras ( 906 and 908 ) are located).
  • the user 900 may be standing while controlling the computer, which requires that the volume 910 be located accordingly to facilitate interface interaction.
  • the objects may include not only the hand(s) of the user, or objects in the hand(s), but other parts of the body, such as head, torso movement, arms, or any other detectable objects. This is described in greater detail hereinbelow.
  • a plane 912 defines a face of the volume 910 that is closest to the user 900 , and is called the engagement plane.
  • the user 900 may effect control of the system by moving a hand (or object) through the engagement plane 912 and into the engagement volume 910 .
  • the hand of the user 900 is detected and tracked even when outside the engagement volume 910 .
  • the object is moved across the engagement plane 912 , feedback is provided to the user in the form of displaying an alpha-blended icon on the display (e.g., an operating system desktop).
  • the icon is designed to be perceived as distinct from other desktop icons and may be viewed as an area cursor.
  • the engagement plane 912 is positioned such that the user's hands do not enter it during normal use of the keyboard and mouse.
  • the corresponding hand icon displayed on the desktop is moved to reflect the position of the tracked object (or hand).
  • the engagement and acquisition of the moving hand (or object) is implemented in the lightweight sparse stereo system by looking for the object with a depth that is less than a predetermined distance value. Any such object will be considered the command object until it is moved out of the engagement volume 910 , for example, behind the engagement plane 912 , or until the hand (or object) is otherwise removed from being a tracked object.
  • the specified distance is twenty inches.
  • the user 900 moves a hand through the engagement plane 912 and into the engagement volume 910 established for the system.
  • the system detects the hand, tracks the hand as the hand moves from outside of the volume 910 to the inside, and provides feedback by displaying a corresponding hand shaped icon on the display 902 .
  • the open microphone 904 placed near the display 902 provides means for the user 900 to invoke one or more verbal commands in order to act upon the selected window under the icon.
  • the window directly underneath the hand shaped icon is the selected window.
  • the interpreted command is displayed along with the hand shaped icon.
  • the user may initiate the continuous (or stepped) movement of the window under the hand shaped icon to follow the movement of the user's hand.
  • the user 900 causes the selected window to move up or down within the display 902 by moving the hand up or down. Lateral motion is also similarly achieved. Movement of the window is terminated when the user hand is moved across the engagement plane 912 and out of the engagement volume 910 .
  • Other methods of termination include stopping movement of the hand (or object) for an extended period of time, which is processed by the system as a command to drop the associated hypothesis.
  • the Move command may be invoked by dwelling the hand on the window for a period of time, followed by hand motion to initiate the direction of window movement.
  • the user may speak the word “Release” and the system will stop moving the selected window in response to the user's hand motion. Release may also be accomplished by dwelling a bit longer in time while in Move, and/or Scroll modes.
  • the user 900 may also act upon a selected window with other actions. By speaking the words “Close”, “Minimize”, or “Maximize” the selected window is respectively closed, minimized or maximized. By speaking the word “Raise”, the selected window is brought to the foreground, and by speaking “Send to Back”, the selected window is sent behind (to the background) all other open windows. By speaking “Scroll”, the user initiates a scrolling mode on the selected window. The user may control the rate of the scroll by the position of the hand.
  • the hand shaped icon tracks the user's hand position, and the rate of the scrolling of the selected window is proportional to the distance between the current hand icon position and the position of the hand icon at the time the scrolling is initiated. Scrolling can be terminated by the user speaking “Release” or by the user moving their hand behind the engagement plane and out of the engagement volume.
  • dwell time can be used as a modality to control windows in lieu of, or in addition to, verbal commands and other disclosed modalities.
  • Dwell time is defined as the time, after having engaged the system, that the user holds their hand position stationary such that the system hand shaped icon remains over a particular window. For example, by dwelling on a selected window for a short period of time (e.g., two seconds), the system can bring the window to the foreground of all other open windows (i.e., a RAISE command).
  • GUI windows can be accomplished in a similar fashion by controlling the dwell time of the hand shaped icon over the open window.
  • hand gestures are interpreted by hand motion or by pattern recognition.
  • the user can bring the window to the front (or foreground), on top of all other open windows by moving a hand from a position closer to the display to position farther from the display, the hand remaining in the engagement volume 910 .
  • the user can cause the selected window to be grabbed and moved by bringing fingers together with their thumb, and subsequently moving the hand.
  • the selected window will move in relation to the user hand movement until the hand is opened up to release the selected window. Additional control over the selected window can be defined in response to particular hand movements or hand gestures.
  • the selected window will move in response to the user pointing their hand, thumb, or finger in a particular direction. For example, if the user points their index finger to right, the window will move to the right within the display. Similarly, if the user points to the left, up, or down the selected window will move to the left, up or down within the display, respectively. Additional window controls can be achieved through the use of similar hand gestures or motions.
  • the system is configurable such that an individual user selects the particular hand gestures that they wish to associate with particular commands.
  • the system provides default settings that map a given set of gestures to a given set of commands.
  • This mapping is configurable such that the specific command executed in response to each particular hand gesture is definable by each user. For example, one user may wish to point directly at the screen with their index finger to grab the selected window for movement while another user may wish to bring their fingers together with their thumb to grab the selected window.
  • one user may wish to point a group of finger up or down in order to move a selected window up or down, while another user may wish to open the palm of their hand toward the cameras and then move their opened hand up or down to move a selected window up or down. All given gestures and commands are configurable by the individual users to best suit that particular user's individual personal preferences.
  • the system may include a “Record and Define Gesture” mode.
  • the system records hand gestures performed by the user.
  • the recorded gestures are then stored in the system memory to be recognized during normal operation.
  • the given hand gestures are then associated with a particular command to be performed by the system in response to that particular hand gesture.
  • a user may further tailor the system to their personal preference or, similarly, may tailor system operation to respond to specific commands most appropriate for particular applications.
  • the user can choose the particular words, from a given set, they wish to use for a particular command. For example, one user may choose to say “Release” to stop moving a window while another may wish to say “Quit”.
  • This capability allows different users, which may prefer to use different words for a given command, the ability to tailor the system in a way most efficient for their personal use.
  • the present invention can be utilized in an expansive list of applications. The following discussion is exemplary of only a few applications with which the present invention may be utilized.
  • One such application is associated with user control of a presentation, or similar type of briefing application, wherein the user makes a presentation on a projection type screen to a group of listeners.
  • the system includes three monitors (or displays) through which the user 900 exercises control of GUI features; a first display 912 , a second display 914 , and a third display 916 .
  • the cameras ( 906 and 908 ) are similarly situated as in FIG. 9A , to define the engagement volume 910 .
  • the user 900 can move a window 920 from the first display 912 to the second display 914 , and further from the second display 914 to the third display 916 .
  • the flick motion of the user hand 918 can effect movement of the window 920 from the first display 912 to the third display 916 in a single window movement, or in multiple steps through the displays ( 914 and 916 ) using corresponding multiple hand motions.
  • control by the user 900 occurs only when the user hand 918 breaks the engagement plane 912 , and is determined to be a control object (i.e., an object meeting parameters sufficient to effect control of the computer).
  • the user 900 is located generally in front of the displays ( 912 , 914 , and 916 ), which is also within the imaging capabilities of the pair of cameras ( 906 and 908 ).
  • the microphone 904 is suitably located to receive user voice signals.
  • the cameras ( 906 and 908 ) are mounted substantially parallel to each other and on a horizontal plane above the displays ( 912 , 914 , and 916 ), and separated by a distance that provides optimum detection and tracking for the given cameras and the engagement volume 910 .
  • the user 900 moves the hand 918 through the engagement plane 912 and into the engagement volume 910 established for the system.
  • the system which had detected and tracked the hand 918 before it entered the volume 912 , begins providing feedback to the user 900 by displaying the hand shaped icon 922 on one of the displays ( 912 , 914 , and 916 ).
  • the microphone 904 provides additional means for the user 900 to invoke one or more verbal commands in order to act upon the selected window 920 under the corresponding icon 922 .
  • the window 920 directly underneath the hand shaped icon is the selected window.
  • the corresponding icon 922 is presented by the system on the computer display 912 .
  • the associated window is assigned for control.
  • the user 900 causes the selected window to move up or down within the display by invoking the ‘Move’ command as explained above and then moving the hand up or down, or to move across one or more of the monitors ( 914 and 916 ) by invoking the ‘Flick’ command and then using the flick hand motion.
  • the user 900 can cause the window 920 to be moved left to the first display 912 , or right to the third display 916 . Movement of the window is terminated (or “released”) when the user hand dwells for a time longer than a predetermined dwell time, or out of the engagement volume 910 .
  • the user may speak the word “Release” and the system will stop moving the selected window in response to the user's hand motion. Release may also be accomplished by dwelling a bit while in Move, and/or Scroll modes. The user may also act upon a selected window with other actions. By speaking the words “Close”, “Minimize”, or “Maximize” the selected window is respectively closed, minimized or maximized. By speaking the word “Raise”, the selected window is brought to the foreground, and by speaking “Send to Back”, the selected window is sent behind (to the background) all other open windows. By speaking “Scroll”, the user initiates a scrolling mode on the selected window. The user may control the rate of the scroll by the position of the hand.
  • the hand shaped icon tracks the user's hand position, and the rate of the scrolling of the selected window is proportional to the distance between the current hand icon position and the position of the hand icon at the time the scrolling is initiated. Scrolling can be terminated by the user speaking “Release” or by the user moving their hand behind the engagement plane and out of the engagement volume.
  • the briefing room 1000 comprises a large briefing table 1002 surrounded on three sides by numerous chairs 1004 , a computer 1006 , a video projector 1008 , and a projector screen 1010 .
  • Utilization of the present invention adds additional elements comprising the disclosed perceptual software 1012 , two video cameras ( 1014 and 1016 ) and a microphone 1018 .
  • a user 1020 is positioned between the projector screen 1010 and briefing table 1002 at which the audience is seated.
  • a top face 1022 of an engagement volume 1024 is defined by rectangular area 1026 .
  • a front surface indicated at 1028 represents an engagement plane.
  • the user controls the content displayed on the projection screen 1010 and advancement of the slides (or presentation images) by moving their hand(s) through the engagement plane 1028 into the engagement volume 1024 , and/or speaking commands recognizable by the system.
  • a simple gesture is made to advance to the next slide, back-up to a previous slide, initiate an embedded video, or to effect one of a number of many other presentation capabilities.
  • a similar capability can be implemented for a home media center wherein the user can change selected video sources, change channels, control volume, advance chapter and other similar functions by moving their hand across an engagement plane into an engagement volume and subsequently performing the appropriate hand gesture.
  • Additional applications include perceptual interfaces for TabletPCs, Media center PCs, kiosks, hand held computers, home appliances, video games, and wall sized displays, along with many others.
  • the system can be configured such that the engagement volume travels with the user (in a “roaming” mode) as the user moves about the room.
  • the cameras would be mounted on a platform that rotates such that the rotation maintains the cameras substantially equidistant from the presenter.
  • the presenter may carrier a sensor that allows the system to sense or track the general location of the presenter. The system would then affect rotation of the camera mount to “point” the cameras at the presenter.
  • the engagement volume may be extended to the presenter allowing control of the computer system as the presenter moves about.
  • the process of “extending” the engagement volume can include increasing the depth of the volume such that the engagement plane surface moves to the presenter, or by maintaining the volume dimensions, but moving the fixed volume to the presenter. This would require on-the-fly focal adjustment of the cameras to track quick movements in the depth of objects in the volume, but also the movement of the presenter.
  • Another method of triggering system attention in this mode would be to execute a predefined gesture that is not likely to be made unintentionally, e.g., raising a hand.
  • the system is configurable for individual preferences such that the engagement volume of a first user may be different than the volume of a second user.
  • the user preferences may be retrieved and implemented automatically by the system. This can include automatically elevating the mounted cameras for a taller person by using a telescoping camera stand so that the cameras are at the appropriate height of the particular user, whether sitting or standing. This also includes, but is not limited to, setting the system for “roaming” mode
  • FIG. 11 there is illustrated a block diagram of a computer operable to execute the present invention.
  • FIG. 11 and the following discussion are intended to provide a brief, general description of a suitable computing environment 1100 in which the various aspects of the present invention may be implemented. While the invention has been described above in the general context of computer-executable instructions that may run on one or more computers, those skilled in the art will recognize that the invention also may be implemented in combination with other program modules and/or as a combination of hardware and software.
  • program modules include routines, programs, components, data structures, etc., that perform particular tasks or implement particular abstract data types.
  • inventive methods may be practiced with other computer system configurations, including single-processor or multiprocessor computer systems, minicomputers, mainframe computers, as well as personal computers, hand-held computing devices, microprocessor-based or programmable consumer electronics, and the like, each of which may be operatively coupled to one or more associated devices.
  • inventive methods may also be practiced in distributed computing environments where certain tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote memory storage devices.
  • the exemplary environment 1100 for implementing various aspects of the invention includes a computer 1102 , the computer 1102 including a processing unit 1104 , a system memory 1106 , and a system bus 1108 .
  • the system bus 1108 couples system components including, but not limited to the system memory 1106 to the processing unit 1104 .
  • the processing unit 1104 may be any of various commercially available processors. Dual microprocessors and other multi-processor architectures also can be employed as the processing unit 1104 .
  • the system bus 1108 can be any of several types of bus structure including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of commercially available bus architectures.
  • the system memory 1106 includes read only memory (ROM) 1110 and random access memory (RAM) 1112 .
  • ROM read only memory
  • RAM random access memory
  • the computer 1102 further includes a hard disk drive 1114 , a magnetic disk drive 1116 , (e.g., to read from or write to a removable disk 1118 ) and an optical disk drive 1120 , (e.g., reading a CD-ROM disk 1122 or to read from or write to other optical media).
  • the hard disk drive 1114 , magnetic disk drive 1116 and optical disk drive 1120 can be connected to the system bus 1108 by a hard disk drive interface 1124 , a magnetic disk drive interface 1126 and an optical drive interface 1128 , respectively.
  • the drives and their associated computer-readable media provide nonvolatile storage of data, data structures, computer-executable instructions, and so forth.
  • the drives and media accommodate the storage of broadcast programming in a suitable digital format.
  • computer-readable media refers to a hard disk, a removable magnetic disk and a CD
  • other types of media which are readable by a computer such as zip drives, magnetic cassettes, flash memory cards, digital video disks, cartridges, and the like, may also be used in the exemplary operating environment, and further that any such media may contain computer-executable instructions for performing the methods of the present invention.
  • a number of program modules can be stored in the drives and RAM 1112 , including an operating system 1130 , one or more application programs 1132 , other program modules 1134 and program data 1136 . It is appreciated that the present invention can be implemented with various commercially available operating systems or combinations of operating systems.
  • a user can enter commands and information into the computer 1102 through a keyboard 1138 and a pointing device, such as a mouse 1140 .
  • Other input devices may include one or more video cameras, one or microphones, an IR remote control, a joystick, a game pad, a satellite dish, a scanner, or the like.
  • These and other input devices are often connected to the processing unit 1104 through a serial port interface 1142 that is coupled to the system bus 1108 , but may be connected by other interfaces, such as a parallel port, a game port, a firewire port, a universal serial bus (“USB”), an IR interface, etc.
  • USB universal serial bus
  • a monitor 1144 or other type of display device is also connected to the system bus 1108 via an interface, such as a video adapter 1146 .
  • a computer typically includes other peripheral output devices (not shown), such as speakers, printers etc.
  • the computer 1102 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer(s) 1148 .
  • the remote computer(s) 1148 may be a workstation, a server computer, a router, a personal computer, portable computer, microprocessor-based entertainment appliance, a peer device or other common network node, and typically includes many or all of the elements described relative to the computer 1102 , although, for purposes of brevity, only a memory storage device 1150 is illustrated.
  • the logical connections depicted include a LAN 1152 and a WAN 1154 .
  • Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.
  • the computer 1102 When used in a LAN networking environment, the computer 1102 is connected to the local network 1152 through a network interface or adapter 1156 .
  • the computer 1102 When used in a WAN networking environment, the computer 1102 typically includes a modem 1158 , or is connected to a communications server on the LAN, or has other means for establishing communications over the WAN 1154 , such as the Internet.
  • the modem 1158 which may be internal or external, is connected to the system bus 1108 via the serial port interface 1142 .
  • program modules depicted relative to the computer 1102 may be stored in the remote memory storage device 1150 . It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
  • the implementation 1200 includes a first perceptual system 1202 and a second perceptual system 1204 , both operational according to the disclose invention.
  • the first system 1202 includes cameras 1206 (also denoted C 1 and C 2 ) mounted on a rotational and telescoping camera mount 1208 .
  • a first user 1210 located generally in front of the first system 1202 effects control of a GUI content A of the first system 1202 in accordance with the novel aspects of the present invention by introducing hand gestures into an engagement volume 1211 and/or voice signals.
  • the first user 1210 may rove about in front of the cameras 1206 in accordance with the “roaming” operational mode described previously, or may be seated in front of the cameras 1206 .
  • the second system 1204 includes cameras 1212 (also denoted C 3 and C 4 ) mounted on a rotational and telescoping camera mount 1214 .
  • a second user 1216 located generally in front of the second system 1204 effects control of a GUI content B of the second system 1204 in accordance with the novel aspects of the present invention by introducing hand gestures into an engagement volume 1217 and/or voice signals.
  • the second user 1216 may rove about in front of the cameras 1212 in accordance with the “roaming” operational mode described previously, or may be seated in front of the cameras 1212 .
  • the first and second systems ( 1202 and 1204 ) may be networked in a conventional wired or wireless network 1207 peer configuration (or bus configuration by using a hub 1215 ).
  • This particular system 1200 is employed to present both content A and content B via a single large monitor or display 1218 .
  • the monitor 1218 can be driven by either of the systems ( 1202 and 1204 ), as can be provided by conventional dual-output video graphics cards, or the separate video information may be transmitted to a third monitor control system 1220 to present the content.
  • Such an implementation finds application where a side-by-side comparison of product features is being presented, other similar applications where two or more users may desire to interact.
  • content A and content B may be presented on a split screen layout of the monitor 1218 .
  • Either or both user 1210 and 1216 may provide keyboard and/or mouse input to facilitate control according to the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Architecture for implementing a perceptual user interface. The architecture comprises alternative modalities for controlling computer application programs and manipulating on-screen objects through hand gestures or a combination of hand gestures and verbal commands. The perceptual user interface system includes a tracking component that detects object characteristics of at least one of a plurality of objects within a scene, and tracks the respective object. Detection of object characteristics is based at least in part upon image comparison of a plurality of images relative to a course mapping of the images. A seeding component iteratively seeds the tracking component with object hypotheses based upon the presence of the object characteristics and the image comparison. A filtering component selectively removes the tracked object from the object hypotheses and/or at least one object hypothesis from the set of object hypotheses based upon predetermined removal criteria.

Description

    RELATED APPLICATIONS
  • This is a continuation of U.S. patent application Ser. No. 10/396,653, filed on Mar. 25, 2003 and entitled, “ARCHITECTURE FOR CONTROLLING A COMPUTER USING HAND GESTURES,” the entire contents of which is hereby incorporated by reference.
  • TECHNICAL FIELD
  • The present invention relates generally to controlling a computer system, and more particularly to a system and method to implement alternative modalities for controlling computer application programs and manipulating on-screen objects through hand gestures or a combination of hand gestures and verbal commands.
  • BACKGROUND OF THE INVENTION
  • A user interface facilitates the interaction between a computer and computer user by enhancing the user's ability to utilize application programs. The traditional interface between a human user and a typical personal computer is implemented with graphical displays and is generally referred to as a graphical user interface (GUI). Input to the computer or particular application program is accomplished through the presentation of graphical information on the computer screen and through the use of a keyboard and/or mouse, trackball or other similar implements. Many systems employed for use in public areas utilize touch screen implementations whereby the user touches a designated area of a screen to effect the desired input. Airport electronic ticket check-in kiosks and rental car direction systems are examples of such systems. There are, however, many applications where the traditional user interface is less practical or efficient.
  • The traditional computer interface is not ideal for a number of applications. Providing stand-up presentations or other type of visual presentations to large audiences, is but one example. In this example, a presenter generally stands in front of the audience and provides a verbal dialog in conjunction with the visual presentation that is projected on a large display or screen. Manipulation of the presentation by the presenter is generally controlled through use of awkward remote controls, which frequently suffer from inconsistent and less precise operation, or require the cooperation of another individual. Traditional user interfaces require the user either to provide input via the keyboard or to exhibit a degree of skill and precision more difficult to implement with a remote control than a traditional mouse and keyboard. Other examples include control of video, audio, and display components of a media room. Switching between sources, advancing fast fast-forward, rewinding, changing chapters, changing volume, etc., can be very cumbersome in a professional studio as well as in the home. Similarly, traditional interfaces are not well suited for smaller, specialized electronic gadgets.
  • Additionally, people with motion impairment conditions find it very challenging to cope with traditional user interfaces and computer access systems. Such conditions include Cerebral Palsy, Muscular Dystrophy, Friedrich's Ataxia, and spinal injuries or disorders. These conditions and disorders are often accompanied by tremors, spasms, loss of coordination, restricted range of movement, reduced muscle strength, and other motion impairing symptoms.
  • Similar symptoms exist in the growing elderly segment of the population. As people age, their motor skills decline and impact the ability to perform many tasks. It is known that as people age, their cognitive, perceptual and motor skills decline, with negative effects in their ability to perform many tasks. The requirement to position a cursor, particularly with smaller graphical presentations, can often be a significant barrier for elderly or afflicted computer users. Computers can play an increasingly important role in helping older adults function well in society.
  • Graphical interfaces contribute to the ease of use of computers. WIMP (Window, Icon, Menu, Pointing device (or Pull-down menu)) interfaces allow fairly non-trivial operations to be performed with a few mouse motions and clicks. However, at the same time, this shift in the user interaction from a primarily text-oriented experience to a point-and-click experience has erected new barriers between people with disabilities and the computer. For example, for older adults, there is evidence that using the mouse can be quite challenging. There is extensive literature demonstrating that the ability to make small movements decreases with age. This decreased ability can have a major effect on the ability of older adults to use a pointing device on a computer. It has been shown that even experienced older computer users move a cursor much more slowly and less accurately than their younger counterparts. In addition, older adults seem to have increased difficulty (as compared to younger users) when targets become smaller. For older computer users, positioning a cursor can be a severe limitation.
  • One solution to the problem of decreased ability to position the cursor with a mouse is to simply increase the size of the targets in computer displays, which can often be counter-productive since less information is being displayed, requiring more navigation. Another approach is to constrain the movement of the mouse to follow on-screen objects, as with sticky icons or solid borders that do not allow cursors to overshoot the target. There is evidence that performance with area cursors (possibly translucent) is better than performance with regular cursors for some target acquisition tasks.
  • One method to facilitate computer access for users with motion impairment conditions and for applications, in which the traditional user interfaces are cumbersome, is through use of perceptual user interfaces. Perceptual user interfaces utilize alternate sensing modalities, such as the capability of sensing physical gestures of the user, to replace or complement traditional input devices such as the mouse and keyboard. Perceptual user interfaces promise modes of fluid computer-human interaction that complement and/or replace the mouse and keyboard, particularly in non-desktop applications such as control for a media room.
  • One study indicates that adding a simple gesture-based navigation facility to web browsers can significantly reduce the time taken to carry out one of the most common actions in computer use, i.e., using the “back” button (or function) to return to previously visited pages. Subjective ratings by users in experiments showed a strong preference for a “flick” system, where the users would flick the mouse left or right to go back or forward in the web browser.
  • In the simplest view, gestures play a symbolic communication role similar to speech, suggesting that for simple tasks gesture may enhance or replace speech recognition. Small gestures near the keyboard or mouse do not induce fatigue as quickly as sustained whole arm postures. Previous studies indicate that users find gesture-based systems highly desirable, but that users are also dissatisfied with the recognition accuracy of gesture recognizers. Furthermore, experimental results indicate that a user's difficulty with gestures is in part due to a lack of understanding of how gesture recognition works. The studies highlight the ability of users to learn and remember gestures as an important design consideration.
  • Even when a mouse and keyboard are available, users may find it attractive to manipulate often-used applications while away from the keyboard, in what can be called a “casual interface” or “lean-back” posture. Browsing e-mail over morning coffee might be accomplished by mapping simple gestures to “next message” and “delete message”.
  • Gestures may compensate for the limitations of the mouse when the display is several times larger than a typical display. In such a scenario, gestures can provide mechanisms to restore the ability to quickly reach any part of the display, where once a mouse was adequate with a small display. Similarly, in a multiple display scenario it is desirable to have a fast comfortable way to indicate a particular display. For example, the foreground object may be “bumped” to another display by gesturing in the direction of the target display.
  • However, examples of perceptual user interfaces to date are dependent on significant limiting assumptions. One type of perceptual user interface utilizes color models that make certain assumptions about the color of an object. Proper operation of the system is dependent on proper lighting conditions and can be negatively impacted when the system is moved from one location to another as a result of changes in lighting conditions, or simply when the lighting conditions change in the room. Factors that impact performance include sun light versus artificial light, florescent light versus incandescent light, direct illumination versus indirect illumination, and the like. Additionally, most attempts to develop perceptual user interfaces require the user to wear specialized devices such as gloves, headsets, or close-talk microphones. The use of such devices is generally found to be distracting and intrusive for the user.
  • Thus perceptual user interfaces have been slow to emerge. The reasons include heavy computational burdens, unreasonable calibration demands, required use of intrusive and distracting devices, and a general lack of robustness outside of specific laboratory conditions. For these and similar reasons, there has been little advancement in systems and methods for exploiting perceptual user interfaces. However, as the trend towards smaller, specialized electronic gadgets continues to grow, so does the need for alternate methods for interaction between the user and the electronic device. Many of these specialized devices are too small and the applications unsophisticated to utilize the traditional input keyboard and mouse devices. Examples of such devices include TabletPCs, Media center PCs, kiosks, hand held computers, home appliances, video games, and wall sized displays, along with many others. In these, and other applications, the perceptual user interface provides a significant advancement in computer control over traditional computer interaction modalities.
  • In light of these findings, what is needed is to standardize a small set of easily learned gestures, the semantics of which are determined by application context. A small set of very simple gestures may offer significant bits of functionality where they are needed most. For example, dismissing a notification window may be accomplished by a quick gesture to the one side or the other, as in shooing a fly. Another example is gestures for “next” and “back” functionality found in web browsers, presentation programs (e.g., PowerPoint™) and other applications. Note that in many cases the surface forms of these various gestures may remain the same throughout these examples, while the semantics of the gestures depends on the application at hand. Providing a small set of standard gestures eases problems users have in recalling how gestures are performed, and also allows for simpler and more robust signal processing and recognition processes.
  • SUMMARY OF THE INVENTION
  • The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention. This summary is not an extensive overview of the invention. It is intended to neither identify key or critical elements of the invention nor delineate the scope of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later.
  • The present invention relates to a system and methodology to implement a perceptual user interface comprising alternative modalities for controlling computer application programs and manipulating on-screen objects through hand gestures or a combination of hand gestures and verbal commands. A perceptual user interface system is provided that detects and tracks hand and/or object movements, and provides for the control of application programs and manipulation of on-screen objects in response to hand or object movements performed by the user. The system operates in real time, is robust, responsive, and introduces a reduced computational load due to “lightweight” sparse stereo image processing by not imaging every pixel, but only a reduced representation of image pixels. That is, the depth at every pixel in the image is not computed, which is the typical approach in conventional correlation-based stereo systems. The present invention utilizes the depth information at specific locations in the image that correspond to object hypotheses.
  • The system provides a relatively inexpensive capability for the recognition of hand gestures.
  • Mice are particularly suited to fine cursor control, and most users have much experience with them. The disclosed invention can provide a secondary, coarse control that may complement mice in some applications. For example, in a map application, the user might cause the viewpoint to change with a gesture, while using the mouse to select and manipulate particular objects in the view. The present invention may also provide a natural “push-to-talk” or “stop-listening” signal to speech recognition processes. Users were shown to prefer using a perceptual user interface for push-to-talk. The invention combines area cursors with gesture-based manipulation of on-screen objects, and may be configured to be driven by gross or fine movements, and may be helpful to people with limited manual dexterity.
  • A multiple hypothesis tracking framework allows for the detection and tracking of multiple objects. Thus tracking of both hands may be considered for a two-handed interface. Studies show that people naturally assign different tasks to each hand, and that the non-dominant hand can support the task of the dominant hand. Two-handed interfaces are often used to specify spatial relationships that are otherwise more difficult to describe in speech. For example, it is natural to describe the relative sizes of objects by holding up two hands, or to specify how an object (dominant hand) is to be moved with respect to its environment (non-dominant hand). Thus there is provided a system that facilitates the processing of computer-human interaction in response to multiple input modalities. The system processes commands in response to hand gestures or a combination of hand gestures and verbal commands, or in addition to traditional computer-human interaction modalities such as a keyboard and mouse. The user interacts with the computer and controls the application through a series of hand gestures, or a combination of hand gestures and verbal commands, but is also free to operate the system with traditional interaction devices when more appropriate. The system and method provide for certain actions to be performed in response to particular verbal commands. For example, a verbal command “Close” may be used to close a selected window and a verbal command “Raise” may be used to bring the window to the forefront of the display.
  • In accordance with another aspect thereof, the present invention facilitates adapting the system to the particular preferences of an individual user. The system and method allow the user to tailor the system to recognize specific hand gestures and verbal commands and to associate these hand gestures and verbal commands with particular actions to be taken. This capability allows different users, which may prefer to make different motions for a given command, the ability to tailor the system in a way most efficient for their personal use. Similarly, different users can choose to use different verbal commands to perform the same function. For example, one user may choose to say “Release” to stop moving a window while another may wish to say “Quit”.
  • In accordance with another aspect of the present invention, dwell time is used as an alternative modality to complement gestures or verbal commands. Dwell time is the length of time an input device pointer remains in a particular position (or location of the GUI), and is controlled by the user holding one hand stationary while the system is tracking that hand. In response to the hand gesture, or combination of hand gestures, the pointer may be caused to be moved by the system to a location of the GUI. The disclosed invention provides for a modality such that if the pointer dwell time equals or exceeds predetermined dwell criteria, the system reacts accordingly. For example, where the dwell time exceeds a first criteria, the GUI window is selected. Dwelling of the pointer for a longer period of time in a portion of a window invokes a corresponding command to bring the window to the foreground of the GUI display, while dwelling still longer invokes a command to cause the window to be grabbed and moved.
  • In accordance with yet another aspect of the present invention, video cameras are used to view a volume of area. This volume of area is generally in front of the video display (on which the video cameras may be located) and is designated as an engagement volume wherein gesture commands may be performed by the user and recognized by the system. Objects in motion are detected by comparing corresponding patches (subsets of video of the entire video image) of video from successive video images. By analyzing and comparing the corresponding video patches from successive images, objects in motion are detected and tracked.
  • In accordance with still another aspect of the invention, two video cameras are mounted substantially parallel to each other to generate video images that are used to determine the depth (distance from the camera, display, or other point of reference) of a moving object using a lightweight sparse stereo technique. The lightweight sparse stereo technique reduces the computational requirements of the system and the depth component is used as an element in determining whether that particular object is the nearest object within the engagement volume.
  • The following description and the annexed drawings set forth in detail certain illustrative aspects of the invention. These aspects are indicative, however, of but a few of the various ways in which the principles of the invention may be employed and the present invention is intended to include all such aspects and their equivalents. Other advantages and novel features of the invention will become apparent from the following detailed description of the invention when considered in conjunction with the drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates a system block diagram of components of the present invention for controlling a computer and/or other hardware/software peripherals interfaced thereto.
  • FIG. 2 illustrates a schematic block diagram of a perceptual user interface system, in accordance with an aspect of the present invention.
  • FIG. 3 illustrates a flow diagram of a methodology for implementing a perceptual user interface system, in accordance with an aspect of the present invention.
  • FIG. 4 illustrates a flow diagram of a methodology for determining the presence of moving objects within images, in accordance with an aspect of the present invention.
  • FIG. 5 illustrates a flow diagram of a methodology for tracking a moving object within an image, in accordance with an aspect of the present invention.
  • FIG. 6 illustrates a disparity between two video images captured by two video cameras mounted substantially parallel to each other for the purpose of determining the depth of objects, in accordance with an aspect of the present invention.
  • FIG. 7 illustrates an example of the hand gestures that the system may recognize and the visual feedback provided through the display, in accordance with an aspect of the present invention.
  • FIG. 8 illustrates an alternative embodiment wherein a unique icon is displayed in association with a name of a specific recognized command, in accordance with an aspect of the present invention.
  • FIG. 9 illustrates an engagement plane and engagement volume of both single and multiple monitor implementations, in accordance with an aspect of the present invention.
  • FIG. 10 illustrates a briefing room environment where gestures are utilized to control a screen projector via a computer system configured in accordance with an aspect of the present invention.
  • FIG. 11 illustrates a block diagram of a computer system operable to execute the present invention.
  • FIG. 12 illustrates a network implementation of the present invention.
  • DETAILED DESCRIPTION OF THE INVENTION
  • As used in this application, the terms “component” and “system” are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a server and the server can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
  • The present invention relates to a system and methodology for implementing a perceptual user interface comprising alternative modalities for controlling computer programs and manipulating on-screen objects through hand gestures or a combination of hand gestures and/or verbal commands. A perceptual user interface system is provided that tracks hand movements and provides for the control of computer programs and manipulation of on-screen objects in response to hand gestures performed by the user. Similarly the system provides for the control of computer programs and manipulation of on-screen objects in response to verbal commands spoken by the user. Further, the gestures and/or verbal commands may be tailored by a particular user to suit that user's personal preferences. The system operates in real time and is robust, light in weight and responsive. The system provides a relatively inexpensive capability for the recognition of hand gestures and verbal commands.
  • Referring now to FIG. 1, there is illustrated a system block diagram of components of the present invention for controlling a computer and/or other hardware/software peripherals interfaced thereto. The system 100 includes a tracking component 102 for detecting and tracking one or more objects 104 through image capture utilizing cameras (not shown) or other suitable conventional image-capture devices. The cameras operate to capture images of the object(s) 104 in a scene within the image capture capabilities of the cameras so that the images may be further processed to not only detect the presence of the object(s) 104, but also to detect and track object(s) movements. It is appreciated that in more robust implementations, object characteristics such as object features and object orientation may also be detected, tracked, and processed. The object(s) 104 of the present invention include basic hand movements created by one or more hands of a system user and/or other person selected for use with the disclosed system. However, in more robust system implementations, such objects may include many different types of objects with object characteristics, including hand gestures each of which have gesture characteristics including but not limited to, hand movement, finger count, finger orientation, hand rotation, hand orientation, and hand pose (e.g., opened, closed, and partially closed).
  • The tracking component 102 interfaces to a control component 106 of the system 100 that controls all onboard component processes. The control component 106 interfaces to a seeding component 108 that seeds object hypotheses to the tracking component based upon the object characteristics.
  • The object(s) 104 are detected and tracked in the scene such that object characteristic data is processed according to predetermined criteria to associate the object characteristic data with commands for interacting with a user interface component 110. The user interface component 110 interfaces to the control component 106 to receive control instructions that affect presentation of text, graphics, and other output (e.g., audio) provided to the user via the interface component 110. The control instructions are communicated to the user interface component 110 in response to the object characteristic data processed from detection and tracking of the object(s) within a predefined engagement volume space 112 of the scene.
  • A filtering component 114 interfaces to the control component 106 to receive filtering criteria in accordance with user filter configuration data, and to process the filtering criteria such that tracked object(s) of respective object hypotheses are selectively removed from the object hypotheses and/or at least one hypothesis from a set of hypotheses within the volume space 112 and the scene. Objects are detected and tracked either within the volume space 112 or outside the volume space 112. Those objects outside of the volume space 112 are detected, tracked, and ignored, until entering the volume space 112.
  • The system 100 also receives user input via input port(s) 116 such as input from pointing devices, keyboards, interactive input mechanisms such as touch screens, and audio input devices.
  • The subject invention (e.g., in connection with object detection, tracking, and filtering) can employ various artificial intelligence based schemes for carrying out various aspects of the subject invention. For example, a process for determining which object is to be selected for tracking can be facilitated via an automatic classification system and process. Such classification can employ a probabilistic and/or statistical-based analysis (e.g., factoring into the analysis utilities and costs) to prognose or infer an action that a user desires to be automatically performed. For example, a support vector machine (SVM) classifier can be employed. Other classification approaches include Bayesian networks, decision trees, and probabilistic classification models providing different patterns of independence can be employed. Classification as used herein also is inclusive of statistical regression that is utilized to develop models of priority.
  • As will be readily appreciated from the subject specification, the subject invention can employ classifiers that are explicitly trained (e.g., via a generic training data) as well as implicitly trained (e.g., via observing user behavior, receiving extrinsic information) so that the classifier(s) is used to automatically determine according to a predetermined criteria which object(s) should be selected for tracking and which objects that were being tracked are now removed from tracking. The criteria can include, but is not limited to, object characteristics such as object size, object speed, direction of movement, distance from one or both cameras, object orientation, object features, and object rotation. For example, with respect to SVM's which are well understood—it is to be appreciated that other classifier models may also be utilized such as Naive Bayes, Bayes Net, decision tree and other learning models—SVM's are configured via a learning or training phase within a classifier constructor and feature selection module. A classifier is a function that maps an input attribute vector, x=(x1, x2, x3, x4, . . . , xn), to a confidence that the input belongs to a class—that is, f(x)=confidence(class). In the case of text-based data collection synchronization classification, for example, attributes are words or phrases or other data-specific attributes derived from the words (e.g., parts of speech, presence of key terms), and the classes are categories or areas of interest (e.g., levels of priorities).
  • Referring now to FIG. 2, there is illustrated a schematic block diagram of a perceptual user interface system, in accordance with an aspect of the present invention. The system comprises a computer 200 with a traditional keyboard 202, input pointing device (e.g., a mouse) 204, microphone 206, and display 208. The system further comprises at least one video camera 210, at least one user 212, and software 214. The exemplary system of FIG. 2 is comprised of two video cameras 210 mounted substantially parallel to each other (that is, the rasters are parallel) and the user 212. The first camera is used to detect and track the object, and the second camera is used for determining the depth (or distance) of the object from the camera(s). The computer 200 is operably connected to the keyboard 202, mouse 204 and display 208. Video cameras 210 and microphone 206 are also operably connected to computer 200. The video cameras 210 “look” towards the user 212 and may point downward to capture objects within the volume defined above the keyboard and in front of the user. User 212 is typically an individual that is capable of providing hand gestures, holding objects in a hand, verbal commands, and mouse and/or keyboard input. The hand gestures and/or object(s) appear in video images created by the video cameras 210 and are interpreted by the software 214 as commands to be executed by computer 200. Similarly, microphone 206 receives verbal commands provided by user 212, which are in turn, interpreted by software 214 and executed by computer 200. User 212 can control and operate various application programs on the computer 200 by providing a series of hand gestures or a combination of hand gestures, verbal commands, and mouse/keyboard input.
  • In view of the foregoing structural and functional features described above, methodologies in accordance with various aspects of the present invention will be better appreciated with reference to FIGS. 3-5. While, for purposes of simplicity of explanation, the methodologies of FIGS. 3-5 are shown and described as executing serially, it is to be understood and appreciated that the present invention is not limited by the illustrated order, as some aspects could, in accordance with the present invention, occur in different orders and/or concurrently with other aspects from that shown and described herein. Moreover, not all illustrated features may be required to implement a methodology in accordance with an aspect the present invention.
  • Accordingly, FIG. 3 is a flow diagram that illustrates a high level methodology for detecting the user's hand, tracking movement of the hand and interpreting commands in accordance with an aspect of the invention. While, for purposes of simplicity of explanation, the methodologies shown here and below are described as a series of acts, it is to be understood and appreciated that the present invention is not limited by the order of acts, as some acts may, in accordance with the present invention, occur in different orders and/or concurrently with other acts from that shown and described herein. For example, those skilled in the art will understand and appreciate that a methodology could alternatively be represented as a series of interrelated states or events, such as in a state diagram. Moreover, not all illustrated acts may be required to implement a methodology in accordance with the present invention.
  • The methodology begins at 300 where video images are scanned to determine whether any moving objects exist within the field of view (or scene) of the cameras. The system is capable of running one or more object hypothesis models to detect and track objects, whether moving or not moving. In one embodiment, the system runs up to and including six object hypotheses. If more than one object is detected as a result of the multiple hypotheses, the system drops one of the objects if the distance from any other object falls below a threshold distance, for example, five inches. It is assumed that the two hypotheses are redundantly tracking the same object, and one of the hypotheses is removed from consideration. At 302, if NO, no moving object(s) have been detected, and flow returns to 300 where the system continues to scan the current image for moving objects. Alternatively, if YES, object movement has been detected, and flow continues from 302 to 304 where it is determined whether or not one or more moving objects are within the engagement volume. It is appreciated that the depth of the object may be determined before determination of whether the object is within the engagement volume.
  • The engagement volume is defined as a volume of space in front of the video cameras and above the keyboard wherein the user is required to introduce the hand gestures (or object(s)) in order to utilize the system. A purpose of the engagement volume is to provide a means for ignoring all objects and/or gestures in motion except for those intended by the user to effect control of the computer. If a moving object is detected at 302, but is determined not to be within the engagement volume, then the system dismisses the moving object as not being a desired object to track for providing commands. Flow then loops back to the input of 300 to scan for more objects. However, if the moving object is determined to be within the engagement volume, then the methodology proceeds to 306. However, new objects are seeded only when it is determined that the new object is a sufficient distance away from an existing object that is being tracked (in 3-D). At 306, the system determines the distance of each moving object from the video cameras. At 308, if more than one moving object is detected within the engagement volume, then the object closest to the video camera(s) is selected as the desired command object. If by the given application context the user is predisposed to use hand gestures towards the display, the nearest object hypotheses will apply to the hands. In other scenarios, more elaborate criteria for object selection may be used. For example, an application may select a particular object based upon its quality of movement over time. Additionally, a two-handed interaction application may select an object to the left of the dominant hand (for right handed users) as the non-dominant hand. The command object is the object that has been selected for tracking, the movements of which will be analyzed and interpreted for gesture commands. The command object is generally the user's dominant hand. Once the command object is selected, its movement is tracked, as indicated at 310.
  • At 312, the system determines whether the command object is still within the engagement volume. If NO, the object has moved outside the engagement volume, and the system dismisses the object hypothesis and returns to 300 where the current image is processed for moving objects. If NO, the object is still within the engagement volume, and flow proceeds to 314. At 314, the system determines whether the object is still moving. If no movement is detected, flow is along the NO path returning to 300 to process the current camera images for moving objects. If however, movement is detected, then flow proceeds from 314 to 316. At 316, the system analyzes the movements of the command object to interpret the gestures for specific commands. At 318, it is determined whether the interpreted gesture is a recognized command. If NO, the movement is not interpreted as a recognized command, and flow returns to 310 to continue tracking the object. However, if the object movement is interpreted as a recognized command, flow is to 320 where the system executes the corresponding command. After execution thereof, flow returns to 310 to continue tracking the object. This process may continually execute to detect and interpret gestures.
  • In accordance with an aspect of the invention, algorithms used to interpret gestures are kept to simple algorithms and are performed on sparse (“lightweight”) images to limit the computational overhead required to properly interpret and execute desired commands in real time. In accordance with another aspect of the invention, the system is able to exploit the presence of motion and depth to minimize computational requirements involved in determining objects that provide gesture commands.
  • Referring now to FIG. 4, there is illustrated a flow diagram of a methodology for determining the presence of moving objects within video images created by one or more video sources, in accordance with an aspect of the present invention. The methodology exploits the notion that attention is often drawn to objects that move. At 400, video data is acquired from one or more video sources. Successive video images are selected from the same video source, and motion is detected by comparing a patch of a current video image, centered on a given location, to a patch from the previous video image centered on the same location. At 402, a video patch centered about a point located at (u1,v1), and (u2,v2) is selected from successive video images I1 and I2, respectively. A simple comparison function is utilized wherein the sum of the absolute differences (SAD) over square patches in two images is obtained. For a patch from image I1 centered on pixel location (u1,v1) and a patch in image I2 centered on (u 2,v2), the image comparison function is defined as SAD(I1,u1,v1,I2,u2,v2) as:
  • - D 2 i , j D 2 I 1 ( u 1 + i , v 1 + j ) - I 2 ( u 2 + i , v 2 + j )
  • where I(u,v) refers to the pixel at (u,v), D is the patch width, and the absolute difference between two pixels is the sum of the absolute differences taken over all available color channels. Regions in the image that have movement can be found by determining points (u,v) such that function SAD(It−1,ut−1,vt−1,It,ut,vt)>τ, where the subscript refers to the image at time t, and τ is a threshold level for motion. At 404, a comparison is made between patches from image I1 and I2 using the sum of the absolute difference algorithm. At 406, the result of the sum of the absolute difference algorithm is compared to a threshold value to determine whether a threshold level of motion exists within the image patch. If SAD≦τ, no sufficient motion exists, and flow proceeds to 410. If at 406, SAD>τ, then sufficient motion exists within the patch, and flow is to 408 where the object is designated for continued tracking. At 410, the system determines whether the current image patch is the last patch to be examined within the current image. If NO, the methodology returns to 402 where a new patch is selected. If YES, then the system returns to 400 to acquire a new video image from the video source.
  • To reduce the computational load, the SAD algorithm is computed on a sparse regular grid within the image. In one embodiment, the sparse regular grid is based on sixteen pixel centers. When the motion detection methodology determines that an object has sufficient motion, then the system tracks the motion of the object. Again, in order to limit (or reduce) the computational load, a position prediction algorithm is used to predict the next position of the moving object. In one embodiment, the prediction algorithm is a Kalman filter. However, it is to be appreciated that any position prediction algorithm can be used.
  • Note that the image operations may use the same SAD function on image patches, which allows for easy SIMD (Single-Instruction Stream Multiple-Data Stream, which architectures are essential in the parallel world of computers) optimization of the algorithm's implementation, which in turn allows it to run with sufficiently many trackers while still leaving CPU time to the user.
  • The process of seeding process hypotheses based upon motion may place more than one hypothesis on a given moving object. One advantage of this multiple hypothesis approach is that a simple, fast, and imperfect tracking algorithm may be used. Thus if one tracker fails, another may be following the object of interest. Once a given tracker has been seeded, the algorithm updates the position of the object being followed using the same function over successive frames.
  • Referring now to FIG. 5, there is illustrated a flow diagram of a methodology for tracking a moving object within an image, in accordance with an aspect of the present invention. The methodology begins at 500 where, after the motion detection methodology has identified the location of a moving object to be tracked, the next position of the object is predicted. Once identified, the methodology utilizes a prediction algorithm to predict the position of the object in successive frames. The prediction algorithm limits the computational burden on the system. In the successive frames, the moving object should be at the predicted location, or within a narrow range centered on the predicted location. At 502, the methodology selects a small pixel window (e.g., ten pixels) centered on the predicted location. Within this small window, an algorithm executes to determine the actual location of the moving object. At 504, the new position is determined by examining the sum of the absolute difference algorithm over successive video frames acquired at time t and time t−1. The actual location is determined by finding the location (ut, vt) that minimizes:

  • SAD(It−1,ut−1,vt−1,It,ut,vt),
  • where It refers to the image at time t, It−1 refers to the image at time t−1, and where (ut, vt) refers to the location at time t. Once determined, the actual position is updated, at 506. At 508, motion characteristics are evaluated to determine whether the motion is still greater that the threshold level required. What is evaluated is not only the SAD image-based computation, but also movement of the object over time. The movement parameter is the average movement over a window of time. Thus if the user pauses the object or hand for a short duration of time, it may not be dropped from consideration. However, if the duration of time for the pause is still longer such that it exceeds a predetermined average time parameter, the object will be dropped. If YES, the motion is sufficient, and flow returns to 500 where a new prediction for the next position is determined. If NO, the object motion is insufficient, and the given object is dropped from being tracked, as indicated by flow to 510. At 512, flow is to 430 of FIG. 4 to select a new patch in the image from which to analyze motion.
  • When determining the depth information of an object (i.e., the distance from the object to the display or any other chosen reference point), a lightweight sparse stereo approach is utilized in accordance with an aspect of the invention. The sparse stereo approach is a region-based approach utilized to find the disparity at only locations in the image corresponding to the object hypothesis. Note that in the stereo matching process, it is assumed that both cameras are parallel (in rasters). Object hypotheses are supported by frame-to-frame tracking through time in one view and stereo matching across both views. A second calibration issue is the distance between the two cameras (i.e., the baseline), which must be considered to recover depth in real world coordinates. In practice, both calibration issues maybe dealt with automatically by fixing the cameras on a prefabricated mounting bracket or semi-automatically by the user presenting objects at a known depth in a calibration routine that requires a short period of time to complete. The accuracy of the transform to world coordinates is improved by accounting for lens distortion effects with a static, pre-computed calibration procedure for a given camera.
  • Binocular disparity is the primary means for recovering depth information from two or more images taken from different viewpoints. Given the two-dimensional position of an object in two views, it is possible to compute the depth of the object. Given that the two cameras are mounted parallel to each other in the same horizontal plane, and given that the two cameras have a focal length f, the three-dimensional position (x,y,z) of an object is computed from the positions of the object in both images (u1,v1) and (ur,vr) by the following perspective projection equations:
  • u = u r = f x z ; v - v r = f y z ; d = u r - u l = f b z ;
  • where the disparity, d, is the shift in location of the object in one view with respect to the other, and is related to the baseline b, the distance between the two cameras.
  • The vision algorithm performs 3-dimensional (3-D) tracking and 3-D depth computations. In this process, each object hypothesis is supported only by consistency of the object movement in 3-D. Unlike many conventional computer vision algorithms, the present invention does not rely on fragile appearance models such as skin color models or hand image templates, which are likely invalidated when environmental conditions change or the system is confronted with a different user.
  • Referring now to FIG. 6, there is illustrated a disparity between two video images captured by two video cameras mounted substantially parallel to each other for the purpose of determining the depth of objects, in accordance with an aspect of the present invention. In FIG. 6, a first camera 600 and a second camera 602 (similar to cameras 210) are mounted substantially parallel to each other in the same horizontal plane and laterally aligned. The two cameras (600 and 602) are separated by a distance 604 defined between the longitudinal focal axis of each camera lens, also known as the baseline, b. A first video image 606 is the video image from the first camera 600 and a second video image 608 is the video image from the second camera 602. The disparity d (also item number 610), or shift in the two video images (606 and 608), can be seen by looking to an object 612 in the center of the first image 606, and comparing the location of that object 612 in the first image 606 to the location of that same object 612 in the second image 608. The disparity 610 is illustrated as the difference between a first vertical centerline 614 of the first image 606 that intersects the center of the object 612, and a second vertical centerline 616 of the second image 608. In the first image 606, the object 612 is centered about the vertical centerline 614 with the top of the object 612 located at point (u,v). In the second image 608, the same point (u,v) of the object 612 is located at point (u−d,v) in the second image 608, where d is the disparity 610, or shift in the object from the first image 606 with respect to the second image 610. Given disparity d, a depth z can be determined. As will be discussed, in accordance with one aspect of the invention, the depth component z is used in part to determine if an object is within the engagement volume, where the engagement volume is the volume within which objects will be selected by the system.
  • In accordance with another aspect of the present invention, a sparse stereo approach is utilized in order to limit computational requirements. The sparse stereo approach is that which determines disparity d only at the locations in the image that corresponds to a moving object. For a given point (u,v) in the image, the value of disparity d is found such that the sum of the absolute differences over a patch in the first image 606 (i.e., a left image IL) centered on (u,v) and a corresponding patch in the second image 608 (i.e., a right image IR) centered on (u−d,v) is minimized, i.e., the dispatch value d that minimizes SAD(I1,u−d,v,Ir,u,v). If an estimate of depth z is available from a previous time, then in order to limit computational requirements, the search for the minimal disparity d is limited to a range of depth z around the last known depth.
  • In accordance with another aspect of the invention, the search range may be further narrowed by use of an algorithm to predict the objects new location. In one embodiment, the prediction is accomplished by utilization of a Kalman filter.
  • The depth z can also be computed using traditional triangulation techniques. The sparse stereo technique is used when the system operation involves detecting moving objects within a narrow range in front of the display, e.g., within twenty inches. In such cases, the two video cameras are mounted in parallel and separated by a distance equal to the approximate width of the display. However, when the system is implemented in a larger configuration, the distance between the two video cameras may be much greater. In such cases, traditional triangulation algorithms are used to determine the depth.
  • The foregoing discussion has focused on some details of the methodologies associated with locating and tracking an object to effect execution of corresponding and specified commands. An overview follows as to how these capabilities are implemented in one exemplary system.
  • Referring now to FIG. 7, there is illustrated an example of gestures that the system recognizes, and further illustrates visual feedback provided to the system through the display. A user 700 gives commands by virtue of different hand gestures 702 and/or verbal commands 704. The gestures 702 are transmitted to a system computer (not shown) as part of the video images created by a pair of video cameras (706 and 708). Verbal and/or generally, audio commands, are input to the system computer through a microphone 710. Typical GUI windows 712, 714, and 716 are displayed in a layered presentation in an upper portion of display 718 while a lower portion of display 718 provides visual graphic feedback of in the form of icons 720, 722, 724, and 726 of some of the gestures 702 recognized by the system.
  • In one example, the hand icon 720 is displayed when a corresponding gesture 728 is recognized. The name of the recognized command (Move) is also then displayed below the icon 720 to provide additional textual feedback to the user 700. Move and Raise commands may be recognized by dwelling on the window for a period of time. There is also a “flick” or “bump” command to send a window from one monitor to another monitor, in a multiple monitor configuration. This is controlled by moving the hand (or object) to the left or right, and is described in greater detail hereinbelow with respect to FIG. 9B. There are at least two ways to effect a Move; by speech recognition when voicing the word “Move”, or phrase “Move Window”, or any other associated voice command(s); and, by using the dwelling technique. It is appreciated that where more robust image capture and imaging processing systems are implemented, the pose of the hand may be mapped to any functionality, as described in greater detail below. Moreover, the shape of the hand icon may be changed in association with the captured hand pose to provide visual feedback to the user that the correct hand pose is being processed. However, as a basic implementation, the hand icon is positioned for selecting the window for interaction, or to move the window, or effect scrolling.
  • A Scroll command may be initiated first by voicing a corresponding command that is processed by speech recognition, and then using the hand (or object) to commence scrolling of the window by moving the hand (or object) up and down for the desired scroll direction.
  • In another example, the single displayed hand icon 720 is presented for all recognized hand gestures 702, however, the corresponding specific command name is displayed below the icon 720. Here, the same hand icon 720 is displayed in accordance with four different hand gestures utilized to indicate four different commands: Move, Close, Raise, and Scroll.
  • In still another aspect of the present invention, a different hand shaped icon is used for each specific command and the name of the command is optionally displayed below the command. In yet another embodiment, audio confirmation is provided by the computer, in addition to the displayed icon and optional command name displayed below the icon.
  • As previously mentioned, FIG. 7 illustrates the embodiment where a single hand shaped icon 720 is used, and the corresponding command recognized by the system is displayed below the icon 720. For example, when the system recognizes, either by virtue of gestures (with hand and/or object) and or verbal commands, the command to move a window, the icon 720 and corresponding command word “MOVE” are displayed by the display 718. Similarly, when the system recognizes a command to close a window, the icon 720 and corresponding command word “CLOSE” may be displayed by the display 718. Additional examples include, but are not limited to, displaying the icon 720 and corresponding command word “RAISE” when the system recognizes a hand gesture to bring a GUI window forward. When the system recognizes a hand gesture corresponding to a scroll command for scrolling a GUI window, the icon 720 and command word “SCROLL” are displayed by the display 718.
  • It is to be appreciated that the disclosed system may be configured to display any number and type of graphical icons in response to one or more hand gestures presented by the system user. Additionally, audio feedback may be used such that a beep to tone may be presented in addition to or in lieu of the graphical feedback. Furthermore the graphical icon may be used to provide feedback in the form of a color, combination of colors, and/or flashing color or colors. Feedback may also be provided by flashing a border of the selected window, the border in the direction of movement. For example, if the window is to be moved to the right, the right window border could be flashed to indicate the selected direction of window movement. In addition to or separate from, a corresponding tone frequency may be emitted to indicate direction of movement, e.g., an upward movement would have and associated high pitch and a downward movement would have a low pitch. Still further, rotational aspects may be provided such that movement to the left effects a counterclockwise rotation of a move icon, or perhaps a leftward tilt in the GUI window in the direction of movement.
  • Referring now to FIG. 8, there is illustrated an alternative embodiment wherein a unique icon is displayed in association with a name of a specific recognized command, in accordance with an aspect of the present invention. Here, each icon-word pair is unique for each recognized command. Icon-word pairs 800, 802, 804, and 806 for the respective commands “MOVE”, “CLOSE”, “RAISE”, and “SCROLL”, are examples of visual feedback capabilities that can be provided.
  • The system is capable of interpreting commands based on interpreting hand gestures, verbal commands, or both in combination. A hand is identified as a moving object by the motion detection algorithms and the hand movement is tracked and interpreted. In accordance with one aspect of the invention, hand gestures and verbal commands are used cooperatively. Speech recognition is performed using suitable voice recognition applications, for example, Microsoft SAPI 5.1, with a simple command and control grammar. However, it is understood that any similar speech recognition system can be used. An inexpensive microphone is placed near the display to receive audio input. However, the microphone can be placed at any location insofar as audio signals can be received thereinto and processed by the system.
  • Following is an example of functionality that is achieved by combining hand gesture and verbal modalities. Interaction with the system can be initiated by a user moving a hand across an engagement plane and into an engagement volume.
  • Referring now to FIG. 9A, there is illustrated the engagement plane and engagement volume for a single monitor system of the present invention. A user 900 is located generally in front of a display 902, which is also within the imaging capabilities of a pair of cameras (906 and 908). A microphone 904 (similar to microphones 206 and 710) is suitably located such that user voice signals are input for processing, e.g., in front of the display 902. The cameras (906 and 908, similar to cameras 200 and, 706 and 708) are mounted substantially parallel to each other and on a horizontal plane above the display 902. The two video cameras (906 and 908) are separated by a distance that provides optimum detection and tracking for the given cameras and the engagement volume. However, it is to be appreciated that cameras suitable for wider fields of view, higher resolution, may be placed further apart on a plane different from the top of the display 902, for example, lower and along the sides of the display facing upwards, to capture gesture images for processing in accordance with novel aspects of the present invention. In accordance therewith, more robust image processing capabilities and hypothesis engines can be employed in the system to process greater amounts of data.
  • Between the display 902 and the user 900 is a volume 910 defined as the engagement volume. The system detects and tracks objects inside and outside of the volume 910 to determine the depth of one or more objects with respect to the engagement volume 910. However, those objects determined to be of a depth that is outside of the volume 910 will be ignored. As mentioned hereinabove, the engagement volume 910 is typically defined to be located where the hands and/or objects in the hands of the user 900 are most typically situated, i.e., above a keyboard of the computer system and in front of the cameras (906 and 908) between the user 900 and the display 902 (provided the user 900 is seated in front of the display on which the cameras (906 and 908) are located). However, is it appreciated that the user 900 may be standing while controlling the computer, which requires that the volume 910 be located accordingly to facilitate interface interaction. Furthermore, the objects may include not only the hand(s) of the user, or objects in the hand(s), but other parts of the body, such as head, torso movement, arms, or any other detectable objects. This is described in greater detail hereinbelow.
  • A plane 912 defines a face of the volume 910 that is closest to the user 900, and is called the engagement plane. The user 900 may effect control of the system by moving a hand (or object) through the engagement plane 912 and into the engagement volume 910. However, as noted above, the hand of the user 900 is detected and tracked even when outside the engagement volume 910. However, it would be ignored when outside of the engagement volume 910 insofar as control of the computer is concerned. When the object is moved across the engagement plane 912, feedback is provided to the user in the form of displaying an alpha-blended icon on the display (e.g., an operating system desktop). The icon is designed to be perceived as distinct from other desktop icons and may be viewed as an area cursor. The engagement plane 912 is positioned such that the user's hands do not enter it during normal use of the keyboard and mouse. When the system engages the hand or object, the corresponding hand icon displayed on the desktop is moved to reflect the position of the tracked object (or hand).
  • The engagement and acquisition of the moving hand (or object) is implemented in the lightweight sparse stereo system by looking for the object with a depth that is less than a predetermined distance value. Any such object will be considered the command object until it is moved out of the engagement volume 910, for example, behind the engagement plane 912, or until the hand (or object) is otherwise removed from being a tracked object. In one example, the specified distance is twenty inches.
  • In operation, the user 900 moves a hand through the engagement plane 912 and into the engagement volume 910 established for the system. The system detects the hand, tracks the hand as the hand moves from outside of the volume 910 to the inside, and provides feedback by displaying a corresponding hand shaped icon on the display 902. The open microphone 904 placed near the display 902 provides means for the user 900 to invoke one or more verbal commands in order to act upon the selected window under the icon. The window directly underneath the hand shaped icon is the selected window. When a spoken and/or audio command is input to and understood by the system, the interpreted command is displayed along with the hand shaped icon. For example, in one embodiment, by speaking the word “Move”, the user may initiate the continuous (or stepped) movement of the window under the hand shaped icon to follow the movement of the user's hand. The user 900 causes the selected window to move up or down within the display 902 by moving the hand up or down. Lateral motion is also similarly achieved. Movement of the window is terminated when the user hand is moved across the engagement plane 912 and out of the engagement volume 910. Other methods of termination include stopping movement of the hand (or object) for an extended period of time, which is processed by the system as a command to drop the associated hypothesis. Furthermore, as described hereinabove, the Move command may be invoked by dwelling the hand on the window for a period of time, followed by hand motion to initiate the direction of window movement.
  • Alternatively, the user may speak the word “Release” and the system will stop moving the selected window in response to the user's hand motion. Release may also be accomplished by dwelling a bit longer in time while in Move, and/or Scroll modes. The user 900 may also act upon a selected window with other actions. By speaking the words “Close”, “Minimize”, or “Maximize” the selected window is respectively closed, minimized or maximized. By speaking the word “Raise”, the selected window is brought to the foreground, and by speaking “Send to Back”, the selected window is sent behind (to the background) all other open windows. By speaking “Scroll”, the user initiates a scrolling mode on the selected window. The user may control the rate of the scroll by the position of the hand. The hand shaped icon tracks the user's hand position, and the rate of the scrolling of the selected window is proportional to the distance between the current hand icon position and the position of the hand icon at the time the scrolling is initiated. Scrolling can be terminated by the user speaking “Release” or by the user moving their hand behind the engagement plane and out of the engagement volume. These are just a few examples of the voice recognition perceptual computer control capabilities of the disclosed architecture. It is to be appreciated that these voiced commands may also be programmed for execution in response to one or more object movements in accordance with the present invention.
  • In accordance with another aspect of the invention, dwell time can be used as a modality to control windows in lieu of, or in addition to, verbal commands and other disclosed modalities. Dwell time is defined as the time, after having engaged the system, that the user holds their hand position stationary such that the system hand shaped icon remains over a particular window. For example, by dwelling on a selected window for a short period of time (e.g., two seconds), the system can bring the window to the foreground of all other open windows (i.e., a RAISE command). Similarly, by dwelling a short time longer (e.g., four seconds), the system will grab (or select for dragging) the window, and the user causes the selected window to move up or down within the display by moving a hand up or down (i.e., a MOVE command). Lateral motion is also similarly achieved. Additional control over GUI windows can be accomplished in a similar fashion by controlling the dwell time of the hand shaped icon over the open window.
  • In accordance with a more robust aspect of the invention, hand gestures are interpreted by hand motion or by pattern recognition. For example, the user can bring the window to the front (or foreground), on top of all other open windows by moving a hand from a position closer to the display to position farther from the display, the hand remaining in the engagement volume 910. Similarly, the user can cause the selected window to be grabbed and moved by bringing fingers together with their thumb, and subsequently moving the hand. The selected window will move in relation to the user hand movement until the hand is opened up to release the selected window. Additional control over the selected window can be defined in response to particular hand movements or hand gestures. In accordance with another aspect of the present invention, the selected window will move in response to the user pointing their hand, thumb, or finger in a particular direction. For example, if the user points their index finger to right, the window will move to the right within the display. Similarly, if the user points to the left, up, or down the selected window will move to the left, up or down within the display, respectively. Additional window controls can be achieved through the use of similar hand gestures or motions.
  • In accordance with another aspect of the invention, the system is configurable such that an individual user selects the particular hand gestures that they wish to associate with particular commands. The system provides default settings that map a given set of gestures to a given set of commands. This mapping, however, is configurable such that the specific command executed in response to each particular hand gesture is definable by each user. For example, one user may wish to point directly at the screen with their index finger to grab the selected window for movement while another user may wish to bring their fingers together with their thumb to grab the selected window. Similarly, one user may wish to point a group of finger up or down in order to move a selected window up or down, while another user may wish to open the palm of their hand toward the cameras and then move their opened hand up or down to move a selected window up or down. All given gestures and commands are configurable by the individual users to best suit that particular user's individual personal preferences.
  • Similarly, in accordance with another aspect of the present invention, the system may include a “Record and Define Gesture” mode. In the “Record and Define Gesture” mode, the system records hand gestures performed by the user. The recorded gestures are then stored in the system memory to be recognized during normal operation. The given hand gestures are then associated with a particular command to be performed by the system in response to that particular hand gesture. With such capability, a user may further tailor the system to their personal preference or, similarly, may tailor system operation to respond to specific commands most appropriate for particular applications.
  • In a similar fashion, the user can choose the particular words, from a given set, they wish to use for a particular command. For example, one user may choose to say “Release” to stop moving a window while another may wish to say “Quit”. This capability allows different users, which may prefer to use different words for a given command, the ability to tailor the system in a way most efficient for their personal use.
  • The present invention can be utilized in an expansive list of applications. The following discussion is exemplary of only a few applications with which the present invention may be utilized. One such application is associated with user control of a presentation, or similar type of briefing application, wherein the user makes a presentation on a projection type screen to a group of listeners.
  • Referring now to FIG. 9B, there is illustrated a multiple monitor implementation. Here, the system includes three monitors (or displays) through which the user 900 exercises control of GUI features; a first display 912, a second display 914, and a third display 916. The cameras (906 and 908) are similarly situated as in FIG. 9A, to define the engagement volume 910. By utilizing the “flick” or “bump” motion(s) as performed by a hand 918 of the user 900, the user 900 can move a window 920 from the first display 912 to the second display 914, and further from the second display 914 to the third display 916. The flick motion of the user hand 918 can effect movement of the window 920 from the first display 912 to the third display 916 in a single window movement, or in multiple steps through the displays (914 and 916) using corresponding multiple hand motions. Of course, control by the user 900 occurs only when the user hand 918 breaks the engagement plane 912, and is determined to be a control object (i.e., an object meeting parameters sufficient to effect control of the computer).
  • As mentioned hereinabove, the user 900 is located generally in front of the displays (912, 914, and 916), which is also within the imaging capabilities of the pair of cameras (906 and 908). The microphone 904 is suitably located to receive user voice signals. The cameras (906 and 908) are mounted substantially parallel to each other and on a horizontal plane above the displays (912, 914, and 916), and separated by a distance that provides optimum detection and tracking for the given cameras and the engagement volume 910.
  • In operation, the user 900 moves the hand 918 through the engagement plane 912 and into the engagement volume 910 established for the system. The system, which had detected and tracked the hand 918 before it entered the volume 912, begins providing feedback to the user 900 by displaying the hand shaped icon 922 on one of the displays (912, 914, and 916). The microphone 904 provides additional means for the user 900 to invoke one or more verbal commands in order to act upon the selected window 920 under the corresponding icon 922. The window 920 directly underneath the hand shaped icon is the selected window. When the user hand 918 enters the volume 910, it is recognized as a control object. The corresponding icon 922 is presented by the system on the computer display 912. By dwelling a predetermined amount of time, the associated window is assigned for control. The user 900 causes the selected window to move up or down within the display by invoking the ‘Move’ command as explained above and then moving the hand up or down, or to move across one or more of the monitors (914 and 916) by invoking the ‘Flick’ command and then using the flick hand motion. Of course, if the second display 914 was the initial point of control, the user 900 can cause the window 920 to be moved left to the first display 912, or right to the third display 916. Movement of the window is terminated (or “released”) when the user hand dwells for a time longer than a predetermined dwell time, or out of the engagement volume 910.
  • Alternatively, the user may speak the word “Release” and the system will stop moving the selected window in response to the user's hand motion. Release may also be accomplished by dwelling a bit while in Move, and/or Scroll modes. The user may also act upon a selected window with other actions. By speaking the words “Close”, “Minimize”, or “Maximize” the selected window is respectively closed, minimized or maximized. By speaking the word “Raise”, the selected window is brought to the foreground, and by speaking “Send to Back”, the selected window is sent behind (to the background) all other open windows. By speaking “Scroll”, the user initiates a scrolling mode on the selected window. The user may control the rate of the scroll by the position of the hand. The hand shaped icon tracks the user's hand position, and the rate of the scrolling of the selected window is proportional to the distance between the current hand icon position and the position of the hand icon at the time the scrolling is initiated. Scrolling can be terminated by the user speaking “Release” or by the user moving their hand behind the engagement plane and out of the engagement volume. These are just a few examples of the voice recognition perceptual computer control capabilities of the disclosed architecture.
  • Referring now to FIG. 10, there is illustrated a briefing room environment where voice and/or gestures are utilized to control a screen projector via a computer system configured in accordance with an aspect of the present invention. The briefing room 1000 comprises a large briefing table 1002 surrounded on three sides by numerous chairs 1004, a computer 1006, a video projector 1008, and a projector screen 1010. Utilization of the present invention adds additional elements comprising the disclosed perceptual software 1012, two video cameras (1014 and 1016) and a microphone 1018. In this application, a user 1020 is positioned between the projector screen 1010 and briefing table 1002 at which the audience is seated. A top face 1022 of an engagement volume 1024 is defined by rectangular area 1026. Similarly, a front surface indicated at 1028 represents an engagement plane.
  • As the user gives the presentation, the user controls the content displayed on the projection screen 1010 and advancement of the slides (or presentation images) by moving their hand(s) through the engagement plane 1028 into the engagement volume 1024, and/or speaking commands recognizable by the system. Once inside the engagement volume 1024, a simple gesture is made to advance to the next slide, back-up to a previous slide, initiate an embedded video, or to effect one of a number of many other presentation capabilities.
  • A similar capability can be implemented for a home media center wherein the user can change selected video sources, change channels, control volume, advance chapter and other similar functions by moving their hand across an engagement plane into an engagement volume and subsequently performing the appropriate hand gesture. Additional applications include perceptual interfaces for TabletPCs, Media center PCs, kiosks, hand held computers, home appliances, video games, and wall sized displays, along with many others.
  • It is appreciated that in more robust implementations, instead of the engagement volume being fixed at a position associated with the location of the cameras that requires the presenter to operate according to the location of the engagement volume, the system can be configured such that the engagement volume travels with the user (in a “roaming” mode) as the user moves about the room. Thus the cameras would be mounted on a platform that rotates such that the rotation maintains the cameras substantially equidistant from the presenter. The presenter may carrier a sensor that allows the system to sense or track the general location of the presenter. The system would then affect rotation of the camera mount to “point” the cameras at the presenter. In response thereto, the engagement volume may be extended to the presenter allowing control of the computer system as the presenter moves about. The process of “extending” the engagement volume can include increasing the depth of the volume such that the engagement plane surface moves to the presenter, or by maintaining the volume dimensions, but moving the fixed volume to the presenter. This would require on-the-fly focal adjustment of the cameras to track quick movements in the depth of objects in the volume, but also the movement of the presenter.
  • Another method of triggering system attention in this mode would be to execute a predefined gesture that is not likely to be made unintentionally, e.g., raising a hand.
  • It is also appreciated that the system is configurable for individual preferences such that the engagement volume of a first user may be different than the volume of a second user. For example, in accordance with a user login, or other unique user information, the user preferences may be retrieved and implemented automatically by the system. This can include automatically elevating the mounted cameras for a taller person by using a telescoping camera stand so that the cameras are at the appropriate height of the particular user, whether sitting or standing. This also includes, but is not limited to, setting the system for “roaming” mode
  • Referring now to FIG. 11, there is illustrated a block diagram of a computer operable to execute the present invention. In order to provide additional context for various aspects of the present invention, FIG. 11 and the following discussion are intended to provide a brief, general description of a suitable computing environment 1100 in which the various aspects of the present invention may be implemented. While the invention has been described above in the general context of computer-executable instructions that may run on one or more computers, those skilled in the art will recognize that the invention also may be implemented in combination with other program modules and/or as a combination of hardware and software. Generally, program modules include routines, programs, components, data structures, etc., that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the inventive methods may be practiced with other computer system configurations, including single-processor or multiprocessor computer systems, minicomputers, mainframe computers, as well as personal computers, hand-held computing devices, microprocessor-based or programmable consumer electronics, and the like, each of which may be operatively coupled to one or more associated devices. The illustrated aspects of the invention may also be practiced in distributed computing environments where certain tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
  • With reference again to FIG. 11, the exemplary environment 1100 for implementing various aspects of the invention includes a computer 1102, the computer 1102 including a processing unit 1104, a system memory 1106, and a system bus 1108. The system bus 1108 couples system components including, but not limited to the system memory 1106 to the processing unit 1104. The processing unit 1104 may be any of various commercially available processors. Dual microprocessors and other multi-processor architectures also can be employed as the processing unit 1104.
  • The system bus 1108 can be any of several types of bus structure including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of commercially available bus architectures. The system memory 1106 includes read only memory (ROM) 1110 and random access memory (RAM) 1112. A basic input/output system (BIOS), containing the basic routines that help to transfer information between elements within the computer 1102, such as during start-up, is stored in the ROM 1110.
  • The computer 1102 further includes a hard disk drive 1114, a magnetic disk drive 1116, (e.g., to read from or write to a removable disk 1118) and an optical disk drive 1120, (e.g., reading a CD-ROM disk 1122 or to read from or write to other optical media). The hard disk drive 1114, magnetic disk drive 1116 and optical disk drive 1120 can be connected to the system bus 1108 by a hard disk drive interface 1124, a magnetic disk drive interface 1126 and an optical drive interface 1128, respectively. The drives and their associated computer-readable media provide nonvolatile storage of data, data structures, computer-executable instructions, and so forth. For the computer 1102, the drives and media accommodate the storage of broadcast programming in a suitable digital format. Although the description of computer-readable media above refers to a hard disk, a removable magnetic disk and a CD, it should be appreciated by those skilled in the art that other types of media which are readable by a computer, such as zip drives, magnetic cassettes, flash memory cards, digital video disks, cartridges, and the like, may also be used in the exemplary operating environment, and further that any such media may contain computer-executable instructions for performing the methods of the present invention.
  • A number of program modules can be stored in the drives and RAM 1112, including an operating system 1130, one or more application programs 1132, other program modules 1134 and program data 1136. It is appreciated that the present invention can be implemented with various commercially available operating systems or combinations of operating systems.
  • A user can enter commands and information into the computer 1102 through a keyboard 1138 and a pointing device, such as a mouse 1140. Other input devices (not shown) may include one or more video cameras, one or microphones, an IR remote control, a joystick, a game pad, a satellite dish, a scanner, or the like. These and other input devices are often connected to the processing unit 1104 through a serial port interface 1142 that is coupled to the system bus 1108, but may be connected by other interfaces, such as a parallel port, a game port, a firewire port, a universal serial bus (“USB”), an IR interface, etc. A monitor 1144 or other type of display device is also connected to the system bus 1108 via an interface, such as a video adapter 1146. In addition to the monitor 1144, a computer typically includes other peripheral output devices (not shown), such as speakers, printers etc.
  • The computer 1102 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer(s) 1148. The remote computer(s) 1148 may be a workstation, a server computer, a router, a personal computer, portable computer, microprocessor-based entertainment appliance, a peer device or other common network node, and typically includes many or all of the elements described relative to the computer 1102, although, for purposes of brevity, only a memory storage device 1150 is illustrated. The logical connections depicted include a LAN 1152 and a WAN 1154. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.
  • When used in a LAN networking environment, the computer 1102 is connected to the local network 1152 through a network interface or adapter 1156. When used in a WAN networking environment, the computer 1102 typically includes a modem 1158, or is connected to a communications server on the LAN, or has other means for establishing communications over the WAN 1154, such as the Internet. The modem 1158, which may be internal or external, is connected to the system bus 1108 via the serial port interface 1142. In a networked environment, program modules depicted relative to the computer 1102, or portions thereof, may be stored in the remote memory storage device 1150. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
  • Referring now to FIG. 12, there is illustrated a network implementation 1200 of the present invention. The implementation 1200 includes a first perceptual system 1202 and a second perceptual system 1204, both operational according to the disclose invention. The first system 1202 includes cameras 1206 (also denoted C1 and C2) mounted on a rotational and telescoping camera mount 1208. A first user 1210 located generally in front of the first system 1202 effects control of a GUI content A of the first system 1202 in accordance with the novel aspects of the present invention by introducing hand gestures into an engagement volume 1211 and/or voice signals. The first user 1210 may rove about in front of the cameras 1206 in accordance with the “roaming” operational mode described previously, or may be seated in front of the cameras 1206. The second system 1204 includes cameras 1212 (also denoted C3 and C4) mounted on a rotational and telescoping camera mount 1214. A second user 1216 located generally in front of the second system 1204 effects control of a GUI content B of the second system 1204 in accordance with the novel aspects of the present invention by introducing hand gestures into an engagement volume 1217 and/or voice signals. The second user 1216 may rove about in front of the cameras 1212 in accordance with the “roaming” operational mode described previously, or may be seated in front of the cameras 1212.
  • The first and second systems (1202 and 1204) may be networked in a conventional wired or wireless network 1207 peer configuration (or bus configuration by using a hub 1215). This particular system 1200 is employed to present both content A and content B via a single large monitor or display 1218. Thus the monitor 1218 can be driven by either of the systems (1202 and 1204), as can be provided by conventional dual-output video graphics cards, or the separate video information may be transmitted to a third monitor control system 1220 to present the content. Such an implementation finds application where a side-by-side comparison of product features is being presented, other similar applications where two or more users may desire to interact. Thus content A and content B may be presented on a split screen layout of the monitor 1218. Either or both user 1210 and 1216 may provide keyboard and/or mouse input to facilitate control according to the present invention.
  • What has been described above includes examples of the present invention. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the present invention, but one of ordinary skill in the art may recognize that many further combinations and permutations of the present invention are possible. Accordingly, the present invention is intended to embrace all such alterations, modifications, and variations that fall within the spirit and scope of the appended claims. Furthermore, to the extent that the term “includes” is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.

Claims (18)

1. A method of determining a command, comprising:
capturing an image of an object with a camera;
determining a gesture based at least partly upon the image;
detecting an audio input; and
determining, at one or more processors, the command based at least partly upon the gesture and the audio input.
2. The method of claim 1, further comprising:
determining a depth of the object; and
determining the command based at least partly upon the depth of the object.
3. The method of claim 2, wherein determining the depth of the object includes capturing a second image of the object with a second camera.
4. The method of claim 1, wherein the camera is a video camera.
5. The method of claim 1, wherein the camera detects visible light.
6. The method of claim 1, wherein determining the gesture includes capturing a second image of the object with the camera and comparing the image with the second image.
7. A computer-readable medium having instruction that cause a processor to execute steps, the steps comprising:
capturing an image of an object with a camera;
determining a gesture based at least partly upon the image;
detecting an audio input; and
determining, at one or more processors, a command based at least partly upon the gesture and the audio input.
8. The computer-readable medium of claim 7, the steps further comprising:
determining a depth of the object; and
determining the command based at least partly upon the depth of the object.
9. The computer-readable medium of claim 8, wherein determining the depth of the object includes capturing a second image of the object with a second camera.
10. The computer-readable medium of claim 7, wherein the camera is a video camera.
11. The computer-readable medium of claim 7, wherein the camera detects visible light.
12. The computer-readable medium of claim 7, wherein determining the gesture includes capturing a second image of the object with the camera and comparing the image with the second image.
13. A command determining system, comprising:
a camera configured to capture an image of an object;
a first determiner configured to determine a gesture based at least partly upon the image;
an audio detection unit configured to detect an audio input; and
a second determiner configured to determine the command based at least partly upon the gesture and the audio input.
14. The command determining system of claim 13, further comprising:
a third determiner configured to determine a depth of the object, wherein the second determiner is further configured to determine the command based at least partly upon the depth of the object.
15. The command determining system of claim 14, wherein determining the depth of the object includes capturing a second image of the object with a second camera.
16. The command determining system of claim 13, wherein the camera is a video camera.
17. The command determining system of claim 13, wherein the camera detects visible light.
18. The command determining system of claim 13, wherein determining the gesture includes capturing a second image of the object with the camera and comparing the image with the second image.
US12/494,303 2003-03-25 2009-06-30 Architecture for controlling a computer using hand gestures Abandoned US20090268945A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/494,303 US20090268945A1 (en) 2003-03-25 2009-06-30 Architecture for controlling a computer using hand gestures

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/396,653 US7665041B2 (en) 2003-03-25 2003-03-25 Architecture for controlling a computer using hand gestures
US12/494,303 US20090268945A1 (en) 2003-03-25 2009-06-30 Architecture for controlling a computer using hand gestures

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/396,653 Continuation US7665041B2 (en) 2003-03-25 2003-03-25 Architecture for controlling a computer using hand gestures

Publications (1)

Publication Number Publication Date
US20090268945A1 true US20090268945A1 (en) 2009-10-29

Family

ID=32988815

Family Applications (4)

Application Number Title Priority Date Filing Date
US10/396,653 Expired - Fee Related US7665041B2 (en) 2003-03-25 2003-03-25 Architecture for controlling a computer using hand gestures
US12/494,303 Abandoned US20090268945A1 (en) 2003-03-25 2009-06-30 Architecture for controlling a computer using hand gestures
US12/705,113 Active 2024-12-14 US9652042B2 (en) 2003-03-25 2010-02-12 Architecture for controlling a computer using hand gestures
US12/705,014 Abandoned US20100146464A1 (en) 2003-03-25 2010-02-12 Architecture For Controlling A Computer Using Hand Gestures

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/396,653 Expired - Fee Related US7665041B2 (en) 2003-03-25 2003-03-25 Architecture for controlling a computer using hand gestures

Family Applications After (2)

Application Number Title Priority Date Filing Date
US12/705,113 Active 2024-12-14 US9652042B2 (en) 2003-03-25 2010-02-12 Architecture for controlling a computer using hand gestures
US12/705,014 Abandoned US20100146464A1 (en) 2003-03-25 2010-02-12 Architecture For Controlling A Computer Using Hand Gestures

Country Status (1)

Country Link
US (4) US7665041B2 (en)

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040193413A1 (en) * 2003-03-25 2004-09-30 Wilson Andrew D. Architecture for controlling a computer using hand gestures
US20080141181A1 (en) * 2006-12-07 2008-06-12 Kabushiki Kaisha Toshiba Information processing apparatus, information processing method, and program
US20110182471A1 (en) * 2009-11-30 2011-07-28 Abisee, Inc. Handling information flow in printed text processing
US20110221770A1 (en) * 2010-03-15 2011-09-15 Ezekiel Kruglick Selective motor control classification
US20120102400A1 (en) * 2010-10-22 2012-04-26 Microsoft Corporation Touch Gesture Notification Dismissal Techniques
US20120121123A1 (en) * 2010-11-11 2012-05-17 Hsieh Chang-Tai Interactive device and method thereof
US20120139907A1 (en) * 2010-12-06 2012-06-07 Samsung Electronics Co., Ltd. 3 dimensional (3d) display system of responding to user motion and user interface for the 3d display system
US20130010207A1 (en) * 2011-07-04 2013-01-10 3Divi Gesture based interactive control of electronic equipment
US8396252B2 (en) 2010-05-20 2013-03-12 Edge 3 Technologies Systems and related methods for three dimensional gesture recognition in vehicles
US20130121528A1 (en) * 2011-11-14 2013-05-16 Sony Corporation Information presentation device, information presentation method, information presentation system, information registration device, information registration method, information registration system, and program
US8457353B2 (en) 2010-05-18 2013-06-04 Microsoft Corporation Gestures and gesture modifiers for manipulating a user-interface
US8467599B2 (en) 2010-09-02 2013-06-18 Edge 3 Technologies, Inc. Method and apparatus for confusion learning
US8582866B2 (en) 2011-02-10 2013-11-12 Edge 3 Technologies, Inc. Method and apparatus for disparity computation in stereo images
US8655093B2 (en) 2010-09-02 2014-02-18 Edge 3 Technologies, Inc. Method and apparatus for performing segmentation of an image
US8666144B2 (en) 2010-09-02 2014-03-04 Edge 3 Technologies, Inc. Method and apparatus for determining disparity of texture
US8705877B1 (en) 2011-11-11 2014-04-22 Edge 3 Technologies, Inc. Method and apparatus for fast computational stereo
US8730164B2 (en) 2010-05-28 2014-05-20 Panasonic Corporation Gesture recognition apparatus and method of gesture recognition
US8811938B2 (en) 2011-12-16 2014-08-19 Microsoft Corporation Providing a user interface experience based on inferred vehicle state
US20140280890A1 (en) * 2013-03-15 2014-09-18 Yahoo! Inc. Method and system for measuring user engagement using scroll dwell time
WO2014185808A1 (en) * 2013-05-13 2014-11-20 3Divi Company System and method for controlling multiple electronic devices
US8970589B2 (en) 2011-02-10 2015-03-03 Edge 3 Technologies, Inc. Near-touch interaction with a stereo camera grid structured tessellations
US9123316B2 (en) * 2010-12-27 2015-09-01 Microsoft Technology Licensing, Llc Interactive content creation
US9170666B2 (en) 2010-02-25 2015-10-27 Hewlett-Packard Development Company, L.P. Representative image
US9393695B2 (en) 2013-02-27 2016-07-19 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with person and object discrimination
US9417700B2 (en) 2009-05-21 2016-08-16 Edge3 Technologies Gesture recognition systems and related methods
US9465980B2 (en) 2009-01-30 2016-10-11 Microsoft Technology Licensing, Llc Pose tracking pipeline
US9498885B2 (en) 2013-02-27 2016-11-22 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with confidence-based decision support
US9798302B2 (en) 2013-02-27 2017-10-24 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with redundant system input support
US9804576B2 (en) 2013-02-27 2017-10-31 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with position and derivative decision reference
CN109669553A (en) * 2011-02-28 2019-04-23 意法半导体(R&D)有限公司 In optical navigation device or improvement related with optical navigation device
US10491694B2 (en) 2013-03-15 2019-11-26 Oath Inc. Method and system for measuring user engagement using click/skip in content stream using a probability model
US10721448B2 (en) 2013-03-15 2020-07-21 Edge 3 Technologies, Inc. Method and apparatus for adaptive exposure bracketing, segmentation and scene organization
US10963062B2 (en) 2011-11-23 2021-03-30 Intel Corporation Gesture input with multiple views, displays and physics
US11322171B1 (en) 2007-12-17 2022-05-03 Wai Wu Parallel signal processing system and method
US11331006B2 (en) 2019-03-05 2022-05-17 Physmodo, Inc. System and method for human motion detection and tracking
US11497961B2 (en) 2019-03-05 2022-11-15 Physmodo, Inc. System and method for human motion detection and tracking

Families Citing this family (367)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9722766D0 (en) 1997-10-28 1997-12-24 British Telecomm Portable computers
US9239673B2 (en) 1998-01-26 2016-01-19 Apple Inc. Gesturing with a multipoint sensing device
US8479122B2 (en) 2004-07-30 2013-07-02 Apple Inc. Gestures for touch sensitive input devices
US7840912B2 (en) * 2006-01-30 2010-11-23 Apple Inc. Multi-touch gesture dictionary
US9292111B2 (en) 1998-01-26 2016-03-22 Apple Inc. Gesturing with a multipoint sensing device
US20070177804A1 (en) * 2006-01-30 2007-08-02 Apple Computer, Inc. Multi-touch gesture dictionary
US7614008B2 (en) 2004-07-30 2009-11-03 Apple Inc. Operation of a computer with touch screen interface
JP4052498B2 (en) 1999-10-29 2008-02-27 株式会社リコー Coordinate input apparatus and method
JP2001184161A (en) 1999-12-27 2001-07-06 Ricoh Co Ltd Method and device for inputting information, writing input device, method for managing written data, method for controlling display, portable electronic writing device, and recording medium
US6803906B1 (en) 2000-07-05 2004-10-12 Smart Technologies, Inc. Passive touch system and method of detecting user input
ES2340945T3 (en) * 2000-07-05 2010-06-11 Smart Technologies Ulc PROCEDURE FOR A CAMERA BASED TOUCH SYSTEM.
US8035612B2 (en) * 2002-05-28 2011-10-11 Intellectual Ventures Holding 67 Llc Self-contained interactive video display system
US20050134578A1 (en) * 2001-07-13 2005-06-23 Universal Electronics Inc. System and methods for interacting with a control environment
US6990639B2 (en) * 2002-02-07 2006-01-24 Microsoft Corporation System and process for controlling electronic components in a ubiquitous computing environment using multimodal integration
US20050122308A1 (en) * 2002-05-28 2005-06-09 Matthew Bell Self-contained interactive video display system
US7710391B2 (en) * 2002-05-28 2010-05-04 Matthew Bell Processing an image utilizing a spatially varying pattern
US20040001144A1 (en) 2002-06-27 2004-01-01 Mccharles Randy Synchronization of camera images in camera-based touch system to enhance position determination of fast moving objects
US7161579B2 (en) 2002-07-18 2007-01-09 Sony Computer Entertainment Inc. Hand-held computer interactive device
US8570378B2 (en) 2002-07-27 2013-10-29 Sony Computer Entertainment Inc. Method and apparatus for tracking three-dimensional movements of an object using a depth sensing camera
US8313380B2 (en) 2002-07-27 2012-11-20 Sony Computer Entertainment America Llc Scheme for translating movements of a hand-held controller into inputs for a system
US8686939B2 (en) 2002-07-27 2014-04-01 Sony Computer Entertainment Inc. System, method, and apparatus for three-dimensional input control
US7760248B2 (en) 2002-07-27 2010-07-20 Sony Computer Entertainment Inc. Selective sound source listening in conjunction with computer interactive processing
US9474968B2 (en) 2002-07-27 2016-10-25 Sony Interactive Entertainment America Llc Method and system for applying gearing effects to visual tracking
US9393487B2 (en) 2002-07-27 2016-07-19 Sony Interactive Entertainment Inc. Method for mapping movements of a hand-held controller to game commands
US9682319B2 (en) 2002-07-31 2017-06-20 Sony Interactive Entertainment Inc. Combiner method for altering game gearing
US7815507B2 (en) * 2004-06-18 2010-10-19 Igt Game machine user interface using a non-contact eye motion recognition device
US8460103B2 (en) 2004-06-18 2013-06-11 Igt Gesture controlled casino gaming system
US6954197B2 (en) 2002-11-15 2005-10-11 Smart Technologies Inc. Size/scale and orientation determination of a pointer in a camera-based touch system
WO2004055776A1 (en) * 2002-12-13 2004-07-01 Reactrix Systems Interactive directed light/sound system
US9177387B2 (en) 2003-02-11 2015-11-03 Sony Computer Entertainment Inc. Method and apparatus for real time motion capture
US8456447B2 (en) 2003-02-14 2013-06-04 Next Holdings Limited Touch screen signal processing
US8508508B2 (en) 2003-02-14 2013-08-13 Next Holdings Limited Touch screen signal processing with single-point calibration
US7629967B2 (en) 2003-02-14 2009-12-08 Next Holdings Limited Touch screen signal processing
US7532206B2 (en) 2003-03-11 2009-05-12 Smart Technologies Ulc System and method for differentiating between pointers used to contact touch surface
US7665041B2 (en) * 2003-03-25 2010-02-16 Microsoft Corporation Architecture for controlling a computer using hand gestures
US7256772B2 (en) 2003-04-08 2007-08-14 Smart Technologies, Inc. Auto-aligning touch system and method
US9213365B2 (en) 2010-10-01 2015-12-15 Z124 Method and system for viewing stacked screen displays using gestures
US9207717B2 (en) 2010-10-01 2015-12-08 Z124 Dragging an application to a screen using the application manager
US9182937B2 (en) 2010-10-01 2015-11-10 Z124 Desktop reveal by moving a logical display stack with gestures
US7372977B2 (en) * 2003-05-29 2008-05-13 Honda Motor Co., Ltd. Visual tracking using depth data
US8072470B2 (en) * 2003-05-29 2011-12-06 Sony Computer Entertainment Inc. System and method for providing a real-time three-dimensional interactive environment
US7620202B2 (en) * 2003-06-12 2009-11-17 Honda Motor Co., Ltd. Target orientation estimation using depth sensing
US7038661B2 (en) * 2003-06-13 2006-05-02 Microsoft Corporation Pointing device and cursor for use in intelligent computing environments
JP3752246B2 (en) * 2003-08-11 2006-03-08 学校法人慶應義塾 Hand pattern switch device
US7411575B2 (en) * 2003-09-16 2008-08-12 Smart Technologies Ulc Gesture recognition method and touch system incorporating the same
US7274356B2 (en) 2003-10-09 2007-09-25 Smart Technologies Inc. Apparatus for determining the location of a pointer within a region of interest
US7478171B2 (en) * 2003-10-20 2009-01-13 International Business Machines Corporation Systems and methods for providing dialog localization in a distributed environment and enabling conversational communication using generalized user gestures
EP1676442A2 (en) * 2003-10-24 2006-07-05 Reactrix Systems, Inc. Method and system for managing an interactive video display system
US7831087B2 (en) * 2003-10-31 2010-11-09 Hewlett-Packard Development Company, L.P. Method for visual-based recognition of an object
US7848850B2 (en) * 2003-11-13 2010-12-07 Japan Science And Technology Agency Method for driving robot
US7496385B2 (en) * 2003-12-29 2009-02-24 International Business Machines Corporation Method for viewing information underlying lists and other contexts
US7895537B2 (en) * 2003-12-29 2011-02-22 International Business Machines Corporation Method and apparatus for setting attributes and initiating actions through gestures
US7355593B2 (en) 2004-01-02 2008-04-08 Smart Technologies, Inc. Pointer tracking across multiple overlapping coordinate input sub-regions defining a generally contiguous input region
FI117308B (en) * 2004-02-06 2006-08-31 Nokia Corp gesture Control
US7232986B2 (en) * 2004-02-17 2007-06-19 Smart Technologies Inc. Apparatus for detecting a pointer within a region of interest
JP2005242694A (en) * 2004-02-26 2005-09-08 Mitsubishi Fuso Truck & Bus Corp Hand pattern switching apparatus
US8094927B2 (en) * 2004-02-27 2012-01-10 Eastman Kodak Company Stereoscopic display system with flexible rendering of disparity map according to the stereoscopic fusing capability of the observer
US20050227217A1 (en) * 2004-03-31 2005-10-13 Wilson Andrew D Template matching on interactive surface
US7460110B2 (en) 2004-04-29 2008-12-02 Smart Technologies Ulc Dual mode touch system
US7538759B2 (en) 2004-05-07 2009-05-26 Next Holdings Limited Touch panel display system with illumination and detection provided from a single edge
US7308112B2 (en) * 2004-05-14 2007-12-11 Honda Motor Co., Ltd. Sign based human-machine interaction
US8120596B2 (en) 2004-05-21 2012-02-21 Smart Technologies Ulc Tiled touch system
US7787706B2 (en) * 2004-06-14 2010-08-31 Microsoft Corporation Method for controlling an intensity of an infrared source used to detect objects adjacent to an interactive display surface
US7593593B2 (en) 2004-06-16 2009-09-22 Microsoft Corporation Method and system for reducing effects of undesired signals in an infrared imaging system
US8684839B2 (en) 2004-06-18 2014-04-01 Igt Control of wager-based game using gesture recognition
JP4572615B2 (en) * 2004-07-27 2010-11-04 ソニー株式会社 Information processing apparatus and method, recording medium, and program
US8381135B2 (en) 2004-07-30 2013-02-19 Apple Inc. Proximity detector in handheld device
US8560972B2 (en) 2004-08-10 2013-10-15 Microsoft Corporation Surface UI for gesture-based interaction
US7942744B2 (en) 2004-08-19 2011-05-17 Igt Virtual input system
US8547401B2 (en) 2004-08-19 2013-10-01 Sony Computer Entertainment Inc. Portable augmented reality device and method
US7761814B2 (en) * 2004-09-13 2010-07-20 Microsoft Corporation Flick gesture
US20060072009A1 (en) * 2004-10-01 2006-04-06 International Business Machines Corporation Flexible interaction-based computer interfacing using visible artifacts
EP1849123A2 (en) 2005-01-07 2007-10-31 GestureTek, Inc. Optical flow based tilt sensor
JP2008537190A (en) * 2005-01-07 2008-09-11 ジェスチャー テック,インコーポレイテッド Generation of three-dimensional image of object by irradiating with infrared pattern
US7864159B2 (en) 2005-01-12 2011-01-04 Thinkoptics, Inc. Handheld vision based absolute pointing system
JP4689684B2 (en) * 2005-01-21 2011-05-25 ジェスチャー テック,インコーポレイテッド Tracking based on movement
JP5631535B2 (en) 2005-02-08 2014-11-26 オブロング・インダストリーズ・インコーポレーテッド System and method for a gesture-based control system
CN101133385B (en) * 2005-03-04 2014-05-07 苹果公司 Hand held electronic device, hand held device and operation method thereof
KR100687737B1 (en) * 2005-03-19 2007-02-27 한국전자통신연구원 Apparatus and method for a virtual mouse based on two-hands gesture
US20060215042A1 (en) * 2005-03-24 2006-09-28 Motorola, Inc. Image processing method and apparatus with provision of status information to a user
US9128519B1 (en) 2005-04-15 2015-09-08 Intellectual Ventures Holding 67 Llc Method and system for state-based control of objects
KR101430761B1 (en) 2005-05-17 2014-08-19 퀄컴 인코포레이티드 Orientation-sensitive signal output
US8081822B1 (en) 2005-05-31 2011-12-20 Intellectual Ventures Holding 67 Llc System and method for sensing a feature of an object in an interactive video display
US7362738B2 (en) * 2005-08-09 2008-04-22 Deere & Company Method and system for delivering information to a user
US7911444B2 (en) 2005-08-31 2011-03-22 Microsoft Corporation Input method for surface of interactive display
US20070055938A1 (en) * 2005-09-07 2007-03-08 Avaya Technology Corp. Server-based method for providing internet content to users with disabilities
US7697827B2 (en) 2005-10-17 2010-04-13 Konicek Jeffrey C User-friendlier interfaces for a camera
US8209620B2 (en) * 2006-01-31 2012-06-26 Accenture Global Services Limited System for storage and navigation of application states and interactions
US7599520B2 (en) * 2005-11-18 2009-10-06 Accenture Global Services Gmbh Detection of multiple targets on a plane of interest
US8098277B1 (en) 2005-12-02 2012-01-17 Intellectual Ventures Holding 67 Llc Systems and methods for communication between a reactive video system and a mobile communication device
US8549442B2 (en) * 2005-12-12 2013-10-01 Sony Computer Entertainment Inc. Voice and video control of interactive electronically simulated environment
US8060840B2 (en) 2005-12-29 2011-11-15 Microsoft Corporation Orientation free user interface
US8370383B2 (en) 2006-02-08 2013-02-05 Oblong Industries, Inc. Multi-process interactive systems and methods
US9823747B2 (en) 2006-02-08 2017-11-21 Oblong Industries, Inc. Spatial, multi-modal control device for use with spatial operating system
US9910497B2 (en) 2006-02-08 2018-03-06 Oblong Industries, Inc. Gestural control of autonomous and semi-autonomous systems
US8531396B2 (en) 2006-02-08 2013-09-10 Oblong Industries, Inc. Control system for navigating a principal dimension of a data space
US8537111B2 (en) 2006-02-08 2013-09-17 Oblong Industries, Inc. Control system for navigating a principal dimension of a data space
DE102006037154A1 (en) * 2006-03-27 2007-10-18 Volkswagen Ag Navigation device and method for operating a navigation device
US9274807B2 (en) 2006-04-20 2016-03-01 Qualcomm Incorporated Selective hibernation of activities in an electronic device
US8683362B2 (en) * 2008-05-23 2014-03-25 Qualcomm Incorporated Card metaphor for activities in a computing device
US8296684B2 (en) 2008-05-23 2012-10-23 Hewlett-Packard Development Company, L.P. Navigating among activities in a computing device
US20070265075A1 (en) * 2006-05-10 2007-11-15 Sony Computer Entertainment America Inc. Attachable structure for use with hand-held controller having tracking ability
US7907117B2 (en) * 2006-08-08 2011-03-15 Microsoft Corporation Virtual controller for visual displays
TWI317895B (en) * 2006-08-14 2009-12-01 Lee Chia Hoang A device for controlling a software object and the method for the same
JP4267648B2 (en) * 2006-08-25 2009-05-27 株式会社東芝 Interface device and method thereof
US9317124B2 (en) * 2006-09-28 2016-04-19 Nokia Technologies Oy Command input by hand gestures captured from camera
US8781151B2 (en) 2006-09-28 2014-07-15 Sony Computer Entertainment Inc. Object detection using video input combined with tilt angle information
US8310656B2 (en) 2006-09-28 2012-11-13 Sony Computer Entertainment America Llc Mapping movements of a hand-held controller to the two-dimensional image plane of a display screen
USRE48417E1 (en) 2006-09-28 2021-02-02 Sony Interactive Entertainment Inc. Object direction using video input combined with tilt angle information
US8904312B2 (en) * 2006-11-09 2014-12-02 Navisense Method and device for touchless signing and recognition
US9442607B2 (en) 2006-12-04 2016-09-13 Smart Technologies Inc. Interactive input system and method
KR101304461B1 (en) * 2006-12-04 2013-09-04 삼성전자주식회사 Method and apparatus of gesture-based user interface
CN101636745A (en) 2006-12-29 2010-01-27 格斯图尔泰克股份有限公司 Manipulation of virtual objects using enhanced interactive system
US9311528B2 (en) * 2007-01-03 2016-04-12 Apple Inc. Gesture learning
US7877707B2 (en) * 2007-01-06 2011-01-25 Apple Inc. Detecting and interpreting real-world and security gestures on touch and hover sensitive devices
US7844915B2 (en) 2007-01-07 2010-11-30 Apple Inc. Application programming interfaces for scrolling operations
US8212857B2 (en) * 2007-01-26 2012-07-03 Microsoft Corporation Alternating light sources to reduce specular reflection
GB0703974D0 (en) * 2007-03-01 2007-04-11 Sony Comp Entertainment Europe Entertainment device
WO2008112519A1 (en) * 2007-03-12 2008-09-18 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Fingertip visual haptic sensor controller
US20080252596A1 (en) * 2007-04-10 2008-10-16 Matthew Bell Display Using a Three-Dimensional vision System
US8115753B2 (en) 2007-04-11 2012-02-14 Next Holdings Limited Touch screen system with hover and click input methods
KR101420419B1 (en) * 2007-04-20 2014-07-30 엘지전자 주식회사 Electronic Device And Method Of Editing Data Using the Same And Mobile Communication Terminal
WO2008134452A2 (en) 2007-04-24 2008-11-06 Oblong Industries, Inc. Proteins, pools, and slawx in processing environments
EP2153377A4 (en) * 2007-05-04 2017-05-31 Qualcomm Incorporated Camera-based user input for compact devices
US9176598B2 (en) * 2007-05-08 2015-11-03 Thinkoptics, Inc. Free-space multi-dimensional absolute pointer with improved performance
TWI355615B (en) * 2007-05-11 2012-01-01 Ind Tech Res Inst Moving object detection apparatus and method by us
DE112008001396B4 (en) * 2007-06-05 2015-12-31 Mitsubishi Electric Corp. Vehicle operating device
US8094137B2 (en) 2007-07-23 2012-01-10 Smart Technologies Ulc System and method of detecting contact on a display
US8726194B2 (en) * 2007-07-27 2014-05-13 Qualcomm Incorporated Item selection using enhanced control
JP5091591B2 (en) * 2007-08-30 2012-12-05 株式会社東芝 Information processing apparatus, program, and information processing method
KR20100075460A (en) 2007-08-30 2010-07-02 넥스트 홀딩스 인코포레이티드 Low profile touch panel systems
US8432377B2 (en) 2007-08-30 2013-04-30 Next Holdings Limited Optical touchscreen with improved illumination
US20090058820A1 (en) * 2007-09-04 2009-03-05 Microsoft Corporation Flick-based in situ search from ink, text, or an empty selection region
JP5430572B2 (en) 2007-09-14 2014-03-05 インテレクチュアル ベンチャーズ ホールディング 67 エルエルシー Gesture-based user interaction processing
US9031843B2 (en) * 2007-09-28 2015-05-12 Google Technology Holdings LLC Method and apparatus for enabling multimodal tags in a communication device by discarding redundant information in the tags training signals
US8201108B2 (en) * 2007-10-01 2012-06-12 Vsee Lab, Llc Automatic communication notification and answering method in communication correspondance
US20090100383A1 (en) * 2007-10-16 2009-04-16 Microsoft Corporation Predictive gesturing in graphical user interface
US8005263B2 (en) * 2007-10-26 2011-08-23 Honda Motor Co., Ltd. Hand sign recognition using label assignment
US20090109036A1 (en) * 2007-10-29 2009-04-30 The Boeing Company System and Method for Alternative Communication
US10146320B2 (en) 2007-10-29 2018-12-04 The Boeing Company Aircraft having gesture-based control for an onboard passenger service unit
US8159682B2 (en) 2007-11-12 2012-04-17 Intellectual Ventures Holding 67 Llc Lens system
US9171454B2 (en) * 2007-11-14 2015-10-27 Microsoft Technology Licensing, Llc Magic wand
TW200935272A (en) * 2007-12-03 2009-08-16 Tse-Hsien Yeh Sensing apparatus and operating method thereof
US8542907B2 (en) 2007-12-17 2013-09-24 Sony Computer Entertainment America Llc Dynamic three-dimensional object mapping for user-defined control device
JP2009146333A (en) * 2007-12-18 2009-07-02 Panasonic Corp Spatial input operation display apparatus
US20090172606A1 (en) * 2007-12-31 2009-07-02 Motorola, Inc. Method and apparatus for two-handed computer user interface with gesture recognition
US8413075B2 (en) * 2008-01-04 2013-04-02 Apple Inc. Gesture movies
US8405636B2 (en) 2008-01-07 2013-03-26 Next Holdings Limited Optical position sensing system and optical position sensor assembly
US20100039500A1 (en) * 2008-02-15 2010-02-18 Matthew Bell Self-Contained 3D Vision System Utilizing Stereo Camera and Patterned Illuminator
US9772689B2 (en) * 2008-03-04 2017-09-26 Qualcomm Incorporated Enhanced gesture-based image manipulation
US8259163B2 (en) * 2008-03-07 2012-09-04 Intellectual Ventures Holding 67 Llc Display with built in 3D sensing
US20090254855A1 (en) * 2008-04-08 2009-10-08 Sony Ericsson Mobile Communications, Ab Communication terminals with superimposed user interface
US9639531B2 (en) * 2008-04-09 2017-05-02 The Nielsen Company (Us), Llc Methods and apparatus to play and control playing of media in a web page
WO2009128064A2 (en) * 2008-04-14 2009-10-22 Pointgrab Ltd. Vision based pointing device emulation
JP2009265709A (en) * 2008-04-22 2009-11-12 Hitachi Ltd Input device
US9952673B2 (en) 2009-04-02 2018-04-24 Oblong Industries, Inc. Operating environment comprising multiple client devices, multiple displays, multiple users, and gestural control
US9684380B2 (en) * 2009-04-02 2017-06-20 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
US9495013B2 (en) 2008-04-24 2016-11-15 Oblong Industries, Inc. Multi-modal gestural interface
US9740293B2 (en) 2009-04-02 2017-08-22 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
US9740922B2 (en) 2008-04-24 2017-08-22 Oblong Industries, Inc. Adaptive tracking system for spatial input devices
US10642364B2 (en) 2009-04-02 2020-05-05 Oblong Industries, Inc. Processing tracking and recognition data in gestural recognition systems
US8723795B2 (en) 2008-04-24 2014-05-13 Oblong Industries, Inc. Detecting, representing, and interpreting three-space input: gestural continuum subsuming freespace, proximal, and surface-contact modes
US8902193B2 (en) 2008-05-09 2014-12-02 Smart Technologies Ulc Interactive input system and bezel therefor
US8595218B2 (en) * 2008-06-12 2013-11-26 Intellectual Ventures Holding 67 Llc Interactive display management systems and methods
US20110169730A1 (en) * 2008-06-13 2011-07-14 Pioneer Corporation Sight line input user interface unit, user interface method, user interface program, and recording medium with user interface program recorded
KR20100003913A (en) * 2008-07-02 2010-01-12 삼성전자주식회사 Method and apparatus for communication using 3-dimensional image display
US20110115702A1 (en) * 2008-07-08 2011-05-19 David Seaberg Process for Providing and Editing Instructions, Data, Data Structures, and Algorithms in a Computer System
US20100031202A1 (en) * 2008-08-04 2010-02-04 Microsoft Corporation User-defined gesture set for surface computing
US8847739B2 (en) * 2008-08-04 2014-09-30 Microsoft Corporation Fusing RFID and vision for surface object tracking
US8463053B1 (en) 2008-08-08 2013-06-11 The Research Foundation Of State University Of New York Enhanced max margin learning on multimodal data mining in a multimedia database
JP4720874B2 (en) * 2008-08-14 2011-07-13 ソニー株式会社 Information processing apparatus, information processing method, and information processing program
US8755515B1 (en) 2008-09-29 2014-06-17 Wai Wu Parallel signal processing system and method
US9250797B2 (en) * 2008-09-30 2016-02-02 Verizon Patent And Licensing Inc. Touch gesture interface apparatuses, systems, and methods
EP2350792B1 (en) * 2008-10-10 2016-06-22 Qualcomm Incorporated Single camera tracker
US20100105479A1 (en) 2008-10-23 2010-04-29 Microsoft Corporation Determining orientation in an external reference frame
US8516397B2 (en) * 2008-10-27 2013-08-20 Verizon Patent And Licensing Inc. Proximity interface apparatuses, systems, and methods
US8339378B2 (en) 2008-11-05 2012-12-25 Smart Technologies Ulc Interactive input system with multi-angle reflector
JP5317630B2 (en) * 2008-11-07 2013-10-16 キヤノン株式会社 Image distribution apparatus, method and program
US20100162181A1 (en) * 2008-12-22 2010-06-24 Palm, Inc. Interpreting Gesture Input Including Introduction Or Removal Of A Point Of Contact While A Gesture Is In Progress
JP2010152761A (en) * 2008-12-25 2010-07-08 Sony Corp Input apparatus, control apparatus, control system, electronic apparatus, and control method
US8291348B2 (en) * 2008-12-31 2012-10-16 Hewlett-Packard Development Company, L.P. Computing device and method for selecting display regions responsive to non-discrete directional input actions and intelligent content analysis
US9250788B2 (en) * 2009-03-18 2016-02-02 IdentifyMine, Inc. Gesture handlers of a gesture engine
US9317128B2 (en) * 2009-04-02 2016-04-19 Oblong Industries, Inc. Remote devices used in a markerless installation of a spatial operating environment incorporating gestural control
US10824238B2 (en) 2009-04-02 2020-11-03 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
DE102009017772A1 (en) * 2009-04-16 2010-11-04 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Method and system for recognizing an object, and method and system for generating a marking in a screen display by means of a contactless gesture-controlled screen pointer
JP5256109B2 (en) * 2009-04-23 2013-08-07 株式会社日立製作所 Display device
US9498718B2 (en) * 2009-05-01 2016-11-22 Microsoft Technology Licensing, Llc Altering a view perspective within a display environment
US8649554B2 (en) * 2009-05-01 2014-02-11 Microsoft Corporation Method to control perspective for a camera-controlled computer
US20100295782A1 (en) * 2009-05-21 2010-11-25 Yehuda Binder System and method for control based on face ore hand gesture detection
US20100306716A1 (en) * 2009-05-29 2010-12-02 Microsoft Corporation Extending standard gestures
US9778921B2 (en) * 2009-06-02 2017-10-03 Apple Inc. Method for creating, exporting, sharing, and installing graphics functional blocks
US8692768B2 (en) 2009-07-10 2014-04-08 Smart Technologies Ulc Interactive input system
WO2011007204A1 (en) * 2009-07-16 2011-01-20 Ondo Inc. Control method of a graphic interface
US9933852B2 (en) 2009-10-14 2018-04-03 Oblong Industries, Inc. Multi-process interactive systems and methods
US9971807B2 (en) 2009-10-14 2018-05-15 Oblong Industries, Inc. Multi-process interactive systems and methods
US20110099476A1 (en) * 2009-10-23 2011-04-28 Microsoft Corporation Decorating a display environment
US9244533B2 (en) * 2009-12-17 2016-01-26 Microsoft Technology Licensing, Llc Camera navigation for presentations
CN102713798B (en) * 2010-01-15 2016-01-13 韩国电子通信研究院 Sight treating apparatus and method
US20110179376A1 (en) * 2010-01-21 2011-07-21 Sony Corporation Three or higher dimensional graphical user interface for tv menu and document navigation
US8499257B2 (en) 2010-02-09 2013-07-30 Microsoft Corporation Handles interactions for human—computer interface
US8928579B2 (en) * 2010-02-22 2015-01-06 Andrew David Wilson Interacting with an omni-directionally projected display
EP2369443B1 (en) * 2010-03-25 2017-01-11 BlackBerry Limited System and method for gesture detection and feedback
US8818027B2 (en) * 2010-04-01 2014-08-26 Qualcomm Incorporated Computing device interface
US8351651B2 (en) * 2010-04-26 2013-01-08 Microsoft Corporation Hand-location post-process refinement in a tracking system
FR2960076B1 (en) * 2010-05-12 2012-06-15 Pi Corporate METHOD AND SYSTEM FOR NON-CONTACT ACQUISITION OF MOVEMENTS OF AN OBJECT.
US9113190B2 (en) 2010-06-04 2015-08-18 Microsoft Technology Licensing, Llc Controlling power levels of electronic devices through user interaction
US8639020B1 (en) 2010-06-16 2014-01-28 Intel Corporation Method and system for modeling subjects from a depth map
US8964004B2 (en) 2010-06-18 2015-02-24 Amchael Visual Technology Corporation Three channel reflector imaging system
US8890803B2 (en) 2010-09-13 2014-11-18 Samsung Electronics Co., Ltd. Gesture control system
EP2428870A1 (en) * 2010-09-13 2012-03-14 Samsung Electronics Co., Ltd. Device and method for controlling gesture for mobile device
EP2622443B1 (en) 2010-10-01 2022-06-01 Z124 Drag move gesture in user interface
US9052800B2 (en) 2010-10-01 2015-06-09 Z124 User interface with stacked application management
US9729658B2 (en) * 2010-10-12 2017-08-08 Chris Trahan System for managing web-based content data and applications
US9195345B2 (en) 2010-10-28 2015-11-24 Microsoft Technology Licensing, Llc Position aware gestures with visual feedback as input method
US9377950B2 (en) * 2010-11-02 2016-06-28 Perceptive Pixel, Inc. Touch-based annotation system with temporary modes
JP5885309B2 (en) * 2010-12-30 2016-03-15 トムソン ライセンシングThomson Licensing User interface, apparatus and method for gesture recognition
GB2490199B (en) * 2011-01-06 2013-08-21 Pointgrab Ltd Computer vision based two hand control of content
KR101795574B1 (en) * 2011-01-06 2017-11-13 삼성전자주식회사 Electronic device controled by a motion, and control method thereof
KR101858531B1 (en) 2011-01-06 2018-05-17 삼성전자주식회사 Display apparatus controled by a motion, and motion control method thereof
US9271027B2 (en) * 2011-01-30 2016-02-23 Lg Electronics Inc. Image display apparatus and method for operating the same
US8782566B2 (en) 2011-02-22 2014-07-15 Cisco Technology, Inc. Using gestures to schedule and manage meetings
US9019202B2 (en) 2011-02-23 2015-04-28 Sony Corporation Dynamic virtual remote tagging
US9857868B2 (en) * 2011-03-19 2018-01-02 The Board Of Trustees Of The Leland Stanford Junior University Method and system for ergonomic touch-free interface
US9298287B2 (en) 2011-03-31 2016-03-29 Microsoft Technology Licensing, Llc Combined activation for natural user interface systems
US9842168B2 (en) 2011-03-31 2017-12-12 Microsoft Technology Licensing, Llc Task driven user intents
US9858343B2 (en) 2011-03-31 2018-01-02 Microsoft Technology Licensing Llc Personalization of queries, conversations, and searches
US9244984B2 (en) 2011-03-31 2016-01-26 Microsoft Technology Licensing, Llc Location based conversational understanding
US9760566B2 (en) 2011-03-31 2017-09-12 Microsoft Technology Licensing, Llc Augmented conversational understanding agent to identify conversation context between two humans and taking an agent action thereof
US10642934B2 (en) 2011-03-31 2020-05-05 Microsoft Technology Licensing, Llc Augmented conversational understanding architecture
US20120257035A1 (en) * 2011-04-08 2012-10-11 Sony Computer Entertainment Inc. Systems and methods for providing feedback by tracking user gaze and gestures
US8620113B2 (en) 2011-04-25 2013-12-31 Microsoft Corporation Laser diode modes
US8840466B2 (en) 2011-04-25 2014-09-23 Aquifi, Inc. Method and system to create three-dimensional mapping in a two-dimensional game
US9454962B2 (en) 2011-05-12 2016-09-27 Microsoft Technology Licensing, Llc Sentence simplification for spoken language understanding
US9064006B2 (en) 2012-08-23 2015-06-23 Microsoft Technology Licensing, Llc Translating natural language utterances to keyword search queries
US8845431B2 (en) 2011-05-31 2014-09-30 Microsoft Corporation Shape trace gesturing
US8657683B2 (en) 2011-05-31 2014-02-25 Microsoft Corporation Action selection gesturing
US8760395B2 (en) 2011-05-31 2014-06-24 Microsoft Corporation Gesture recognition techniques
US8740702B2 (en) 2011-05-31 2014-06-03 Microsoft Corporation Action trigger gesturing
JP6074170B2 (en) 2011-06-23 2017-02-01 インテル・コーポレーション Short range motion tracking system and method
US11048333B2 (en) 2011-06-23 2021-06-29 Intel Corporation System and method for close-range movement tracking
US9207767B2 (en) * 2011-06-29 2015-12-08 International Business Machines Corporation Guide mode for gesture spaces
US8810533B2 (en) 2011-07-20 2014-08-19 Z124 Systems and methods for receiving gesture inputs spanning multiple input devices
US9292112B2 (en) * 2011-07-28 2016-03-22 Hewlett-Packard Development Company, L.P. Multimodal interface
US9030487B2 (en) * 2011-08-01 2015-05-12 Lg Electronics Inc. Electronic device for displaying three-dimensional image and method of using the same
US9013366B2 (en) * 2011-08-04 2015-04-21 Microsoft Technology Licensing, Llc Display environment for a plurality of display devices
KR101262700B1 (en) * 2011-08-05 2013-05-08 삼성전자주식회사 Method for Controlling Electronic Apparatus based on Voice Recognition and Motion Recognition, and Electric Apparatus thereof
ES2958183T3 (en) 2011-08-05 2024-02-05 Samsung Electronics Co Ltd Control procedure for electronic devices based on voice and motion recognition, and electronic device that applies the same
US11133096B2 (en) * 2011-08-08 2021-09-28 Smith & Nephew, Inc. Method for non-invasive motion tracking to augment patient administered physical rehabilitation
NO333234B1 (en) * 2011-08-31 2013-04-15 Cisco Tech Inc Video conferencing system, method and computer program storage device
US8648808B2 (en) * 2011-09-19 2014-02-11 Amchael Visual Technology Corp. Three-dimensional human-computer interaction system that supports mouse operations through the motion of a finger and an operation method thereof
US8842057B2 (en) * 2011-09-27 2014-09-23 Z124 Detail on triggers: transitional states
US9019352B2 (en) 2011-11-21 2015-04-28 Amchael Visual Technology Corp. Two-parallel-channel reflector with focal length and disparity control
US8635637B2 (en) 2011-12-02 2014-01-21 Microsoft Corporation User interface presenting an animated avatar performing a media reaction
JP5846662B2 (en) * 2011-12-06 2016-01-20 トムソン ライセンシングThomson Licensing Method and system for responding to user selection gestures for objects displayed in three dimensions
US9100685B2 (en) 2011-12-09 2015-08-04 Microsoft Technology Licensing, Llc Determining audience state or interest using passive sensor data
US11493998B2 (en) 2012-01-17 2022-11-08 Ultrahaptics IP Two Limited Systems and methods for machine control
US8638989B2 (en) 2012-01-17 2014-01-28 Leap Motion, Inc. Systems and methods for capturing motion in three-dimensional space
US9501152B2 (en) 2013-01-15 2016-11-22 Leap Motion, Inc. Free-space user interface and control using virtual constructs
US9679215B2 (en) 2012-01-17 2017-06-13 Leap Motion, Inc. Systems and methods for machine control
US10691219B2 (en) 2012-01-17 2020-06-23 Ultrahaptics IP Two Limited Systems and methods for machine control
US8693731B2 (en) 2012-01-17 2014-04-08 Leap Motion, Inc. Enhanced contrast for object detection and characterization by optical imaging
US9070019B2 (en) 2012-01-17 2015-06-30 Leap Motion, Inc. Systems and methods for capturing motion in three-dimensional space
US8854433B1 (en) 2012-02-03 2014-10-07 Aquifi, Inc. Method and system enabling natural user interface gestures with an electronic system
US9019603B2 (en) 2012-03-22 2015-04-28 Amchael Visual Technology Corp. Two-parallel-channel reflector with focal length and disparity control
US9264660B1 (en) 2012-03-30 2016-02-16 Google Inc. Presenter control during a video conference
US8898687B2 (en) 2012-04-04 2014-11-25 Microsoft Corporation Controlling a media program based on a media reaction
US9477303B2 (en) 2012-04-09 2016-10-25 Intel Corporation System and method for combining three-dimensional tracking with a three-dimensional display for a user interface
US9141197B2 (en) 2012-04-16 2015-09-22 Qualcomm Incorporated Interacting with a device using gestures
JP2013225211A (en) * 2012-04-20 2013-10-31 Nippon Telegr & Teleph Corp <Ntt> Information input device
CA2775700C (en) 2012-05-04 2013-07-23 Microsoft Corporation Determining a future portion of a currently presented media program
US8938124B2 (en) 2012-05-10 2015-01-20 Pointgrab Ltd. Computer vision based tracking of a hand
US9619036B2 (en) 2012-05-11 2017-04-11 Comcast Cable Communications, Llc System and methods for controlling a user experience
US9092394B2 (en) 2012-06-15 2015-07-28 Honda Motor Co., Ltd. Depth based context identification
US9536135B2 (en) 2012-06-18 2017-01-03 Microsoft Technology Licensing, Llc Dynamic hand gesture recognition using depth data
US9836590B2 (en) 2012-06-22 2017-12-05 Microsoft Technology Licensing, Llc Enhanced accuracy of user presence status determination
US8934675B2 (en) 2012-06-25 2015-01-13 Aquifi, Inc. Systems and methods for tracking human hands by performing parts based template matching using images from multiple viewpoints
US9111135B2 (en) 2012-06-25 2015-08-18 Aquifi, Inc. Systems and methods for tracking human hands using parts based template matching using corresponding pixels in bounded regions of a sequence of frames that are a specified distance interval from a reference camera
US20140007115A1 (en) * 2012-06-29 2014-01-02 Ning Lu Multi-modal behavior awareness for human natural command control
US9557634B2 (en) 2012-07-05 2017-01-31 Amchael Visual Technology Corporation Two-channel reflector based single-lens 2D/3D camera with disparity and convergence angle control
CN102854981A (en) * 2012-07-30 2013-01-02 成都西可科技有限公司 Body technology based virtual keyboard character input method
US9360932B1 (en) * 2012-08-29 2016-06-07 Intellect Motion Llc. Systems and methods for virtually displaying real movements of objects in a 3D-space by means of 2D-video capture
US8836768B1 (en) 2012-09-04 2014-09-16 Aquifi, Inc. Method and system enabling natural user interface gestures with user wearable glasses
JP2014067148A (en) * 2012-09-25 2014-04-17 Toshiba Corp Handwritten document processor and handwritten document processing method and program
US10908929B2 (en) 2012-10-15 2021-02-02 Famous Industries, Inc. Human versus bot detection using gesture fingerprinting
US9501171B1 (en) * 2012-10-15 2016-11-22 Famous Industries, Inc. Gesture fingerprinting
US10877780B2 (en) 2012-10-15 2020-12-29 Famous Industries, Inc. Visibility detection using gesture fingerprinting
US11386257B2 (en) 2012-10-15 2022-07-12 Amaze Software, Inc. Efficient manipulation of surfaces in multi-dimensional space using energy agents
US9285893B2 (en) 2012-11-08 2016-03-15 Leap Motion, Inc. Object detection and tracking with variable-field illumination devices
US9423939B2 (en) * 2012-11-12 2016-08-23 Microsoft Technology Licensing, Llc Dynamic adjustment of user interface
KR20140063272A (en) * 2012-11-16 2014-05-27 엘지전자 주식회사 Image display apparatus and method for operating the same
US8761448B1 (en) * 2012-12-13 2014-06-24 Intel Corporation Gesture pre-processing of video stream using a markered region
TWI454968B (en) 2012-12-24 2014-10-01 Ind Tech Res Inst Three-dimensional interactive device and operation method thereof
US10609285B2 (en) 2013-01-07 2020-03-31 Ultrahaptics IP Two Limited Power consumption in motion-capture systems
US9465461B2 (en) 2013-01-08 2016-10-11 Leap Motion, Inc. Object detection and tracking with audio and optical signals
US9459697B2 (en) 2013-01-15 2016-10-04 Leap Motion, Inc. Dynamic, free-space user interactions for machine control
US10241639B2 (en) 2013-01-15 2019-03-26 Leap Motion, Inc. Dynamic user interactions for display control and manipulation of display objects
US9129155B2 (en) 2013-01-30 2015-09-08 Aquifi, Inc. Systems and methods for initializing motion tracking of human hands using template matching within bounded regions determined using a depth map
US9092665B2 (en) 2013-01-30 2015-07-28 Aquifi, Inc Systems and methods for initializing motion tracking of human hands
US9785228B2 (en) 2013-02-11 2017-10-10 Microsoft Technology Licensing, Llc Detecting natural user-input engagement
US8744645B1 (en) 2013-02-26 2014-06-03 Honda Motor Co., Ltd. System and method for incorporating gesture and voice recognition into a single system
US20140258942A1 (en) * 2013-03-05 2014-09-11 Intel Corporation Interaction of multiple perceptual sensing inputs
US9342230B2 (en) 2013-03-13 2016-05-17 Microsoft Technology Licensing, Llc Natural user interface scrolling and targeting
US9292103B2 (en) * 2013-03-13 2016-03-22 Intel Corporation Gesture pre-processing of video stream using skintone detection
US9702977B2 (en) 2013-03-15 2017-07-11 Leap Motion, Inc. Determining positional information of an object in space
US20140281980A1 (en) 2013-03-15 2014-09-18 Chad A. Hage Methods and Apparatus to Identify a Type of Media Presented by a Media Player
US20140298379A1 (en) * 2013-03-15 2014-10-02 Yume, Inc. 3D Mobile and Connected TV Ad Trafficking System
US9298266B2 (en) 2013-04-02 2016-03-29 Aquifi, Inc. Systems and methods for implementing three-dimensional (3D) gesture based graphical user interfaces (GUI) that incorporate gesture reactive interface objects
US10620709B2 (en) 2013-04-05 2020-04-14 Ultrahaptics IP Two Limited Customized gesture interpretation
US9916009B2 (en) 2013-04-26 2018-03-13 Leap Motion, Inc. Non-tactile interface systems and methods
US9747696B2 (en) 2013-05-17 2017-08-29 Leap Motion, Inc. Systems and methods for providing normalized parameters of motions of objects in three-dimensional space
KR102227494B1 (en) * 2013-05-29 2021-03-15 삼성전자주식회사 Apparatus and method for processing an user input using movement of an object
US9696812B2 (en) * 2013-05-29 2017-07-04 Samsung Electronics Co., Ltd. Apparatus and method for processing user input using motion of object
US9430045B2 (en) * 2013-07-17 2016-08-30 Lenovo (Singapore) Pte. Ltd. Special gestures for camera control and image processing operations
DE102013012285A1 (en) * 2013-07-24 2015-01-29 Giesecke & Devrient Gmbh Method and device for value document processing
US9798388B1 (en) 2013-07-31 2017-10-24 Aquifi, Inc. Vibrotactile system to augment 3D input systems
US10281987B1 (en) 2013-08-09 2019-05-07 Leap Motion, Inc. Systems and methods of free-space gestural interaction
TWI505135B (en) * 2013-08-20 2015-10-21 Utechzone Co Ltd Control system for display screen, control apparatus and control method
US9721383B1 (en) 2013-08-29 2017-08-01 Leap Motion, Inc. Predictive information for free space gesture control and communication
US9632572B2 (en) 2013-10-03 2017-04-25 Leap Motion, Inc. Enhanced field of view to augment three-dimensional (3D) sensory space for free-space gesture interpretation
US9996638B1 (en) 2013-10-31 2018-06-12 Leap Motion, Inc. Predictive information for free space gesture control and communication
US10218660B2 (en) * 2013-12-17 2019-02-26 Google Llc Detecting user gestures for dismissing electronic notifications
US9507417B2 (en) 2014-01-07 2016-11-29 Aquifi, Inc. Systems and methods for implementing head tracking based graphical user interfaces (GUI) that incorporate gesture reactive interface objects
US20150199017A1 (en) * 2014-01-10 2015-07-16 Microsoft Corporation Coordinated speech and gesture input
US9613262B2 (en) 2014-01-15 2017-04-04 Leap Motion, Inc. Object detection and tracking for providing a virtual device experience
US9619105B1 (en) 2014-01-30 2017-04-11 Aquifi, Inc. Systems and methods for gesture based interaction with viewpoint dependent user interfaces
US11221680B1 (en) * 2014-03-01 2022-01-11 sigmund lindsay clements Hand gestures used to operate a control panel for a device
US9990046B2 (en) 2014-03-17 2018-06-05 Oblong Industries, Inc. Visual collaboration interface
US9501138B2 (en) 2014-05-05 2016-11-22 Aquifi, Inc. Systems and methods for remapping three-dimensional gestures onto a finite-size two-dimensional surface
US9958946B2 (en) 2014-06-06 2018-05-01 Microsoft Technology Licensing, Llc Switching input rails without a release command in a natural user interface
DE202014103729U1 (en) 2014-08-08 2014-09-09 Leap Motion, Inc. Augmented reality with motion detection
US9953646B2 (en) 2014-09-02 2018-04-24 Belleau Technologies Method and system for dynamic speech recognition and tracking of prewritten script
US9730671B2 (en) * 2014-10-03 2017-08-15 David Thomas Gering System and method of voice activated image segmentation
US9881610B2 (en) 2014-11-13 2018-01-30 International Business Machines Corporation Speech recognition system adaptation based on non-acoustic attributes and face selection based on mouth motion using pixel intensities
US9626001B2 (en) 2014-11-13 2017-04-18 International Business Machines Corporation Speech recognition candidate selection based on non-acoustic input
US9454235B2 (en) * 2014-12-26 2016-09-27 Seungman KIM Electronic apparatus having a sensing unit to input a user command and a method thereof
US9696795B2 (en) 2015-02-13 2017-07-04 Leap Motion, Inc. Systems and methods of creating a realistic grab experience in virtual reality/augmented reality environments
US10429923B1 (en) 2015-02-13 2019-10-01 Ultrahaptics IP Two Limited Interaction engine for creating a realistic experience in virtual reality/augmented reality environments
US10121471B2 (en) * 2015-06-29 2018-11-06 Amazon Technologies, Inc. Language model speech endpointing
US10620807B2 (en) * 2015-09-16 2020-04-14 Lenovo (Singapore) Pte. Ltd. Association of objects in a three-dimensional model with time-related metadata
US10429935B2 (en) 2016-02-08 2019-10-01 Comcast Cable Communications, Llc Tremor correction for gesture recognition
FR3049078B1 (en) * 2016-03-21 2019-11-29 Valeo Vision VOICE AND / OR GESTUAL RECOGNITION CONTROL DEVICE AND METHOD FOR INTERIOR LIGHTING OF A VEHICLE
US10074226B2 (en) * 2016-04-05 2018-09-11 Honeywell International Inc. Systems and methods for providing UAV-based digital escort drones in visitor management and integrated access control systems
CN107452381B (en) * 2016-05-30 2020-12-29 中国移动通信有限公司研究院 Multimedia voice recognition device and method
WO2018006224A1 (en) * 2016-07-04 2018-01-11 SZ DJI Technology Co., Ltd. System and method for automated tracking and navigation
US10529302B2 (en) 2016-07-07 2020-01-07 Oblong Industries, Inc. Spatially mediated augmentations of and interactions among distinct devices and applications via extended pixel manifold
US10444983B2 (en) * 2016-09-20 2019-10-15 Rohde & Schwarz Gmbh & Co. Kg Signal analyzing instrument with touch gesture control and method of operating thereof
US20180143693A1 (en) * 2016-11-21 2018-05-24 David J. Calabrese Virtual object manipulation
US10303417B2 (en) 2017-04-03 2019-05-28 Youspace, Inc. Interactive systems for depth-based input
WO2018106276A1 (en) * 2016-12-05 2018-06-14 Youspace, Inc. Systems and methods for gesture-based interaction
US10303259B2 (en) 2017-04-03 2019-05-28 Youspace, Inc. Systems and methods for gesture-based interaction
US10437342B2 (en) 2016-12-05 2019-10-08 Youspace, Inc. Calibration systems and methods for depth-based interfaces with disparate fields of view
US10684758B2 (en) 2017-02-20 2020-06-16 Microsoft Technology Licensing, Llc Unified system for bimanual interactions
US10558341B2 (en) * 2017-02-20 2020-02-11 Microsoft Technology Licensing, Llc Unified system for bimanual interactions on flexible representations of content
US10013979B1 (en) * 2017-04-17 2018-07-03 Essential Products, Inc. Expanding a set of commands to control devices in an environment
US10664041B2 (en) 2017-11-13 2020-05-26 Inernational Business Machines Corporation Implementing a customized interaction pattern for a device
US10585525B2 (en) 2018-02-12 2020-03-10 International Business Machines Corporation Adaptive notification modifications for touchscreen interfaces
US11875012B2 (en) 2018-05-25 2024-01-16 Ultrahaptics IP Two Limited Throwable interface for augmented reality and virtual reality environments
DE102019002794A1 (en) * 2019-04-16 2020-10-22 Daimler Ag Method for controlling lighting
DE102019210009A1 (en) 2019-07-08 2021-01-28 Volkswagen Aktiengesellschaft Method for operating an operating system in a vehicle and operating system in a vehicle
DE102019210008A1 (en) 2019-07-08 2021-01-14 Volkswagen Aktiengesellschaft Method for operating a control system and control system
DE102019210010A1 (en) 2019-07-08 2021-01-14 Volkswagen Aktiengesellschaft Method and operating system for acquiring user input for a device of a vehicle
EP3835924A1 (en) * 2019-12-13 2021-06-16 Treye Tech UG (haftungsbeschränkt) Computer system and method for human-machine interaction
CN111123986A (en) * 2019-12-25 2020-05-08 四川云盾光电科技有限公司 Control device for controlling two-degree-of-freedom turntable based on gestures
CN212972930U (en) 2020-04-21 2021-04-16 上海联影医疗科技股份有限公司 Magnetic resonance system
US11418863B2 (en) 2020-06-25 2022-08-16 Damian A Lynch Combination shower rod and entertainment system
WO2022160085A1 (en) * 2021-01-26 2022-08-04 京东方科技集团股份有限公司 Control method, electronic device, and storage medium
US20220253148A1 (en) * 2021-02-05 2022-08-11 Pepsico, Inc. Devices, Systems, and Methods for Contactless Interfacing

Citations (91)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4288078A (en) * 1979-11-20 1981-09-08 Lugo Julio I Game apparatus
US4627620A (en) * 1984-12-26 1986-12-09 Yang John P Electronic athlete trainer for improving skills in reflex, speed and accuracy
US4630910A (en) * 1984-02-16 1986-12-23 Robotic Vision Systems, Inc. Method of measuring in three-dimensions at high speed
US4645458A (en) * 1985-04-15 1987-02-24 Harald Phillip Athletic evaluation and training apparatus
US4695953A (en) * 1983-08-25 1987-09-22 Blair Preston E TV animation interactively controlled by the viewer
US4702475A (en) * 1985-08-16 1987-10-27 Innovating Training Products, Inc. Sports technique and reaction training system
US4711543A (en) * 1986-04-14 1987-12-08 Blair Preston E TV animation interactively controlled by the viewer
US4751642A (en) * 1986-08-29 1988-06-14 Silva John M Interactive sports simulation system with physiological sensing and psychological conditioning
US4796997A (en) * 1986-05-27 1989-01-10 Synthetic Vision Systems, Inc. Method and system for high-speed, 3-D imaging of an object at a vision station
US4809065A (en) * 1986-12-01 1989-02-28 Kabushiki Kaisha Toshiba Interactive system and related method for displaying data to produce a three-dimensional image of an object
US4817950A (en) * 1987-05-08 1989-04-04 Goo Paul E Video game control unit and attitude sensor
US4843568A (en) * 1986-04-11 1989-06-27 Krueger Myron W Real time perception of and response to the actions of an unencumbered participant/user
US4893183A (en) * 1988-08-11 1990-01-09 Carnegie-Mellon University Robotic vision system
US4901362A (en) * 1988-08-08 1990-02-13 Raytheon Company Method of recognizing patterns
US4925189A (en) * 1989-01-13 1990-05-15 Braeunig Thomas F Body-mounted video game exercise device
US5101444A (en) * 1990-05-18 1992-03-31 Panacea, Inc. Method and apparatus for high speed object location
US5109537A (en) * 1988-06-24 1992-04-28 Kabushiki Kaisha Toshiba Telecommunication apparatus having an id rechecking function
US5139261A (en) * 1989-09-15 1992-08-18 Openiano Renato M Foot-actuated computer game controller serving as a joystick
US5148154A (en) * 1990-12-04 1992-09-15 Sony Corporation Of America Multi-dimensional user interface
US5184295A (en) * 1986-05-30 1993-02-02 Mann Ralph V System and method for teaching physical skills
US5229756A (en) * 1989-02-07 1993-07-20 Yamaha Corporation Image control apparatus
US5229754A (en) * 1990-02-13 1993-07-20 Yazaki Corporation Automotive reflection type display apparatus
US5239464A (en) * 1988-08-04 1993-08-24 Blair Preston E Interactive video system providing repeated switching of multiple tracks of actions sequences
US5239463A (en) * 1988-08-04 1993-08-24 Blair Preston E Method and apparatus for player interaction with animated characters and objects
US5288078A (en) * 1988-10-14 1994-02-22 David G. Capper Control interface apparatus
US5295491A (en) * 1991-09-26 1994-03-22 Sam Technology, Inc. Non-invasive human neurocognitive performance capability testing method and system
US5320538A (en) * 1992-09-23 1994-06-14 Hughes Training, Inc. Interactive aircraft training system and method
US5347306A (en) * 1993-12-17 1994-09-13 Mitsubishi Electric Research Laboratories, Inc. Animated electronic meeting place
US5385519A (en) * 1994-04-19 1995-01-31 Hsu; Chi-Hsueh Running machine
US5405152A (en) * 1993-06-08 1995-04-11 The Walt Disney Company Method and apparatus for an interactive video game with physical feedback
US5525901A (en) * 1993-02-02 1996-06-11 Beaudreau Electric, Inc. Sensor systems for monitoring and measuring angular position in two or three axes
US5611731A (en) * 1995-09-08 1997-03-18 Thrustmaster, Inc. Video pinball machine controller having an optical accelerometer for detecting slide and tilt
US5615132A (en) * 1994-01-21 1997-03-25 Crossbow Technology, Inc. Method and apparatus for determining position and orientation of a moveable object using accelerometers
US5757360A (en) * 1995-05-03 1998-05-26 Mitsubishi Electric Information Technology Center America, Inc. Hand held computer control device
US5864808A (en) * 1994-04-25 1999-01-26 Hitachi, Ltd. Erroneous input processing method and apparatus in information processing system using composite input
US5909189A (en) * 1996-11-14 1999-06-01 Raytheon Company Group tracking
US6067077A (en) * 1998-04-10 2000-05-23 Immersion Corporation Position sensing for force feedback devices
US6072467A (en) * 1996-05-03 2000-06-06 Mitsubishi Electric Information Technology Center America, Inc. (Ita) Continuously variable control of animated on-screen characters
US6181343B1 (en) * 1997-12-23 2001-01-30 Philips Electronics North America Corp. System and method for permitting three-dimensional navigation through a virtual reality environment using camera-based gesture inputs
US6195104B1 (en) * 1997-12-23 2001-02-27 Philips Electronics North America Corp. System and method for permitting three-dimensional navigation through a virtual reality environment using camera-based gesture inputs
US6269172B1 (en) * 1998-04-13 2001-07-31 Compaq Computer Corporation Method for tracking the motion of a 3-D figure
US20020004422A1 (en) * 1997-01-30 2002-01-10 Kabusiki Kaisha Sega Enterprises Input device, game device, and method and recording medium for same
US20020019258A1 (en) * 2000-05-31 2002-02-14 Kim Gerard Jounghyun Methods and apparatus of displaying and evaluating motion data in a motion game apparatus
US6347998B1 (en) * 1999-06-30 2002-02-19 Konami Co., Ltd. Game system and computer-readable recording medium
US20020041327A1 (en) * 2000-07-24 2002-04-11 Evan Hildreth Video-based image control system
US6375572B1 (en) * 1999-10-04 2002-04-23 Nintendo Co., Ltd. Portable game apparatus with acceleration sensor and information storage medium storing a game progam
US20020055383A1 (en) * 2000-02-24 2002-05-09 Namco Ltd. Game system and program
US6509889B2 (en) * 1998-12-03 2003-01-21 International Business Machines Corporation Method and apparatus for enabling the adaptation of the input parameters for a computer system pointing device
US20030040350A1 (en) * 2001-08-22 2003-02-27 Masayasu Nakata Game system, puzzle game program, and storage medium having program stored therein
US6542621B1 (en) * 1998-08-31 2003-04-01 Texas Instruments Incorporated Method of dealing with occlusion when tracking multiple objects and people in video sequences
US6545661B1 (en) * 1999-06-21 2003-04-08 Midway Amusement Games, Llc Video game system having a control unit with an accelerometer for controlling a video game
US6600475B2 (en) * 2001-01-22 2003-07-29 Koninklijke Philips Electronics N.V. Single camera system for gesture-based input and target indication
US20030156756A1 (en) * 2002-02-15 2003-08-21 Gokturk Salih Burak Gesture recognition system using depth perceptive sensors
US20030216179A1 (en) * 2002-05-17 2003-11-20 Toshiaki Suzuki Game device changing sound and an image in accordance with a tilt operation
US20040001113A1 (en) * 2002-06-28 2004-01-01 John Zipperer Method and apparatus for spline-based trajectory classification, gesture detection and localization
US20040005083A1 (en) * 2002-03-26 2004-01-08 Kikuo Fujimura Real-time eye detection and tracking under various light conditions
US6753879B1 (en) * 2000-07-03 2004-06-22 Intel Corporation Creating overlapping real and virtual images
US20040155902A1 (en) * 2001-09-14 2004-08-12 Dempski Kelly L. Lab window collaboration
US6795567B1 (en) * 1999-09-16 2004-09-21 Hewlett-Packard Development Company, L.P. Method for efficiently tracking object models in video sequences via dynamic ordering of features
US20040189720A1 (en) * 2003-03-25 2004-09-30 Wilson Andrew D. Architecture for controlling a computer using hand gestures
US20040194129A1 (en) * 2003-03-31 2004-09-30 Carlbom Ingrid Birgitta Method and apparatus for intelligent and automatic sensor control using multimedia database system
US6804396B2 (en) * 2001-03-28 2004-10-12 Honda Giken Kogyo Kabushiki Kaisha Gesture recognition system
US20040204240A1 (en) * 2000-02-22 2004-10-14 Barney Jonathan A. Magical wand and interactive play experience
US20050076161A1 (en) * 2003-10-03 2005-04-07 Amro Albanna Input system and method
US20050085298A1 (en) * 1997-11-25 2005-04-21 Woolston Thomas G. Electronic sword game with input and feedback
US6888960B2 (en) * 2001-03-28 2005-05-03 Nec Corporation Fast optimal linear approximation of the images of variably illuminated solid objects for recognition
US20050151850A1 (en) * 2004-01-14 2005-07-14 Korea Institute Of Science And Technology Interactive presentation system
US20050212753A1 (en) * 2004-03-23 2005-09-29 Marvit David L Motion controlled remote controller
US20050238201A1 (en) * 2004-04-15 2005-10-27 Atid Shamaie Tracking bimanual movements
US20050239548A1 (en) * 2002-06-27 2005-10-27 Hiromu Ueshima Information processor having input system using stroboscope
US20050255434A1 (en) * 2004-02-27 2005-11-17 University Of Florida Research Foundation, Inc. Interactive virtual characters for training including medical diagnosis training
US20060007142A1 (en) * 2003-06-13 2006-01-12 Microsoft Corporation Pointing device and cursor for use in intelligent computing environments
US20060036944A1 (en) * 2004-08-10 2006-02-16 Microsoft Corporation Surface UI for gesture-based interaction
US20060098873A1 (en) * 2000-10-03 2006-05-11 Gesturetek, Inc., A Delaware Corporation Multiple camera control system
US7070500B1 (en) * 1999-09-07 2006-07-04 Konami Corporation Musical player-motion sensing game system
US20060178212A1 (en) * 2004-11-23 2006-08-10 Hillcrest Laboratories, Inc. Semantic gaming and application transformation
US7095401B2 (en) * 2000-11-02 2006-08-22 Siemens Corporate Research, Inc. System and method for gesture interface
US7148913B2 (en) * 2001-10-12 2006-12-12 Hrl Laboratories, Llc Vision-based pointer tracking and object classification method and apparatus
US20070060383A1 (en) * 2005-09-14 2007-03-15 Nintendo Co., Ltd. Video game program and video game system
US20070252898A1 (en) * 2002-04-05 2007-11-01 Bruno Delean Remote control apparatus using gesture recognition
US20080036732A1 (en) * 2006-08-08 2008-02-14 Microsoft Corporation Virtual Controller For Visual Displays
US7372977B2 (en) * 2003-05-29 2008-05-13 Honda Motor Co., Ltd. Visual tracking using depth data
US20080122786A1 (en) * 1997-08-22 2008-05-29 Pryor Timothy R Advanced video gaming methods for education and play using camera based inputs
US20080193043A1 (en) * 2004-06-16 2008-08-14 Microsoft Corporation Method and system for reducing effects of undesired signals in an infrared imaging system
US20090121894A1 (en) * 2007-11-14 2009-05-14 Microsoft Corporation Magic wand
US20100113153A1 (en) * 2006-07-14 2010-05-06 Ailive, Inc. Self-Contained Inertial Navigation System for Interactive Control Using Movable Controllers
US7721231B2 (en) * 2002-02-07 2010-05-18 Microsoft Corporation Controlling an object within an environment using a pointing device
US20100138798A1 (en) * 2003-03-25 2010-06-03 Wilson Andrew D System and method for executing a game process
US20110081969A1 (en) * 2005-08-22 2011-04-07 Akio Ikeda Video game system with wireless modular handheld controller
US7927216B2 (en) * 2005-09-15 2011-04-19 Nintendo Co., Ltd. Video game system with wireless modular handheld controller
US20110124410A1 (en) * 2009-11-20 2011-05-26 Xiaodong Mao Controller for interfacing with a computing program using position, orientation, or motion

Family Cites Families (169)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5469740A (en) 1989-07-14 1995-11-28 Impulse Technology, Inc. Interactive video testing and training system
US5534917A (en) 1991-05-09 1996-07-09 Very Vivid, Inc. Video image based control system
US5417210A (en) * 1992-05-27 1995-05-23 International Business Machines Corporation System and method for augmentation of endoscopic surgery
US6054991A (en) * 1991-12-02 2000-04-25 Texas Instruments Incorporated Method of modeling player position and movement in a virtual reality system
CA2101633A1 (en) 1991-12-03 1993-06-04 Barry J. French Interactive video testing and training system
US5875108A (en) 1991-12-23 1999-02-23 Hoffberg; Steven M. Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
JPH07325934A (en) 1992-07-10 1995-12-12 Walt Disney Co:The Method and equipment for provision of graphics enhanced to virtual world
US5999908A (en) 1992-08-06 1999-12-07 Abelow; Daniel H. Customer-based product design module
IT1257294B (en) * 1992-11-20 1996-01-12 DEVICE SUITABLE TO DETECT THE CONFIGURATION OF A PHYSIOLOGICAL-DISTAL UNIT, TO BE USED IN PARTICULAR AS AN ADVANCED INTERFACE FOR MACHINES AND CALCULATORS.
US5495576A (en) 1993-01-11 1996-02-27 Ritchey; Kurtis J. Panoramic image based virtual reality/telepresence audio-visual system and method
US5690582A (en) 1993-02-02 1997-11-25 Tectrix Fitness Equipment, Inc. Interactive exercise apparatus
JP2799126B2 (en) 1993-03-26 1998-09-17 株式会社ナムコ Video game device and game input device
US5414643A (en) * 1993-06-14 1995-05-09 Hughes Aircraft Company Method and apparatus for continuous time representation of multiple hypothesis tracking data
US5801943A (en) * 1993-07-23 1998-09-01 Condition Monitoring Systems Traffic surveillance and simulation apparatus
US5454043A (en) 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
US5423554A (en) 1993-09-24 1995-06-13 Metamedia Ventures, Inc. Virtual reality game method and apparatus
US5980256A (en) 1993-10-29 1999-11-09 Carmein; David E. E. Virtual reality system with enhanced sensory apparatus
JP3419050B2 (en) * 1993-11-19 2003-06-23 株式会社日立製作所 Input device
US5959574A (en) * 1993-12-21 1999-09-28 Colorado State University Research Foundation Method and system for tracking multiple regional objects by multi-dimensional relaxation
JP2552427B2 (en) * 1993-12-28 1996-11-13 コナミ株式会社 Tv play system
US5577981A (en) 1994-01-19 1996-11-26 Jarvik; Robert Virtual reality exercise machine and computer controlled video system
US5580249A (en) 1994-02-14 1996-12-03 Sarcos Group Apparatus for simulating mobility of a human
US5732227A (en) 1994-07-05 1998-03-24 Hitachi, Ltd. Interactive information processing system responsive to user manipulation of physical objects and displayed images
US5597309A (en) 1994-03-28 1997-01-28 Riess; Thomas Method and apparatus for treatment of gait problems associated with parkinson's disease
US5528263A (en) * 1994-06-15 1996-06-18 Daniel M. Platzker Interactive projected video image display system
US5524637A (en) 1994-06-29 1996-06-11 Erickson; Jon W. Interactive system for measuring physiological exertion
JPH0844490A (en) 1994-07-28 1996-02-16 Matsushita Electric Ind Co Ltd Interface device
US5563988A (en) 1994-08-01 1996-10-08 Massachusetts Institute Of Technology Method and system for facilitating wireless, full-body, real-time user interaction with a digitally represented visual environment
JPH0863326A (en) * 1994-08-22 1996-03-08 Hitachi Ltd Image processing device/method
US6714665B1 (en) * 1994-09-02 2004-03-30 Sarnoff Corporation Fully automated iris recognition system utilizing wide and narrow fields of view
US5516105A (en) * 1994-10-06 1996-05-14 Exergame, Inc. Acceleration activated joystick
US5638300A (en) 1994-12-05 1997-06-10 Johnson; Lee E. Golf swing analysis system
JPH08161292A (en) 1994-12-09 1996-06-21 Matsushita Electric Ind Co Ltd Method and system for detecting congestion degree
US5594469A (en) * 1995-02-21 1997-01-14 Mitsubishi Electric Information Technology Center America Inc. Hand gesture machine control system
US5682229A (en) 1995-04-14 1997-10-28 Schwartz Electro-Optics, Inc. Laser range camera
DE69634913T2 (en) * 1995-04-28 2006-01-05 Matsushita Electric Industrial Co., Ltd., Kadoma INTERFACE DEVICE
DE19516664C1 (en) * 1995-05-05 1996-08-29 Siemens Ag Processor-supported detection of selective target object in image
US5913727A (en) 1995-06-02 1999-06-22 Ahdoot; Ned Interactive movement and contact simulation game
JP3481631B2 (en) * 1995-06-07 2003-12-22 ザ トラスティース オブ コロンビア ユニヴァーシティー イン ザ シティー オブ ニューヨーク Apparatus and method for determining a three-dimensional shape of an object using relative blur in an image due to active illumination and defocus
US5682196A (en) 1995-06-22 1997-10-28 Actv, Inc. Three-dimensional (3D) video presentation system providing interactive 3D presentation with personalized audio responses for multiple viewers
US5702323A (en) 1995-07-26 1997-12-30 Poulton; Craig K. Electronic exercise enhancer
JPH0981309A (en) * 1995-09-13 1997-03-28 Toshiba Corp Input device
US6308565B1 (en) 1995-11-06 2001-10-30 Impulse Technology Ltd. System and method for tracking and assessing movement skills in multidimensional space
US6430997B1 (en) 1995-11-06 2002-08-13 Trazer Technologies, Inc. System and method for tracking and assessing movement skills in multidimensional space
US6073489A (en) 1995-11-06 2000-06-13 French; Barry J. Testing and training system for assessing the ability of a player to complete a task
US6098458A (en) 1995-11-06 2000-08-08 Impulse Technology, Ltd. Testing and training system for assessing movement and agility skills without a confining field
US6176782B1 (en) 1997-12-22 2001-01-23 Philips Electronics North America Corp. Motion-based command generation technology
US5933125A (en) 1995-11-27 1999-08-03 Cae Electronics, Ltd. Method and apparatus for reducing instability in the display of a virtual environment
US5641288A (en) 1996-01-11 1997-06-24 Zaenglein, Jr.; William G. Shooting simulating process and training device using a virtual reality display screen
JP2000510013A (en) 1996-05-08 2000-08-08 リアル ヴィジョン コーポレイション Real-time simulation using position detection
US6173066B1 (en) * 1996-05-21 2001-01-09 Cybernet Systems Corporation Pose determination and tracking by matching 3D objects to a 2D sensor
US6002808A (en) * 1996-07-26 1999-12-14 Mitsubishi Electric Information Technology Center America, Inc. Hand gesture control system
US5989157A (en) 1996-08-06 1999-11-23 Walton; Charles A. Exercising system with electronic inertial game playing
EP0959444A4 (en) 1996-08-14 2005-12-07 Nurakhmed Nurislamovic Latypov Method for following and imaging a subject's three-dimensional position and orientation, method for presenting a virtual space to a subject, and systems for implementing said methods
JP3064928B2 (en) 1996-09-20 2000-07-12 日本電気株式会社 Subject extraction method
EP0849697B1 (en) 1996-12-20 2003-02-12 Hitachi Europe Limited A hand gesture recognition system and method
US6009210A (en) 1997-03-05 1999-12-28 Digital Equipment Corporation Hands-free interface to a virtual reality environment using head tracking
US6100896A (en) 1997-03-24 2000-08-08 Mitsubishi Electric Information Technology Center America, Inc. System for designing graphical multi-participant environments
US5877803A (en) 1997-04-07 1999-03-02 Tritech Mircoelectronics International, Ltd. 3-D image detector
US6215898B1 (en) * 1997-04-15 2001-04-10 Interval Research Corporation Data processing system and method
WO1998059478A1 (en) 1997-06-25 1998-12-30 Samsung Electronics Co., Ltd. Programming tool for home networks
JP3077745B2 (en) * 1997-07-31 2000-08-14 日本電気株式会社 Data processing method and apparatus, information storage medium
US6188777B1 (en) * 1997-08-01 2001-02-13 Interval Research Corporation Method and apparatus for personnel detection and tracking
US6289112B1 (en) 1997-08-22 2001-09-11 International Business Machines Corporation System and method for determining block direction in fingerprint images
US20020036617A1 (en) * 1998-08-21 2002-03-28 Timothy R. Pryor Novel man machine interfaces and applications
US6750848B1 (en) * 1998-11-09 2004-06-15 Timothy R. Pryor More useful man machine interfaces and applications
AUPO894497A0 (en) 1997-09-02 1997-09-25 Xenotech Research Pty Ltd Image processing method and apparatus
EP0905644A3 (en) * 1997-09-26 2004-02-25 Matsushita Electric Industrial Co., Ltd. Hand gesture recognizing device
US6141463A (en) 1997-10-10 2000-10-31 Electric Planet Interactive Method and system for estimating jointed-figure configurations
US6130677A (en) 1997-10-15 2000-10-10 Electric Planet, Inc. Interactive computer vision system
US6072494A (en) 1997-10-15 2000-06-06 Electric Planet, Inc. Method and apparatus for real-time gesture recognition
US6101289A (en) 1997-10-15 2000-08-08 Electric Planet, Inc. Method and apparatus for unencumbered capture of an object
WO1999019840A1 (en) * 1997-10-15 1999-04-22 Electric Planet, Inc. A system and method for generating an animatable character
AU1099899A (en) * 1997-10-15 1999-05-03 Electric Planet, Inc. Method and apparatus for performing a clean background subtraction
EP1059970A2 (en) 1998-03-03 2000-12-20 Arena, Inc, System and method for tracking and assessing movement skills in multidimensional space
US6301370B1 (en) 1998-04-13 2001-10-09 Eyematic Interfaces, Inc. Face recognition from video images
US6159100A (en) 1998-04-23 2000-12-12 Smith; Michael D. Virtual reality game
US6421453B1 (en) * 1998-05-15 2002-07-16 International Business Machines Corporation Apparatus and methods for user recognition employing behavioral passwords
US6077201A (en) 1998-06-12 2000-06-20 Cheng; Chau-Yang Exercise bicycle
JP2000028799A (en) * 1998-07-07 2000-01-28 Fuji Photo Film Co Ltd Radiation image conversion panel for method for reading by condensing light on both side and method for reading radiation image
US6801637B2 (en) 1999-08-10 2004-10-05 Cybernet Systems Corporation Optical body tracker
US20010008561A1 (en) 1999-08-10 2001-07-19 Paul George V. Real-time object tracking system
US7121946B2 (en) 1998-08-10 2006-10-17 Cybernet Systems Corporation Real-time head tracking system for computer games and other applications
US7036094B1 (en) * 1998-08-10 2006-04-25 Cybernet Systems Corporation Behavior recognition system
US6950534B2 (en) 1998-08-10 2005-09-27 Cybernet Systems Corporation Gesture-controlled interfaces for self-service machines and other applications
US6681031B2 (en) * 1998-08-10 2004-01-20 Cybernet Systems Corporation Gesture-controlled interfaces for self-service machines and other applications
IL126284A (en) 1998-09-17 2002-12-01 Netmor Ltd System and method for three dimensional positioning and tracking
EP0991011B1 (en) * 1998-09-28 2007-07-25 Matsushita Electric Industrial Co., Ltd. Method and device for segmenting hand gestures
AU6225199A (en) 1998-10-05 2000-04-26 Scansoft, Inc. Speech controlled computer user interface
AU1930700A (en) 1998-12-04 2000-06-26 Interval Research Corporation Background estimation and segmentation based on range and color
US6222465B1 (en) * 1998-12-09 2001-04-24 Lucent Technologies Inc. Gesture-based computer interface
US6147678A (en) 1998-12-09 2000-11-14 Lucent Technologies Inc. Video hand image-three-dimensional computer interface with multiple degrees of freedom
WO2000036372A1 (en) * 1998-12-16 2000-06-22 3Dv Systems, Ltd. Self gating photosurface
US6570555B1 (en) 1998-12-30 2003-05-27 Fuji Xerox Co., Ltd. Method and apparatus for embodied conversational characters with multimodal input/output in an interface device
US6226388B1 (en) * 1999-01-05 2001-05-01 Sharp Labs Of America, Inc. Method and apparatus for object tracking for automatic controls in video devices
US6363160B1 (en) * 1999-01-22 2002-03-26 Intel Corporation Interface using pattern recognition and tracking
US6377296B1 (en) * 1999-01-28 2002-04-23 International Business Machines Corporation Virtual map system and method for tracking objects
US7003134B1 (en) 1999-03-08 2006-02-21 Vulcan Patents Llc Three dimensional object pose estimation which employs dense depth information
US6299308B1 (en) 1999-04-02 2001-10-09 Cybernet Systems Corporation Low-cost non-imaging eye tracker system for computer control
US6591236B2 (en) * 1999-04-13 2003-07-08 International Business Machines Corporation Method and system for determining available and alternative speech commands
US6503195B1 (en) 1999-05-24 2003-01-07 University Of North Carolina At Chapel Hill Methods and systems for real-time structured light depth extraction and endoscope using real-time structured light depth extraction
US6476834B1 (en) 1999-05-28 2002-11-05 International Business Machines Corporation Dynamic creation of selectable items on surfaces
US6873723B1 (en) 1999-06-30 2005-03-29 Intel Corporation Segmenting three-dimensional video images using stereo
US6738066B1 (en) 1999-07-30 2004-05-18 Electric Plant, Inc. System, method and article of manufacture for detecting collisions between video images generated by a camera and an object depicted on a display
US7113918B1 (en) 1999-08-01 2006-09-26 Electric Planet, Inc. Method for video enabled electronic commerce
US7050606B2 (en) 1999-08-10 2006-05-23 Cybernet Systems Corporation Tracking and gesture recognition system particularly suited to vehicular control applications
US6663491B2 (en) 2000-02-18 2003-12-16 Namco Ltd. Game apparatus, storage medium and computer program that adjust tempo of sound
US7878905B2 (en) 2000-02-22 2011-02-01 Creative Kingdoms, Llc Multi-layered interactive play experience
US6633294B1 (en) 2000-03-09 2003-10-14 Seth Rosenthal Method and apparatus for using captured high density motion for animation
US6980312B1 (en) 2000-04-24 2005-12-27 International Business Machines Corporation Multifunction office device having a graphical user interface implemented with a touch screen
EP1152261A1 (en) 2000-04-28 2001-11-07 CSEM Centre Suisse d'Electronique et de Microtechnique SA Device and method for spatially resolved photodetection and demodulation of modulated electromagnetic waves
US6640202B1 (en) 2000-05-25 2003-10-28 International Business Machines Corporation Elastic sensor mesh system for 3-dimensional measurement, mapping and kinematics applications
US6731799B1 (en) 2000-06-01 2004-05-04 University Of Washington Object segmentation with background extraction and moving boundary techniques
US6788809B1 (en) 2000-06-30 2004-09-07 Intel Corporation System and method for gesture recognition in three dimensions using stereo imaging and color vision
US7039676B1 (en) 2000-10-31 2006-05-02 International Business Machines Corporation Using video image analysis to automatically transmit gestures over a network in a chat or instant messaging session
US6539931B2 (en) * 2001-04-16 2003-04-01 Koninklijke Philips Electronics N.V. Ball throwing assistant
US8035612B2 (en) 2002-05-28 2011-10-11 Intellectual Ventures Holding 67 Llc Self-contained interactive video display system
US7259747B2 (en) 2001-06-05 2007-08-21 Reactrix Systems, Inc. Interactive video display system
US6594616B2 (en) * 2001-06-18 2003-07-15 Microsoft Corporation System and method for providing a mobile input device
JP3420221B2 (en) 2001-06-29 2003-06-23 株式会社コナミコンピュータエンタテインメント東京 GAME DEVICE AND PROGRAM
US6868383B1 (en) * 2001-07-12 2005-03-15 At&T Corp. Systems and methods for extracting meaning from multimodal inputs using finite-state devices
US7274800B2 (en) * 2001-07-18 2007-09-25 Intel Corporation Dynamic gesture recognition from stereo sequences
US6937742B2 (en) 2001-09-28 2005-08-30 Bellsouth Intellectual Property Corporation Gesture activated home appliance
US7394346B2 (en) 2002-01-15 2008-07-01 International Business Machines Corporation Free-space gesture recognition for transaction security and command processing
US6982697B2 (en) 2002-02-07 2006-01-03 Microsoft Corporation System and process for selecting objects in a ubiquitous computing environment
EP1497160B2 (en) 2002-04-19 2010-07-21 IEE INTERNATIONAL ELECTRONICS &amp; ENGINEERING S.A. Safety device for a vehicle
US7348963B2 (en) * 2002-05-28 2008-03-25 Reactrix Systems, Inc. Interactive video display system
US7170492B2 (en) * 2002-05-28 2007-01-30 Reactrix Systems, Inc. Interactive video display system
US7710391B2 (en) 2002-05-28 2010-05-04 Matthew Bell Processing an image utilizing a spatially varying pattern
US7489812B2 (en) 2002-06-07 2009-02-10 Dynamic Digital Depth Research Pty Ltd. Conversion and encoding techniques
US7225414B1 (en) * 2002-09-10 2007-05-29 Videomining Corporation Method and system for virtual touch entertainment
WO2004027685A2 (en) * 2002-09-19 2004-04-01 The Penn State Research Foundation Prosody based audio/visual co-analysis for co-verbal gesture recognition
US20040113933A1 (en) * 2002-10-08 2004-06-17 Northrop Grumman Corporation Split and merge behavior analysis and understanding using Hidden Markov Models
US7576727B2 (en) 2002-12-13 2009-08-18 Matthew Bell Interactive directed light/sound system
JP4235729B2 (en) 2003-02-03 2009-03-11 国立大学法人静岡大学 Distance image sensor
US9177387B2 (en) 2003-02-11 2015-11-03 Sony Computer Entertainment Inc. Method and apparatus for real time motion capture
DE602004006190T8 (en) 2003-03-31 2008-04-10 Honda Motor Co., Ltd. Device, method and program for gesture recognition
US8072470B2 (en) 2003-05-29 2011-12-06 Sony Computer Entertainment Inc. System and method for providing a real-time three-dimensional interactive environment
US7620202B2 (en) 2003-06-12 2009-11-17 Honda Motor Co., Ltd. Target orientation estimation using depth sensing
US7536032B2 (en) 2003-10-24 2009-05-19 Reactrix Systems, Inc. Method and system for processing captured image information in an interactive video display system
US7308112B2 (en) 2004-05-14 2007-12-11 Honda Motor Co., Ltd. Sign based human-machine interaction
US7704135B2 (en) 2004-08-23 2010-04-27 Harrison Jr Shelton E Integrated game system, method, and device
KR20060070280A (en) 2004-12-20 2006-06-23 한국전자통신연구원 Apparatus and its method of user interface using hand gesture recognition
JP2008537190A (en) 2005-01-07 2008-09-11 ジェスチャー テック,インコーポレイテッド Generation of three-dimensional image of object by irradiating with infrared pattern
EP3693889A3 (en) 2005-01-07 2020-10-28 QUALCOMM Incorporated Detecting and tracking objects in images
EP1849123A2 (en) 2005-01-07 2007-10-31 GestureTek, Inc. Optical flow based tilt sensor
JP5631535B2 (en) 2005-02-08 2014-11-26 オブロング・インダストリーズ・インコーポレーテッド System and method for a gesture-based control system
US7492367B2 (en) * 2005-03-10 2009-02-17 Motus Corporation Apparatus, system and method for interpreting and reproducing physical motion
US7317836B2 (en) * 2005-03-17 2008-01-08 Honda Motor Co., Ltd. Pose estimation based on critical point analysis
KR101430761B1 (en) 2005-05-17 2014-08-19 퀄컴 인코포레이티드 Orientation-sensitive signal output
EP1752748B1 (en) 2005-08-12 2008-10-29 MESA Imaging AG Highly sensitive, fast pixel for use in an image sensor
US20080026838A1 (en) * 2005-08-22 2008-01-31 Dunstan James E Multi-player non-role-playing virtual world games: method for two-way interaction between participants and multi-player virtual world games
JP4262726B2 (en) 2005-08-24 2009-05-13 任天堂株式会社 Game controller and game system
US7450736B2 (en) 2005-10-28 2008-11-11 Honda Motor Co., Ltd. Monocular tracking of 3D human motion with a coordinated mixture of factor analyzers
US7988558B2 (en) 2006-04-27 2011-08-02 Nintendo Co., Ltd. Game apparatus and storage medium storing game program
JP4679431B2 (en) 2006-04-28 2011-04-27 任天堂株式会社 Sound output control program and sound output control device
US7701439B2 (en) 2006-07-13 2010-04-20 Northrop Grumman Corporation Gesture recognition simulation system and method
JP5051822B2 (en) 2006-08-02 2012-10-17 任天堂株式会社 Game device with general-purpose remote control function
JP5395323B2 (en) 2006-09-29 2014-01-22 ブレインビジョン株式会社 Solid-state image sensor
JP4926799B2 (en) * 2006-10-23 2012-05-09 キヤノン株式会社 Information processing apparatus and information processing method
US7412077B2 (en) 2006-12-29 2008-08-12 Motorola, Inc. Apparatus and methods for head pose estimation and head gesture detection
US7729530B2 (en) 2007-03-03 2010-06-01 Sergey Antonov Method and apparatus for 3-D data input to a personal computer with a multimedia oriented operating system
US7852262B2 (en) 2007-08-16 2010-12-14 Cybernet Systems Corporation Wireless mobile indoor/outdoor tracking system
TWI338241B (en) 2007-08-23 2011-03-01 Pixart Imaging Inc Interactive image system, interactive device and operative method thereof
US9292092B2 (en) 2007-10-30 2016-03-22 Hewlett-Packard Development Company, L.P. Interactive display system with collaborative gesture detection
US20090221368A1 (en) 2007-11-28 2009-09-03 Ailive Inc., Method and system for creating a shared game space for a networked game
CN101254344B (en) 2008-04-18 2010-06-16 李刚 Game device of field orientation corresponding with display screen dot array in proportion and method
CN102282528A (en) 2008-11-14 2011-12-14 索尼计算机娱乐公司 Operating device
US9740187B2 (en) 2012-11-21 2017-08-22 Microsoft Technology Licensing, Llc Controlling hardware in an environment

Patent Citations (102)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4288078A (en) * 1979-11-20 1981-09-08 Lugo Julio I Game apparatus
US4695953A (en) * 1983-08-25 1987-09-22 Blair Preston E TV animation interactively controlled by the viewer
US4630910A (en) * 1984-02-16 1986-12-23 Robotic Vision Systems, Inc. Method of measuring in three-dimensions at high speed
US4627620A (en) * 1984-12-26 1986-12-09 Yang John P Electronic athlete trainer for improving skills in reflex, speed and accuracy
US4645458A (en) * 1985-04-15 1987-02-24 Harald Phillip Athletic evaluation and training apparatus
US4702475A (en) * 1985-08-16 1987-10-27 Innovating Training Products, Inc. Sports technique and reaction training system
US4843568A (en) * 1986-04-11 1989-06-27 Krueger Myron W Real time perception of and response to the actions of an unencumbered participant/user
US4711543A (en) * 1986-04-14 1987-12-08 Blair Preston E TV animation interactively controlled by the viewer
US4796997A (en) * 1986-05-27 1989-01-10 Synthetic Vision Systems, Inc. Method and system for high-speed, 3-D imaging of an object at a vision station
US5184295A (en) * 1986-05-30 1993-02-02 Mann Ralph V System and method for teaching physical skills
US4751642A (en) * 1986-08-29 1988-06-14 Silva John M Interactive sports simulation system with physiological sensing and psychological conditioning
US4809065A (en) * 1986-12-01 1989-02-28 Kabushiki Kaisha Toshiba Interactive system and related method for displaying data to produce a three-dimensional image of an object
US4817950A (en) * 1987-05-08 1989-04-04 Goo Paul E Video game control unit and attitude sensor
US5109537A (en) * 1988-06-24 1992-04-28 Kabushiki Kaisha Toshiba Telecommunication apparatus having an id rechecking function
US5239464A (en) * 1988-08-04 1993-08-24 Blair Preston E Interactive video system providing repeated switching of multiple tracks of actions sequences
US5239463A (en) * 1988-08-04 1993-08-24 Blair Preston E Method and apparatus for player interaction with animated characters and objects
US4901362A (en) * 1988-08-08 1990-02-13 Raytheon Company Method of recognizing patterns
US4893183A (en) * 1988-08-11 1990-01-09 Carnegie-Mellon University Robotic vision system
US5288078A (en) * 1988-10-14 1994-02-22 David G. Capper Control interface apparatus
US4925189A (en) * 1989-01-13 1990-05-15 Braeunig Thomas F Body-mounted video game exercise device
US5229756A (en) * 1989-02-07 1993-07-20 Yamaha Corporation Image control apparatus
US5139261A (en) * 1989-09-15 1992-08-18 Openiano Renato M Foot-actuated computer game controller serving as a joystick
US5229754A (en) * 1990-02-13 1993-07-20 Yazaki Corporation Automotive reflection type display apparatus
US5101444A (en) * 1990-05-18 1992-03-31 Panacea, Inc. Method and apparatus for high speed object location
US5148154A (en) * 1990-12-04 1992-09-15 Sony Corporation Of America Multi-dimensional user interface
US5295491A (en) * 1991-09-26 1994-03-22 Sam Technology, Inc. Non-invasive human neurocognitive performance capability testing method and system
US5320538A (en) * 1992-09-23 1994-06-14 Hughes Training, Inc. Interactive aircraft training system and method
US5525901A (en) * 1993-02-02 1996-06-11 Beaudreau Electric, Inc. Sensor systems for monitoring and measuring angular position in two or three axes
US5405152A (en) * 1993-06-08 1995-04-11 The Walt Disney Company Method and apparatus for an interactive video game with physical feedback
US5347306A (en) * 1993-12-17 1994-09-13 Mitsubishi Electric Research Laboratories, Inc. Animated electronic meeting place
US5615132A (en) * 1994-01-21 1997-03-25 Crossbow Technology, Inc. Method and apparatus for determining position and orientation of a moveable object using accelerometers
US5385519A (en) * 1994-04-19 1995-01-31 Hsu; Chi-Hsueh Running machine
US5864808A (en) * 1994-04-25 1999-01-26 Hitachi, Ltd. Erroneous input processing method and apparatus in information processing system using composite input
US5757360A (en) * 1995-05-03 1998-05-26 Mitsubishi Electric Information Technology Center America, Inc. Hand held computer control device
US5611731A (en) * 1995-09-08 1997-03-18 Thrustmaster, Inc. Video pinball machine controller having an optical accelerometer for detecting slide and tilt
US6072467A (en) * 1996-05-03 2000-06-06 Mitsubishi Electric Information Technology Center America, Inc. (Ita) Continuously variable control of animated on-screen characters
US5909189A (en) * 1996-11-14 1999-06-01 Raytheon Company Group tracking
US20020004422A1 (en) * 1997-01-30 2002-01-10 Kabusiki Kaisha Sega Enterprises Input device, game device, and method and recording medium for same
US20080122786A1 (en) * 1997-08-22 2008-05-29 Pryor Timothy R Advanced video gaming methods for education and play using camera based inputs
US20050085298A1 (en) * 1997-11-25 2005-04-21 Woolston Thomas G. Electronic sword game with input and feedback
US6181343B1 (en) * 1997-12-23 2001-01-30 Philips Electronics North America Corp. System and method for permitting three-dimensional navigation through a virtual reality environment using camera-based gesture inputs
US6195104B1 (en) * 1997-12-23 2001-02-27 Philips Electronics North America Corp. System and method for permitting three-dimensional navigation through a virtual reality environment using camera-based gesture inputs
US6067077A (en) * 1998-04-10 2000-05-23 Immersion Corporation Position sensing for force feedback devices
US6269172B1 (en) * 1998-04-13 2001-07-31 Compaq Computer Corporation Method for tracking the motion of a 3-D figure
US6542621B1 (en) * 1998-08-31 2003-04-01 Texas Instruments Incorporated Method of dealing with occlusion when tracking multiple objects and people in video sequences
US6509889B2 (en) * 1998-12-03 2003-01-21 International Business Machines Corporation Method and apparatus for enabling the adaptation of the input parameters for a computer system pointing device
US6545661B1 (en) * 1999-06-21 2003-04-08 Midway Amusement Games, Llc Video game system having a control unit with an accelerometer for controlling a video game
US6347998B1 (en) * 1999-06-30 2002-02-19 Konami Co., Ltd. Game system and computer-readable recording medium
US7070500B1 (en) * 1999-09-07 2006-07-04 Konami Corporation Musical player-motion sensing game system
US6795567B1 (en) * 1999-09-16 2004-09-21 Hewlett-Packard Development Company, L.P. Method for efficiently tracking object models in video sequences via dynamic ordering of features
US20020072418A1 (en) * 1999-10-04 2002-06-13 Nintendo Co., Ltd. Portable game apparatus with acceleration sensor and information storage medium storing a game program
US6375572B1 (en) * 1999-10-04 2002-04-23 Nintendo Co., Ltd. Portable game apparatus with acceleration sensor and information storage medium storing a game progam
US20040204240A1 (en) * 2000-02-22 2004-10-14 Barney Jonathan A. Magical wand and interactive play experience
US20020055383A1 (en) * 2000-02-24 2002-05-09 Namco Ltd. Game system and program
US20020019258A1 (en) * 2000-05-31 2002-02-14 Kim Gerard Jounghyun Methods and apparatus of displaying and evaluating motion data in a motion game apparatus
US6753879B1 (en) * 2000-07-03 2004-06-22 Intel Corporation Creating overlapping real and virtual images
US20020041327A1 (en) * 2000-07-24 2002-04-11 Evan Hildreth Video-based image control system
US20060098873A1 (en) * 2000-10-03 2006-05-11 Gesturetek, Inc., A Delaware Corporation Multiple camera control system
US7095401B2 (en) * 2000-11-02 2006-08-22 Siemens Corporate Research, Inc. System and method for gesture interface
US6600475B2 (en) * 2001-01-22 2003-07-29 Koninklijke Philips Electronics N.V. Single camera system for gesture-based input and target indication
US6888960B2 (en) * 2001-03-28 2005-05-03 Nec Corporation Fast optimal linear approximation of the images of variably illuminated solid objects for recognition
US6804396B2 (en) * 2001-03-28 2004-10-12 Honda Giken Kogyo Kabushiki Kaisha Gesture recognition system
US20030040350A1 (en) * 2001-08-22 2003-02-27 Masayasu Nakata Game system, puzzle game program, and storage medium having program stored therein
US7094147B2 (en) * 2001-08-22 2006-08-22 Nintendo Co., Ltd. Game system, puzzle game program, and storage medium having program stored therein
US7007236B2 (en) * 2001-09-14 2006-02-28 Accenture Global Services Gmbh Lab window collaboration
US20060092267A1 (en) * 2001-09-14 2006-05-04 Accenture Global Services Gmbh Lab window collaboration
US20040155902A1 (en) * 2001-09-14 2004-08-12 Dempski Kelly L. Lab window collaboration
US7148913B2 (en) * 2001-10-12 2006-12-12 Hrl Laboratories, Llc Vision-based pointer tracking and object classification method and apparatus
US7823089B2 (en) * 2002-02-07 2010-10-26 Microsoft Corporation Manipulating objects displayed on a display screen
US7721231B2 (en) * 2002-02-07 2010-05-18 Microsoft Corporation Controlling an object within an environment using a pointing device
US20030156756A1 (en) * 2002-02-15 2003-08-21 Gokturk Salih Burak Gesture recognition system using depth perceptive sensors
US20040005083A1 (en) * 2002-03-26 2004-01-08 Kikuo Fujimura Real-time eye detection and tracking under various light conditions
US20070252898A1 (en) * 2002-04-05 2007-11-01 Bruno Delean Remote control apparatus using gesture recognition
US20030216179A1 (en) * 2002-05-17 2003-11-20 Toshiaki Suzuki Game device changing sound and an image in accordance with a tilt operation
US20050239548A1 (en) * 2002-06-27 2005-10-27 Hiromu Ueshima Information processor having input system using stroboscope
US20040001113A1 (en) * 2002-06-28 2004-01-01 John Zipperer Method and apparatus for spline-based trajectory classification, gesture detection and localization
US20100146455A1 (en) * 2003-03-25 2010-06-10 Microsoft Corporation Architecture For Controlling A Computer Using Hand Gestures
US20040189720A1 (en) * 2003-03-25 2004-09-30 Wilson Andrew D. Architecture for controlling a computer using hand gestures
US7665041B2 (en) * 2003-03-25 2010-02-16 Microsoft Corporation Architecture for controlling a computer using hand gestures
US20100138798A1 (en) * 2003-03-25 2010-06-03 Wilson Andrew D System and method for executing a game process
US20130190089A1 (en) * 2003-03-25 2013-07-25 Andrew Wilson System and method for execution a game process
US20100146464A1 (en) * 2003-03-25 2010-06-10 Microsoft Corporation Architecture For Controlling A Computer Using Hand Gestures
US20040194129A1 (en) * 2003-03-31 2004-09-30 Carlbom Ingrid Birgitta Method and apparatus for intelligent and automatic sensor control using multimedia database system
US7372977B2 (en) * 2003-05-29 2008-05-13 Honda Motor Co., Ltd. Visual tracking using depth data
US7038661B2 (en) * 2003-06-13 2006-05-02 Microsoft Corporation Pointing device and cursor for use in intelligent computing environments
US20060007142A1 (en) * 2003-06-13 2006-01-12 Microsoft Corporation Pointing device and cursor for use in intelligent computing environments
US20050076161A1 (en) * 2003-10-03 2005-04-07 Amro Albanna Input system and method
US20050151850A1 (en) * 2004-01-14 2005-07-14 Korea Institute Of Science And Technology Interactive presentation system
US20050255434A1 (en) * 2004-02-27 2005-11-17 University Of Florida Research Foundation, Inc. Interactive virtual characters for training including medical diagnosis training
US20050212753A1 (en) * 2004-03-23 2005-09-29 Marvit David L Motion controlled remote controller
US20050238201A1 (en) * 2004-04-15 2005-10-27 Atid Shamaie Tracking bimanual movements
US20080193043A1 (en) * 2004-06-16 2008-08-14 Microsoft Corporation Method and system for reducing effects of undesired signals in an infrared imaging system
US20060036944A1 (en) * 2004-08-10 2006-02-16 Microsoft Corporation Surface UI for gesture-based interaction
US20060178212A1 (en) * 2004-11-23 2006-08-10 Hillcrest Laboratories, Inc. Semantic gaming and application transformation
US20110081969A1 (en) * 2005-08-22 2011-04-07 Akio Ikeda Video game system with wireless modular handheld controller
US20070060383A1 (en) * 2005-09-14 2007-03-15 Nintendo Co., Ltd. Video game program and video game system
US7927216B2 (en) * 2005-09-15 2011-04-19 Nintendo Co., Ltd. Video game system with wireless modular handheld controller
US20110172015A1 (en) * 2005-09-15 2011-07-14 Nintendo Co., Ltd. Video game system with wireless modular handheld controller
US20100113153A1 (en) * 2006-07-14 2010-05-06 Ailive, Inc. Self-Contained Inertial Navigation System for Interactive Control Using Movable Controllers
US20080036732A1 (en) * 2006-08-08 2008-02-14 Microsoft Corporation Virtual Controller For Visual Displays
US20090121894A1 (en) * 2007-11-14 2009-05-14 Microsoft Corporation Magic wand
US20110124410A1 (en) * 2009-11-20 2011-05-26 Xiaodong Mao Controller for interfacing with a computing program using position, orientation, or motion

Cited By (85)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8745541B2 (en) 2003-03-25 2014-06-03 Microsoft Corporation Architecture for controlling a computer using hand gestures
US20040193413A1 (en) * 2003-03-25 2004-09-30 Wilson Andrew D. Architecture for controlling a computer using hand gestures
US20080141181A1 (en) * 2006-12-07 2008-06-12 Kabushiki Kaisha Toshiba Information processing apparatus, information processing method, and program
US11322171B1 (en) 2007-12-17 2022-05-03 Wai Wu Parallel signal processing system and method
US9465980B2 (en) 2009-01-30 2016-10-11 Microsoft Technology Licensing, Llc Pose tracking pipeline
US12105887B1 (en) 2009-05-21 2024-10-01 Golden Edge Holding Corporation Gesture recognition systems
US9417700B2 (en) 2009-05-21 2016-08-16 Edge3 Technologies Gesture recognition systems and related methods
US11703951B1 (en) 2009-05-21 2023-07-18 Edge 3 Technologies Gesture recognition systems
US20110182471A1 (en) * 2009-11-30 2011-07-28 Abisee, Inc. Handling information flow in printed text processing
US9542005B2 (en) 2010-02-25 2017-01-10 Hewlett-Packard Development Company, L.P. Representative image
US9170666B2 (en) 2010-02-25 2015-10-27 Hewlett-Packard Development Company, L.P. Representative image
WO2011115706A3 (en) * 2010-03-15 2011-11-17 Empire Technology Development Llc Selective motor control classification
CN102687161A (en) * 2010-03-15 2012-09-19 英派尔科技开发有限公司 Selective motor control classification
WO2011115706A2 (en) * 2010-03-15 2011-09-22 Empire Technology Development Llc Selective motor control classification
US20110221770A1 (en) * 2010-03-15 2011-09-15 Ezekiel Kruglick Selective motor control classification
KR101431351B1 (en) 2010-03-15 2014-08-20 엠파이어 테크놀로지 디벨롭먼트 엘엘씨 Selective motor control classification
US8614663B2 (en) * 2010-03-15 2013-12-24 Empire Technology Development, Llc Selective motor control classification
US8457353B2 (en) 2010-05-18 2013-06-04 Microsoft Corporation Gestures and gesture modifiers for manipulating a user-interface
US9152853B2 (en) 2010-05-20 2015-10-06 Edge 3Technologies, Inc. Gesture recognition in vehicles
US8396252B2 (en) 2010-05-20 2013-03-12 Edge 3 Technologies Systems and related methods for three dimensional gesture recognition in vehicles
US9891716B2 (en) 2010-05-20 2018-02-13 Microsoft Technology Licensing, Llc Gesture recognition in vehicles
US8625855B2 (en) 2010-05-20 2014-01-07 Edge 3 Technologies Llc Three dimensional gesture recognition in vehicles
US8730164B2 (en) 2010-05-28 2014-05-20 Panasonic Corporation Gesture recognition apparatus and method of gesture recognition
US8467599B2 (en) 2010-09-02 2013-06-18 Edge 3 Technologies, Inc. Method and apparatus for confusion learning
US11967083B1 (en) 2010-09-02 2024-04-23 Golden Edge Holding Corporation Method and apparatus for performing segmentation of an image
US11398037B2 (en) 2010-09-02 2022-07-26 Edge 3 Technologies Method and apparatus for performing segmentation of an image
US10586334B2 (en) 2010-09-02 2020-03-10 Edge 3 Technologies, Inc. Apparatus and method for segmenting an image
US11710299B2 (en) 2010-09-02 2023-07-25 Edge 3 Technologies Method and apparatus for employing specialist belief propagation networks
US9990567B2 (en) 2010-09-02 2018-06-05 Edge 3 Technologies, Inc. Method and apparatus for spawning specialist belief propagation networks for adjusting exposure settings
US9723296B2 (en) 2010-09-02 2017-08-01 Edge 3 Technologies, Inc. Apparatus and method for determining disparity of textured regions
US12087044B2 (en) 2010-09-02 2024-09-10 Golden Edge Holding Corporation Method and apparatus for employing specialist belief propagation networks
US11023784B2 (en) 2010-09-02 2021-06-01 Edge 3 Technologies, Inc. Method and apparatus for employing specialist belief propagation networks
US8644599B2 (en) 2010-09-02 2014-02-04 Edge 3 Technologies, Inc. Method and apparatus for spawning specialist belief propagation networks
US8798358B2 (en) 2010-09-02 2014-08-05 Edge 3 Technologies, Inc. Apparatus and method for disparity map generation
US10909426B2 (en) 2010-09-02 2021-02-02 Edge 3 Technologies, Inc. Method and apparatus for spawning specialist belief propagation networks for adjusting exposure settings
US8891859B2 (en) 2010-09-02 2014-11-18 Edge 3 Technologies, Inc. Method and apparatus for spawning specialist belief propagation networks based upon data classification
US8983178B2 (en) 2010-09-02 2015-03-17 Edge 3 Technologies, Inc. Apparatus and method for performing segment-based disparity decomposition
US8655093B2 (en) 2010-09-02 2014-02-18 Edge 3 Technologies, Inc. Method and apparatus for performing segmentation of an image
US8666144B2 (en) 2010-09-02 2014-03-04 Edge 3 Technologies, Inc. Method and apparatus for determining disparity of texture
US20120102400A1 (en) * 2010-10-22 2012-04-26 Microsoft Corporation Touch Gesture Notification Dismissal Techniques
US20120121123A1 (en) * 2010-11-11 2012-05-17 Hsieh Chang-Tai Interactive device and method thereof
US20120139907A1 (en) * 2010-12-06 2012-06-07 Samsung Electronics Co., Ltd. 3 dimensional (3d) display system of responding to user motion and user interface for the 3d display system
US9123316B2 (en) * 2010-12-27 2015-09-01 Microsoft Technology Licensing, Llc Interactive content creation
US9529566B2 (en) 2010-12-27 2016-12-27 Microsoft Technology Licensing, Llc Interactive content creation
US9323395B2 (en) 2011-02-10 2016-04-26 Edge 3 Technologies Near touch interaction with structured light
US10599269B2 (en) 2011-02-10 2020-03-24 Edge 3 Technologies, Inc. Near touch interaction
US8970589B2 (en) 2011-02-10 2015-03-03 Edge 3 Technologies, Inc. Near-touch interaction with a stereo camera grid structured tessellations
US9652084B2 (en) 2011-02-10 2017-05-16 Edge 3 Technologies, Inc. Near touch interaction
US10061442B2 (en) 2011-02-10 2018-08-28 Edge 3 Technologies, Inc. Near touch interaction
US8582866B2 (en) 2011-02-10 2013-11-12 Edge 3 Technologies, Inc. Method and apparatus for disparity computation in stereo images
CN109669553A (en) * 2011-02-28 2019-04-23 意法半导体(R&D)有限公司 In optical navigation device or improvement related with optical navigation device
US20130010207A1 (en) * 2011-07-04 2013-01-10 3Divi Gesture based interactive control of electronic equipment
US8896522B2 (en) 2011-07-04 2014-11-25 3Divi Company User-centric three-dimensional interactive control environment
US8823642B2 (en) 2011-07-04 2014-09-02 3Divi Company Methods and systems for controlling devices using gestures and related 3D sensor
US8705877B1 (en) 2011-11-11 2014-04-22 Edge 3 Technologies, Inc. Method and apparatus for fast computational stereo
US8718387B1 (en) 2011-11-11 2014-05-06 Edge 3 Technologies, Inc. Method and apparatus for enhanced stereo vision
US11455712B2 (en) 2011-11-11 2022-09-27 Edge 3 Technologies Method and apparatus for enhancing stereo vision
US10037602B2 (en) 2011-11-11 2018-07-31 Edge 3 Technologies, Inc. Method and apparatus for enhancing stereo vision
US8761509B1 (en) 2011-11-11 2014-06-24 Edge 3 Technologies, Inc. Method and apparatus for fast computational stereo
US9672609B1 (en) 2011-11-11 2017-06-06 Edge 3 Technologies, Inc. Method and apparatus for improved depth-map estimation
US9324154B2 (en) 2011-11-11 2016-04-26 Edge 3 Technologies Method and apparatus for enhancing stereo vision through image segmentation
US10825159B2 (en) 2011-11-11 2020-11-03 Edge 3 Technologies, Inc. Method and apparatus for enhancing stereo vision
US20130121528A1 (en) * 2011-11-14 2013-05-16 Sony Corporation Information presentation device, information presentation method, information presentation system, information registration device, information registration method, information registration system, and program
US8948451B2 (en) * 2011-11-14 2015-02-03 Sony Corporation Information presentation device, information presentation method, information presentation system, information registration device, information registration method, information registration system, and program
US11543891B2 (en) 2011-11-23 2023-01-03 Intel Corporation Gesture input with multiple views, displays and physics
US10963062B2 (en) 2011-11-23 2021-03-30 Intel Corporation Gesture input with multiple views, displays and physics
US12061745B2 (en) 2011-11-23 2024-08-13 Intel Corporation Gesture input with multiple views, displays and physics
US9596643B2 (en) 2011-12-16 2017-03-14 Microsoft Technology Licensing, Llc Providing a user interface experience based on inferred vehicle state
US8811938B2 (en) 2011-12-16 2014-08-19 Microsoft Corporation Providing a user interface experience based on inferred vehicle state
US9498885B2 (en) 2013-02-27 2016-11-22 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with confidence-based decision support
US9393695B2 (en) 2013-02-27 2016-07-19 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with person and object discrimination
US9804576B2 (en) 2013-02-27 2017-10-31 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with position and derivative decision reference
US9731421B2 (en) 2013-02-27 2017-08-15 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with person and object discrimination
US9798302B2 (en) 2013-02-27 2017-10-24 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with redundant system input support
US11297150B2 (en) 2013-03-15 2022-04-05 Verizon Media Inc. Method and system for measuring user engagement using click/skip in content stream
US20140280890A1 (en) * 2013-03-15 2014-09-18 Yahoo! Inc. Method and system for measuring user engagement using scroll dwell time
US11206311B2 (en) 2013-03-15 2021-12-21 Verizon Media Inc. Method and system for measuring user engagement using click/skip in content stream
US10721448B2 (en) 2013-03-15 2020-07-21 Edge 3 Technologies, Inc. Method and apparatus for adaptive exposure bracketing, segmentation and scene organization
US10491694B2 (en) 2013-03-15 2019-11-26 Oath Inc. Method and system for measuring user engagement using click/skip in content stream using a probability model
WO2014185808A1 (en) * 2013-05-13 2014-11-20 3Divi Company System and method for controlling multiple electronic devices
US11497961B2 (en) 2019-03-05 2022-11-15 Physmodo, Inc. System and method for human motion detection and tracking
US11547324B2 (en) 2019-03-05 2023-01-10 Physmodo, Inc. System and method for human motion detection and tracking
US11771327B2 (en) 2019-03-05 2023-10-03 Physmodo, Inc. System and method for human motion detection and tracking
US11826140B2 (en) 2019-03-05 2023-11-28 Physmodo, Inc. System and method for human motion detection and tracking
US11331006B2 (en) 2019-03-05 2022-05-17 Physmodo, Inc. System and method for human motion detection and tracking

Also Published As

Publication number Publication date
US9652042B2 (en) 2017-05-16
US20100146455A1 (en) 2010-06-10
US7665041B2 (en) 2010-02-16
US20040189720A1 (en) 2004-09-30
US20100146464A1 (en) 2010-06-10

Similar Documents

Publication Publication Date Title
US9652042B2 (en) Architecture for controlling a computer using hand gestures
US10551930B2 (en) System and method for executing a process using accelerometer signals
US8115732B2 (en) Virtual controller for visual displays
US20190187801A1 (en) Detecting, representing, and interpreting three-space input: gestural continuum subsuming freespace, proximal, and surface-contact modes
US20180203520A1 (en) Operating environment comprising multiple client devices, multiple displays, multiple users, and gestural control
US20180173313A1 (en) Detecting, representing, and interpreting three-space input: gestural continuum subsuming freespace, proximal, and surface-contact modes
EP2344983B1 (en) Method, apparatus and computer program product for providing adaptive gesture analysis
US10642364B2 (en) Processing tracking and recognition data in gestural recognition systems
US9971491B2 (en) Gesture library for natural user input
US20100281440A1 (en) Detecting, Representing, and Interpreting Three-Space Input: Gestural Continuum Subsuming Freespace, Proximal, and Surface-Contact Modes
EP2427857A1 (en) Gesture-based control systems including the representation, manipulation, and exchange of data
WO2013184704A1 (en) Spatial operating environment (soe) with markerless gestural control
Kjeldsen et al. Design issues for vision-based computer interaction systems
Wilson et al. Gwindows: Towards robust perception-based ui
Wilson et al. Multimodal sensing for explicit and implicit interaction
Kölsch et al. Touching the visualized invisible: Wearable ar with a multimodal interface

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034564/0001

Effective date: 20141014

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCV Information on status: appeal procedure

Free format text: NOTICE OF APPEAL FILED

STCV Information on status: appeal procedure

Free format text: APPEAL BRIEF (OR SUPPLEMENTAL BRIEF) ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE