WO2022066185A1 - Application gestures - Google Patents

Application gestures Download PDF

Info

Publication number
WO2022066185A1
WO2022066185A1 PCT/US2020/053100 US2020053100W WO2022066185A1 WO 2022066185 A1 WO2022066185 A1 WO 2022066185A1 US 2020053100 W US2020053100 W US 2020053100W WO 2022066185 A1 WO2022066185 A1 WO 2022066185A1
Authority
WO
WIPO (PCT)
Prior art keywords
application
user
video stream
physical
processor
Prior art date
Application number
PCT/US2020/053100
Other languages
French (fr)
Inventor
Raphael Dal ZOTTO
Original Assignee
Hewlett-Packard Development Company, L.P.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett-Packard Development Company, L.P. filed Critical Hewlett-Packard Development Company, L.P.
Priority to PCT/US2020/053100 priority Critical patent/WO2022066185A1/en
Publication of WO2022066185A1 publication Critical patent/WO2022066185A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Definitions

  • FIG. 1 illustrates an application gesture system, according to an example
  • FIG. 2 is a block diagram showing an application gesture system, according to an example
  • FIG. 3 is a flow diagram showing an application gesture system, according to an example.
  • FIG. 4 is a computing device for supporting instructions for an application gesture system, according to an example.
  • touch-based input continues to be the dominant method of a human providing input into a computing device.
  • the keyboard and mouse have been mainstays in most computing devices either as peripherals or integrated.
  • Touch pads and touch screens have augmented the experience, but keyboard and mouse input still dominate input mechanisms.
  • Certain industries may want to limit touch-based input mechanism.
  • the medical industry may want to limit touch-based input mechanisms due to contamination and cleaning requirements. Described herein is an application gesture system.
  • a camera, a memory and a processor execute instructions to receive a video stream comprising imaging of physical gestures from a camera.
  • the processor identifies a physical gesture from the video stream.
  • the processor maps the physical gesture to an application command associated with an application and executes the application command as an input to an executing instance of the application.
  • a processor receives a video stream from a camera and identifies a user from a video stream.
  • the processor loads a mapping of physical gestures to applications commands based on the identity of the user.
  • the processor identifies a physical gesture from the video stream and maps the physical gestures to an application command.
  • the processor executes the application command as an input to an executing instance of the application.
  • FIG. 1 illustrates an application gesture system 100, according to an example.
  • FIG. 1 illustrates how the application gestures system 100 may appear to a user in one example.
  • the system 100 may include a camera 102, a display 104, an application 108, and utilize a gesture 106 as an input.
  • the camera 102 may be implemented as a red green blue (RGB) camera device.
  • the camera 102 may be an infrared (IR) camera.
  • the camera 102 may be a standalone device, such as a webcam, attached to a host computing device by way of a communication channel such as universal serial bus.
  • the camera 102 may be integrated into the display 104 or chassis in the example of a laptop computing device.
  • the camera 102 may be a digital camera with a charge-coupled device (CCD) image sensor for capturing images and video streams.
  • the camera 102 may capture video streams including imaging of a user and of physical gestures.
  • CCD charge-coupled device
  • the display 104 may be implemented with varying technologies including but not limited to cathode ray tubes (CRTs), liquid crystal displays (LCDs), organic light emitting diodes (OLEDs), and electronic paper.
  • the display 104 may be implemented as a standalone monitor connected to a computing device via a display protocol such as DisplayPortTM (DISPLAYPORT is a trademark of Video Electronics Standards Association), or High Definition Multimedia Interface (HDMITM) (HDMI is a trademark of HDMI LICENSING, L.L.C., San Jose, California USA).
  • the display 104 may be an integrated display panel connected to the processor 202 utilizing protocols and signaling across physical interfaces common for that purpose (e.g. Low voltage differential signaling and OpenLDI and FPD-Link).
  • the display 104 may be utilized for rendering graphical representations of applications executing within a computing device.
  • the display 104 renders the representations of applications responsive to user input.
  • a gesture 106 may be a physical movement of a user's body to communicate meaning.
  • a gesture 106 may be a hand movement.
  • a gesture 106 may be a series of hand movements.
  • a gesture 106 may be a hand positioning with no movement. In this example, simply stated a gesture 106 may be a discernable positioning of a hand within the field of view of the camera 102.
  • a physical gesture 106 may correspond to another body part of a person including but not limited to head and facial features.
  • An application 108 may be a piece of software being utilized by the user.
  • An application 108 may accept input from the user based on keystroke combinations. Other input may include mouse movements and mouse clicks.
  • the application 108 may accept touchscreen input or touch pad input.
  • the application 108 may include application specific commands that may be executed with input combinations. For example, a specific set of keys pressed at the same time may cause the application 108 to take a specific action. Common application specific commands may include pressing the “control" key and the “B” key at the same time to toggle a bold type font in a word processor. Another common application specific command may include pressing the space bar to unmute a microphone in a video conferencing application.
  • an application 108 may include the operating system.
  • the application gesture system may be able to provide gesture mappings as inputs to the operating system itself. For example, a specific gesture may be utilized to launch a new instance of another application such as command prompt or terminal window.
  • the camera 102 may capture imaging of the user in the form of a video stream.
  • a processor (not shown and described later) may detect a gesture 106 within the captured video stream. The processor may identify the gesture and then map the gesture to an associated application specific command. The mapping may be predefined “out of the box.”
  • the physical gesture 106 may be selected from a set of pre-configured gestures. For example, illustrated in FIG. 1, a gesture 106 corresponding to a finger orientation of the index and middle finger extended toward the display, the ring and little finger curled toward the palm, and the thumb perpendicular to the index and forefinger and a downward movement, may be mapped to the “down arrow” on the keyboard for applications that support scrolling.
  • the same finger orientation and an upward movement may be mapped to the “up arrow” on the keyboard for applications that support scrolling.
  • the instantiated and currently running application may be obtained by detecting focus through an application programming interface (API). Utilizing the gesture 106, and the application 108, the corresponding application command determined in the mapping. In one implementation, the application command may be executed utilizing an instantiated virtualized keyboard as an input mechanism.
  • API application programming interface
  • a user may present their own gesture 106 to be mapped to a keystroke combination or other input.
  • the system may be placed in a “recording mode.”
  • the recording mode may capture a calibration video stream during a user-defined period of time.
  • the calibration video stream does not result in the execution of the mapping, but instead results in the capturing of a user-defined physical gesture.
  • the processor may extract a gesture 106 from the calibration video stream.
  • the processor may identify the user-defined gesture through the configuration application. The identification may include a name for the physical gesture to identify the physical gesture for subsequent reuse.
  • the processor may prompt the user via a configuration application (not shown) for a keystroke and an application to create the mapping of the gesture 106 to the keystroke and the application.
  • the user may click a record gesture button on a configuration application, make the intended gesture 106, dick stop record, and then select the keystroke combination or other input and the application for which the gesture 106 may be intended.
  • multiple mappings of the same gesture 106 may be created, wherein the keystrokes and the applications may be different, but correspond to the same gesture 106.
  • FIG. 2 is a block diagram showing an application gesture system, according to an example.
  • Features illustrated in FIG. 2 may correspond with features illustrated in FIG. 1.
  • reference numbers may correspond to both FIG. 1 and FIG. 2 illustrating the relationship between the two examples illustrated in each.
  • a camera 102 may be utilized for capturing a video stream.
  • the camera 102 may be connected to the processor 202 via a bus such as universal serial bus.
  • the camera 102 may capture a video stream, where the video stream includes a user.
  • the resolution of the camera 102 may be such as to provide enough pixels to discern high contrast changes in an image to allow for the processor 202 to proficiently perform operations of feature detection.
  • Feature detection may be utilized for determining changes in the movement, particularly the hands within the video stream.
  • the processor 202 of the system 200 may be implemented as dedicated hardware circuitry or a virtualized logical processor.
  • the dedicated hardware circuitry may be implemented as a central processing unit (CPU).
  • a dedicated hardware CPU may be implemented as a single to many-core general purpose processor.
  • a dedicated hardware CPU may also be implemented as a multi-chip solution, where more than one CPU are linked through a bus and schedule processing tasks across the more than one CPU.
  • a virtualized logical processor may be implemented across a distributed computing environment.
  • a virtualized logical processor may not have a dedicated piece of hardware supporting it. Instead, the virtualized logical processor may have a pool of resources supporting the task for which it was provisioned.
  • the virtualized logical processor may actually be executed on hardware circuitry; however, the hardware circuitry is not dedicated.
  • the hardware circuitry may be in a shared environment where utilization is time sliced.
  • the virtualized logical processor includes a software layer between any executing application and the hardware circuitry to handle any abstraction which also monitors and save the application state.
  • Virtual machines may be implementations of virtualized logical processors.
  • a memory 204 may be implemented in the system 200.
  • the memory 204 may be dedicated hardware circuitry to host instructions for the processor 202 to execute.
  • the memory 204 may be virtualized logical memory.
  • dedicated hardware circuitry may be implemented with dynamic ram (DRAM) or other hardware implementations for storing processor instructions.
  • the virtualized logical memory may be implemented in a software abstraction which allows the instructions 206 to be executed on a virtualized logical processor, independent of any dedicated hardware implementation.
  • the system 200 may also include instructions 206.
  • the instructions 206 may be implemented in a platform specific language that the processor 202 may decode and execute.
  • the instructions 206 may be stored in the memory 204 during execution.
  • the instructions 206 may be encoded to perform operations such as receive a video stream comprising imaging of physical gestures from a camera 208, identify a physical gesture from the video stream 210, map the physical gesture to an application command associated with an application 212, and execute the application command as an input to an executing instance of the application 214.
  • multiple users may be able to utilize application gestures.
  • the instructions 206 may also include identifying a user in the video stream.
  • the instruction 206 may recognize a user in the video streaming, wherein the recognizing corresponds to matching features short of knowing an identity of the user. Recognizing the user may provide more personal security for the user as the features are not tied to an identity.
  • the instructions 206 may also load a mapping of physical gestures to application commands based on the identity of the user. As the processor 202 may be able to identify physical gestures in the video stream, the processor may be able to identify a user based on facial recognition.
  • a user may install an application gesture system on a computing device. The application gesture system may go through a configuration routine where the video stream captures an image of the user's face.
  • the processor 202 may be able to subsequently recognize the user's face in the video stream.
  • operating system capabilities for facial detection for log on of a user may be utilized.
  • the application gesture system may default to an “out of the box" configuration incorporating a set of common physical gestures mapping to established keystroke combinations or other input for commonly used applications.
  • the application gesture system may provide the user an opportunity to introduce user-defined physical gestures.
  • the user may map their own user-defined physical gesture to a specific application and keystroke combination or other input for that application.
  • the user may have a robust flexible physical gesture enabled system with a mapping unique to that user.
  • a second user may execute the same configuration process and result with a unique mapping to the second user.
  • FIG. 3 is a flow diagram showing an application gesture system, according to an example. References to FIG. 2 may be utilized to describe the features of FIG. 3.
  • the processor 202 receives a video stream from a camera.
  • the processor 202 may receive the video stream from the camera utilizing a communication channel such as universal serial bus.
  • the processor 202 identifies a user from the video stream.
  • the video stream may be processed to extract features pertaining to a human face in the video stream.
  • the feature extraction may be determined utilizing a third-party facial recognition library.
  • Many facial recognition libraries are available in the public domain and enabled in a wide variety of computing languages.
  • the library “face-recognition” is implemented in python and features algorithms to extract features to identify a user’s face.
  • Similar cloud-based applications such as Amazon Rekognition® (AMAZON REKOGNITION is a registered trademark of Amazon Technologies Inc, Seattle WA, USA) may be used to implement facial recognition.
  • the processor 202 loads a mapping of physical gestures to application commands based on the identity of the user.
  • the mapping of physical gestures may be stored on a non-volatile storage device and loaded into memory 204 by the processor.
  • the retrieval and loading may occur at the startup of a hosting computing device.
  • the implementation of the application gesture system may correspond to a service that executes quietly (e.g. no visible notification to the user).
  • the mapping may comprise a table. Each row in the table may include a descriptor for a physical gesture, an application name, and a keystroke combination or other input.
  • Each user on the computing device may have their own mapping table. The correct table may be identify based on the identification of the user.
  • the processor 202 identifies a physical gesture from the video stream.
  • the processor 202 may perform feature detection based on the motion within the video stream to identify a hand orientation and motion associated with the hand.
  • the processor 202 may segment and classify the images coming from the video stream.
  • the processor 202 may classify the background from a human body part (e.g. hand) associated with a physical gesture.
  • the processor 202 apply a neural network model to identify the gesture within the classification.
  • the OpenCV library of the programming language python may be utilized for the classification. Additional implementations may utilize publicly available including libraries such as OpenPose.
  • the processor 202 maps the physical gesture to an application command associated with an application based on the mapping.
  • the application command may include a keystroke combination or other input that commands the application to execute an associated subroutine.
  • the “Control + U” command toggles the text underline feature.
  • the application gestures system maps the identify physical gesture to a descriptor in the previously described table. The mapped row verifies the application and retrieves the application command or keystroke combination or other input.
  • the processor 202 executes the application command as an input to an executing instance of the application.
  • the retrieve application command or keystroke combination may be executed via standard inputs to the computing device.
  • a virtualized keyboard may be instantiated out of view from the user, wherein the application command may be input as a keystroke combination.
  • FIG. 4 is a computing device for supporting instructions for an application gesture system, according to an example.
  • the computing device 400 depicts a processor 202 and a storage medium 404 and, as an example of the computing device 400 performing its operations, the storage medium 404 may include instructions 406-418 that are executable by the processor 202.
  • the processor 202 may be synonymous with the processor 202 referenced in FIG. 2. Additionally, the processor 202 may include but is not limited to central processing units (CPUs).
  • the storage medium 404 can be said to store program instructions that, when executed by processor 202, implement the components of the computing device 400.
  • the executable program instructions stored in the storage medium 404 include, as an example, instructions to receive a video stream from a camera 406, instructions to identify a user from the video stream 408, instructions load a mapping of physical gestures to application commands based on the identity of the user 410, instructions to identify a physical gesture from the video stream 412, instructions to instantiate a virtualized keyboard 414, instructions to map the physical gesture to an application command associated with an application based on the mapping 416, and execute the application command as an input to an executing instance of the application via the virtualized keyboard 418.
  • Storage medium 404 represents generally any number of memory components capable of storing instructions that can be executed by processor 202.
  • Storage medium 404 is non-transitory in the sense that it does not encompass a transitory signal but instead is made up of at least one memory component configured to store the relevant instructions.
  • the storage medium 404 may be a non-transitory computer-readable storage medium.
  • Storage medium 404 may be implemented in a single device or distributed across devices.
  • processor 202 represents any number of processors capable of executing instructions stored by storage medium 404.
  • Processor 202 may be integrated in a single device or distributed across devices.
  • storage medium 404 may be fully or partially integrated in the same device as processor 202, or it may be separate but accessible to that computing device 400 and the processor 202.
  • the program instructions 406-418 may be part of an installation package that, when installed, can be executed by processor 202 to implement the components of the computing device 400.
  • storage medium 404 may be a portable medium such as a CD, DVD, or flash drive, or a memory maintained by a server from which the installation package can be downloaded and installed.
  • the program instructions may be part of an application or applications already installed.
  • storage medium 404 can include integrated memory such as a hard drive, solid state drive, or the like.

Abstract

In an example implementation according to aspects of the present disclosure, a system, method, and storage medium comprising a camera, processor, memory, and instructions to create application gestures. The system receives a video stream comprising physical gestures from a camera. The system identifies a physical gesture from the video stream. The system maps the physical gesture to an application command associated with an application. The system executes the application command as an input to an executing instance of the application.

Description

APPLICATION GESTURES
BACKGROUND
[0001] Interaction between an application and the user goes through a keyboard. Applications have commands that may be activated by keyboard key combinations that facilitated the utilization of features within the applications.
BRIEF DESCRIPTION OF THE DRAWINGS
[0002] FIG. 1 illustrates an application gesture system, according to an example;
[0003] FIG. 2 is a block diagram showing an application gesture system, according to an example;
[0004] FIG. 3 is a flow diagram showing an application gesture system, according to an example; and
[0005] FIG. 4 is a computing device for supporting instructions for an application gesture system, according to an example.
DETAILED DESCRIPTION
[0006] In the computing industry touch-based input continues to be the dominant method of a human providing input into a computing device. The keyboard and mouse have been mainstays in most computing devices either as peripherals or integrated. Touch pads and touch screens have augmented the experience, but keyboard and mouse input still dominate input mechanisms. Certain industries may want to limit touch-based input mechanism. For example, the medical industry may want to limit touch-based input mechanisms due to contamination and cleaning requirements. Described herein is an application gesture system.
[0007] In one implementation, a camera, a memory and a processor execute instructions to receive a video stream comprising imaging of physical gestures from a camera. The processor identifies a physical gesture from the video stream. The processor maps the physical gesture to an application command associated with an application and executes the application command as an input to an executing instance of the application.
[0008] In another implementation, a processor receives a video stream from a camera and identifies a user from a video stream. The processor loads a mapping of physical gestures to applications commands based on the identity of the user. The processor identifies a physical gesture from the video stream and maps the physical gestures to an application command. The processor executes the application command as an input to an executing instance of the application.
[0009] FIG. 1 illustrates an application gesture system 100, according to an example. FIG. 1 illustrates how the application gestures system 100 may appear to a user in one example. The system 100 may include a camera 102, a display 104, an application 108, and utilize a gesture 106 as an input.
[0010] The camera 102 may be implemented as a red green blue (RGB) camera device. In another implementation, the camera 102 may be an infrared (IR) camera. In some implementations the camera 102 may be a standalone device, such as a webcam, attached to a host computing device by way of a communication channel such as universal serial bus. In another implementation, the camera 102 may be integrated into the display 104 or chassis in the example of a laptop computing device. The camera 102 may be a digital camera with a charge-coupled device (CCD) image sensor for capturing images and video streams. The camera 102 may capture video streams including imaging of a user and of physical gestures.
[0011] The display 104 may be implemented with varying technologies including but not limited to cathode ray tubes (CRTs), liquid crystal displays (LCDs), organic light emitting diodes (OLEDs), and electronic paper. The display 104 may be implemented as a standalone monitor connected to a computing device via a display protocol such as DisplayPort™ (DISPLAYPORT is a trademark of Video Electronics Standards Association), or High Definition Multimedia Interface (HDMI™) (HDMI is a trademark of HDMI LICENSING, L.L.C., San Jose, California USA). In another implementation, the display 104 may be an integrated display panel connected to the processor 202 utilizing protocols and signaling across physical interfaces common for that purpose (e.g. Low voltage differential signaling and OpenLDI and FPD-Link). The display 104 may be utilized for rendering graphical representations of applications executing within a computing device. The display 104 renders the representations of applications responsive to user input.
[0012] A gesture 106 may be a physical movement of a user's body to communicate meaning. In one implementation, a gesture 106 may be a hand movement. A gesture 106 may be a series of hand movements. A gesture 106 may be a hand positioning with no movement. In this example, simply stated a gesture 106 may be a discernable positioning of a hand within the field of view of the camera 102. In another example, a physical gesture 106 may correspond to another body part of a person including but not limited to head and facial features.
[0013] An application 108 may be a piece of software being utilized by the user. An application 108 may accept input from the user based on keystroke combinations. Other input may include mouse movements and mouse clicks. In other implementations, the application 108 may accept touchscreen input or touch pad input. The application 108 may include application specific commands that may be executed with input combinations. For example, a specific set of keys pressed at the same time may cause the application 108 to take a specific action. Common application specific commands may include pressing the “control" key and the “B” key at the same time to toggle a bold type font in a word processor. Another common application specific command may include pressing the space bar to unmute a microphone in a video conferencing application. In another implementation, an application 108 may include the operating system. Some operating systems allow for keyboard shortcuts and mouse interactions as inputs. The application gesture system may be able to provide gesture mappings as inputs to the operating system itself. For example, a specific gesture may be utilized to launch a new instance of another application such as command prompt or terminal window.
[0014] During usage, the camera 102 may capture imaging of the user in the form of a video stream. A processor (not shown and described later) may detect a gesture 106 within the captured video stream. The processor may identify the gesture and then map the gesture to an associated application specific command. The mapping may be predefined “out of the box." In this implementation, the physical gesture 106 may be selected from a set of pre-configured gestures. For example, illustrated in FIG. 1, a gesture 106 corresponding to a finger orientation of the index and middle finger extended toward the display, the ring and little finger curled toward the palm, and the thumb perpendicular to the index and forefinger and a downward movement, may be mapped to the “down arrow” on the keyboard for applications that support scrolling. Likewise, the same finger orientation and an upward movement, may be mapped to the “up arrow” on the keyboard for applications that support scrolling. The instantiated and currently running application may be obtained by detecting focus through an application programming interface (API). Utilizing the gesture 106, and the application 108, the corresponding application command determined in the mapping. In one implementation, the application command may be executed utilizing an instantiated virtualized keyboard as an input mechanism.
[0015] In another implementation, a user may present their own gesture 106 to be mapped to a keystroke combination or other input. The system may be placed in a “recording mode." The recording mode may capture a calibration video stream during a user-defined period of time. The calibration video stream does not result in the execution of the mapping, but instead results in the capturing of a user-defined physical gesture. The processor may extract a gesture 106 from the calibration video stream. The processor may identify the user-defined gesture through the configuration application. The identification may include a name for the physical gesture to identify the physical gesture for subsequent reuse. The processor may prompt the user via a configuration application (not shown) for a keystroke and an application to create the mapping of the gesture 106 to the keystroke and the application. In practice, the user may click a record gesture button on a configuration application, make the intended gesture 106, dick stop record, and then select the keystroke combination or other input and the application for which the gesture 106 may be intended. Likewise, multiple mappings of the same gesture 106 may be created, wherein the keystrokes and the applications may be different, but correspond to the same gesture 106.
[0016] FIG. 2 is a block diagram showing an application gesture system, according to an example. Features illustrated in FIG. 2 may correspond with features illustrated in FIG. 1. As such reference numbers may correspond to both FIG. 1 and FIG. 2 illustrating the relationship between the two examples illustrated in each.
[0017] As described in FIG. 1, a camera 102 may be utilized for capturing a video stream. The camera 102 may be connected to the processor 202 via a bus such as universal serial bus. The camera 102 may capture a video stream, where the video stream includes a user. The resolution of the camera 102 may be such as to provide enough pixels to discern high contrast changes in an image to allow for the processor 202 to proficiently perform operations of feature detection. Feature detection may be utilized for determining changes in the movement, particularly the hands within the video stream.
[0018] The processor 202 of the system 200 may be implemented as dedicated hardware circuitry or a virtualized logical processor. The dedicated hardware circuitry may be implemented as a central processing unit (CPU). A dedicated hardware CPU may be implemented as a single to many-core general purpose processor. A dedicated hardware CPU may also be implemented as a multi-chip solution, where more than one CPU are linked through a bus and schedule processing tasks across the more than one CPU.
[0019] A virtualized logical processor may be implemented across a distributed computing environment. A virtualized logical processor may not have a dedicated piece of hardware supporting it. Instead, the virtualized logical processor may have a pool of resources supporting the task for which it was provisioned. In this implementation, the virtualized logical processor may actually be executed on hardware circuitry; however, the hardware circuitry is not dedicated. The hardware circuitry may be in a shared environment where utilization is time sliced. In some implementations the virtualized logical processor includes a software layer between any executing application and the hardware circuitry to handle any abstraction which also monitors and save the application state. Virtual machines (VMs) may be implementations of virtualized logical processors.
[0020] A memory 204 may be implemented in the system 200. The memory 204 may be dedicated hardware circuitry to host instructions for the processor 202 to execute. In another implementation, the memory 204 may be virtualized logical memory. Analogous to the processor 202, dedicated hardware circuitry may be implemented with dynamic ram (DRAM) or other hardware implementations for storing processor instructions. Additionally, the virtualized logical memory may be implemented in a software abstraction which allows the instructions 206 to be executed on a virtualized logical processor, independent of any dedicated hardware implementation.
[0021] The system 200 may also include instructions 206. The instructions 206 may be implemented in a platform specific language that the processor 202 may decode and execute. The instructions 206 may be stored in the memory 204 during execution. As described in reference to FIG. 1 , the instructions 206 may be encoded to perform operations such as receive a video stream comprising imaging of physical gestures from a camera 208, identify a physical gesture from the video stream 210, map the physical gesture to an application command associated with an application 212, and execute the application command as an input to an executing instance of the application 214. In another implementation, multiple users may be able to utilize application gestures. The instructions 206 may also include identifying a user in the video stream. In another implementation the instruction 206 may recognize a user in the video streaming, wherein the recognizing corresponds to matching features short of knowing an identity of the user. Recognizing the user may provide more personal security for the user as the features are not tied to an identity. The instructions 206 may also load a mapping of physical gestures to application commands based on the identity of the user. As the processor 202 may be able to identify physical gestures in the video stream, the processor may be able to identify a user based on facial recognition. In one implementation, a user may install an application gesture system on a computing device. The application gesture system may go through a configuration routine where the video stream captures an image of the user's face. Through face detection software and labeling by the user in the configuration process, the processor 202 may be able to subsequently recognize the user's face in the video stream. In another implementation, operating system capabilities for facial detection for log on of a user may be utilized. During the configuration process, the application gesture system may default to an “out of the box" configuration incorporating a set of common physical gestures mapping to established keystroke combinations or other input for commonly used applications. Additionally, during the configuration process, the application gesture system may provide the user an opportunity to introduce user-defined physical gestures. Upon executing the “recording" feature described previously, the user may map their own user-defined physical gesture to a specific application and keystroke combination or other input for that application. As such, the user may have a robust flexible physical gesture enabled system with a mapping unique to that user. A second user may execute the same configuration process and result with a unique mapping to the second user.
[0022] FIG. 3 is a flow diagram showing an application gesture system, according to an example. References to FIG. 2 may be utilized to describe the features of FIG. 3.
[0023] At 302, the processor 202 receives a video stream from a camera. As described previously, the processor 202 may receive the video stream from the camera utilizing a communication channel such as universal serial bus.
[0024] At 304, the processor 202 identifies a user from the video stream. The video stream may be processed to extract features pertaining to a human face in the video stream. The feature extraction may be determined utilizing a third-party facial recognition library. Many facial recognition libraries are available in the public domain and enabled in a wide variety of computing languages. For example, the library “face-recognition” is implemented in python and features algorithms to extract features to identify a user’s face. Similar cloud-based applications such as Amazon Rekognition® (AMAZON REKOGNITION is a registered trademark of Amazon Technologies Inc, Seattle WA, USA) may be used to implement facial recognition.
[0025] At 306, the processor 202 loads a mapping of physical gestures to application commands based on the identity of the user. The mapping of physical gestures may be stored on a non-volatile storage device and loaded into memory 204 by the processor. The retrieval and loading may occur at the startup of a hosting computing device. The implementation of the application gesture system may correspond to a service that executes quietly (e.g. no visible notification to the user). The mapping may comprise a table. Each row in the table may include a descriptor for a physical gesture, an application name, and a keystroke combination or other input. Each user on the computing device, may have their own mapping table. The correct table may be identify based on the identification of the user. [0026] At 308, the processor 202 identifies a physical gesture from the video stream. The processor 202 may perform feature detection based on the motion within the video stream to identify a hand orientation and motion associated with the hand. In one implementation, the processor 202 may segment and classify the images coming from the video stream. In segmentation, the processor 202 may classify the background from a human body part (e.g. hand) associated with a physical gesture. The processor 202 apply a neural network model to identify the gesture within the classification. The OpenCV library of the programming language python may be utilized for the classification. Additional implementations may utilize publicly available including libraries such as OpenPose.
[0027] At 310, the processor 202 maps the physical gesture to an application command associated with an application based on the mapping. In one implementation the application command may include a keystroke combination or other input that commands the application to execute an associated subroutine. For example, in a word processor, the “Control + U” command toggles the text underline feature. The application gestures system maps the identify physical gesture to a descriptor in the previously described table. The mapped row verifies the application and retrieves the application command or keystroke combination or other input.
[0028] At 312, the processor 202 executes the application command as an input to an executing instance of the application. The retrieve application command or keystroke combination may be executed via standard inputs to the computing device. In another implementation, a virtualized keyboard may be instantiated out of view from the user, wherein the application command may be input as a keystroke combination.
[0029] FIG. 4 is a computing device for supporting instructions for an application gesture system, according to an example. The computing device 400 depicts a processor 202 and a storage medium 404 and, as an example of the computing device 400 performing its operations, the storage medium 404 may include instructions 406-418 that are executable by the processor 202. The processor 202 may be synonymous with the processor 202 referenced in FIG. 2. Additionally, the processor 202 may include but is not limited to central processing units (CPUs). The storage medium 404 can be said to store program instructions that, when executed by processor 202, implement the components of the computing device 400.
[0030] The executable program instructions stored in the storage medium 404 include, as an example, instructions to receive a video stream from a camera 406, instructions to identify a user from the video stream 408, instructions load a mapping of physical gestures to application commands based on the identity of the user 410, instructions to identify a physical gesture from the video stream 412, instructions to instantiate a virtualized keyboard 414, instructions to map the physical gesture to an application command associated with an application based on the mapping 416, and execute the application command as an input to an executing instance of the application via the virtualized keyboard 418.
[0031] Storage medium 404 represents generally any number of memory components capable of storing instructions that can be executed by processor 202. Storage medium 404 is non-transitory in the sense that it does not encompass a transitory signal but instead is made up of at least one memory component configured to store the relevant instructions. As a result, the storage medium 404 may be a non-transitory computer-readable storage medium. Storage medium 404 may be implemented in a single device or distributed across devices. Likewise, processor 202 represents any number of processors capable of executing instructions stored by storage medium 404. Processor 202 may be integrated in a single device or distributed across devices. Further, storage medium 404 may be fully or partially integrated in the same device as processor 202, or it may be separate but accessible to that computing device 400 and the processor 202.
[0032] In one example, the program instructions 406-418 may be part of an installation package that, when installed, can be executed by processor 202 to implement the components of the computing device 400. In this case, storage medium 404 may be a portable medium such as a CD, DVD, or flash drive, or a memory maintained by a server from which the installation package can be downloaded and installed. In another example, the program instructions may be part of an application or applications already installed. Here, storage medium 404 can include integrated memory such as a hard drive, solid state drive, or the like. [0033] It is appreciated that examples described may include various components and features. It is also appreciated that numerous specific details are set forth to provide a thorough understanding of the examples. However, it is appreciated that the examples may be practiced without limitations to these specific details. In other instances, well known methods and structures may not be described in detail to avoid unnecessarily obscuring the description of the examples. Also, the examples may be used in combination with each other.
[0034] Reference in the specification to “an example" or similar language means that a particular feature, structure, or characteristic described in connection with the example is included in at least one example, but not necessarily in other examples. The various instances of the phrase “in one example" or similar phrases in various places in the specification are not necessarily all referring to the same example.
[0035] It is appreciated that the previous description of the disclosed examples is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these examples will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other examples without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the examples shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

WHAT IS CLAIMED IS: 1. A system comprising: a camera; a processor communicatively couple to the camera; and a memory communicatively coupled to the processor and storing machine readable instructions that when executed cause the processor to: receive a video stream comprising imaging of physical gestures from a camera; identify a physical gesture from the video stream; map the physical gesture to an application command associated with an application; and execute the application command as an input to an executing instance of the application.
2. The system of claim 1 wherein the physical gestures comprise a set of preconfigured gestures.
3. The system of claim 1 wherein the application command comprises a keystroke combination.
4. The system of claim 1 , the instructions when executed further cause the processor to: receive a calibration video stream comprising imaging of a user- defined physical gesture from a camera; and receive an identification of the user-defined physical gesture.
5. The system of claim 1 the instructions when executed further cause the processor to: identify a user in the video stream; and load a mapping of physical gestures to application commands based on the identity of the user.
6. A method comprising: receiving a video stream from a camera; identifying a user from the video stream; loading a mapping of physical gestures to application commands based on the identity of the user; identifying a physical gesture from the video stream; mapping the physical gesture to an application command associated with an application based on the mapping; and execute the application command as an input to an executing instance of the application.
7. The method of claim 6 further comprising: receiving a calibration video stream comprising imaging of a user-defined physical gesture from a camera; receiving an identification of the user-defined physical gesture; updating the mapping to include the identification and the application.
8. The method of claim 6 wherein the physical gestures comprise a set of preconfigured gestures.
9. The method of claim 6 wherein the application command comprises a keystroke combination.
10. The method of claim 6 wherein the physical gestures correspond to hand movements.
11 .A non-transitory computer readable medium comprising machine readable instructions that when executed cause a processor to: receive a video stream from a camera; identify a user from the video stream; load a mapping of physical gestures to application commands based on the identity of the user; identify a physical gesture from the video stream; instantiate a virtualized keyboard; map the physical gesture to an application command associated with an application based on the mapping; and execute the application command as an input to an executing instance of the application via the virtualized keyboard.
12. The medium of claim 11 the instructions that when executed further cause the processor to: receive a calibration video stream comprising imaging of a user- defined physical gesture from a camera; receive an identification of the user-defined physical gesture; update the mapping to include the identification and the application.
13. The medium of claim 11 wherein the physical gestures comprise a set of preconfigured gestures.
14. The medium of claim 1 wherein the application command comprises a keystroke combination.
15. The medium of claim 11 wherein the physical gestures correspond to hand movements.
PCT/US2020/053100 2020-09-28 2020-09-28 Application gestures WO2022066185A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2020/053100 WO2022066185A1 (en) 2020-09-28 2020-09-28 Application gestures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2020/053100 WO2022066185A1 (en) 2020-09-28 2020-09-28 Application gestures

Publications (1)

Publication Number Publication Date
WO2022066185A1 true WO2022066185A1 (en) 2022-03-31

Family

ID=80845731

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/053100 WO2022066185A1 (en) 2020-09-28 2020-09-28 Application gestures

Country Status (1)

Country Link
WO (1) WO2022066185A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110173574A1 (en) * 2010-01-08 2011-07-14 Microsoft Corporation In application gesture interpretation
US20150058729A1 (en) * 2011-01-06 2015-02-26 Tivo Inc. Method and apparatus for controls based on concurrent gestures
US20160117081A1 (en) * 2014-10-27 2016-04-28 Thales Avionics, Inc. Controlling entertainment system using combination of inputs from proximity sensor and touch sensor of remote controller
US20170017306A1 (en) * 2013-01-15 2017-01-19 Leap Motion, Inc. Dynamic, free-space user interactions for machine control
US20190025936A1 (en) * 2012-11-08 2019-01-24 Cuesta Technology Holdings, Llc Systems and methods for extensions to alternative control of touch-based devices

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110173574A1 (en) * 2010-01-08 2011-07-14 Microsoft Corporation In application gesture interpretation
US20150058729A1 (en) * 2011-01-06 2015-02-26 Tivo Inc. Method and apparatus for controls based on concurrent gestures
US20190025936A1 (en) * 2012-11-08 2019-01-24 Cuesta Technology Holdings, Llc Systems and methods for extensions to alternative control of touch-based devices
US20170017306A1 (en) * 2013-01-15 2017-01-19 Leap Motion, Inc. Dynamic, free-space user interactions for machine control
US20160117081A1 (en) * 2014-10-27 2016-04-28 Thales Avionics, Inc. Controlling entertainment system using combination of inputs from proximity sensor and touch sensor of remote controller

Similar Documents

Publication Publication Date Title
US10108310B2 (en) Method and apparatus for icon based application control
US9507417B2 (en) Systems and methods for implementing head tracking based graphical user interfaces (GUI) that incorporate gesture reactive interface objects
US11323658B2 (en) Display apparatus and control methods thereof
EP3129871B1 (en) Generating a screenshot
US20170255320A1 (en) Virtual input device using second touch-enabled display
US20180173614A1 (en) Technologies for device independent automated application testing
US20110219340A1 (en) System and method for point, select and transfer hand gesture based user interface
WO2020019616A1 (en) Touch control data processing method and device, intelligent device and storage medium
US9001050B2 (en) Touch screen emulation for a virtual machine
EP3869346B1 (en) Apparatus and method of recognizing external device in a communication system
US20120066624A1 (en) Method and apparatus for controlling movement of graphical user interface objects
US9710137B2 (en) Handedness detection
US20150310267A1 (en) Automated handwriting input for entry fields
US20160170632A1 (en) Interacting With Application Beneath Transparent Layer
WO2017049649A1 (en) Technologies for automated application exploratory testing
US20120162262A1 (en) Information processor, information processing method, and computer program product
US20150205360A1 (en) Table top gestures for mimicking mouse control
WO2022066185A1 (en) Application gestures
WO2023160453A1 (en) Display control method and apparatus, electronic device and storage medium
US9870061B2 (en) Input apparatus, input method and computer-executable program
US20170085784A1 (en) Method for image capturing and an electronic device using the method
US20140152540A1 (en) Gesture-based computer control
CN113117320A (en) Electronic device and method for triggering key macros by using external input signals
US20220129085A1 (en) Input device, input method, medium, and program
US20200183497A1 (en) Operation method and apparatus for service object, and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20955460

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20955460

Country of ref document: EP

Kind code of ref document: A1