WO2016154834A1 - Gesture matching mechanism - Google Patents

Gesture matching mechanism Download PDF

Info

Publication number
WO2016154834A1
WO2016154834A1 PCT/CN2015/075339 CN2015075339W WO2016154834A1 WO 2016154834 A1 WO2016154834 A1 WO 2016154834A1 CN 2015075339 W CN2015075339 W CN 2015075339W WO 2016154834 A1 WO2016154834 A1 WO 2016154834A1
Authority
WO
WIPO (PCT)
Prior art keywords
gesture
user
gestures
database
new
Prior art date
Application number
PCT/CN2015/075339
Other languages
French (fr)
Inventor
Wenlong Li
Xiaolu Shen
Lidan ZHANG
Jose E. LORENZO
Qiang Li
Steven Holmes
Xiaofeng Tong
Yangzhou Du
Mary Smiley
Alok MISHRA
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Priority to EP15886821.6A priority Critical patent/EP3278260B1/en
Priority to US14/911,390 priority patent/US10803157B2/en
Priority to PCT/CN2015/075339 priority patent/WO2016154834A1/en
Priority to CN201580077135.0A priority patent/CN107615288B/en
Publication of WO2016154834A1 publication Critical patent/WO2016154834A1/en
Priority to US17/066,138 priority patent/US11449592B2/en
Priority to US17/947,991 priority patent/US11841935B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/36User authentication by graphic or iconic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/06Authentication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/06Authentication
    • H04W12/065Continuous authentication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/60Context-dependent security
    • H04W12/68Gesture-dependent or behaviour-dependent

Definitions

  • Embodiments described herein generally relate to computers. More particularly, embodiments relate to a mechanism for gesture matching.
  • Authentication is implemented in computer security applications to confirm the identify of an individual that is attempting to use a computer system.
  • Common authentication systems may employ biometric (e.g., fingerprint and/or facial recognition) applications to authenticate a user.
  • biometric e.g., fingerprint and/or facial recognition
  • Such systems may be subject to counterfeit measures.
  • Figure 1 illustrates a gesture matching mechanism at a computing device according to one embodiment.
  • Figure 2 illustrates a gesture matching mechanism according to one embodiment.
  • Figure 3 illustrates avatars displayed by gesture matching mechanism.
  • Figure 4 is a flow diagram illustrating the operation of a gesture matching mechanism according to one embodiment.
  • Figure 5 illustrates a computer system suitable for implementing embodiments of the present disclosure according to one embodiment.
  • Embodiments provide for a gesture matching mechanism that learns gestures and performs user authentication based on the learned gestures.
  • the gesture matching mechanism learns gestures during a registration phase in which a user registers a number of gestures for later recognition of a user. Subsequently during an authentication phase, a user is prompted to perform a gesture selected from a database in order to determine whether the user’s gesture performance matches the selected gesture. The user is authenticated if a match is detected.
  • the gesture matching mechanism may be implemented to screen for health warnings by monitoring a user’s facial movement over time to detect changes that may indicate a health problem.
  • the gesture matching mechanism may be implemented to perform game control, as well as other applications.
  • Figure 1 illustrates a gesture matching mechanism 110 at a computing device 100 according to one embodiment.
  • computing device 100 serves as a host machine for hosting gesture matching mechanism 110 that includes a combination of any number and type of components for facilitating authentication, health indication and/or game control based on gesture recognition at computing devices, such as computing device 100.
  • Computing device 100 may include large computing systems, such as server computers, desktop computers, etc., and may further include set-top boxes (e.g., Internet-based cable television set-top boxes, etc. ) , global positioning system (GPS) -based devices, etc.
  • set-top boxes e.g., Internet-based cable television set-top boxes, etc.
  • GPS global positioning system
  • Computing device 100 may include mobile computing devices, such as cellular phones including smartphones, personal digital assistants (PDAs) , tablet computers, laptop computers (e.g., notebook, netbook, Ultrabook TM , etc. ) , e-readers, etc.
  • mobile computing devices such as cellular phones including smartphones, personal digital assistants (PDAs) , tablet computers, laptop computers (e.g., notebook, netbook, Ultrabook TM , etc. ) , e-readers, etc.
  • Computing device 100 may include an operating system (OS) 106 serving as an interface between hardware and/or physical resources of the computer device 100 and a user.
  • OS operating system
  • Computing device 100 further includes one or more processors 102, memory devices 104, network devices, drivers, or the like, as well as input/output (I/O) sources 108, such as touchscreens, touch panels, touch pads, virtual or regular keyboards, virtual or regular mice, etc.
  • I/O input/output
  • gesture matching mechanism 110 may be employed at computing device 100, such as a laptop computer, a desktop computer, a smartphone, a tablet computer, etc.
  • gesture matching mechanism 110 may include any number and type of components, such as: reception and capturing logic 201, gesture training module 202, gesture selection engine 203, avatar animation and rendering engine 204, gesture matching component 205 and gesture learning module 206.
  • reception and capturing logic 201 facilitates an image capturing device implemented at image sources 225 at computing device 100 to receive and capture an image associated with a user, such as a live and real-time image of a user. As the live image of the user is received and captured, the user’s movements may be continuously, and in real-time, detected and tracked in live video frames.
  • reception and capturing logic 201 may receive image data from image source 225, where the image data may be in the form of a sequence of images or frames (e.g., video frames) .
  • Image sources 225 may include an image capturing device, such as a camera.
  • Such a device may include various components, such as (but are not limited to) an optics assembly, an image sensor, an image/video encoder, etc., that may be implemented in any combination of hardware and/or software.
  • the optics assembly may include one or more optical devices (e.g., lenses, mirrors, etc. ) to project an image within a field of view onto multiple sensor elements within the image sensor.
  • the optics assembly may include one or more mechanisms to control the arrangement of these optical device (s) .
  • such mechanisms may control focusing operations, aperture settings, exposure settings, zooming operations, shutter speed, effective focal length, etc. Embodiments, however, are not limited to these examples.
  • Image sources 225 may further include one or more image sensors including an array of sensor elements where these elements may be complementary metal oxide semiconductor (CMOS) sensors, charge coupled devices (CCDs) , or other suitable sensor element types. These elements may generate analog intensity signals (e.g., voltages) , which correspond to light incident upon the sensor.
  • the image sensor may also include analog-to-digital converter (s) ADC (s) that convert the analog intensity signals into digitally encoded intensity values.
  • CMOS complementary metal oxide semiconductor
  • ADC analog-to-digital converter
  • Embodiments, however, are not limited to these examples. For example, an image sensor converts light received through optics assembly into pixel values, where each of these pixel values represents a particular light intensity at the corresponding sensor element. Although these pixel values have been described as digital, they may alternatively be analog.
  • the image sensing device may include an image/video encoder to encode and/or compress pixel values.
  • image/video encoder to encode and/or compress pixel values.
  • Various techniques, standards, and/or formats e.g., Moving Picture Experts Group (MPEG) , Joint Photographic Expert Group (JPEG) , etc. ) may be employed for this encoding and/or compression.
  • MPEG Moving Picture Experts Group
  • JPEG Joint Photographic Expert Group
  • image sources 225 may be any number and type of components, such as image capturing devices (e.g., one or more cameras, etc. ) and image sensing devices, such as (but not limited to) context-aware sensors (e.g., temperature sensors, facial expression and feature measurement sensors working with one or more cameras, environment sensors (such as to sense background colors, lights, etc. ) , biometric sensors (such as to detect fingerprints, facial points or features, etc. ) , and the like.
  • context-aware sensors e.g., temperature sensors, facial expression and feature measurement sensors working with one or more cameras, environment sensors (such as to sense background colors, lights, etc. )
  • biometric sensors such as to detect fingerprints, facial points or features, etc.
  • Computing device 100 may also include one or more software applications, such as business applications, social network websites, business networking websites, communication applications, games and other entertainment applications, etc., offering one or more user interfaces (e.g., web user interface (WUI) , graphical user interface (GUI) , touchscreen, etc. ) to display the gesture matching and for the user to communicate with other users at other computing devices, while ensuring compatibility with changing technologies, parameters, protocols, standards, etc.
  • WUI web user interface
  • GUI graphical user interface
  • gesture matching mechanism 110 operates in two phases.
  • One such phase is a registration phase in which a user registers a number of gestures.
  • gesture training module 202 capture a multitude of gestures for later recognition of a user.
  • gesture training module 202 identifies new gestures from user images captured at reception and capturing logic 201 and adds the gestures to database 240 as animation parameters.
  • each new gesture defines a combination of a pose or expression in a single frame.
  • each gesture defines a sequence of poses or expressions occuring within a predetermined time frame (e.g., seconds) .
  • database 240 may be used to record, store, and maintain data relating to various gestures such as human head, facial, hand and/or finger movements. These gestures may be recorded as sequences of frames where each frame may include multiple features.
  • Database 240 may include a data source, an information storage medium, such as memory (volatile or non-volatile) , disk storage, optical storage, etc.
  • the second phase is an authentication phase in which a user is authenticated based on recognition of a gesture.
  • a user is prompted to perform a gesture selected from database 240 in order to determine whether the user’s gesture performance matches the selected gesture.
  • Gesture selection engine 203 is implemented to randomly select a gesture from database 240 for user authentication.
  • Avatar animation and rendering engine 204 translates the selected gesture into an animated avatar on display 230.
  • Display device 230 may be implemented with various display (s) including (but are not limited to) liquid crystal displays (LCDs) , light emitting diode (LED) displays, plasma displays, and cathode ray tube (CRT) displays.
  • LCDs liquid crystal displays
  • LED light emitting diode
  • CRT cathode ray tube
  • display screen or device 230 visually outputs the avatar to the user.
  • avatar animation and rendering engine 204 uses Pocket which blends shapes to animate a selected avatar.
  • a facial gesture e.g., mouth open, eye wink, etc.
  • Figure 3 illustrates one embodiment of avatars corresponding to selected gestures that are displayed by avatar animation and rendering engine 204. As shown in Figure 3, an avatar dynamically poses facial/head gestures.
  • avatar animation and rendering engine 204 facilitates the prompting of a user to perform the pose of the displayed avatar.
  • reception and capturing logic 201 captures the user’s response.
  • reception and capturing logic 201 captures video within a predetermined time window.
  • gesture matching component 205 compares the captured gesture response with the selected gesture by analyzing the video frame to identify whether the same gesture appears in the predetermined time period.
  • Gesture matching component 205 automatically selects a key frame and determines the temporal sequence across multiple frames to compare the user’s input (e.g., performed gesture) with database 240 to determine if the input matches the selected gesture.
  • G user and G database are compared by a temporal sequence matching method, such as Dynamic Time Warping.
  • Gesture learning module 207 identifies new gestures performed by the user and adds the new gestures to database 240. For instance, if G user doesn’ t match any gestures database 240, database 240 is updated to include this new gesture. As a result, different gestures may be used for subsequent authentication of the user.
  • any number and type of components 201-240 of gesture matching mechanism 110 may not necessarily be at a single computing device and may be allocated among or distributed between any number and type of computing devices, including computing device 100 having (but are not limited to) server computing devices, cameras, PDAs, mobile phones (e.g., smartphones, tablet computers, etc. ) , personal computing devices (e.g., desktop devices, laptop computers, etc. ) , smart televisions, servers, wearable devices, media players, any smart computing devices, and so forth. Further examples include microprocessors, graphics processors or engines, microcontrollers, application specific integrated circuits (ASICs) , and so forth. Embodiments, however, are not limited to these examples.
  • Figure 4 is a flow diagram illustrating a method 400 for facilitating authentication of a user at a gesture matching mechanism operating on a computing device according to one embodiment.
  • Method 400 may be performed by processing logic that may comprise hardware (e.g., circuitry, dedicated logic, programmable logic, etc. ) , software (such as instructions run on a processing device) , or a combination thereof.
  • method 400 may be performed by gesture matching mechanism 110.
  • the processes of method 400 are illustrated in linear sequences for brevity and clarity in presentation; however, it is contemplated that any number of them can be performed in parallel, asynchronously, or in different orders. For brevity, clarity, and ease of understanding, many of the details discussed with reference to Figures 1 and 2 are not discussed or repeated here.
  • Method 400 begins at block 410 with a gesture being selected from database 240.
  • the selected gesture is displayed as an avatar.
  • the user is prompted to pose according to a gesture being displayed by the avatar.
  • the user pose is captured.
  • video frame data comprising the user pose is analyzed.
  • a determination is made as to whether the captured pose includes a gesture that matches the selected gesture. If not, control is returned to processing block 410, where another gesture is selected for authentication. If there is a determination that the captured pose includes a gesture that matches the selected gesture, the user is authenticated at processing block 470.
  • a determination is made as to whether one or more poses included unrecognized gestures. If so, the gestures are added to the database, processing block 490.
  • gesture matching mechanism 110 may feature gesture matching mechanism 110 being implemented to screen for health warnings.
  • gesture matching mechanism 110 may monitor a user’s facial movement (e.g., mouth) over time and analyze the movements for micro changes that may indicate a stroke.
  • gesture matching mechanism 110 may be implemented to perform game control.
  • FIG. 5 illustrates one embodiment of a computer system 500.
  • Computing system 500 includes bus 505 (or, for example, a link, an interconnect, or another type of communication device or interface to communicate information) and processor 510 coupled to bus 505 that may process information. While computing system 500 is illustrated with a single processor, electronic system 500 and may include multiple processors and/or co-processors, such as one or more of central processors, graphics processors, and physics processors, etc. Computing system 500 may further include random access memory (RAM) or other dynamic storage device 520 (referred to as main memory) , coupled to bus 505 and may store information and instructions that may be executed by processor 510. Main memory 520 may also be used to store temporary variables or other intermediate information during execution of instructions by processor 510.
  • RAM random access memory
  • main memory main memory
  • Computing system 500 may also include read only memory (ROM) and/or other storage device 530 coupled to bus 505 that may store static information and instructions for processor 510.
  • Date storage device 540 may be coupled to bus 505 to store information and instructions.
  • Date storage device 540 such as magnetic disk or optical disc and corresponding drive may be coupled to computing system 500.
  • Computing system 500 may also be coupled via bus 505 to display device 550, such as a cathode ray tube (CRT) , liquid crystal display (LCD) or Organic Light Emitting Diode (OLED) array, to display information to a user.
  • display device 550 such as a cathode ray tube (CRT) , liquid crystal display (LCD) or Organic Light Emitting Diode (OLED) array
  • User input device 560 including alphanumeric and other keys, may be coupled to bus 505 to communicate information and command selections to processor 510.
  • cursor control 570 such as a mouse, a trackball, a touchscreen, a touchpad, or cursor direction keys to communicate direction information and command selections to processor 510 and to control cursor movement on display 550.
  • Camera and microphone arrays 590 of computer system 500 may be coupled to bus 505 to observe gestures, record audio and video and to receive and transmit visual and audio commands.
  • Computing system 500 may further include network interface (s) 580 to provide access to a network, such as a local area network (LAN) , a wide area network (WAN) , a metropolitan area network (MAN) , a personal area network (PAN) , Bluetooth, a cloud network, a mobile network (e.g., 3 rd Generation (3G) , etc. ) , an intranet, the Internet, etc.
  • Network interface (s) 580 may include, for example, a wireless network interface having antenna 585, which may represent one or more antenna (e) .
  • Network interface (s) 580 may also include, for example, a wired network interface to communicate with remote devices via network cable 587, which may be, for example, an Ethernet cable, a coaxial cable, a fiber optic cable, a serial cable, or a parallel cable.
  • network cable 587 may be, for example, an Ethernet cable, a coaxial cable, a fiber optic cable, a serial cable, or a parallel cable.
  • Network interface (s) 580 may provide access to a LAN, for example, by conforming to IEEE 802.11b and/or IEEE 802.11g standards, and/or the wireless network interface may provide access to a personal area network, for example, by conforming to Bluetooth standards.
  • Other wireless network interfaces and/or protocols, including previous and subsequent versions of the standards, may also be supported.
  • network interface (s) 580 may provide wireless communication using, for example, Time Division, Multiple Access (TDMA) protocols, Global Systems for Mobile Communications (GSM) protocols, Code Division, Multiple Access (CDMA) protocols, and/or any other type of wireless communications protocols.
  • TDMA Time Division, Multiple Access
  • GSM Global Systems for Mobile Communications
  • CDMA Code Division, Multiple Access
  • Network interface (s) 580 may include one or more communication interfaces, such as a modem, a network interface card, or other well-known interface devices, such as those used for coupling to the Ethernet, token ring, or other types of physical wired or wireless attachments for purposes of providing a communication link to support a LAN or a WAN, for example.
  • the computer system may also be coupled to a number of peripheral devices, clients, control surfaces, consoles, or servers via a conventional network infrastructure, including an Intranet or the Internet, for example.
  • computing system 500 may vary from implementation to implementation depending upon numerous factors, such as price constraints, performance requirements, technological improvements, or other circumstances.
  • Examples of the electronic device or computer system 500 may include without limitation a mobile device, a personal digital assistant, a mobile computing device, a smartphone, a cellular telephone, a handset, a one-way pager, a two-way pager, a messaging device, a computer, a personal computer (PC) , a desktop computer, a laptop computer, a notebook computer, a handheld computer, a tablet computer, a server, a server array or server farm, a web server, a network server, an Internet server, a work station, a mini-computer, a main frame computer, a supercomputer, a network appliance, a web appliance, a distributed computing system, multiprocessor systems, processor-based systems, consumer electronics, programmable consumer electronics, television, digital television, set top box,
  • Embodiments may be implemented as any or a combination of: one or more microchips or integrated circuits interconnected using a parentboard, hardwired logic, software stored by a memory device and executed by a microprocessor, firmware, an application specific integrated circuit (ASIC) , and/or a field programmable gate array (FPGA) .
  • logic may include, by way of example, software or hardware and/or combinations of software and hardware.
  • Embodiments may be provided, for example, as a computer program product which may include one or more machine-readable media having stored thereon machine-executable instructions that, when executed by one or more machines such as a computer, network of computers, or other electronic devices, may result in the one or more machines carrying out operations in accordance with embodiments described herein.
  • a machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs (Compact Disc-Read Only Memories) , and magneto-optical disks, ROMs, RAMs, EPROMs (Erasable Programmable Read Only Memories) , EEPROMs (Electrically Erasable Programmable Read Only Memories) , magnetic or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing machine-executable instructions.
  • embodiments may be downloaded as a computer program product, wherein the program may be transferred from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of one or more data signals embodied in and/or modulated by a carrier wave or other propagation medium via a communication link (e.g., a modem and/or network connection) .
  • a remote computer e.g., a server
  • a requesting computer e.g., a client
  • a communication link e.g., a modem and/or network connection
  • references to “one embodiment” , “an embodiment” , “example embodiment” , “various embodiments” , etc., indicate that the embodiment (s) so described may include particular features, structures, or characteristics, but not every embodiment necessarily includes the particular features, structures, or characteristics. Further, some embodiments may have some, all, or none of the features described for other embodiments.
  • Coupled is used to indicate that two or more elements co-operate or interact with each other, but they may or may not have intervening physical or electrical components between them.
  • Example 1 includes an apparatus to facilitate gesture matching.
  • the apparatus includes a gesture selection engine to select a gesture from a database during an authentication phase, an avatar animation and rendering engine to translate a selected gesture into an animated avatar for display at a display device with a prompt for a user to perform the selected gesture, reception and capturing logic to capture, in real-time, an image of a user and a gesture matching component to compare the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
  • Example 2 includes the subject matter of Example 1, wherein the gesture matching component authenticates the user if the gesture performed by the user in the captured image matches the selected gesture.
  • Example 3 includes the subject matter of Example 2, wherein the gesture matching component selects a key frame from the user image and determines a temporal sequence across multiple frames to compare the gesture performed by the user to the selected gesture.
  • Example 4 includes the subject matter of Example 3, wherein the comparison is performed using a temporal sequence matching process.
  • Example 5 includes the subject matter of Example 1, further comprising a gesture training module to identify gestures from images of a user captured at reception and capturing logic during a registration phase and store the gestures in the database for recognition.
  • Example 6 includes the subject matter of Example 5, wherein the gesture training module stores the gestures as animation parameters.
  • Example 7 includes the subject matter of Example 6, wherein one of the captured gestures is selected from the database by the gesture selection engine during the authentication phase.
  • Example 8 includes the subject matter of Example 1, further comprising a gesture learning module to identify new gestures performed by the user and add the new gestures to the database.
  • Example 9 includes the subject matter of Example 8, wherein the gesture learning module identifies a new gesture upon the gesture matching component determining that the gesture performed by the user does not match a gesture in the database.
  • Example 10 includes a method to facilitate gesture matching comprising selecting a gesture from a database during an authentication phase, translating the selected gesture into an animated avatar, displaying the avatar, prompting a user to perform the selected gesture, capturing a real-time image of the user and comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
  • Example 11 includes the subject matter of Example 10, further comprising authenticating the user if the gesture performed by the user in the captured image matches the selected gesture.
  • Example 12 includes the subject matter of Example 11, wherein comparing the gesture performed by the user to the selected gesture comprises selecting a key frame from the user image and determining a temporal sequence across multiple frames.
  • Example 13 includes the subject matter of Example 11, wherein the comparison is performed using a temporal sequence matching process.
  • Example 14 includes the subject matter of Example 10, further comprising performing a registration process prior to the authentication phase.
  • Example 15 includes the subject matter of Example 14, wherein the registration process comprises identifying gestures from captured images of the user and storing the gestures in the database for recognition.
  • Example 16 includes the subject matter of Example 15, wherein the gestures are stored as animation parameters.
  • Example 17 includes the subject matter of Example 16, wherein one of the captured gestures is selected from the database during the authentication phase.
  • Example 18 includes the subject matter of Example 10, further comprising identifying new gestures performed by the user and adding the new gestures to the database.
  • Example 19 includes the subject matter of Example 18, wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.
  • Example 20 that includes at least one machine-readable medium comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to carry out operations according to any one of claims 10 to 19.
  • Example 21 includes an apparatus to facilitate gesture matching, comprising means for selecting a gesture from a database during an authentication phase, means for translating the selected gesture into an animated avatar, means for displaying the avatar, means for prompting a user to perform the selected gesture, means for capturing a real-time image of the user and means for comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
  • Example 22 includes the subject matter of Example 21, further comprising means for performing registration process prior to the authentication phase.
  • Example 23 includes the subject matter of Example 22, wherein the means for registration comprises means for identifying gestures from captured images of the user and means for storing the gestures in the database for recognition.
  • Example 24 includes the subject matter of Example 22, further comprising means for identifying new gestures performed by the user and means for adding the new gestures to the database.
  • Example 25 includes the subject matter of Example 24, wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.
  • Example 26 includes at least one machine-readable medium comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to carry out operations comprising selecting a gesture from a database during an authentication phase, translating the selected gesture into an animated avatar, displaying the avatar, prompting a user to perform the selected gesture, capturing a real-time image of the user and comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
  • Example 27 includes the subject matter of Example 26, comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to further carry out operations comprising performing registration process prior to the authentication phase.
  • Example 28 includes the subject matter of Example 27, wherein the registration process comprises identifying gestures from captured images of the user and means for storing the gestures in the database for recognition.
  • Example 29 includes the subject matter of Example 26, comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to further carry out operations comprising identifying new gestures performed by the user and adding the new gestures to the database.
  • Example 30 includes the subject matter of Example 29, wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Computing Systems (AREA)
  • Computer Graphics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • User Interface Of Digital Computer (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A mechanism is described to facilitate gesture matching according to one embodiment. A method of embodiments, as described herein, includes selecting a gesture from a database during an authentication phase, translating the selected gesture into an animated avatar, displaying the avatar, prompting a user to perform the selected gesture, capturing a real-time image of the user and comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.

Description

GESTURE MATCHING MECHANISM FIELD
Embodiments described herein generally relate to computers. More particularly, embodiments relate to a mechanism for gesture matching.
BACKGROUND
Authentication is implemented in computer security applications to confirm the identify of an individual that is attempting to use a computer system. Common authentication systems may employ biometric (e.g., fingerprint and/or facial recognition) applications to authenticate a user. However, such systems may be subject to counterfeit measures.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.
Figure 1 illustrates a gesture matching mechanism at a computing device according to one embodiment.
Figure 2 illustrates a gesture matching mechanism according to one embodiment.
Figure 3 illustrates avatars displayed by gesture matching mechanism.
Figure 4 is a flow diagram illustrating the operation of a gesture matching mechanism according to one embodiment.
Figure 5 illustrates a computer system suitable for implementing embodiments of the present disclosure according to one embodiment.
DETAILED DESCRIPTION
In the following description, numerous specific details are set forth. However, embodiments, as described herein, may be practiced without these specific details. In other instances, well-known circuits, structures and techniques have not been shown in details in order not to obscure the understanding of this description.
Embodiments provide for a gesture matching mechanism that learns gestures and performs user authentication based on the learned gestures. In one embodiment, the gesture matching mechanism learns gestures during a registration phase in which a user registers a number of gestures for later recognition of a user. Subsequently during an authentication phase, a user is prompted to perform a gesture selected from a database in order to determine whether the user’s gesture performance matches the selected gesture. The user is authenticated if a match is detected. In other embodiment, the gesture matching mechanism may be implemented to screen for health warnings by monitoring a user’s facial movement over time to detect changes that may  indicate a health problem. In a further embodiment, the gesture matching mechanism may be implemented to perform game control, as well as other applications.
Figure 1 illustrates a gesture matching mechanism 110 at a computing device 100 according to one embodiment. In one embodiment, computing device 100 serves as a host machine for hosting gesture matching mechanism 110 that includes a combination of any number and type of components for facilitating authentication, health indication and/or game control based on gesture recognition at computing devices, such as computing device 100. Computing device 100 may include large computing systems, such as server computers, desktop computers, etc., and may further include set-top boxes (e.g., Internet-based cable television set-top boxes, etc. ) , global positioning system (GPS) -based devices, etc. Computing device 100 may include mobile computing devices, such as cellular phones including smartphones, personal digital assistants (PDAs) , tablet computers, laptop computers (e.g., notebook, netbook, UltrabookTM, etc. ) , e-readers, etc.
Computing device 100 may include an operating system (OS) 106 serving as an interface between hardware and/or physical resources of the computer device 100 and a user. Computing device 100 further includes one or more processors 102, memory devices 104, network devices, drivers, or the like, as well as input/output (I/O) sources 108, such as touchscreens, touch panels, touch pads, virtual or regular keyboards, virtual or regular mice, etc. It is to be noted that terms like “node” , “computing node” , “server” , “server device” , “cloud computer” , “cloud server” , “cloud server computer” , “machine” , “host machine” , “device” , “computing device” , “computer” , “computing system” , and the like, may be used interchangeably throughout this document. It is to be further noted that terms like “application” , “software application” , “program” , “software program” , “package” , and “software package” may be used interchangeably throughout this document. Similarly, terms like “job” , “input” , “request” and “message” may be used interchangeably throughout this document.
Figure 2 illustrates a gesture matching mechanism 110 according to one embodiment. In one embodiment, gesture matching mechanism 110 may be employed at computing device 100, such as a laptop computer, a desktop computer, a smartphone, a tablet computer, etc. In one embodiment, gesture matching mechanism 110 may include any number and type of components, such as: reception and capturing logic 201, gesture training module 202, gesture selection engine 203, avatar animation and rendering engine 204, gesture matching component 205 and gesture learning module 206.
In one embodiment, reception and capturing logic 201 facilitates an image capturing device implemented at image sources 225 at computing device 100 to receive and capture an image  associated with a user, such as a live and real-time image of a user. As the live image of the user is received and captured, the user’s movements may be continuously, and in real-time, detected and tracked in live video frames. In embodiments, reception and capturing logic 201 may receive image data from image source 225, where the image data may be in the form of a sequence of images or frames (e.g., video frames) . Image sources 225 may include an image capturing device, such as a camera. Such a device may include various components, such as (but are not limited to) an optics assembly, an image sensor, an image/video encoder, etc., that may be implemented in any combination of hardware and/or software. The optics assembly may include one or more optical devices (e.g., lenses, mirrors, etc. ) to project an image within a field of view onto multiple sensor elements within the image sensor. In addition, the optics assembly may include one or more mechanisms to control the arrangement of these optical device (s) . For example, such mechanisms may control focusing operations, aperture settings, exposure settings, zooming operations, shutter speed, effective focal length, etc. Embodiments, however, are not limited to these examples.
Image sources 225 may further include one or more image sensors including an array of sensor elements where these elements may be complementary metal oxide semiconductor (CMOS) sensors, charge coupled devices (CCDs) , or other suitable sensor element types. These elements may generate analog intensity signals (e.g., voltages) , which correspond to light incident upon the sensor. In addition, the image sensor may also include analog-to-digital converter (s) ADC (s) that convert the analog intensity signals into digitally encoded intensity values. Embodiments, however, are not limited to these examples. For example, an image sensor converts light received through optics assembly into pixel values, where each of these pixel values represents a particular light intensity at the corresponding sensor element. Although these pixel values have been described as digital, they may alternatively be analog. As described above, the image sensing device may include an image/video encoder to encode and/or compress pixel values. Various techniques, standards, and/or formats (e.g., Moving Picture Experts Group (MPEG) , Joint Photographic Expert Group (JPEG) , etc. ) may be employed for this encoding and/or compression.
As aforementioned, image sources 225 may be any number and type of components, such as image capturing devices (e.g., one or more cameras, etc. ) and image sensing devices, such as (but not limited to) context-aware sensors (e.g., temperature sensors, facial expression and feature measurement sensors working with one or more cameras, environment sensors (such as to sense background colors, lights, etc. ) , biometric sensors (such as to detect fingerprints, facial points or features, etc. ) , and the like. Computing device 100 may also include one or more  software applications, such as business applications, social network websites, business networking websites, communication applications, games and other entertainment applications, etc., offering one or more user interfaces (e.g., web user interface (WUI) , graphical user interface (GUI) , touchscreen, etc. ) to display the gesture matching and for the user to communicate with other users at other computing devices, while ensuring compatibility with changing technologies, parameters, protocols, standards, etc.
According to one embodiment, gesture matching mechanism 110 operates in two phases. One such phase is a registration phase in which a user registers a number of gestures. In such an embodiment, gesture training module 202 capture a multitude of gestures for later recognition of a user. In one embodiment, gesture training module 202 identifies new gestures from user images captured at reception and capturing logic 201 and adds the gestures to database 240 as animation parameters. According to one embodiment, each new gesture defines a combination of a pose or expression in a single frame. In other embodiments, each gesture defines a sequence of poses or expressions occuring within a predetermined time frame (e.g., seconds) . In some embodiments, database 240 may be used to record, store, and maintain data relating to various gestures such as human head, facial, hand and/or finger movements. These gestures may be recorded as sequences of frames where each frame may include multiple features. Database 240 may include a data source, an information storage medium, such as memory (volatile or non-volatile) , disk storage, optical storage, etc.
The second phase is an authentication phase in which a user is authenticated based on recognition of a gesture. In one embodiment, a user is prompted to perform a gesture selected from database 240 in order to determine whether the user’s gesture performance matches the selected gesture. Gesture selection engine 203 is implemented to randomly select a gesture from database 240 for user authentication. Avatar animation and rendering engine 204 translates the selected gesture into an animated avatar on display 230. Display device 230 may be implemented with various display (s) including (but are not limited to) liquid crystal displays (LCDs) , light emitting diode (LED) displays, plasma displays, and cathode ray tube (CRT) displays.
In one embodiment, display screen or device 230 visually outputs the avatar to the user. In further embodiments, avatar animation and rendering engine 204 uses
Figure PCTCN2015075339-appb-000001
Pocket
Figure PCTCN2015075339-appb-000002
which blends shapes to animate a selected avatar. In this embodiment, a facial gesture (e.g., mouth open, eye wink, etc. ) may be represented by the blend shape parameters that correspond to facial gesture data) . Figure 3 illustrates one embodiment of avatars corresponding to selected  gestures that are displayed by avatar animation and rendering engine 204. As shown in Figure 3, an avatar dynamically poses facial/head gestures.
According to one embodiment, avatar animation and rendering engine 204 facilitates the prompting of a user to perform the pose of the displayed avatar. Referring back to Figure 2, reception and capturing logic 201 captures the user’s response. According to one embodiment, reception and capturing logic 201 captures video within a predetermined time window. Subsequently, gesture matching component 205 compares the captured gesture response with the selected gesture by analyzing the video frame to identify whether the same gesture appears in the predetermined time period.
Gesture matching component 205 automatically selects a key frame and determines the temporal sequence across multiple frames to compare the user’s input (e.g., performed gesture) with database 240 to determine if the input matches the selected gesture. In one embodiment, the user’s gesture is recorded as a sequence of t frames: Guser = {p1, p2 ... pt} , where pi is the pose and expression parameters for the ith frame. Similarly, each gesture in the database can be represented as a sequence of s frames: Gdatabase = {p1, p2 ... ps} . Guser and Gdatabase are compared by a temporal sequence matching method, such as Dynamic Time Warping. If there is a match, the user is authenticated. Gesture learning module 207 identifies new gestures performed by the user and adds the new gestures to database 240. For instance, if Guser doesn’ t match any gestures database 240, database 240 is updated to include this new gesture. As a result, different gestures may be used for subsequent authentication of the user.
It is contemplated that any number and type of components 201-240 of gesture matching mechanism 110 may not necessarily be at a single computing device and may be allocated among or distributed between any number and type of computing devices, including computing device 100 having (but are not limited to) server computing devices, cameras, PDAs, mobile phones (e.g., smartphones, tablet computers, etc. ) , personal computing devices (e.g., desktop devices, laptop computers, etc. ) , smart televisions, servers, wearable devices, media players, any smart computing devices, and so forth. Further examples include microprocessors, graphics processors or engines, microcontrollers, application specific integrated circuits (ASICs) , and so forth. Embodiments, however, are not limited to these examples.
Figure 4 is a flow diagram illustrating a method 400 for facilitating authentication of a user at a gesture matching mechanism operating on a computing device according to one embodiment. Method 400 may be performed by processing logic that may comprise hardware (e.g., circuitry, dedicated logic, programmable logic, etc. ) , software (such as instructions run on a processing device) , or a combination thereof. In one embodiment, method 400 may be  performed by gesture matching mechanism 110. The processes of method 400 are illustrated in linear sequences for brevity and clarity in presentation; however, it is contemplated that any number of them can be performed in parallel, asynchronously, or in different orders. For brevity, clarity, and ease of understanding, many of the details discussed with reference to Figures 1 and 2 are not discussed or repeated here.
Method 400 begins at block 410 with a gesture being selected from database 240. At processing block 420, the selected gesture is displayed as an avatar. At processing block 430, the user is prompted to pose according to a gesture being displayed by the avatar. At processing block 440 the user pose is captured. At processing block 450, video frame data comprising the user pose is analyzed. At decision block 460, a determination is made as to whether the captured pose includes a gesture that matches the selected gesture. If not, control is returned to processing block 410, where another gesture is selected for authentication. If there is a determination that the captured pose includes a gesture that matches the selected gesture, the user is authenticated at processing block 470. At decision block 480, a determination is made as to whether one or more poses included unrecognized gestures. If so, the gestures are added to the database, processing block 490.
Although described with reference to authentication, other embodiments may feature gesture matching mechanism 110 being implemented to screen for health warnings. In such embodiments, gesture matching mechanism 110 may monitor a user’s facial movement (e.g., mouth) over time and analyze the movements for micro changes that may indicate a stroke. In a further embodiment, gesture matching mechanism 110 may be implemented to perform game control.
Figure 5 illustrates one embodiment of a computer system 500. Computing system 500 includes bus 505 (or, for example, a link, an interconnect, or another type of communication device or interface to communicate information) and processor 510 coupled to bus 505 that may process information. While computing system 500 is illustrated with a single processor, electronic system 500 and may include multiple processors and/or co-processors, such as one or more of central processors, graphics processors, and physics processors, etc. Computing system 500 may further include random access memory (RAM) or other dynamic storage device 520 (referred to as main memory) , coupled to bus 505 and may store information and instructions that may be executed by processor 510. Main memory 520 may also be used to store temporary variables or other intermediate information during execution of instructions by processor 510.
Computing system 500 may also include read only memory (ROM) and/or other storage device 530 coupled to bus 505 that may store static information and instructions for processor  510. Date storage device 540 may be coupled to bus 505 to store information and instructions. Date storage device 540, such as magnetic disk or optical disc and corresponding drive may be coupled to computing system 500.
Computing system 500 may also be coupled via bus 505 to display device 550, such as a cathode ray tube (CRT) , liquid crystal display (LCD) or Organic Light Emitting Diode (OLED) array, to display information to a user. User input device 560, including alphanumeric and other keys, may be coupled to bus 505 to communicate information and command selections to processor 510. Another type of user input device 560 is cursor control 570, such as a mouse, a trackball, a touchscreen, a touchpad, or cursor direction keys to communicate direction information and command selections to processor 510 and to control cursor movement on display 550. Camera and microphone arrays 590 of computer system 500 may be coupled to bus 505 to observe gestures, record audio and video and to receive and transmit visual and audio commands.
Computing system 500 may further include network interface (s) 580 to provide access to a network, such as a local area network (LAN) , a wide area network (WAN) , a metropolitan area network (MAN) , a personal area network (PAN) , Bluetooth, a cloud network, a mobile network (e.g., 3rd Generation (3G) , etc. ) , an intranet, the Internet, etc. Network interface (s) 580 may include, for example, a wireless network interface having antenna 585, which may represent one or more antenna (e) . Network interface (s) 580 may also include, for example, a wired network interface to communicate with remote devices via network cable 587, which may be, for example, an Ethernet cable, a coaxial cable, a fiber optic cable, a serial cable, or a parallel cable.
Network interface (s) 580 may provide access to a LAN, for example, by conforming to IEEE 802.11b and/or IEEE 802.11g standards, and/or the wireless network interface may provide access to a personal area network, for example, by conforming to Bluetooth standards. Other wireless network interfaces and/or protocols, including previous and subsequent versions of the standards, may also be supported.
In addition to, or instead of, communication via the wireless LAN standards, network interface (s) 580 may provide wireless communication using, for example, Time Division, Multiple Access (TDMA) protocols, Global Systems for Mobile Communications (GSM) protocols, Code Division, Multiple Access (CDMA) protocols, and/or any other type of wireless communications protocols.
Network interface (s) 580 may include one or more communication interfaces, such as a modem, a network interface card, or other well-known interface devices, such as those used for coupling to the Ethernet, token ring, or other types of physical wired or wireless attachments for  purposes of providing a communication link to support a LAN or a WAN, for example. In this manner, the computer system may also be coupled to a number of peripheral devices, clients, control surfaces, consoles, or servers via a conventional network infrastructure, including an Intranet or the Internet, for example.
It is to be appreciated that a lesser or more equipped system than the example described above may be preferred for certain implementations. Therefore, the configuration of computing system 500 may vary from implementation to implementation depending upon numerous factors, such as price constraints, performance requirements, technological improvements, or other circumstances. Examples of the electronic device or computer system 500 may include without limitation a mobile device, a personal digital assistant, a mobile computing device, a smartphone, a cellular telephone, a handset, a one-way pager, a two-way pager, a messaging device, a computer, a personal computer (PC) , a desktop computer, a laptop computer, a notebook computer, a handheld computer, a tablet computer, a server, a server array or server farm, a web server, a network server, an Internet server, a work station, a mini-computer, a main frame computer, a supercomputer, a network appliance, a web appliance, a distributed computing system, multiprocessor systems, processor-based systems, consumer electronics, programmable consumer electronics, television, digital television, set top box, wireless access point, base station, subscriber station, mobile subscriber center, radio network controller, router, hub, gateway, bridge, switch, machine, or combinations thereof.
Embodiments may be implemented as any or a combination of: one or more microchips or integrated circuits interconnected using a parentboard, hardwired logic, software stored by a memory device and executed by a microprocessor, firmware, an application specific integrated circuit (ASIC) , and/or a field programmable gate array (FPGA) . The term "logic" may include, by way of example, software or hardware and/or combinations of software and hardware.
Embodiments may be provided, for example, as a computer program product which may include one or more machine-readable media having stored thereon machine-executable instructions that, when executed by one or more machines such as a computer, network of computers, or other electronic devices, may result in the one or more machines carrying out operations in accordance with embodiments described herein. A machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs (Compact Disc-Read Only Memories) , and magneto-optical disks, ROMs, RAMs, EPROMs (Erasable Programmable Read Only Memories) , EEPROMs (Electrically Erasable Programmable Read Only Memories) , magnetic or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing machine-executable instructions.
Moreover, embodiments may be downloaded as a computer program product, wherein the program may be transferred from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of one or more data signals embodied in and/or modulated by a carrier wave or other propagation medium via a communication link (e.g., a modem and/or network connection) .
References to “one embodiment” , “an embodiment” , “example embodiment” , “various embodiments” , etc., indicate that the embodiment (s) so described may include particular features, structures, or characteristics, but not every embodiment necessarily includes the particular features, structures, or characteristics. Further, some embodiments may have some, all, or none of the features described for other embodiments.
In the following description and claims, the term “coupled” along with its derivatives, may be used. “Coupled” is used to indicate that two or more elements co-operate or interact with each other, but they may or may not have intervening physical or electrical components between them.
As used in the claims, unless otherwise specified the use of the ordinal adjectives “first” , “second” , “third” , etc., to describe a common element, merely indicate that different instances of like elements are being referred to, and are not intended to imply that the elements so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
The following clauses and/or examples pertain to further embodiments or examples. Specifics in the examples may be used anywhere in one or more embodiments. The various features of the different embodiments or examples may be variously combined with some features included and others excluded to suit a variety of different applications. Examples may include subject matter such as a method, means for performing acts of the method, at least one machine-readable medium including instructions that, when performed by a machine cause the machine to performs acts of the method, or of an apparatus or system for facilitating hybrid communication according to embodiments and examples described herein.
Some embodiments pertain to Example 1 that includes an apparatus to facilitate gesture matching. The apparatus includes a gesture selection engine to select a gesture from a database during an authentication phase, an avatar animation and rendering engine to translate a selected gesture into an animated avatar for display at a display device with a prompt for a user to perform the selected gesture, reception and capturing logic to capture, in real-time, an image of a user and a gesture matching component to compare the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
Example 2 includes the subject matter of Example 1, wherein the gesture matching component authenticates the user if the gesture performed by the user in the captured image matches the selected gesture.
Example 3 includes the subject matter of Example 2, wherein the gesture matching component selects a key frame from the user image and determines a temporal sequence across multiple frames to compare the gesture performed by the user to the selected gesture.
Example 4 includes the subject matter of Example 3, wherein the comparison is performed using a temporal sequence matching process.
Example 5 includes the subject matter of Example 1, further comprising a gesture training module to identify gestures from images of a user captured at reception and capturing logic during a registration phase and store the gestures in the database for recognition.
Example 6 includes the subject matter of Example 5, wherein the gesture training module stores the gestures as animation parameters.
Example 7 includes the subject matter of Example 6, wherein one of the captured gestures is selected from the database by the gesture selection engine during the authentication phase.
Example 8 includes the subject matter of Example 1, further comprising a gesture learning module to identify new gestures performed by the user and add the new gestures to the database.
Example 9 includes the subject matter of Example 8, wherein the gesture learning module identifies a new gesture upon the gesture matching component determining that the gesture performed by the user does not match a gesture in the database.
Some embodiments pertain to Example 10 that includes a method to facilitate gesture matching comprising selecting a gesture from a database during an authentication phase, translating the selected gesture into an animated avatar, displaying the avatar, prompting a user to perform the selected gesture, capturing a real-time image of the user and comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
Example 11 includes the subject matter of Example 10, further comprising authenticating the user if the gesture performed by the user in the captured image matches the selected gesture.
Example 12 includes the subject matter of Example 11, wherein comparing the gesture performed by the user to the selected gesture comprises selecting a key frame from the user image and determining a temporal sequence across multiple frames.
Example 13 includes the subject matter of Example 11, wherein the comparison is performed using a temporal sequence matching process.
Example 14 includes the subject matter of Example 10, further comprising performing a registration process prior to the authentication phase.
Example 15 includes the subject matter of Example 14, wherein the registration process comprises identifying gestures from captured images of the user and storing the gestures in the database for recognition.
Example 16 includes the subject matter of Example 15, wherein the gestures are stored as animation parameters.
Example 17 includes the subject matter of Example 16, wherein one of the captured gestures is selected from the database during the authentication phase.
Example 18 includes the subject matter of Example 10, further comprising identifying new gestures performed by the user and adding the new gestures to the database.
Example 19 includes the subject matter of Example 18, wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.
Some embodiments pertain to Example 20 that includes at least one machine-readable medium comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to carry out operations according to any one of claims 10 to 19.
Some embodiments pertain to Example 21 that includes an apparatus to facilitate gesture matching, comprising means for selecting a gesture from a database during an authentication phase, means for translating the selected gesture into an animated avatar, means for displaying the avatar, means for prompting a user to perform the selected gesture, means for capturing a real-time image of the user and means for comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
Example 22 includes the subject matter of Example 21, further comprising means for performing registration process prior to the authentication phase.
Example 23 includes the subject matter of Example 22, wherein the means for registration comprises means for identifying gestures from captured images of the user and means for storing the gestures in the database for recognition.
Example 24 includes the subject matter of Example 22, further comprising means for identifying new gestures performed by the user and means for adding the new gestures to the database.
Example 25 includes the subject matter of Example 24, wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.
Some embodiments pertain to Example 26 that includes at least one machine-readable medium comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to carry out operations comprising selecting a gesture from a database during an authentication phase, translating the selected gesture into an animated avatar, displaying the avatar, prompting a user to perform the selected gesture, capturing a real-time image of the user and comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
Example 27 includes the subject matter of Example 26, comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to further carry out operations comprising performing registration process prior to the authentication phase.
Example 28 includes the subject matter of Example 27, wherein the registration process comprises identifying gestures from captured images of the user and means for storing the gestures in the database for recognition.
Example 29 includes the subject matter of Example 26, comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to further carry out operations comprising identifying new gestures performed by the user and adding the new gestures to the database.
Example 30 includes the subject matter of Example 29, wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.
The drawings and the forgoing description give examples of embodiments. Those skilled in the art will appreciate that one or more of the described elements may well be combined into a single functional element. Alternatively, certain elements may be split into multiple functional elements. Elements from one embodiment may be added to another embodiment. For example, orders of processes described herein may be changed and are not limited to the manner described herein. Moreover, the actions any flow diagram need not be implemented in the order shown; nor do all of the acts necessarily need to be performed. Also, those acts that are not dependent on other acts may be performed in parallel with the other acts. The scope of embodiments is by no means limited by these specific examples. Numerous variations, whether explicitly given in the specification or not, such as differences in structure, dimension, and use of material, are possible. The scope of embodiments is at least as broad as given by the following claims.

Claims (30)

  1. An apparatus to facilitate gesture matching, comprising:
    a gesture selection engine to select a gesture from a database during an authentication phase;
    an avatar animation and rendering engine to translate the selected gesture into an animated avatar for display at a display device with a prompt for a user to perform the selected gesture;
    reception and capturing logic to capture, in real-time, an image of a user; and
    a gesture matching component to compare the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
  2. The apparatus of claim 1 wherein the gesture matching component authenticates the user if the gesture performed by the user in the captured image matches the selected gesture.
  3. The apparatus of claim 2 wherein the gesture matching component selects a key frame from the user image and determines a temporal sequence across multiple frames to compare the gesture performed by the user to the selected gesture.
  4. The apparatus of claim 3 wherein the comparison is performed using a temporal sequence matching process.
  5. The apparatus of claim 1 further comprising a gesture training module to identify gestures from images of a user captured at reception and capturing logic during a registration phase and store the gestures in the database for recognition.
  6. The apparatus of claim 5 wherein the gesture training module stores the gestures as animation parameters.
  7. The apparatus of claim 6 wherein one of the captured gestures is selected from the database by the gesture selection engine during the authentication phase.
  8. The apparatus of claim 1 further comprising a gesture learning module to identify new gestures performed by the user and add the new gestures to the database.
  9. The apparatus of claim 8 wherein the gesture learning module identifies a new gesture upon the gesture matching component determining that the gesture performed by the user does not match a gesture in the database.
  10. A method to facilitate gesture matching, comprising:
    selecting a gesture from a database during an authentication phase;
    translating the selected gesture into an animated avatar;
    displaying the avatar;
    prompting a user to perform the selected gesture;
    capturing a real-time image of the user; and
    comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
  11. The method of claim 10 further comprising authenticating the user if the gesture performed by the user in the captured image matches the selected gesture.
  12. The method of claim 11 wherein comparing the gesture performed by the user to the selected gesture comprises:
    selecting a key frame from the user image; and
    determining a temporal sequence across multiple frames.
  13. The method of claim 11 wherein the comparison is performed using a temporal sequence matching process.
  14. The method of claim 10 further comprising performing a registration process prior to the authentication phase.
  15. The method of claim 14 wherein the registration process comprises:
    identifying gestures from captured images of the user; and
    storing the gestures in the database for recognition.
  16. The method of claim 15 wherein the gestures are stored as animation parameters.
  17. The method of claim 16 wherein one of the captured gestures is selected from the database during the authentication phase.
  18. The method of claim 10 further comprising:
    identifying new gestures performed by the user; and
    adding the new gestures to the database.
  19. The method of claim 18 wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.
  20. At least one machine-readable medium comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to carry out operations according to any one of claims 10 to 19.
  21. An apparatus to facilitate gesture matching, comprising:
    means for selecting a gesture from a database during an authentication phase;
    means for translating the selected gesture into an animated avatar;
    means for displaying the avatar;
    means for prompting a user to perform the selected gesture;
    means for capturing a real-time image of the user; and
    means for comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
  22. The apparatus of claim 21 further comprising means for performing registration process prior to the authentication phase.
  23. The apparatus of claim 22 wherein the means for registration comprises:
    means for identifying gestures from captured images of the user; and
    means for storing the gestures in the database for recognition.
  24. The apparatus of claim 22 further comprising:
    means for identifying new gestures performed by the user; and
    means for adding the new gestures to the database.
  25. The apparatus of claim 24 wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.
  26. At least one machine-readable medium comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to carry out operations comprising:
    selecting a gesture from a database during an authentication phase;
    translating the selected gesture into an animated avatar;
    displaying the avatar;
    prompting a user to perform the selected gesture;
    capturing a real-time image of the user; and
    comparing the gesture performed by the user in the captured image to the selected gesture to determine whether there is a match.
  27. The method of claim 26, comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to further carry out operations comprising performing registration process prior to the authentication phase.
  28. The machine-readable medium of claim 27, wherein the registration process comprises:
    identifying gestures from captured images of the user; and
    means for storing the gestures in the database for recognition.
  29. The method of claim 26, comprising a plurality of instructions that in response to being executed on a computing device, causes the computing device to further carry out operations comprising:
    identifying new gestures performed by the user; and
    adding the new gestures to the database.
  30. The machine-readable medium of claim 29, wherein a new gesture is identified upon determining that the gesture performed by the user does not match a gesture in the database.
PCT/CN2015/075339 2015-03-28 2015-03-28 Gesture matching mechanism WO2016154834A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
EP15886821.6A EP3278260B1 (en) 2015-03-28 2015-03-28 Gesture matching mechanism
US14/911,390 US10803157B2 (en) 2015-03-28 2015-03-28 Gesture matching mechanism
PCT/CN2015/075339 WO2016154834A1 (en) 2015-03-28 2015-03-28 Gesture matching mechanism
CN201580077135.0A CN107615288B (en) 2015-03-28 2015-03-28 Gesture matching mechanism
US17/066,138 US11449592B2 (en) 2015-03-28 2020-10-08 Gesture matching mechanism
US17/947,991 US11841935B2 (en) 2015-03-28 2022-09-19 Gesture matching mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/075339 WO2016154834A1 (en) 2015-03-28 2015-03-28 Gesture matching mechanism

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US14/911,390 A-371-Of-International US10803157B2 (en) 2015-03-28 2015-03-28 Gesture matching mechanism
US17/066,138 Continuation US11449592B2 (en) 2015-03-28 2020-10-08 Gesture matching mechanism

Publications (1)

Publication Number Publication Date
WO2016154834A1 true WO2016154834A1 (en) 2016-10-06

Family

ID=57003779

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/075339 WO2016154834A1 (en) 2015-03-28 2015-03-28 Gesture matching mechanism

Country Status (4)

Country Link
US (3) US10803157B2 (en)
EP (1) EP3278260B1 (en)
CN (1) CN107615288B (en)
WO (1) WO2016154834A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107124664A (en) * 2017-05-25 2017-09-01 百度在线网络技术(北京)有限公司 Exchange method and device applied to net cast

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107615288B (en) 2015-03-28 2021-10-22 英特尔公司 Gesture matching mechanism
US11256923B2 (en) * 2016-05-12 2022-02-22 Arris Enterprises Llc Detecting sentinel frames in video delivery using a pattern analysis
TWI652614B (en) * 2017-05-16 2019-03-01 緯創資通股份有限公司 Portable electronic device and operating method thereof
US11281760B2 (en) 2018-07-18 2022-03-22 Samsung Electronics Co., Ltd. Method and apparatus for performing user authentication
CN109801625A (en) * 2018-12-29 2019-05-24 百度在线网络技术(北京)有限公司 Control method, device, user equipment and the storage medium of virtual speech assistant
US20200275271A1 (en) * 2019-02-21 2020-08-27 Alibaba Group Holding Limited Authentication of a user based on analyzing touch interactions with a device
CN110007765A (en) * 2019-04-11 2019-07-12 上海星视度科技有限公司 A kind of man-machine interaction method, device and equipment
US11146442B1 (en) * 2019-08-15 2021-10-12 Facebook, Inc. Presenting a user profile page including an animation associated with a type of life event described by content posted to the user profile page
US11328047B2 (en) * 2019-10-31 2022-05-10 Microsoft Technology Licensing, Llc. Gamified challenge to detect a non-human user
WO2021171607A1 (en) * 2020-02-28 2021-09-02 日本電気株式会社 Authentication terminal, entrance/exit management system, entrance/exit management method, and program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090307595A1 (en) * 2008-06-09 2009-12-10 Clark Jason T System and method for associating semantically parsed verbal communications with gestures
CN103279253A (en) * 2013-05-23 2013-09-04 广东欧珀移动通信有限公司 Method and terminal device for theme setting
CN103714282A (en) * 2013-12-20 2014-04-09 天津大学 Interactive type identification method based on biological features

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6421453B1 (en) 1998-05-15 2002-07-16 International Business Machines Corporation Apparatus and methods for user recognition employing behavioral passwords
US8688148B2 (en) * 2005-10-25 2014-04-01 Qualcomm Incorporated Dynamic resource matching system
US7558622B2 (en) * 2006-05-24 2009-07-07 Bao Tran Mesh network stroke monitoring appliance
WO2008134745A1 (en) * 2007-04-30 2008-11-06 Gesturetek, Inc. Mobile video-based therapy
JP2011133977A (en) * 2009-12-22 2011-07-07 Sony Corp Image processor, image processing method, and program
US8594374B1 (en) * 2011-03-30 2013-11-26 Amazon Technologies, Inc. Secure device unlock with gaze calibration
US8897500B2 (en) * 2011-05-05 2014-11-25 At&T Intellectual Property I, L.P. System and method for dynamic facial features for speaker recognition
CN104170358B (en) 2012-04-09 2016-05-11 英特尔公司 For the system and method for incarnation management and selection
US9092600B2 (en) * 2012-11-05 2015-07-28 Microsoft Technology Licensing, Llc User authentication on augmented reality display device
GB2525516B (en) * 2012-11-14 2020-04-22 Weiss Golan Biometric methods and systems for enrollment and authentication
US8856541B1 (en) * 2013-01-10 2014-10-07 Google Inc. Liveness detection
CN103218842B (en) * 2013-03-12 2015-11-25 西南交通大学 A kind of voice synchronous drives the method for the three-dimensional face shape of the mouth as one speaks and facial pose animation
US9274607B2 (en) * 2013-03-15 2016-03-01 Bruno Delean Authenticating a user using hand gesture
US9348989B2 (en) * 2014-03-06 2016-05-24 International Business Machines Corporation Contemporaneous gesture and keyboard entry authentication
CN107615288B (en) 2015-03-28 2021-10-22 英特尔公司 Gesture matching mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090307595A1 (en) * 2008-06-09 2009-12-10 Clark Jason T System and method for associating semantically parsed verbal communications with gestures
CN103279253A (en) * 2013-05-23 2013-09-04 广东欧珀移动通信有限公司 Method and terminal device for theme setting
CN103714282A (en) * 2013-12-20 2014-04-09 天津大学 Interactive type identification method based on biological features

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3278260A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107124664A (en) * 2017-05-25 2017-09-01 百度在线网络技术(北京)有限公司 Exchange method and device applied to net cast

Also Published As

Publication number Publication date
CN107615288A (en) 2018-01-19
CN107615288B (en) 2021-10-22
US20210026941A1 (en) 2021-01-28
US20230019957A1 (en) 2023-01-19
US10803157B2 (en) 2020-10-13
US11449592B2 (en) 2022-09-20
EP3278260A4 (en) 2018-08-29
EP3278260A1 (en) 2018-02-07
EP3278260B1 (en) 2021-03-17
US11841935B2 (en) 2023-12-12
US20180060550A1 (en) 2018-03-01

Similar Documents

Publication Publication Date Title
US11841935B2 (en) Gesture matching mechanism
US9489760B2 (en) Mechanism for facilitating dynamic simulation of avatars corresponding to changing user performances as detected at computing devices
KR102374446B1 (en) Avatar selection mechanism
US9952676B2 (en) Wearable device with gesture recognition mechanism
Wang et al. Enabling live video analytics with a scalable and privacy-aware framework
US9852495B2 (en) Morphological and geometric edge filters for edge enhancement in depth images
US10143421B2 (en) System and method for user nudging via wearable devices
US20210192031A1 (en) Motion-based credentials using magnified motion
US20160086088A1 (en) Facilitating dynamic affect-based adaptive representation and reasoning of user behavior on computing devices
US9792673B2 (en) Facilitating projection pre-shaping of digital images at computing devices
WO2016045050A1 (en) Facilitating efficient free in-plane rotation landmark tracking of images on computing devices
US20160378296A1 (en) Augmented Reality Electronic Book Mechanism
US20170163957A1 (en) Powering unpowered objects for tracking, augmented reality, and other experiences
US9392189B2 (en) Mechanism for facilitating fast and efficient calculations for hybrid camera arrays
US9256780B1 (en) Facilitating dynamic computations for performing intelligent body segmentations for enhanced gesture recognition on computing devices
WO2017112095A1 (en) Code filters for coded light depth acquisition in depth images
US10733491B2 (en) Fingerprint-based experience generation

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 14911390

Country of ref document: US

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15886821

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2015886821

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE