WO2017062263A1 - System for gesture recognition - Google Patents

System for gesture recognition Download PDF

Info

Publication number
WO2017062263A1
WO2017062263A1 PCT/US2016/054567 US2016054567W WO2017062263A1 WO 2017062263 A1 WO2017062263 A1 WO 2017062263A1 US 2016054567 W US2016054567 W US 2016054567W WO 2017062263 A1 WO2017062263 A1 WO 2017062263A1
Authority
WO
WIPO (PCT)
Prior art keywords
filtering
light
invisible light
film
camera
Prior art date
Application number
PCT/US2016/054567
Other languages
French (fr)
Inventor
Yong Rui
Zhiwei Li
Rui CAI
Original Assignee
Microsoft Technology Licensing, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing, Llc filed Critical Microsoft Technology Licensing, Llc
Publication of WO2017062263A1 publication Critical patent/WO2017062263A1/en

Links

Classifications

    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B5/00Optical elements other than lenses
    • G02B5/20Filters
    • G02B5/208Filters for use with infrared or ultraviolet radiation, e.g. for separating visible light from infrared and/or ultraviolet radiation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/014Hand-worn input/output arrangements, e.g. data gloves
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B5/00Optical elements other than lenses
    • G02B5/20Filters
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B5/00Optical elements other than lenses
    • G02B5/20Filters
    • G02B5/28Interference filters
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B5/00Optical elements other than lenses
    • G02B5/20Filters
    • G02B5/28Interference filters
    • G02B5/285Interference filters comprising deposited thin solid films
    • G02B5/286Interference filters comprising deposited thin solid films having four or fewer layers, e.g. for achieving a colour effect
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/50Constructional details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/30Transforming light or analogous information into electric information
    • H04N5/33Transforming infrared radiation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/12Fingerprints or palmprints
    • G06V40/1341Sensing with light passing through the finger

Definitions

  • Gesture recognition allows a device to recognize gestures that originate from a bodily motion or state, for example, from the face or hand of a user.
  • human-machine interface (HMI) of the device enables the user to communicate and interact with the device naturally without mechanical devices. For instance, while the user performs a gesture with his/her finger(s), the device recognizes the gesture and then acts accordingly.
  • a cursor or another object may be moved on the display screen of the device.
  • gesture recognition is done by techniques of computer vision and image processing.
  • gesture recognition it is usually necessary to separate the foreground such as the finger from background. This fundamental process is very challenging for some cameras such as the RGB cameras on mobile phones, especially when the background contains abundant texture.
  • accessary equipment for gesture recognition may be used with a camera on a primary device such as a multifunction portable electronic device.
  • the equipment includes in part a first filtering layer and a second filtering layer.
  • the first filtering layer filters out visible light from the ambient light to obtain invisible light.
  • the invisible light arrives at the second filtering layer where the spectrum of the invisible light is shifted to a range of visible spectrum.
  • the shifted invisible light is received by the camera for gesture recognition. That is, the imaging at the camera is performed on the basis of light in form of the visible light but containing information of the invisible spectrum.
  • FIG. 1 shows a block diagram of a block diagram of an environment where implementations of the subject matter described herein can be implemented;
  • FIGS. 2 A, 2B and 2C show a schematic diagrams of example accessory equipment in accordance with one implementation of the subject matter described herein;
  • FIG. 3 shows a block diagram of the optical filtering portion of the accessory equipment in accordance with implementations of the subject matter described herein;
  • FIG. 4 shows an example image of a user's hand which is obtained based on the light filtered by the accessory equipment in accordance with one implementation of the subject matter described herein;
  • FIG. 5 shows a block diagram of a device that can be used with the accessory equipment in accordance with implementations of the subject matter described herein;
  • FIG. 6 shows a block diagram of an integrated headset in accordance with implementations of the subject matter described herein.
  • FIG. 7 shows a flowchart of a method for using the accessory equipment in accordance with one implementation of the subject matter described herein.
  • the term “includes” and its variants are to be read as opened terms that mean “includes, but is not limited to.”
  • the term “or” is to be read as “and/or” unless the context clearly indicates otherwise.
  • the term “based on” is to be read as “based at least in part on.”
  • the term “one implementation” and “an implementation” are to be read as “at least one implementation.”
  • the term “another implementation” is to be read as “at least one other implementation.”
  • the terms “first,” “second,” “third” and the like may refer to different or same objects. Other definitions, explicit and implicit, can be included below.
  • accessory equipment for gesture recognition may be used with a camera on an electronic device such as a multifunction portable device.
  • the equipment includes in part a first filtering layer and a second filtering layer.
  • the first filtering layer filters out visible light from the ambient light to obtain invisible light.
  • the invisible light is transmitted to the second filtering layer which shifts the spectrum of the invisible light to a range of visible spectrum.
  • the shifted invisible light is captured by the camera for gesture recognition. That is, the camera captures the image in form of the visible light but containing information of the invisible light.
  • FIG. 1 illustrates a block diagram of an environment where implementations of the subject matter described herein can be implemented.
  • accessory equipment 100 can be used with a primary device 110.
  • the primary device 110 is at least partially contained in the accessory equipment 100.
  • the accessory equipment 100 and the primary device 110 may work together in other suitable manners.
  • the accessory equipment 100 includes an optical filtering portion 102 and the primary device 110 includes a camera 112.
  • the optical filtering portion 102 is arranged in such a way that the ambient light arrives at the camera 112 of the primary device 110 via the optical filtering portion 102. That is, the optical filtering portion 102 is located at the optical path towards the camera 112.
  • the light will be first received by the optical filtering portion 102 of the accessory equipment 100.
  • the light filtered by the optical filtering portion 102 is sensed by the camera 112 for imaging and gesture recognition, which will be discussed below.
  • the accessory equipment 100 is a container which can contain at least a part of the primary device 110.
  • FIGS. 2A-2C show schematic diagrams of an example implementation of the accessory equipment 100.
  • the accessory equipment 100 is a box-shaped container and the primary device 110 is a smart phone which can be contained in the accessory equipment 100.
  • the screen area of display of the primary device 110 can be divided into two or more screen parts, for example. In the shown example, there are two screen parts 114 and 116.
  • the accessory equipment 100 includes one or more viewing holes. In the shown example, there are two viewing holes 124 and 126 which allow the user to view the content rendered on the screen parts 114 and 116, respectively.
  • VR virtual reality
  • the accessory equipment 100 further includes a camera hole 128 via which the camera 112 of the primary device 110 can receive light.
  • the camera hole 128 and the viewing holes 124 and 126 are located on opposite sides of the accessory equipment 100 when the accessory equipment 100 encapsulates the primary device 110.
  • Other relative locations of the camera hole 128 and the viewing holes 124 and 126 are possible as well.
  • the primary device 110 may include more than one camera 112 in some implementations. Accordingly, the accessory equipment 100 may have more than one camera hole 128.
  • the optical filtering portion 102 is in the form of one or more thin films.
  • the thin film(s) may be arranged on the accessory equipment 100 to cover the camera hole 128. In this way, the light will pass the film(s) before reaching the camera 112.
  • the optical filtering portion 102 does not necessarily have to be implemented as a film(s).
  • one or more optical filters can act as the optical filtering portion 102. Example implementations will be described in the following paragraphs.
  • FIG. 3 shows a block diagram of the optical filtering portion 102 of the accessory equipment 100 in accordance with implementations of the subject matter described herein.
  • the optical filtering portion 102 includes two filtering layers 310 and 320.
  • the ambient light first arrives at the first filtering layer 310 and then reaches the second filtering layer 320.
  • the first filtering layer 310 is a visible light filtering layer which is used to filter out the visible light from the received ambient light.
  • the light reaching the second filtering layer 320 only contains the invisible light such as infrared light.
  • the second filtering layer 320 is used to shift the spectrum of the received invisible light to a range of the visible spectrum.
  • the wavelength of the invisible light may be modified into the range of 380nm to 760nm, for example.
  • Various techniques for spectrum shifting no matter currently known or to be developed in the future, which will be described below.
  • the first and second filtering layers 310 and 320 may be implemented as one or more films, as described above.
  • the filtering layers 310 and 320 are implemented as two separate layers for filtering the visible light and shifting the invisible light, respectively. These layers may fit to each other.
  • the first and second filtering layers 310 and 320 may be integrally formed as a single film that not only filters out the visible light but also shifts the spectrum of the remaining invisible light.
  • the first and/or second filtering layers 310 and 320 may be implemented as optical filters, lens and/or other suitable optical devices.
  • the first and/or second filtering layers 310 and 320 can be made of any suitable material that is able to filter out the visible light of certain wavelength and/or shift the invisible spectrum.
  • the first and/or second filtering layers 310 and 320 may be implemented as absorptive filters to which various inorganic or organic compounds are added.
  • the compounds are used to absorb visible light of certain wavelengths while allowing the invisible light to transmit. Examples of the compounds include oxide.
  • the compounds are added on a glass substrate.
  • the compounds can also be added to plastic (often polycarbonate or acrylic) to produce gel filters which are lighter and cheaper than glass-based filters.
  • resin can be used to form the first and/or second filtering layers 310 and 320.
  • the first and/or second filtering layers 310 and 320 can be implemented as interference filters.
  • Optical coatings with different refractive indexes are built up upon a substrate which can be made of glass or resin, for example.
  • the interfaces between the coating layers of different refractive index produce phased reflections, thereby selectively reinforcing the invisible light and interfering with the visible light.
  • the coating layers can be added by vacuum deposition. By controlling the thickness and number of the coating layers, the wavelength of the passband of the filtering layers can be tuned and made as wide or narrow as desired. Any other suitable implementations are possible as well.
  • the camera 112 of the primary device 110 may sense and capture light in the form of visible light but carrying the information of invisible light. In this way, even the cameras off-the-shelf can be directly used with the accessory equipment 100.
  • the conventional cameras usually include internal filters which filter out the infrared light. By shifting the infrared light into the range of visible spectrum, the cameras are able to directly sense and process the received light within the visible spectrum, such that the imaging process can be correctly completed.
  • FIG. 4 shows an example image 400 of a user's hand which is generated by the camera 112 of the primary device 110 based on the light filtered by the optical filtering portion 102 of the accessory equipment 100.
  • the image 400 can be considered an infrared image. It can be seen that the infrared image contains little noise.
  • the primary device 110 can apply gesture recognition process on the image 400 to recognize the user's gesture and act accordingly. It would be appreciated that compared with normal images obtained directly based on the ambient light, the foreground and background can be separated more accurately and efficiently by use of the infrared image 400, which in turn improves accuracy and efficiency of the gesture recognition.
  • FIG. 5 shows a block diagram of a primary device 110 that can be used with the accessory equipment 100 in accordance with implementations of the subject matter described herein.
  • the primary device 110 include, but are not limited, a multifunction portable device such as a mobile phone or a tablet computer, a desktop personal computer (PC), or the like. It is to be understood that the primary device 110 is not intended to suggest any limitation as to scope of use or functionality of the subject matter described herein, as various implementations may be implemented in diverse general-purpose or special-purpose computing environments.
  • the primary device 110 includes at least one processing unit (or processor) 510, a memory 520, storage 530, one or more input devices 540, one or more output devices 550, and one or more communication connections 560.
  • the input device(s) 540 may include the camera 112.
  • the camera 112 is configured to receive light filtered by the optical filtering portion 102 of the accessory equipment 100 and to generate one or more images such as infrared images.
  • the processing unit 510 executes computer-executable instructions and may be a real or a virtual processor.
  • the processing unit 510 may recognize the user's gestures based on the generated one or more images. Then the processing unit 150 may control the primary device 110 to act according to the recognized gestures.
  • the one or more output devices 550 may include a display 552 such as a touch- sensitive screen display for rendering content to the user.
  • the screen area of the display 552 may be divided into at least two parts for rendering VR content for the right and left eyes of the user, respectively.
  • the primary device 110 can be put into the accessory equipment 100 which can be manufactured as a container, as shown in FIGS. 2A-2C.
  • the user may wear the accessory equipment 100 as a headset and view the VR content through at least one viewing hole of the accessory equipment 100, such as the viewing holes 124 and 126.
  • the camera 112 is aligned with the camera hole 128 covered by the optical filtering portion 102. In this way, the user may perform gestures using his/her fingers to operate the primary device 110. For example, the user is enabled to manipulate the VR content rendered on the display 552.
  • the storage 530 may be removable or non-removable, and may include computer-readable storage media such as flash drives, magnetic disks or any other medium which can be used to store information and which can be accessed within the primary device 110.
  • the communication connection(s) 560 enables communication over a communication medium to another computing entity. Additionally, functionality of the components of the primary device 110 may be implemented in a single computing machine or in multiple computing machines that are able to communicate over communication connections.
  • the accessory equipment 100 and primary device 110 are separate from one another.
  • the accessory equipment 100 and primary device 110 can be integrated into a single device. That is, the subject matter as described herein can be implemented as either separate devices or an integrated device or system such as a headset that has all the necessary elements built therein.
  • FIG. 6 shows a block diagram of such a headset in accordance with implementations of the subject matter described herein.
  • the headset 600 includes not only the camera 112 but also the optical filtering portion 102.
  • the user can directly wear the headset 600 without assembling the primary device with the accessory equipment as in the example shown in FIGS. 1 and 2A-2C.
  • the filtering layers 310 and 320 of the optical filtering portion 102 may be removably attached to lens of the camera 112. It is also possible to directly integrate the filtering layers 310 and 320 into the lens of the camera 112.
  • the headset 600 may include a display 610 for rendering content such as VR content, a processing unit (not shown) for controlling the elements of the headset 600, and any other necessary elements.
  • the display 610 and the camera 112 are located at opposite sides of the headset 610. Other arrangements are possible depending on the shape factor of the headset 600. It is to be understood that all the features described above with reference to FIGS. 1-5 apply to the example shown in FIG. 6.
  • FIG. 7 shows a flowchart of a method 700 for using the accessory equipment in accordance with implementations of the subject matter described herein.
  • step 710 visible light is filtered out from ambient light to obtain invisible light.
  • step 720 the spectrum of the invisible light is shifted to a range of visible spectrum.
  • the shifted invisible light can be used by the camera to generate an image with reduced noise for recognizing a gesture. In this way, accuracy of the gesture recognition is improved. It is to be understood that all the features described above with reference to FIGS. 1-6 apply to the method 700 shown in FIG. 7.
  • the functionally described herein can be performed, at least in part, by one or more hardware logic components.
  • illustrative types of hardware logic components include Field-Programmable Gate Arrays (FPGAs), Application-specific Integrated Circuits (ASICs), Application-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
  • Program code for carrying out methods of the subject matter described herein may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowcharts and/or block diagrams to be implemented.
  • the program code may execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
  • a machine readable medium may be any tangible medium that may contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine readable medium may be a machine readable signal medium or a machine readable storage medium.
  • a machine readable medium may include but not limited to an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
  • machine readable storage medium More specific examples of the machine readable storage medium would include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • CD-ROM portable compact disc read-only memory
  • magnetic storage device or any suitable combination of the foregoing.
  • a system comprising an optical filtering portion including a first filtering layer for filtering out visible light from ambient light to obtain invisible light, and a second filtering layer for shifting spectrum of the invisible light to a range of visible spectrum.
  • the system further comprises a camera configured to generate an image based on the shifted invisible light in the range of visible spectrum; a display configured to render content to a user; and a processor configured to recognize a gesture performed by the user based on the generated image and to control the rendered content based on the recognized gesture.
  • the first and second filtering layers are integrated as a single film.
  • the first filtering layer is a first film
  • the second filtering layer is a second film that is different from the first film.
  • the first and second films fit to each other.
  • at least one of the first and second filtering layers is made of glass, resin, or gel.
  • at least one of the first and second filtering layers includes optical coatings with different refractive indexes for reinforcing the invisible light and interfering with the visible light.
  • at least one of the first and second filtering layers includes a compound for absorbing the visible light and transmitting the invisible light.
  • the display and the camera are located at opposite sides of a device.
  • the content rendered on the display includes virtual reality (VR) content.
  • a screen area of the display is divided to a plurality of parts for rendering the VR content.
  • equipment for use with a camera comprises a first filtering layer for filtering out visible light from ambient light to obtain invisible light; and a second filtering layer for shifting spectrum of the invisible light to a range of visible spectrum, where the shifted invisible light is captured and used by the camera to generate an image with reduced noise for recognizing a gesture.
  • At least one of the first and second filtering layers includes a film.
  • the first and second filtering layers are integrally formed as a film.
  • the first filtering layer is a first film
  • the second filtering layer is a second film that is different from the first film.
  • the first and second films fit to each other.
  • at least one of the first and second filtering layers is made of glass, resin, or gel.
  • at least one of the first and second filtering layers includes optical coatings with different refractive indexes for reinforcing the invisible light and interfering with the visible light.
  • at least one of the first and second filtering layers includes a compound for absorbing the visible light and transmitting the invisible light.
  • the equipment comprises a container for containing a multifunction portable device including the camera, the container having a camera hole that allows the camera to capture the shifted invisible light, the first and second filtering layers covering the camera hole.
  • the equipment is a wearable headset that allows a user to control the multifunction portable device by the gesture.
  • the equipment further comprises at least one viewing hole that allows a user to view content rendered on a display of the multifunction portable device.
  • a method comprises filtering out visible light from ambient light to obtain invisible light; and shifting spectrum of the invisible light to a range of visible spectrum, the shifted invisible light being captured by the camera for recognizing a gesture.

Abstract

In the present disclosure, accessory equipment for gesture recognition is provided. The equipment can be used with a camera on an electronic device such as a multifunction portable device. The equipment includes in part a first filtering layer and a second filtering layer. The first filtering layer filters out visible light from the ambient light to obtain invisible light. The invisible light then arrives at the second filtering layer where the spectrum of the invisible light is shifted to a range of visible spectrum. Then the shifted invisible light is received by the camera for gesture recognition. By recognizing gestures based on such images, the background noise can be significantly reduced or eliminated.

Description

SYSTEM FOR GESTURE RECOGNITION
BACKGROUND
[0001] Gesture recognition allows a device to recognize gestures that originate from a bodily motion or state, for example, from the face or hand of a user. In this way, human-machine interface (HMI) of the device enables the user to communicate and interact with the device naturally without mechanical devices. For instance, while the user performs a gesture with his/her finger(s), the device recognizes the gesture and then acts accordingly. As an example, in response to a swipe gesture by the user, a cursor or another object may be moved on the display screen of the device.
[0002] In general, gesture recognition is done by techniques of computer vision and image processing. In gesture recognition, it is usually necessary to separate the foreground such as the finger from background. This fundamental process is very challenging for some cameras such as the RGB cameras on mobile phones, especially when the background contains abundant texture.
SUMMARY
[0003] In accordance with implementations of the subject matter described herein, there is provided accessary equipment for gesture recognition. The equipment may be used with a camera on a primary device such as a multifunction portable electronic device. The equipment includes in part a first filtering layer and a second filtering layer. The first filtering layer filters out visible light from the ambient light to obtain invisible light. The invisible light then arrives at the second filtering layer where the spectrum of the invisible light is shifted to a range of visible spectrum. Then the shifted invisible light is received by the camera for gesture recognition. That is, the imaging at the camera is performed on the basis of light in form of the visible light but containing information of the invisible spectrum. By recognizing gestures based on such images, the background noise can be significantly reduced or eliminated.
[0004] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0005] FIG. 1 shows a block diagram of a block diagram of an environment where implementations of the subject matter described herein can be implemented; [0006] FIGS. 2 A, 2B and 2C show a schematic diagrams of example accessory equipment in accordance with one implementation of the subject matter described herein;
[0007] FIG. 3 shows a block diagram of the optical filtering portion of the accessory equipment in accordance with implementations of the subject matter described herein;
[0008] FIG. 4 shows an example image of a user's hand which is obtained based on the light filtered by the accessory equipment in accordance with one implementation of the subject matter described herein;
[0009] FIG. 5 shows a block diagram of a device that can be used with the accessory equipment in accordance with implementations of the subject matter described herein;
[0010] FIG. 6 shows a block diagram of an integrated headset in accordance with implementations of the subject matter described herein; and
[0011] FIG. 7 shows a flowchart of a method for using the accessory equipment in accordance with one implementation of the subject matter described herein.
[0012] Throughout the drawings, the same or similar reference symbols are used to indicate the same or similar elements.
DETAILED DESCRIPTION
[0013] The subject matter described herein will now be discussed with reference to several example implementations. It should be understood these implementations are discussed only for enabling those skilled persons in the art to better understand and thus implement the subject matter described herein, rather than suggesting any limitations on the scope of the subject matter.
[0014] As used herein, the term "includes" and its variants are to be read as opened terms that mean "includes, but is not limited to." The term "or" is to be read as "and/or" unless the context clearly indicates otherwise. The term "based on" is to be read as "based at least in part on." The term "one implementation" and "an implementation" are to be read as "at least one implementation." The term "another implementation" is to be read as "at least one other implementation." The terms "first," "second," "third" and the like may refer to different or same objects. Other definitions, explicit and implicit, can be included below.
[0015] In accordance with implementations of the subject matter described herein, accessory equipment for gesture recognition is provided. The equipment may be used with a camera on an electronic device such as a multifunction portable device. The equipment includes in part a first filtering layer and a second filtering layer. The first filtering layer filters out visible light from the ambient light to obtain invisible light. The invisible light is transmitted to the second filtering layer which shifts the spectrum of the invisible light to a range of visible spectrum. Then the shifted invisible light is captured by the camera for gesture recognition. That is, the camera captures the image in form of the visible light but containing information of the invisible light. By recognizing gestures using such images, the background noise can be significantly reduced or eliminated.
[0016] FIG. 1 illustrates a block diagram of an environment where implementations of the subject matter described herein can be implemented. As shown, accessory equipment 100 can be used with a primary device 110. In this example, the primary device 110 is at least partially contained in the accessory equipment 100. In alternative implementations, the accessory equipment 100 and the primary device 110 may work together in other suitable manners. The accessory equipment 100 includes an optical filtering portion 102 and the primary device 110 includes a camera 112. In accordance with implementation of the subject matter described herein, the optical filtering portion 102 is arranged in such a way that the ambient light arrives at the camera 112 of the primary device 110 via the optical filtering portion 102. That is, the optical filtering portion 102 is located at the optical path towards the camera 112. In this way, when a user performs a gesture, the light will be first received by the optical filtering portion 102 of the accessory equipment 100. The light filtered by the optical filtering portion 102 is sensed by the camera 112 for imaging and gesture recognition, which will be discussed below.
[0017] In some implementations, the accessory equipment 100 is a container which can contain at least a part of the primary device 110. FIGS. 2A-2C show schematic diagrams of an example implementation of the accessory equipment 100. In this example, the accessory equipment 100 is a box-shaped container and the primary device 110 is a smart phone which can be contained in the accessory equipment 100. The screen area of display of the primary device 110 can be divided into two or more screen parts, for example. In the shown example, there are two screen parts 114 and 116. The accessory equipment 100 includes one or more viewing holes. In the shown example, there are two viewing holes 124 and 126 which allow the user to view the content rendered on the screen parts 114 and 116, respectively. By rendering virtual reality (VR) content for the right and left eyes on the screen parts 114 and 116, respectively, the accessory equipment 100 can be worn and used by the user as a VR headset.
[0018] As shown in FIG. 2C, the accessory equipment 100 further includes a camera hole 128 via which the camera 112 of the primary device 110 can receive light. In those implementations where the camera 112 and the display are located on opposite sides of the primary device 110, the camera hole 128 and the viewing holes 124 and 126 are located on opposite sides of the accessory equipment 100 when the accessory equipment 100 encapsulates the primary device 110. Other relative locations of the camera hole 128 and the viewing holes 124 and 126 are possible as well. It is to be understood that the primary device 110 may include more than one camera 112 in some implementations. Accordingly, the accessory equipment 100 may have more than one camera hole 128.
[0019] In some implementations, the optical filtering portion 102 is in the form of one or more thin films. In such implementations, the thin film(s) may be arranged on the accessory equipment 100 to cover the camera hole 128. In this way, the light will pass the film(s) before reaching the camera 112. Other arrangements are possible as well. It is to be understood that the optical filtering portion 102 does not necessarily have to be implemented as a film(s). For example, one or more optical filters can act as the optical filtering portion 102. Example implementations will be described in the following paragraphs.
[0020] FIG. 3 shows a block diagram of the optical filtering portion 102 of the accessory equipment 100 in accordance with implementations of the subject matter described herein. In general, the optical filtering portion 102 includes two filtering layers 310 and 320. The ambient light first arrives at the first filtering layer 310 and then reaches the second filtering layer 320. The first filtering layer 310 is a visible light filtering layer which is used to filter out the visible light from the received ambient light. As a result, the light reaching the second filtering layer 320 only contains the invisible light such as infrared light. The second filtering layer 320 is used to shift the spectrum of the received invisible light to a range of the visible spectrum. In one implementation, the wavelength of the invisible light may be modified into the range of 380nm to 760nm, for example. Various techniques for spectrum shifting, no matter currently known or to be developed in the future, which will be described below.
[0021] In some implementations, the first and second filtering layers 310 and 320 may be implemented as one or more films, as described above. In one implementation, the filtering layers 310 and 320 are implemented as two separate layers for filtering the visible light and shifting the invisible light, respectively. These layers may fit to each other. In another implementation, instead of the dual-film configuration, the first and second filtering layers 310 and 320 may be integrally formed as a single film that not only filters out the visible light but also shifts the spectrum of the remaining invisible light. Instead of or in addition to the thin film(s), in other implementations, the first and/or second filtering layers 310 and 320 may be implemented as optical filters, lens and/or other suitable optical devices.
[0022] The first and/or second filtering layers 310 and 320 can be made of any suitable material that is able to filter out the visible light of certain wavelength and/or shift the invisible spectrum. For example, in some implementations, the first and/or second filtering layers 310 and 320 may be implemented as absorptive filters to which various inorganic or organic compounds are added. The compounds are used to absorb visible light of certain wavelengths while allowing the invisible light to transmit. Examples of the compounds include oxide. In one implementation, the compounds are added on a glass substrate. Alternatively, the compounds can also be added to plastic (often polycarbonate or acrylic) to produce gel filters which are lighter and cheaper than glass-based filters. In yet another implementation, resin can be used to form the first and/or second filtering layers 310 and 320.
[0023] Alternatively, or in addition, the first and/or second filtering layers 310 and 320 can be implemented as interference filters. Optical coatings with different refractive indexes are built up upon a substrate which can be made of glass or resin, for example. The interfaces between the coating layers of different refractive index produce phased reflections, thereby selectively reinforcing the invisible light and interfering with the visible light. The coating layers can be added by vacuum deposition. By controlling the thickness and number of the coating layers, the wavelength of the passband of the filtering layers can be tuned and made as wide or narrow as desired. Any other suitable implementations are possible as well.
[0024] By filtering and processing the light by the optical filtering portion 102 as described above, the camera 112 of the primary device 110 may sense and capture light in the form of visible light but carrying the information of invisible light. In this way, even the cameras off-the-shelf can be directly used with the accessory equipment 100. As known, the conventional cameras usually include internal filters which filter out the infrared light. By shifting the infrared light into the range of visible spectrum, the cameras are able to directly sense and process the received light within the visible spectrum, such that the imaging process can be correctly completed.
[0025] In the meantime, by generating images based on the essentially invisible light, noise in the ambient light can be significantly reduced or eliminated. FIG. 4 shows an example image 400 of a user's hand which is generated by the camera 112 of the primary device 110 based on the light filtered by the optical filtering portion 102 of the accessory equipment 100. The image 400 can be considered an infrared image. It can be seen that the infrared image contains little noise. The primary device 110 can apply gesture recognition process on the image 400 to recognize the user's gesture and act accordingly. It would be appreciated that compared with normal images obtained directly based on the ambient light, the foreground and background can be separated more accurately and efficiently by use of the infrared image 400, which in turn improves accuracy and efficiency of the gesture recognition.
[0026] FIG. 5 shows a block diagram of a primary device 110 that can be used with the accessory equipment 100 in accordance with implementations of the subject matter described herein. Examples of the primary device 110 include, but are not limited, a multifunction portable device such as a mobile phone or a tablet computer, a desktop personal computer (PC), or the like. It is to be understood that the primary device 110 is not intended to suggest any limitation as to scope of use or functionality of the subject matter described herein, as various implementations may be implemented in diverse general-purpose or special-purpose computing environments.
[0027] As shown, the primary device 110 includes at least one processing unit (or processor) 510, a memory 520, storage 530, one or more input devices 540, one or more output devices 550, and one or more communication connections 560. Specifically, the input device(s) 540 may include the camera 112. The camera 112 is configured to receive light filtered by the optical filtering portion 102 of the accessory equipment 100 and to generate one or more images such as infrared images. The processing unit 510 executes computer-executable instructions and may be a real or a virtual processor. The processing unit 510 may recognize the user's gestures based on the generated one or more images. Then the processing unit 150 may control the primary device 110 to act according to the recognized gestures.
[0028] The one or more output devices 550 may include a display 552 such as a touch- sensitive screen display for rendering content to the user. In some implementations, the screen area of the display 552 may be divided into at least two parts for rendering VR content for the right and left eyes of the user, respectively. The primary device 110 can be put into the accessory equipment 100 which can be manufactured as a container, as shown in FIGS. 2A-2C. The user may wear the accessory equipment 100 as a headset and view the VR content through at least one viewing hole of the accessory equipment 100, such as the viewing holes 124 and 126. The camera 112 is aligned with the camera hole 128 covered by the optical filtering portion 102. In this way, the user may perform gestures using his/her fingers to operate the primary device 110. For example, the user is enabled to manipulate the VR content rendered on the display 552.
[0029] The storage 530 may be removable or non-removable, and may include computer-readable storage media such as flash drives, magnetic disks or any other medium which can be used to store information and which can be accessed within the primary device 110. The communication connection(s) 560 enables communication over a communication medium to another computing entity. Additionally, functionality of the components of the primary device 110 may be implemented in a single computing machine or in multiple computing machines that are able to communicate over communication connections.
[0030] In the examples described above, the accessory equipment 100 and primary device 110 are separate from one another. In alternative implementations, the accessory equipment 100 and primary device 110 can be integrated into a single device. That is, the subject matter as described herein can be implemented as either separate devices or an integrated device or system such as a headset that has all the necessary elements built therein. FIG. 6 shows a block diagram of such a headset in accordance with implementations of the subject matter described herein.
[0031] As shown, the headset 600 includes not only the camera 112 but also the optical filtering portion 102. The user can directly wear the headset 600 without assembling the primary device with the accessory equipment as in the example shown in FIGS. 1 and 2A-2C. In some implementations, the filtering layers 310 and 320 of the optical filtering portion 102 may be removably attached to lens of the camera 112. It is also possible to directly integrate the filtering layers 310 and 320 into the lens of the camera 112. Further, the headset 600 may include a display 610 for rendering content such as VR content, a processing unit (not shown) for controlling the elements of the headset 600, and any other necessary elements. In this example, the display 610 and the camera 112 are located at opposite sides of the headset 610. Other arrangements are possible depending on the shape factor of the headset 600. It is to be understood that all the features described above with reference to FIGS. 1-5 apply to the example shown in FIG. 6.
[0032] FIG. 7 shows a flowchart of a method 700 for using the accessory equipment in accordance with implementations of the subject matter described herein. As shown, in step 710, visible light is filtered out from ambient light to obtain invisible light. Next, in step 720, the spectrum of the invisible light is shifted to a range of visible spectrum. As described above, the shifted invisible light can be used by the camera to generate an image with reduced noise for recognizing a gesture. In this way, accuracy of the gesture recognition is improved. It is to be understood that all the features described above with reference to FIGS. 1-6 apply to the method 700 shown in FIG. 7.
[0033] The functionally described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-Programmable Gate Arrays (FPGAs), Application-specific Integrated Circuits (ASICs), Application-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
[0034] Program code for carrying out methods of the subject matter described herein may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowcharts and/or block diagrams to be implemented. The program code may execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
[0035] In the context of this disclosure, a machine readable medium may be any tangible medium that may contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine readable medium may be a machine readable signal medium or a machine readable storage medium. A machine readable medium may include but not limited to an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of the machine readable storage medium would include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
[0036] Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are contained in the above discussions, these should not be construed as limitations on the scope of the subject matter described herein, but rather as descriptions of features that may be specific to particular implementations. Certain features that are described in the context of separate implementations may also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation may also be implemented in multiple implementations separately or in any suitable sub-combination.
[0037] Some example implementations of the subject matter described herein are listed below.
[0038] In some implementations, a system is provided. The system comprises an optical filtering portion including a first filtering layer for filtering out visible light from ambient light to obtain invisible light, and a second filtering layer for shifting spectrum of the invisible light to a range of visible spectrum. The system further comprises a camera configured to generate an image based on the shifted invisible light in the range of visible spectrum; a display configured to render content to a user; and a processor configured to recognize a gesture performed by the user based on the generated image and to control the rendered content based on the recognized gesture.
[0039] In some implementations, the first and second filtering layers are integrated as a single film. In some implementations, the first filtering layer is a first film, and wherein the second filtering layer is a second film that is different from the first film. The first and second films fit to each other. In some implementations, at least one of the first and second filtering layers is made of glass, resin, or gel. In some implementations, at least one of the first and second filtering layers includes optical coatings with different refractive indexes for reinforcing the invisible light and interfering with the visible light. In some implementations, at least one of the first and second filtering layers includes a compound for absorbing the visible light and transmitting the invisible light.
[0040] In some implementations, the display and the camera are located at opposite sides of a device. In some implementations, the content rendered on the display includes virtual reality (VR) content. In some implementations, a screen area of the display is divided to a plurality of parts for rendering the VR content.
[0041] In some implementations, equipment for use with a camera is provided. The equipment comprises a first filtering layer for filtering out visible light from ambient light to obtain invisible light; and a second filtering layer for shifting spectrum of the invisible light to a range of visible spectrum, where the shifted invisible light is captured and used by the camera to generate an image with reduced noise for recognizing a gesture.
[0042] In some implementations, at least one of the first and second filtering layers includes a film. In some implementations, the first and second filtering layers are integrally formed as a film. In some implementations, the first filtering layer is a first film, and wherein the second filtering layer is a second film that is different from the first film. The first and second films fit to each other. In some implementations, at least one of the first and second filtering layers is made of glass, resin, or gel. In some implementations, at least one of the first and second filtering layers includes optical coatings with different refractive indexes for reinforcing the invisible light and interfering with the visible light. In some embodiments, at least one of the first and second filtering layers includes a compound for absorbing the visible light and transmitting the invisible light.
[0043] In some implementations, the equipment comprises a container for containing a multifunction portable device including the camera, the container having a camera hole that allows the camera to capture the shifted invisible light, the first and second filtering layers covering the camera hole. In some implementations, the equipment is a wearable headset that allows a user to control the multifunction portable device by the gesture. In some implementations, the equipment further comprises at least one viewing hole that allows a user to view content rendered on a display of the multifunction portable device.
[0044] In some implementations, a method is provided. The method comprises filtering out visible light from ambient light to obtain invisible light; and shifting spectrum of the invisible light to a range of visible spectrum, the shifted invisible light being captured by the camera for recognizing a gesture.
[0045] Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. A system comprising:
an optical filtering portion, including
a first filtering layer for filtering out visible light from ambient light to obtain invisible light, and
a second filtering layer for shifting spectrum of the invisible light to a range of visible spectrum;
a camera configured to generate an image based on the shifted invisible light in the range of visible spectrum;
a display configured to render content to a user; and
a processor configured to recognize a gesture performed by the user based on the generated image and to control the rendered content based on the recognized gesture.
2. The system of claim 1, wherein the first and second filtering layers are integrated as a single film.
3. The system of claim 1, wherein the first filtering layer is a first film, and wherein the second filtering layer is a second film that is different from the first film, the first and second films fitting to each other.
4. The system of claim 1, wherein at least one of the first and second filtering layers is made of glass, resin, or gel.
5. The system of claim 1, wherein at least one of the first and second filtering layers includes optical coatings with different refractive indexes for reinforcing the invisible light and interfering with the visible light.
6. The system of claim 1, wherein at least one of the first and second filtering layers includes a compound for absorbing the visible light and transmitting the invisible light.
7. The system of claim 1, wherein the display and the camera are located at opposite sides of a device.
8. The system of claim 1, wherein the content rendered on the display includes virtual reality (VR) content.
9. The system of claim 8, wherein a screen area of the display is divided into a plurality of parts for rendering the VR content.
10. Equipment for use with a camera comprising:
a first filtering layer for filtering out visible light from ambient light to obtain invisible light; and
a second filtering layer for shifting spectrum of the invisible light to a range of visible spectrum, the shifted invisible light being used by the camera to generate an image with reduced noise for recognizing a gesture.
11. The equipment of claim 10, wherein at least one of the first and second filtering layers includes a film.
12. The equipment of claim 10, wherein the first and second filtering layers are integrally formed as a film.
13. The equipment of claim 10, wherein the first filtering layer is a first film, and wherein the second filtering layer is a second film that is different from the first film, the first and second films fitting to each other.
14. The equipment of claim 10, wherein at least one of the first and second filtering layers is made of glass, resin, or gel.
15. The equipment of claim 10, wherein at least one of the first and second filtering layers includes optical coatings with different refractive indexes for reinforcing the invisible light and interfering with the visible light.
PCT/US2016/054567 2015-10-09 2016-09-30 System for gesture recognition WO2017062263A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510649556.8 2015-10-09
CN201510649556.8A CN106570441A (en) 2015-10-09 2015-10-09 System used for posture recognition

Publications (1)

Publication Number Publication Date
WO2017062263A1 true WO2017062263A1 (en) 2017-04-13

Family

ID=57145035

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/054567 WO2017062263A1 (en) 2015-10-09 2016-09-30 System for gesture recognition

Country Status (2)

Country Link
CN (1) CN106570441A (en)
WO (1) WO2017062263A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107172338B (en) * 2017-06-30 2021-01-15 联想(北京)有限公司 Camera and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7312434B1 (en) * 2006-12-26 2007-12-25 Eaton Corporation Method of filtering spectral energy
US20080048936A1 (en) * 2006-08-10 2008-02-28 Karlton Powell Display and display screen configured for wavelength conversion
WO2015003721A1 (en) * 2013-07-09 2015-01-15 Danmarks Tekniske Universitet Multi-channel up-conversion infrared spectrometer and method of detecting a spectral distribution of light

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101184161A (en) * 2007-12-10 2008-05-21 深圳市通宝莱科技有限公司 Video camera suitable for all-weather use
AU2011220382A1 (en) * 2010-02-28 2012-10-18 Microsoft Corporation Local advertising content on an interactive head-mounted eyepiece
CN102156859B (en) * 2011-04-21 2012-10-03 刘津甦 Sensing method for gesture and spatial location of hand
US20130021374A1 (en) * 2011-07-20 2013-01-24 Google Inc. Manipulating And Displaying An Image On A Wearable Computing System
US20140055322A1 (en) * 2012-08-21 2014-02-27 Hon Hai Precision Industry Co., Ltd. Display system and head-mounted display apparatus
US9218673B2 (en) * 2012-10-11 2015-12-22 Nike, Inc. Method and system for manipulating camera light spectrum for sample article false color rendering
CN105357426B (en) * 2014-11-03 2019-01-15 苏州思源科安信息技术有限公司 Photoelectronic imaging method and mobile terminal for mobile terminal visible light and bio-identification combined system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080048936A1 (en) * 2006-08-10 2008-02-28 Karlton Powell Display and display screen configured for wavelength conversion
US7312434B1 (en) * 2006-12-26 2007-12-25 Eaton Corporation Method of filtering spectral energy
WO2015003721A1 (en) * 2013-07-09 2015-01-15 Danmarks Tekniske Universitet Multi-channel up-conversion infrared spectrometer and method of detecting a spectral distribution of light

Also Published As

Publication number Publication date
CN106570441A (en) 2017-04-19

Similar Documents

Publication Publication Date Title
US11676349B2 (en) Wearable augmented reality devices with object detection and tracking
TWI635349B (en) Apparatus and method to maximize the display area of a mobile device by combining camera and camera icon functionality
US11706520B2 (en) Under-display camera and sensor control
CN106575154B (en) Intelligent transparency of holographic objects
US10203761B2 (en) Glass type mobile terminal
EP2940571B1 (en) Mobile terminal and controlling method thereof
US11393254B2 (en) Hand-over-face input sensing for interaction with a device having a built-in camera
US11782514B2 (en) Wearable device and control method thereof, gesture recognition method, and control system
JP5300825B2 (en) Instruction receiving device, instruction receiving method, computer program, and recording medium
US20140317576A1 (en) Method and system for responding to user's selection gesture of object displayed in three dimensions
EP3095074A1 (en) 3d silhouette sensing system
CN104081307A (en) Image processing apparatus, image processing method, and program
CN103067727A (en) Three-dimensional 3D glasses and three-dimensional 3D display system
CN206595991U (en) A kind of double-camera mobile terminal
US11736679B2 (en) Reverse pass-through glasses for augmented reality and virtual reality devices
US20160188963A1 (en) Method of face detection, method of image processing, face detection device and electronic system including the same
CN112462937B (en) Local perspective method and device of virtual reality equipment and virtual reality equipment
WO2017062263A1 (en) System for gesture recognition
EP3186956B1 (en) Display device and method of controlling therefor
US20190266738A1 (en) Mobile terminal and method for controlling the same
CN107340962B (en) Input method and device based on virtual reality equipment and virtual reality equipment
EP3088991A1 (en) Wearable device and method for enabling user interaction
CN103841331A (en) Electronic equipment capable of self-defining picture content and automatic tracking photographic method
Bhowmik Natural and intuitive user interfaces with perceptual computing technologies
US20230362348A1 (en) Reverse pass-through glasses for augmented reality and virtual reality devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16782344

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16782344

Country of ref document: EP

Kind code of ref document: A1