WO2017075932A1 - 基于三维显示的手势操控方法和系统 - Google Patents

基于三维显示的手势操控方法和系统 Download PDF

Info

Publication number
WO2017075932A1
WO2017075932A1 PCT/CN2016/076748 CN2016076748W WO2017075932A1 WO 2017075932 A1 WO2017075932 A1 WO 2017075932A1 CN 2016076748 W CN2016076748 W CN 2016076748W WO 2017075932 A1 WO2017075932 A1 WO 2017075932A1
Authority
WO
WIPO (PCT)
Prior art keywords
hand
gesture
dimensional
control
control instruction
Prior art date
Application number
PCT/CN2016/076748
Other languages
English (en)
French (fr)
Inventor
黄源浩
肖振中
许宏淮
钟亮洪
Original Assignee
深圳奥比中光科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳奥比中光科技有限公司 filed Critical 深圳奥比中光科技有限公司
Publication of WO2017075932A1 publication Critical patent/WO2017075932A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Definitions

  • the invention relates to a gesture operation control technology, in particular to a three-dimensional display-based gesture control method and system related to human-machine natural interaction.
  • a gesture control method based on three-dimensional display includes the following steps:
  • the step of acquiring position information of the hand and establishing a motion trajectory of the hand in the three-dimensional space coordinate system includes:
  • a series of continuous depth information of the hand is obtained, and a motion trajectory of the hand in the three-dimensional space coordinate system is formed according to the depth information.
  • the operating space of the hand is in linear relationship with the three-dimensional spatial coordinate system, wherein the operating space is a real space in which the hand performs a series of continuous actions.
  • the step of recognizing the gesture of the hand according to the motion trajectory of the hand in the three-dimensional coordinate system includes:
  • the hand contour feature information is extracted, and the feature information is combined with the position information to identify and classify the gesture action.
  • the step of reading the corresponding control instruction according to the gesture action and controlling the operation object in the three-dimensional image according to the control instruction comprises:
  • the control instruction corresponding to the gesture action is read from the correspondence relationship
  • a non-contact three-dimensional display-based gesture control system with human-machine interaction is also provided.
  • a gesture control system based on three-dimensional display comprising:
  • An information acquisition module configured to acquire location information of the hand
  • a coordinate establishing module configured to establish a motion trajectory of the hand in the three-dimensional space coordinate system according to the position information
  • a gesture recognition module configured to recognize a gesture gesture of the hand according to a movement track of the hand in the three-dimensional coordinate system
  • an operation control module configured to read a corresponding control instruction according to the gesture action, and control an operation object in the three-dimensional image according to the control instruction.
  • the information acquisition module is further configured to acquire a series of continuous depth information of the hand; the coordinate establishing module is further configured to form a motion trajectory of the hand in the three-dimensional space coordinate system according to the depth information.
  • the operating space of the hand is in linear relationship with the three-dimensional spatial coordinate system, wherein the operating space is a real space in which the hand performs a series of continuous actions.
  • the gesture recognition module is further configured to extract hand contour feature information, and perform feature matching on the location information to identify and classify the gesture action.
  • the operation control module includes a storage module, a reading module, and an execution module; the storage module is configured to store a correspondence between each gesture action and a corresponding control instruction; the reading module is configured to: After the gesture action is recognized, the control instruction corresponding to the gesture action is read from the correspondence relationship; the execution module is configured to control the operation object in the three-dimensional image to perform the corresponding action according to the control instruction.
  • a non-contact three-dimensional display-based gesture control device that is naturally interacting with human-machines is also provided.
  • a gesture control device based on three-dimensional display comprising: a depth camera, a three-dimensional display and a processor;
  • the depth camera is configured to acquire a depth image of the hand and output the image to the processor;
  • the processor acquires position information of the hand according to the depth image; and establishes a motion trajectory of the hand in the three-dimensional space coordinate system according to the position information; the processor is further configured to: according to the hand in the three-dimensional space coordinate system The motion track identifies a gesture action of the hand; the processor is further configured to read a corresponding control instruction according to the gesture action, and control an operation object in the three-dimensional image according to the control instruction;
  • the processor is further configured to control the three-dimensional display to display the gesture action according to the gesture action, and display a trajectory of the control instruction corresponding to the gesture action.
  • the above-mentioned three-dimensional display-based gesture control method and system, device acquires the position information of the hand, and then establishes the motion trajectory of the hand in the three-dimensional space coordinate system, recognizes the gesture action of the hand according to the motion trajectory, and finally according to the gesture action
  • the corresponding control command is read, and the operation object in the three-dimensional image is controlled according to the control instruction. That is, by recognizing the gesture action of the user, and then controlling the operation object according to the gesture action to perform a corresponding operation. Therefore, it is possible to realize natural interaction between humans and humans without touching the display screen.
  • 1 is a flow chart of a gesture control method based on three-dimensional display
  • FIG. 2 is a schematic diagram of an object depth calculation model
  • Figure 3 (a) is a schematic diagram of the movement of the hand corresponding to the three-dimensional display cursor
  • Figure 3 (b) is a schematic diagram of the movement of the hand corresponding to the three-dimensional display cursor
  • Figure 3 (c) is a schematic diagram of the grasping object following the movement of the hand in the three-dimensional display
  • Figure 3 (d) is a schematic diagram of the grasping object following the hand movement in the three-dimensional display
  • Figure 3 (e) is a third schematic diagram of the grasping object following the hand movement in the three-dimensional display
  • Figure 3 (f) is a fourth schematic diagram of the grasping object following the hand movement in the three-dimensional display
  • Figure 3 (g) is a fifth schematic diagram of the grasping object following the movement of the hand in the three-dimensional display
  • Figure 4 is a block diagram of a gesture control system based on a three-dimensional display.
  • FIG. 1 it is a flowchart of a gesture control method based on three-dimensional display.
  • the depth image of the hand is acquired, and the depth image also becomes a distance image, which is an image or an image channel related to the distance of the object in the scene from the observation angle.
  • the gray value of the pixel in the depth image corresponds to the depth value of the point in the scene.
  • the depth image contains information as depth information.
  • the depth image has two properties, such as color independence and the direction in which the gray value changes, and the direction of the field of view Z, which is captured by the camera.
  • color independence means that the depth image will not be compared with the color image.
  • the direction in which the gray value changes is the same as the direction of the field of view Z taken by the camera means that the depth image can be used to reconstruct the 3D space region within a certain range, and the problem that the object is occluded or the parts of the object overlap can be solved to some extent. Based on the depth information, it is easy to separate the foreground from the background, which can reduce the difficulty of image recognition.
  • the depth image is divided according to the imaging principle, mainly including flight time method, structured light and 3D laser scanning, which are mainly used for human-computer interaction. Pattern recognition using depth images.
  • the depth image can be obtained by the following method:
  • the first method based on the time-of-flight principle calculates the depth information of the surface of the object by measuring the time difference of the light reflected back to the surface of the object.
  • the second method is similar to structured optical coding, projecting a known infrared mode into the scene, and measuring the distance through the deformation mode recorded on the infrared CMOS camera.
  • the working mode is mainly to identify the human body and related actions, and the most important core of the human body is the skeleton. Through the tracking of the bones, the motion of the human body is scanned onto the computer, and related simulations and operations are performed.
  • the method of acquiring a depth image in the present invention is not limited to the above method.
  • the control instruction corresponding to the gesture action is given according to the gesture action.
  • the cursor follows the hand, the hand grips the corresponding grab target command, the hand grips the forward corresponding enlargement command, the grip backwards corresponds to the zoom instruction, and the like.
  • the control command corresponding to the gesture action can be output.
  • the pre-acquired gesture actions are stored, such as grasping, grasping forward movement, etc., and setting control commands corresponding to grasping and grasping forward movement. Therefore, when the user makes an action such as grasping, grasping forward movement, etc., the control instruction corresponding to the action is executed correspondingly. That is, after the data of the gesture action is collected in advance, any user can perform a corresponding gesture action, and the control command corresponding to the gesture action can be executed.
  • the palms open and grasp the control commands that respectively represent zooming in and out. Or hold the thumb up or down to represent the control commands for zooming in and out. Or grasp the back of the hand forward or backward to represent the control instructions for zooming in and out.
  • the gestures are similarly collected and stored, and the corresponding control commands are given, so that a correspondence relationship between the gesture actions and the control commands is established. Therefore, it is possible to recognize the gesture When doing so, the corresponding operation is performed in accordance with the corresponding control command.
  • different control commands can be assigned to each gesture action.
  • the collection of gestures includes many classes, each of which also includes many different instances. This data acquisition is taken in a natural environment, in real rooms or offices, with different lighting and angles, making the data collection of gestures more Practical.
  • a gesture control method based on three-dimensional display includes the following steps:
  • Step S110 acquiring position information of the hand and establishing a motion trajectory of the hand in the three-dimensional space coordinate system.
  • This embodiment can acquire the position information of the hand based on the depth image.
  • the step of acquiring the position information of the hand and establishing the motion track of the hand in the three-dimensional coordinate system includes:
  • a series of continuous depth information of the hand is obtained, and a motion trajectory of the hand in the three-dimensional space coordinate system is formed according to the depth information.
  • the embodiment may be based on a depth image acquisition technique, using a depth camera to collect image data of a series of continuous motions of the hand, and then extracting a series of continuous depth information of the hand from the image data.
  • the operation space of the hand is linearly corresponding to the three-dimensional space coordinate system, wherein the operation space is a real space of a series of continuous movements of the hand, and the image data acquired from the operation space by the depth camera can acquire a series of continuous hands.
  • the depth information and the space of the two-dimensional coordinate information refers to a space coordinate system corresponding to stereoscopic image data for displaying a three-dimensional image.
  • the corresponding coordinate points can be found in the three-dimensional coordinate system.
  • Track the bone information of the hand scan the movement of the hand onto the computer, collect the depth information of the hand, and combine the depth information and the bone information to obtain the corresponding coordinate points of the hand in the three-dimensional coordinate system.
  • the movement trajectory of the hand in the three-dimensional coordinate system can be tracked in turn, that is, the tracking trajectory of the hand is completed, and the actual motion trajectory is converted to the three-dimensional space coordinate. In the department.
  • the acquisition of the depth information may adopt a model of parallel stereo vision, assuming that the outer parameters of the camera C1 are represented by the rotation matrix R1 and the translation vector t1, and the external parameters of the camera C2 are rotated.
  • the optical axes of the two cameras are parallel to each other, and the x in the left and right camera coordinate systems coincide, and the polar lines are parallel to each other.
  • the difference between the two camera coordinate systems is a flat B on the x-axis (ie, "Baseline").
  • An object depth calculation model as shown in FIG. 2 is established to calculate the object depth.
  • the positions of the left and right camera optical centers are C1 and Cr, respectively
  • B is the translation vector between the two camera optical centers
  • f is the focal length of the camera.
  • p1 and pr are the projection points of the electric P on the left and right image planes, respectively.
  • Z is the depth information sought, that is, the distance of the spatial point P from the camera optical center line C1Cr.
  • L and R are the points at which the foot of the camera is perpendicular to the image plane perpendicular to the image plane.
  • H is a vertical foot that is perpendicular to the image plane through the spatial point P.
  • the above formula is a formula for solving depth information, where
  • and focal length f are obtained by camera calibration.
  • step S120 the gesture motion of the hand is recognized according to the motion trajectory of the hand in the three-dimensional space coordinate system.
  • the step of recognizing the gesture action of the hand according to the movement track of the hand in the three-dimensional coordinate system includes:
  • the hand contour feature information is extracted, and the feature information is combined with the position information to identify and classify the gesture action.
  • the gesture point cloud data is obtained by using the three-dimensional point cloud calculation, and the calculated gesture point cloud data only includes the three-dimensional coordinate position information of the hand joint point and the palm center point, and then the gesture point cloud
  • the data is filtered by the data, and the noise interference points in the gesture point cloud data are filtered out to obtain the gesture point cloud information.
  • the gesture point cloud information is used, and the three-dimensional information of the gesture point cloud information is plane-registered by rotating and panning, the post-registration point cloud information is saved, and then the contour key information of the gesture point cloud information is extracted, and the contour feature points include the fingertip point. , fingertip pits and palm center points.
  • the distance threshold is determined by the Euclidean distance method, and the key fingertip information is filtered out, according to the fingertip point information and the corresponding fingertip pit information.
  • the five-finger feature vector is acquired by combining the planar registration planes, and the gesture motion is recovered according to the feature vector.
  • Step S130 reading a corresponding control instruction according to the gesture action, and controlling an operation object in the three-dimensional image according to the control instruction.
  • the step of reading the corresponding control instruction according to the gesture action, and controlling the operation object in the three-dimensional image according to the control instruction includes:
  • the control instruction corresponding to the gesture action is read from the corresponding relationship.
  • the control instruction corresponding to the gesture action is found, and the operation object in the three-dimensional image is controlled according to the control instruction.
  • the three-dimensional image of the embodiment can obtain a spatial stereoscopic image by using a true three-dimensional stereoscopic image display technology.
  • True three-dimensional image display technology refers to holographic display technology or body-based three-dimensional display technology.
  • the stereoscopic image data is image data having a three-dimensional space coordinate system, and the information of each volume pixel includes at least position information and image information of the pixel at the point.
  • the holographic display technology of this paper mainly includes traditional hologram (transmissive holographic display image, reflective holographic display image, image holographic display image, rainbow holographic display image, synthetic holographic display image, etc.) and computer hologram (CGH) , Computer Generated Hologram).
  • the computer hologram floats in the air and has a wide color gamut.
  • the object used to generate the hologram needs to generate a mathematical model description in the computer, and the physical interference of the light wave is also replaced by the calculation step.
  • the intensity pattern in the CGH model can be determined, which can be output to a reconfigurable device that remodulates the lightwave information and reconstructs the output.
  • CGH is to obtain an interference pattern of computer graphics (virtual objects) through computer operation, instead of the interference process of light wave recording of traditional hologram objects; and the diffraction process of hologram reconstruction has no principle change, just A device that reconfigurable light wave information is added to realize holographic display of different computer static and dynamic graphics.
  • the spatial stereoscopic display device comprises: a 360 holographic phantom imaging system, the system comprising a light source, a controller, a beam splitter, the light source may be a spotlight, and the controller comprises one or more
  • the processor receives the stereoscopic image data through the communication interface, and obtains an interference pattern of the computer graphic (imaginary object) after processing, outputs the interference image to the beam splitter, and presents the interference pattern by the light projected by the light source on the beam splitter.
  • a spatial stereoscopic image is formed.
  • the beam splitter here can be a special lens, or a four-sided pyramid or the like.
  • the spatial stereoscopic display device can also be based on a holographic projection device, for example, by forming a stereoscopic image on air, special lenses, fog screens, and the like. Therefore, the spatial stereoscopic display device 8 can also be an air holographic projection device, a laser beam holographic projection device, a holographic projection device having a 360-degree holographic display screen (the principle is to project an image on a mirror rotating at a high speed, thereby realizing a holographic image. ), and one of the equipment such as the fog screen stereo imaging system.
  • volumetric three-dimensional display technology refers to the use of human's own special visual mechanism to create A display object consisting of voxel particles instead of molecular particles, in addition to the shape of the light wave, can also touch the real existence of voxels. It stimulates the material located in the transparent display volume by appropriate means, and forms voxels by the absorption or scattering of visible radiation. When many substances in the volume are excited, a plurality of dispersed voxels can be formed in three dimensions. A three-dimensional image is formed in the space.
  • the present invention can also adopt the following methods:
  • Rotating body scanning technology rotating body scanning technology is mainly used for display of dynamic objects.
  • a series of two-dimensional images are projected onto a rotating or moving screen while the screen is moving at a speed that is not perceptible to the viewer, since the human vision persists to form a three-dimensional object in the human eye. Therefore, a display system using such stereoscopic display technology can realize true three-dimensional display of images (360° visible).
  • Light beams of different colors in the system are projected onto the display medium by the light deflector, so that the medium exhibits rich colors.
  • the display medium allows the beam to produce discrete visible spots, which are voxels, corresponding to any point in the three-dimensional image.
  • a set of voxels is used to create an image, and the observer can observe this true three-dimensional image from any viewpoint.
  • the imaging space in a display device based on a rotating body scanning technique can be generated by rotation or translation of a screen.
  • the voxel is activated on the emitting surface as the screen sweeps across the imaging space.
  • the system includes subsystems such as a laser system, a computer control system, and a rotating display system.
  • Static body imaging technology is based on the frequency up-conversion technology to form a three-dimensional stereoscopic image.
  • the so-called frequency up-conversion three-dimensional stereoscopic display uses the imaging space medium to absorb a plurality of photons and spontaneously radiates a kind of fluorescence, thereby producing visible pixel.
  • the basic principle is to use two mutually perpendicular infrared lasers to cross the upper conversion material. After the two resonance absorptions of the upconversion material, the luminescent center electrons are excited to a high excitation level, and then the next level transition can be generated. The emission of visible light, such a point in the space of the up-converting material is a bright spot of illumination.
  • intersection of the two laser beams is scanned in a three-dimensional space in the up-conversion material according to a certain trajectory, then the two lasers are The area scanned by the intersection should be a bright band that emits visible fluorescence, that is, it can display the same three-dimensional graphics as the laser intersection. This display method allows the naked eye to see a 360-degree view of the three-dimensional image.
  • the three-dimensional image in the present invention can also be displayed on the display screen based on the 3D display technology.
  • the display screen mentioned here is based on the 3D display technology, and utilizes the left and right eye parallax of the human eye to enable the human eye to reconstruct the image displayed on the display screen to obtain a virtual 3D stereoscopic image.
  • the display screen is divided into two types: glasses-type display devices and naked-eye display devices.
  • the glasses type display device is realized by using a flat display screen together with 3D glasses.
  • the naked-eye display device, that is, the naked-eye 3D display consists of three parts: 3D stereoscopic terminal, playback software, production software, and application technology. It is a modern high-tech that integrates optics, photography, electronic computers, automatic control, software, and 3D animation. Technology integrated in the three-dimensional reality system.
  • stereoscopic image data having a three-dimensional spatial coordinate system can be converted into image data input to different display devices as needed.
  • the different display devices use different hardware devices based on the imaging mode of the three-dimensional image. For details, refer to related content in the prior art.
  • the bone is tracked, the hand is recognized as an open motion, and the control command corresponding to the open motion is searched, and the control command corresponding to the open motion is assumed to be the initial motion. At this time, only the cursor corresponding to the hand is displayed.
  • the command that is fed back to the computer through the bone tracking is only to track the motion track of the hand, that is, the motion track of the display cursor following the hand. Since the gesture operation space corresponds to the three-dimensional space coordinate system, when the hand moves in the operation space, it corresponds to the three-dimensional space coordinate system.
  • the hand when it is determined that an operation object needs to be operated, the hand is moved so that the cursor corresponding to the hand is within the control area of the operation object.
  • the grip forward movement and backward movement to represent the reduction and enlargement commands as an example.
  • the grip is recognized, the starting position of the hand is acquired, and the position of the palm of the hand is generally used as the starting position.
  • the motion track of the hand is tracked, and when it is recognized as moving forward, the corresponding control command is a zoom-out operation object. When it is recognized as moving backward, the corresponding control command is an enlarged operation object.
  • the gripping action is the selected instruction
  • the cursor corresponding to the hand is in the control area of the operation object
  • the gripping action is recognized
  • the object corresponding to the cursor corresponding to the hand at this time is the operation object. That is, the current object is selected, and the current object can be moved, copied, and pasted.
  • the cursor corresponding to the hand in the three-dimensional display is gradually enlarged.
  • FIG. 3(d) when the hand is in the grip state and moves backward, the cursor corresponding to the hand corresponding to the three-dimensional display is gradually reduced.
  • the 3D display can be under the depth camera or on the side. As shown in FIG. 3(e) and FIG. 3(f), the placement position of the three-dimensional display does not affect the display of the three-dimensional operation space.
  • the opening and rotating finger motion is a rotation command
  • the cursor corresponding to the hand is in the control area of the operation object, when the opening and the finger movement are recognized, the hand corresponding cursor is used at this time.
  • the object is the operation object, that is, the current object is rotated.
  • a gesture control device based on three-dimensional display includes a depth camera, a three-dimensional display, and a processor.
  • the depth camera is used to acquire a depth image of the hand and output to the processor.
  • the processor acquires position information of the hand according to the depth image; and establishes a motion trajectory of the hand in the three-dimensional space coordinate system according to the position information; the processor is further configured to: according to the hand in the three-dimensional space coordinate system The motion track identifies a gesture action of the hand; the processor is further configured to read a corresponding control instruction according to the gesture action, and control an operation object in the three-dimensional image according to the control instruction.
  • the processor is further configured to control the three-dimensional display to display the gesture action according to the gesture action, and display a trajectory of the control instruction corresponding to the gesture action.
  • the gesture action is acquired in the operating space (the real space for which the hand performs a series of continuous actions). Therefore, the three-dimensional image can realize the three-dimensional effect of the naked eye by using the holographic display technology. That is, the real space and the virtual display can be displayed in real time. Therefore, when the user operates the display object of the three-dimensional display, it is possible to accurately perform operations such as grasping, gripping, and the like on the display object.
  • a stereoscopic image such as a racket
  • a three-dimensional display a three-dimensional display using holographic display technology
  • the depth camera detects the user's gestures and transmits them to the processor.
  • the processor controls the three-dimensional display to display the stereo image to be operated (eg The racket is in a state of being gripped by the user's hand.
  • the processor controls the three-dimensional display to display a trajectory in which the stereoscopic image to be operated is moved (or swung).
  • FIG. 4 it is a block diagram of a gesture control system based on three-dimensional display.
  • An information acquisition module configured to acquire location information of the hand
  • a coordinate establishing module configured to establish a motion trajectory of the hand in the three-dimensional space coordinate system according to the position information
  • a gesture recognition module configured to recognize a gesture gesture of the hand according to a movement track of the hand in the three-dimensional coordinate system
  • an operation control module configured to read a corresponding control instruction according to the gesture action, and control an operation object in the three-dimensional image according to the control instruction.
  • the information acquisition module is further configured to acquire a series of continuous depth information of the hand
  • the coordinate establishing module is further configured to form a motion trajectory of the hand in the three-dimensional space coordinate system according to the depth information.
  • the operating space of the hand is linearly associated with the three-dimensional spatial coordinate system, wherein the operating space is a real space in which the hand performs a series of continuous actions.
  • the gesture recognition module is further configured to extract hand contour feature information, and perform feature matching on the position information to identify and classify the gesture action.
  • the operation control module includes a storage module, a reading module, and an execution module.
  • the storage module is configured to store a correspondence between each gesture action and a corresponding control instruction.
  • the reading module is configured to read, after the gesture action is recognized, a control instruction corresponding to the gesture action from the correspondence relationship.
  • the execution module is configured to control the operation object in the three-dimensional image to perform a corresponding action according to the control instruction.
  • the above-mentioned three-dimensional display-based gesture control method and system, device acquires the position information of the hand, and then establishes the motion trajectory of the hand in the three-dimensional space coordinate system, recognizes the gesture action of the hand according to the motion trajectory, and finally according to the gesture action
  • the corresponding control command is read, and the operation object in the three-dimensional image is controlled according to the control instruction. That is, by recognizing the gesture action of the user, and then controlling the operation object according to the gesture action to perform a corresponding operation. Therefore, it is possible to realize natural interaction between humans and humans without touching the display screen.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种基于三维显示的手势操控方法,通过获取手部的位置信息,然后建立手部在三维空间坐标系的运动轨迹(S110),根据该运动轨迹识别手部的手势动作(S120),最后根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象(S130)。即通过识别用户的手势动作,然后依据手势动作控制操作对象执行对应的操作。因此,能够实现人机自然交互且无需接触显示屏。此外,还提供一种基于三维显示的手势操控系统及装置。

Description

基于三维显示的手势操控方法和系统 【技术领域】
本发明涉及手势操作控制技术,特别是涉及人机自然交互的基于三维显示的手势操控方法和系统。
【背景技术】
人机交互、机器人和虚拟现实等领域的快速发展和广泛应用,三维交互输入新技术成为人机虚拟交互领域中众多研究学者的热点。随着这项技术的发展和不断深入,大众对其使用需求越来越高,非接触、高速、实时定位与三维操作成为该技术发展的方向。因此,传统的采用鼠标或者触摸屏控制显示屏已无法实现大众的需求。
【发明内容】
基于此,有必要提供一种人机自然交互的、非接触的基于三维显示的手势操控方法。
一种基于三维显示的手势操控方法,包括以下步骤:
获取手部的位置信息,并建立手部在三维空间坐标系的运动轨迹;
根据手部在三维空间坐标系的运动轨迹,识别手部的手势动作;
根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。
在其中一个实施例中,所述获取手部的位置信息,并建立手部在三维空间坐标系的运动轨迹的步骤包括:
获取手部一系列连续的深度信息,根据所述深度信息形成手部在三维空间坐标系的运动轨迹。
在其中一个实施例中,所述手部的操作空间与所述三维空间坐标系成线性对应关系,其中,操作空间为手部执行一系列连续动作的真实空间。
在其中一个实施例中,所述根据手部在三维空间坐标系的运动轨迹识别手部的手势动作的步骤包括:
提取手部轮廓特征信息,并结合所述位置信息进行特征匹配对手势动作进行识别分类。
在其中一个实施例中,所述根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象的步骤包括:
存储每个手势动作与对应的控制指令的对应关系;
在识别手势动作后,从对应关系中读取该手势动作对应的控制指令;
根据所述控制指令控制三维图像中的操作对象执行对应动作。
此外,还提供一种人机自然交互的、非接触的基于三维显示的手势操控系统。
一种基于三维显示的手势操控系统,包括:
信息获取模块,用于获取手部的位置信息,
坐标建立模块,用于根据所述位置信息建立手部在三维空间坐标系的运动轨迹;
手势识别模块,用于根据手部在三维空间坐标系的运动轨迹识别手部的手势动作;
操作控制模块,用于根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。
在其中一个实施例中,所述信息获取模块还用于获取手部一系列连续的深度信息;所述坐标建立模块还用于根据所述深度信息形成手部在三维空间坐标系的运动轨迹。
在其中一个实施例中,手部的操作空间与所述三维空间坐标系成线性对应关系,其中,操作空间为手部执行一系列连续动作的真实空间。
在其中一个实施例中,所述手势识别模块还用于提取手部轮廓特征信息,并结合所述位置信息进行特征匹配对手势动作进行识别分类。
在其中一个实施例中,所述操作控制模块包括存储模块、读取模块及执行模块;所述存储模块用于存储每个手势动作与对应的控制指令的对应关系;所述读取模块用于在识别手势动作后,从对应关系中读取该手势动作对应的控制指令;所述执行模块用于根据所述控制指令控制三维图像中的操作对象执行对应动作。
此外,还提供一种人机自然交互的、非接触的基于三维显示的手势操控装置。
一种基于三维显示的手势操控装置,其特征在于,包括深度相机、三维显示器及处理器;
所述深度相机用于获取手部的深度图像,并输出给所述处理器;
所述处理器根据所述深度图像获取手部的位置信息;并根据所述位置信息建立手部在三维空间坐标系的运动轨迹;所述处理器还用于根据手部在三维空间坐标系的运动轨迹识别手部的手势动作;所述处理器还用于根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象;
所述处理器还用于根据所述手势动作控制三维显示器显示该手势动作,并显示执行该手势动作对应的控制指令的轨迹。
上述基于三维显示的手势操控方法和系统、装置通过获取手部的位置信息,然后建立手部在三维空间坐标系的运动轨迹,根据该运动轨迹识别手部的手势动作,最后根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。即通过识别用户的手势动作,然后依据手势动作控制操作对象执行对应的操作。因此,能够实现人机自然交互且无需接触显示屏。
【附图说明】
图1为基于三维显示的手势操控方法的流程图;
图2为物体深度计算模型的示意图;
图3(a)手与三维显示光标对应移动的示意图之一;
图3(b)手与三维显示光标对应移动的示意图之二;
图3(c)抓握对象在三维显示中跟随手移动的示意图之一;
图3(d)抓握对象在三维显示中跟随手移动的示意图之二;
图3(e)抓握对象在三维显示中跟随手移动的示意图之三;
图3(f)抓握对象在三维显示中跟随手移动的示意图之四;
图3(g)抓握对象在三维显示中跟随手移动的示意图之五;
图4基于三维显示的手势操控系统的模块图。
【具体实施方式】
为了便于理解本发明,下面将参照相关附图对本发明进行更全面的描述。附图中给出了本发明的较佳的实施例。但是,本发明可以以许多不同的形式来实现,并不限于本文所描述的实施例。相反地,提供这些实施例的目的是使对本发明的公开内容的理解更加透彻全面。
需要说明的是,当元件被称为“固定于”另一个元件,它可以直接在另一个元件上或者也可以存在居中的元件。当一个元件被认为是“连接”另一个元件,它可以是直接连接到另一个元件或者可能同时存在居中元件。本文所使用的术语“垂直的”、“水平的”、“左”、“右”以及类似的表述只是为了说明的目的。
除非另有定义,本文所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本文中在本发明的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本发明。本文所使用的术语“及/或”包括一个或多个相关的所列项目的任意的和所有的组合。
如图1所示,为基于三维显示的手势操控方法的流程图。
获取手部的深度图像,深度图像也成为距离图像,是指从观察视角看去,图像所含信息与场景中物体表面距离相关的一种图像或一种图像通道。在深度图像中像素点的灰度值对应于场景中点的深度值。深度图像包含的信息为深度信息。
深度图像具有颜色无关性和灰度值变化方向与相机所拍摄的视场方向Z方向相同等两个性质。其中,颜色无关性是指与彩色图像相比,深度图像不会有 光照、阴影以及环境变化的干扰。灰度值变化方向与相机所拍摄的视场方向Z方向相同指利用深度图像可以在一定范围内重建3D空间区域,并且可以从一定程度上解决物体遮挡或同意物体各部分重叠的问题。根据深度信息,可以很容易的把前景和背景分开,这能够降低图像识别的难度。
深度图像按照成像原理划分主要有飞行时间法、结构光及三维激光扫描等主要用于人机互动。利用深度图像进行模式识别。
在本发明中,获取深度图像可采用如下方法:第一种基于飞行时间原理,通过测量光线发射到物体表面后反射回来的时间差,从而计算出物体表面的深度信息。第二种方法类似结构光编码,投影一个己知的红外模式到场景中,通过红外CMOS相机上所记录的变形模式来测量距离。工作模式主要是识别人体及相关的动作,而识别人体的最主要核心就是骨架,通过骨骼的追踪,把人体的动作扫描到计算机上,并做相关的模拟及操作。当然,本发明中获取深度图像的方法不仅限于上述方法。
采集手部的深度信息时,检测手势动作的特征,根据手势动作给出该手势动作对应的控制指令。例如根据映射关系3D显示光标跟随手部、手部抓握对应抓取对象指令、手部抓握向前对应放大指令、抓握向后对应缩小指令等。在检测到手势动作后,就能够输出该手势动作对应的控制指令。
在本发明中,将预先采集的手势动作进行存储,如抓握、抓握向前移动等,设置与抓握、抓握向前移动对应的控制指令。因此,在用户做出抓握、抓握向前移动等动作的时候,会对应执行该动作对应的控制指令。即预先采集手势动作的数据后,任何用户做出对应的手势动作,均能够执行该手势动作对应的控制指令。
如,手掌张开、抓握分别代表放大和缩小的控制指令。或是抓握大拇指向上或向下分别代表放大和缩小的控制指令。或是抓握后手部向前或向后分别代表放大和缩小的控制指令。将类似于这些手势动作采集存储,并赋予对应的控制指令,使手势动作与控制指令之间建立对应关系。因而能够在识别出手势动 作时,按照对应的控制指令执行相应的操作。在其他实施例中,也可赋予每个手势动作不同的控制指令。
手势动作的采集包括很多类,每个类也包括了许多不同的实例,这个数据采集是在自然环境下拍摄的,在真实的房间或办公室、不同的光照和角度,使得手势动作的数据采集更具有实用性。
在本实施例中,一种基于三维显示的手势操控方法,包括以下步骤:
步骤S110,获取手部的位置信息,并建立手部在三维空间坐标系的运动轨迹。本实施例可以基于深度图像获取手部的位置信息。
具体的,获取手部的位置信息,并建立手部在三维空间坐标系的运动轨迹的步骤包括:
获取手部一系列连续的深度信息,根据所述深度信息形成手部在三维空间坐标系的运动轨迹。本实施例可以基于深度图像采集技术,利用深度相机采集手部一系列连续动作的图像数据,然后从该图像数据中提取手部一系列连续的深度信息。
手部的操作空间与所述三维空间坐标系成线性对应关系,其中,操作空间为手部一系列连续动作的真实空间,利用深度相机从操作空间采集的图像数据可以获取手部一系列连续的深度信息及二维坐标信息的空间。上述三维空间坐标系是指用于显示三维图像的立体图像数据所对应的空间坐标系。
获取手部的深度信息后,就能够在三维空间坐标系找出对应的坐标点。追踪手部的骨骼信息,并将手部的动作扫描到计算机上,同时采集手部的深度信息,根据深度信息及骨骼信息结合获得手部在三维空间坐标系中对应的坐标点,用以在三维图像中定位操控位置。而在手部运动时,通过追踪手部的骨骼信息,能够依次追踪到手部在三维空间坐标系中的运动轨迹,即完成追踪手部的运动轨迹,并将实际的运动轨迹转换到三维空间坐标系中。
在一个实施例中,深度信息的获取可采用平行立体视觉的模型,假设摄像机C1的外参数用旋转矩阵R1和平移向量t1表示,摄像机C2的外参数用旋转 矩阵R2和平移向量t2表示,若其中,R1=R2,即左右摄像机平行放置,相对位置关系只存在平移,这样的立体视觉系统就是平行立体市局系统。
以摄像机C1所在的摄像机坐标系为世界坐标系,则有:
t1=(0,0,0)T,R1=R2=I.;
在平行立体视觉系统中,2个摄像机的光轴相互平行,并且左右摄像机坐标系中的x中重合,极线相互平行,2个摄像机坐标系的区别就是x轴上的一个平一辆B(即“基线”)。
建立如图2所示的物体深度计算模型,计算物体深度。
如图2所示,左右摄像机光心(即透镜中心)的位置分别为C1和Cr,B为2个摄像机光心之间的平移向量,f为摄像机的焦距。设有空间中的一点P,而p1和pr则分别为电P在左右图像平面上的投影点。Z为所求的深度信息,即空间点P距离摄像机光心连线C1Cr的距离。L和R为经过摄像机光心垂直在图像平面做垂线的垂足所在点。H为经过空间点P向图像平面做垂线的垂足。
则有线性关系如下:
Figure PCTCN2016076748-appb-000001
对上式合并化简求解,则有
Figure PCTCN2016076748-appb-000002
上式即为求解深度信息的公式,其中,|Lp1|-|Rpr|即为立体匹配中获得的对应匹配点的视差值,表示空间点P在图像平面上锁成图像位置的差x1-x2,摄像机光心距离|B|和焦距f通过摄像机标定获得。
步骤S120,根据手部在三维空间坐标系的运动轨迹,识别手部的手势动作。
具体地,根据手部在三维空间坐标系的运动轨迹识别手部的手势动作的步骤包括:
提取手部轮廓特征信息,并结合所述位置信息进行特征匹配对手势动作进行识别分类。
在本实施例中,根据手势的深度信息,利用三维点云计算得到手势点云数据,计算后手势点云数据只包括手部关节点和手掌中心点的三维坐标位置信息,然后对手势点云数据做数据滤波处理,滤除掉手势点云数据中的噪声干扰点,得到手势点云信息。将手势点云信息,通过旋转平移将手势点云信息三维信息进行平面配准,保存配准后手势点云信息,然后提取出手势点云信息的轮廓体重点信息,轮廓特征点包括指尖点、指尖凹点和手掌中心点。
由于轮廓特征点信息结合深度图像的像素深度值映射出轮廓特征点的深度值,通过欧式距离法做距离阈值判断、筛选出关键指尖点信息,根据指尖点信息和对应指尖凹点信息结合平面配准的平面获取五个手指特征矢量,根据特征矢量恢复出手势动作。
步骤S130,根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。
具体的,根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象的步骤包括:
存储每个手势动作与对应的控制指令的对应关系。
在识别手势动作后,从对应关系中读取该手势动作对应的控制指令。
根据所述控制指令,控制三维图像中的操作对象执行对应动作。
在识别出手势动作后,根据预存的手势动作与控制指令之间的对应关系,查找出该手势动作对应的控制指令,并根据该控制指令控制三维图像中的操作对象。
本实施例的三维图像可以采用真三维立体图像显示技术获得空间立体图像。真三维立体图像显示技术是指基于全息显示技术或基于体三维显示技术, 在一定实体空间范围内显示立体图像数据,形成真实空间立体图像的一种技术。立体图像数据是具有一三维空间坐标系的图像数据,每个体像素的信息至少包括,该点像素的位置信息和图像信息。
本文的全息显示技术,主要包括传统全息图(透射式全息显示图像、反射式全息显示图像、像面式全息显示图像、彩虹式全息显示图像、合成式全息显示图像等)和计算机全息图(CGH,Computer Generated Hologram)。计算机全息图漂浮于空中并具有较广的色域,在计算机全息图中,用来产生全息图的物体需要在计算机中生成一个数学模型描述,且光波的物理干涉也被计算步骤所代替,在每一步中,CGH模型中的强度图形可以被确定,该图形可以输出到一个可重新配置的设备中,该设备对光波信息进行重新调制并重构输出。通俗的讲,CGH就是通过计算机的运算来获得一个计算机图形(虚物)的干涉图样,替代传统全息图物体光波记录的干涉过程;而全息图重构的衍射过程并没有原理上的改变,只是增加了对光波信息可重新配置的设备,从而实现不同的计算机静态、动态图形的全息显示。
基于全息显示技术,在本发明的其中一些实施例中,空间立体显示装置包括:360全息幻影成像系统,该系统包括光源、控制器、分光镜,光源可以采用射灯,控制器包括一个或多个处理器,通过通信接口接收立体图像数据,并经过处理后获得计算机图形(虚物)的干涉图样,输出该干涉图像至分光镜,并通过光源投射在分光镜上的光呈现此干涉图样,形成空间立体图像。这里的分光镜可以是特殊的镜片、或者是四面棱锥体等等。
除上述360全息幻影成像系统之外,空间立体显示装置还可以基于全息投影设备,例如,通过在空气、特殊镜片、雾屏等上形成立体影像。因此,空间立体显示装置8还可以为空气全息投影设备、激光束全息投影设备、具有360度全息显示屏的全息投影设备(其原理是将图像投影在高速旋转的镜子上,从而实现全息影像。)、以及雾幕立体成像系统等设备中之一。
然而,对于体三维显示技术,其是指利用人自身特殊的视觉机理,制造了 一个由体素微粒代替分子微粒组成的显示实物,除了可以看到光波体现的形状外,还能触摸到体素的真实存在。它通过适当方式来激励位于透明显示体积内的物质,利用可见辐射的产生吸收或散射而形成体素,当体积内许多方位的物质都被激励之后,便能形成由许多分散的体素在三维空间内构成三维空间图像。
本发明还可以采用如下方法:
(1)、旋转体扫描技术,旋转体扫描技术主要用于动态物体的显示。在该技术中,一串二维图像被投影到一个旋转或移动的屏幕上,同时该屏幕以观察者无法觉察的速度在运动,因为人的视觉暂留从而在人眼中形成三维物体。因此,使用这种立体显示技术的显示系统可实现图像的真三维显示(360°可视)。系统中不同颜色的光束通过光偏转器投影到显示介质上,从而使得介质体现出丰富的色彩。同时,这种显示介质能让光束产生离散的可见光点,这些点就是体素,对应于三维图像中的任一点。一组组体素用来建立图像,观察者可从任意视点观察到这个真三维图像。基于旋转体扫描技术的显示设备中的成像空间可以由屏幕的旋转或平移产生。在屏幕扫过成像空间时在发射面上激活体素。该系统包括:激光系统、计算机控制系统、旋转显示系统等子系统。
(2)、静态体成像技术,是基于频率上转换技术形成三维立体图像的,所谓频率上转换三维立体显示是利用成像空间介质吸收多个光子后会自发辐射出一种荧光,从而产生可见的像素点。其基本原理是利用两束相互垂直的红外激光交叉作用于上转换材料上,经过上转换材料的两次共振吸收,发光中心电子被激发到高激发能级,再向下能级跃迁就可能产生可见光的发射,这样的上转换材料空间中的一个点就是一个发光的亮点,如果使两束激光的交叉点依照某种轨迹在上转换材料中做三维空间的寻址扫描,那么两束激光的交叉点所扫描过的地方应当是一条可以发射可见荧光的亮带,即可以显示出同激光交叉点运动轨迹相同的三维立体图形。这种显示方法肉眼就可以看到360°全方位可视的三维立体图像。
当然,本发明中的三维图像还可以是基于3D显示技术在显示屏上进行显示 获得的3D图像。这里提到的显示屏基于3D显示技术,利用人眼的左右眼视差,使人眼对显示屏上显示的图像进行重构后获得虚拟的3D立体图像。显示屏分为眼镜式显示设备和裸眼式显示设备两大类。眼镜式显示设备利用平面显示屏配合3D眼镜共同实现。裸眼式显示设备,即裸眼3D显示器,其由3D立体现实终端、播放软件、制作软件、应用技术四部分组成,是集光学、摄影、电子计算机,自动控制、软件、3d动画制作等现代高科技技术于一体的交差立体现实系统。
基于上述不同的三维图像成像方式,可以将具有一三维空间坐标系的立体图像数据转化为所需要的输入至不同显示设备上的图像数据。这不同的显示设备基于三维图像的成像方式而采用不同的硬件设备,具体可参见现有技术中的相关内容。
在一个实施例中,手部为张开动作时,通过骨骼跟踪,识别出手部为张开动作,并查找张开动作对应的控制指令,假设张开动作对应的控制指令为起始动作,因此,此时仅显示与手部对应的光标。当在张开动作这个状态下移动手部时,通过骨骼跟踪,反馈给计算机的指令仅为跟踪手部的运动轨迹,即显示光标跟随手部的运动轨迹。由于手势操作空间与三维空间坐标系对应,因此,手部在操作空间内运动时,与三维空间坐标系对应。
如图3(a)和图3(b)所示,当确定需要对某个操作对象进行操作时,移动手部,使手部对应的光标在操作对象的控制区域内。以抓握向前移动和向后移动代表缩小和放大指令为例。在识别到抓握时,获取手部的开始位置,一般以手部的手心位置为开始位置。跟踪手部的运动轨迹,当识别为向前移动时,对应的控制指令为缩小操作对象。当识别为向后移动时,对应的控制指令为放大操作对象。
在其他实施例中,抓握动作为选中指令时,当手部对应的光标在操作对象的控制区域内,在识别到抓握动作时,以此时手部对应光标所在的对象为操作对象,即选中当前对象,可对当前对象进行移动、复制及粘贴等操作。
具体的,如图3(c)所示,当手部为抓握状态,且向前移动时,对应在三维显示中为手部对应的光标逐渐放大。如图3(d)所示,当手部为抓握状态,且向后移动时,对应在三维显示中为手部对应的光标逐渐缩小。三维显示器可处于深度相机下方或是侧边。如图3(e)和图3(f)所示,三维显示器的放置位置并不影响三维操作空间的显示。
在其他实施例中,以张开且转动手指动作为旋转指令时,当手部对应的光标在操作对象的控制区域内,在识别到张开且转动手指动作时,以此时手部对应光标所在的对象为操作对象,即对当前对象进行旋转操作。
基于上述所述实施例,一种基于三维显示的手势操控装置,包括深度相机、三维显示器及处理器。
所述深度相机用于获取手部的深度图像,并输出给所述处理器。
所述处理器根据所述深度图像获取手部的位置信息;并根据所述位置信息建立手部在三维空间坐标系的运动轨迹;所述处理器还用于根据手部在三维空间坐标系的运动轨迹识别手部的手势动作;所述处理器还用于根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。
所述处理器还用于根据所述手势动作控制三维显示器显示该手势动作,并显示执行该手势动作对应的控制指令的轨迹。
在本实施例中,由于是在操作空间(为手部执行一系列连续动作的真实空间)中采集手势动作。因此,三维图像采用全息显示技术可以实现裸眼三维效果。即真实空间与虚拟显示可实时对应显示。因而,在用户在对三维显示器的显示对象进行操作时,能够准确的对显示对象实施抓握、抓握移动等操作。
请结合图3(g)。例如,假设用户需要对三维显示器(采用全息显示技术的三维显示器)显示的待操作立体图像(如球拍)进行旋转及移动的操作,由于操作空间与虚拟显示是实时对应显示,因此用户仅需要在操作空间里找到与待操作立体图像对应的位置,并做出抓握动作。此时,深度相机会检测用户的手势动作,并传输给处理器。处理器则控制三维显示器显示待操作立体图像(如 球拍)被用户手部抓握的状态。当用户接着在操作空间中进行抓握移动(或是挥动手臂)时,则处理器控制三维显示器显示待操作立体图像被移动(或是挥动)的轨迹。
如图4所示,为基于三维显示的手势操控系统的模块图。
信息获取模块,用于获取手部的位置信息,
坐标建立模块,用于根据所述位置信息建立手部在三维空间坐标系的运动轨迹;
手势识别模块,用于根据手部在三维空间坐标系的运动轨迹识别手部的手势动作;
操作控制模块,用于根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。
信息获取模块还用于获取手部一系列连续的深度信息,坐标建立模块还用于根据所述深度信息形成手部在三维空间坐标系的运动轨迹。
手部的操作空间与所述三维空间坐标系成线性对应关系,其中,操作空间为手部执行一系列连续动作的真实空间。
手势识别模块还用于提取手部轮廓特征信息,并结合所述位置信息进行特征匹配对手势动作进行识别分类。
操作控制模块包括存储模块、读取模块及执行模块。存储模块用于存储每个手势动作与对应的控制指令的对应关系。读取模块用于在识别手势动作后,从对应关系中读取该手势动作对应的控制指令。执行模块用于根据所述控制指令控制三维图像中的操作对象执行对应动作。
上述基于三维显示的手势操控方法和系统、装置通过获取手部的位置信息,然后建立手部在三维空间坐标系的运动轨迹,根据该运动轨迹识别手部的手势动作,最后根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。即通过识别用户的手势动作,然后依据手势动作控制操作对象执行对应的操作。因此,能够实现人机自然交互且无需接触显示屏。
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。

Claims (11)

  1. 一种基于三维显示的手势操控方法,包括以下步骤:
    获取手部的位置信息,并建立手部在三维空间坐标系的运动轨迹;
    根据手部在三维空间坐标系的运动轨迹,识别手部的手势动作;
    根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。
  2. 根据权利要求1所述的基于三维显示的手势操控方法,其特征在于,所述获取手部的位置信息,并建立手部在三维空间坐标系的运动轨迹的步骤包括:
    获取手部一系列连续的深度信息,根据所述深度信息形成手部在三维空间坐标系的运动轨迹。
  3. 根据权利要求1所述的基于三维显示的手势操控方法,其特征在于,所述手部的操作空间与所述三维空间坐标系成线性对应关系,其中,操作空间为手部执行一系列连续动作的真实空间。
  4. 根据权利要求1所述的基于三维显示的手势操控方法,其特征在于,所述根据手部在三维空间坐标系的运动轨迹识别手部的手势动作的步骤包括:
    提取手部轮廓特征信息,并结合所述位置信息进行特征匹配对手势动作进行识别分类。
  5. 根据权利要求1所述的基于三维显示的手势操控方法,其特征在于,所述根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象的步骤包括:
    存储每个手势动作与对应的控制指令的对应关系;
    在识别手势动作后,从对应关系中读取该手势动作对应的控制指令;
    根据所述控制指令控制三维图像中的操作对象执行对应动作。
  6. 一种基于三维显示的手势操控系统,其特征在于,包括:
    信息获取模块,用于获取手部的位置信息,
    坐标建立模块,用于根据所述位置信息建立手部在三维空间坐标系的运动 轨迹;
    手势识别模块,用于根据手部在三维空间坐标系的运动轨迹识别手部的手势动作;
    操作控制模块,用于根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象。
  7. 根据权利要求6所述的基于三维显示的手势操控系统,其特征在于,所述信息获取模块还用于获取手部一系列连续的深度信息;所述坐标建立模块还用于根据所述深度信息形成手部在三维空间坐标系的运动轨迹。
  8. 根据权利要求6所述的基于三维显示的手势操控系统,其特征在于,手部的操作空间与所述三维空间坐标系成线性对应关系,其中,操作空间为手部执行一系列连续动作的真实空间。
  9. 根据权利要求6所述的基于三维显示的手势操控系统,其特征在于,所述手势识别模块还用于提取手部轮廓特征信息,并结合所述位置信息进行特征匹配对手势动作进行识别分类。
  10. 根据权利要求6所述的基于三维显示的手势操控系统,其特征在于,所述操作控制模块包括存储模块、读取模块及执行模块;所述存储模块用于存储每个手势动作与对应的控制指令的对应关系;所述读取模块用于在识别手势动作后,从对应关系中读取该手势动作对应的控制指令;所述执行模块用于根据所述控制指令控制三维图像中的操作对象执行对应动作。
  11. 一种基于三维显示的手势操控装置,其特征在于,包括深度相机、三维显示器及处理器;
    所述深度相机用于获取手部的深度图像,并输出给所述处理器;
    所述处理器根据所述深度图像获取手部的位置信息;并根据所述位置信息建立手部在三维空间坐标系的运动轨迹;所述处理器还用于根据手部在三维空间坐标系的运动轨迹识别手部的手势动作;所述处理器还用于根据所述手势动作读取对应的控制指令,并根据所述控制指令控制三维图像中的操作对象;
    所述处理器还用于根据所述手势动作控制三维显示器显示该手势动作,并显示执行该手势动作对应的控制指令的轨迹。
PCT/CN2016/076748 2015-11-02 2016-03-18 基于三维显示的手势操控方法和系统 WO2017075932A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510735569.7A CN105353873B (zh) 2015-11-02 2015-11-02 基于三维显示的手势操控方法和系统
CN201510735569.7 2015-11-02

Publications (1)

Publication Number Publication Date
WO2017075932A1 true WO2017075932A1 (zh) 2017-05-11

Family

ID=55329857

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/076748 WO2017075932A1 (zh) 2015-11-02 2016-03-18 基于三维显示的手势操控方法和系统

Country Status (2)

Country Link
CN (1) CN105353873B (zh)
WO (1) WO2017075932A1 (zh)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678425A (zh) * 2017-08-29 2018-02-09 南京理工大学 一种基于Kinect手势识别的小车控制装置
CN108776994A (zh) * 2018-05-24 2018-11-09 长春理工大学 基于真三维显示系统的Roesser模型及其实现方法
CN109240494A (zh) * 2018-08-23 2019-01-18 京东方科技集团股份有限公司 电子显示板的控制方法、计算机可读存储介质和控制系统
CN110659543A (zh) * 2018-06-29 2020-01-07 比亚迪股份有限公司 基于手势识别的车辆控制方法、系统及车辆
CN110794959A (zh) * 2019-09-25 2020-02-14 苏州联游信息技术有限公司 一种基于图像识别的手势交互ar投影方法及装置
CN111142664A (zh) * 2019-12-27 2020-05-12 恒信东方文化股份有限公司 一种多人实时手部追踪系统及追踪方法
CN111242084A (zh) * 2020-01-21 2020-06-05 深圳市优必选科技股份有限公司 机器人控制方法、装置、机器人及计算机可读存储介质
CN111949134A (zh) * 2020-08-28 2020-11-17 深圳Tcl数字技术有限公司 人机交互方法、设备及计算机可读存储介质
CN112329540A (zh) * 2020-10-10 2021-02-05 广西电网有限责任公司电力科学研究院 一种面向架空输电线路作业到位监督的识别方法及系统
CN113065383A (zh) * 2020-01-02 2021-07-02 中车株洲电力机车研究所有限公司 一种基于三维手势识别的车载交互方法、装置及系统
CN115840507A (zh) * 2022-12-20 2023-03-24 北京帮威客科技有限公司 一种基于3d图像控制的大屏设备交互方法
CN117278735A (zh) * 2023-09-15 2023-12-22 山东锦霖智能科技集团有限公司 一种沉浸式图像投影设备

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105353873B (zh) * 2015-11-02 2019-03-15 深圳奥比中光科技有限公司 基于三维显示的手势操控方法和系统
CN105589293A (zh) * 2016-03-18 2016-05-18 严俊涛 全息投影方法及全息投影系统
CN105955461A (zh) * 2016-04-25 2016-09-21 乐视控股(北京)有限公司 一种交互界面管理方法和系统
US10377042B2 (en) * 2016-06-17 2019-08-13 Intel Corporation Vision-based robot control system
CN108073267B (zh) * 2016-11-10 2020-06-16 腾讯科技(深圳)有限公司 基于运动轨迹的三维控制方法及装置
CN106774849B (zh) * 2016-11-24 2020-03-17 北京小米移动软件有限公司 虚拟现实设备控制方法及装置
CN106933347A (zh) * 2017-01-20 2017-07-07 深圳奥比中光科技有限公司 三维操控空间的建立方法及设备
CN106919928A (zh) * 2017-03-08 2017-07-04 京东方科技集团股份有限公司 手势识别系统、方法及显示设备
WO2018196552A1 (zh) * 2017-04-25 2018-11-01 腾讯科技(深圳)有限公司 用于虚拟现实场景中的手型显示方法及装置
CN107368194A (zh) * 2017-07-21 2017-11-21 上海爱优威软件开发有限公司 终端设备的手势操控方法
CN107463261B (zh) * 2017-08-11 2021-01-15 北京铂石空间科技有限公司 立体交互系统及方法
CN110989835B (zh) * 2017-09-11 2023-04-28 大连海事大学 一种基于手势识别的全息投影装置的工作方法
CN107976183A (zh) * 2017-12-18 2018-05-01 北京师范大学珠海分校 一种空间数据测量方法及装置
CN108052237B (zh) * 2018-01-05 2022-01-14 上海昶音通讯科技有限公司 一种3d投影触摸装置及其触摸方法
CN108363482A (zh) * 2018-01-11 2018-08-03 江苏四点灵机器人有限公司 一种基于双目结构光的三维手势控制智能电视的方法
WO2019169644A1 (zh) * 2018-03-09 2019-09-12 彼乐智慧科技(北京)有限公司 一种信号输入的方法及装置
CN108681402A (zh) * 2018-05-16 2018-10-19 Oppo广东移动通信有限公司 识别交互方法、装置、存储介质及终端设备
KR102155378B1 (ko) * 2018-09-19 2020-09-14 주식회사 브이터치 객체 제어를 지원하기 위한 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체
CN112714900A (zh) * 2018-10-29 2021-04-27 深圳市欢太科技有限公司 显示屏操作方法、电子设备、可读存储介质
CN109732606A (zh) * 2019-02-13 2019-05-10 深圳大学 机械臂的远程控制方法、装置、系统及存储介质
CN110058688A (zh) * 2019-05-31 2019-07-26 安庆师范大学 一种投影用动态手势翻页的系统及其方法
CN110456957B (zh) * 2019-08-09 2022-05-03 北京字节跳动网络技术有限公司 显示交互方法、装置、设备、存储介质
CN110889390A (zh) * 2019-12-05 2020-03-17 北京明略软件系统有限公司 姿势识别方法、装置、控制设备和机器可读存储介质
CN112241204B (zh) * 2020-12-17 2021-08-27 宁波均联智行科技股份有限公司 一种车载ar-hud的手势交互方法和系统
CN114701409B (zh) * 2022-04-28 2023-09-05 东风汽车集团股份有限公司 一种手势交互式智能座椅调节方法和系统

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102226880A (zh) * 2011-06-03 2011-10-26 北京新岸线网络技术有限公司 一种基于虚拟现实的体感操作方法及系统
US20110291926A1 (en) * 2002-02-15 2011-12-01 Canesta, Inc. Gesture recognition system using depth perceptive sensors
CN102270035A (zh) * 2010-06-04 2011-12-07 三星电子株式会社 以非触摸方式来选择和操作对象的设备和方法
CN102411426A (zh) * 2011-10-24 2012-04-11 由田信息技术(上海)有限公司 电子装置的操作方法
CN102426480A (zh) * 2011-11-03 2012-04-25 康佳集团股份有限公司 一种人机交互系统及其实时手势跟踪处理方法
CN104182035A (zh) * 2013-05-28 2014-12-03 中国电信股份有限公司 一种操控电视应用程序的方法和系统
CN104541232A (zh) * 2012-09-28 2015-04-22 英特尔公司 多模态触摸屏仿真器
CN104571510A (zh) * 2014-12-30 2015-04-29 青岛歌尔声学科技有限公司 一种3d场景中输入手势的系统和方法
CN105353873A (zh) * 2015-11-02 2016-02-24 深圳奥比中光科技有限公司 基于三维显示的手势操控方法和系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102236414A (zh) * 2011-05-24 2011-11-09 北京新岸线网络技术有限公司 三维显示空间中的图片操作方法和系统
CN102650906B (zh) * 2012-04-06 2015-11-04 深圳创维数字技术有限公司 一种用户界面的控制方法及装置
KR20140052640A (ko) * 2012-10-25 2014-05-07 삼성전자주식회사 커서를 디스플레이에 디스플레이하기 위한 방법과 상기 방법을 수행할 수 있는 시스템
CN103176605A (zh) * 2013-03-27 2013-06-26 刘仁俊 一种手势识别控制装置及控制方法
CN103488292B (zh) * 2013-09-10 2016-10-26 青岛海信电器股份有限公司 一种立体应用图标的控制方法及装置

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110291926A1 (en) * 2002-02-15 2011-12-01 Canesta, Inc. Gesture recognition system using depth perceptive sensors
CN102270035A (zh) * 2010-06-04 2011-12-07 三星电子株式会社 以非触摸方式来选择和操作对象的设备和方法
CN102226880A (zh) * 2011-06-03 2011-10-26 北京新岸线网络技术有限公司 一种基于虚拟现实的体感操作方法及系统
CN102411426A (zh) * 2011-10-24 2012-04-11 由田信息技术(上海)有限公司 电子装置的操作方法
CN102426480A (zh) * 2011-11-03 2012-04-25 康佳集团股份有限公司 一种人机交互系统及其实时手势跟踪处理方法
CN104541232A (zh) * 2012-09-28 2015-04-22 英特尔公司 多模态触摸屏仿真器
CN104182035A (zh) * 2013-05-28 2014-12-03 中国电信股份有限公司 一种操控电视应用程序的方法和系统
CN104571510A (zh) * 2014-12-30 2015-04-29 青岛歌尔声学科技有限公司 一种3d场景中输入手势的系统和方法
CN105353873A (zh) * 2015-11-02 2016-02-24 深圳奥比中光科技有限公司 基于三维显示的手势操控方法和系统

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678425A (zh) * 2017-08-29 2018-02-09 南京理工大学 一种基于Kinect手势识别的小车控制装置
CN108776994B (zh) * 2018-05-24 2022-10-25 长春理工大学 基于真三维显示系统的Roesser模型及其实现方法
CN108776994A (zh) * 2018-05-24 2018-11-09 长春理工大学 基于真三维显示系统的Roesser模型及其实现方法
CN110659543A (zh) * 2018-06-29 2020-01-07 比亚迪股份有限公司 基于手势识别的车辆控制方法、系统及车辆
CN110659543B (zh) * 2018-06-29 2023-07-14 比亚迪股份有限公司 基于手势识别的车辆控制方法、系统及车辆
CN109240494A (zh) * 2018-08-23 2019-01-18 京东方科技集团股份有限公司 电子显示板的控制方法、计算机可读存储介质和控制系统
CN109240494B (zh) * 2018-08-23 2023-09-12 京东方科技集团股份有限公司 电子显示板的控制方法、计算机可读存储介质和控制系统
CN110794959A (zh) * 2019-09-25 2020-02-14 苏州联游信息技术有限公司 一种基于图像识别的手势交互ar投影方法及装置
CN111142664B (zh) * 2019-12-27 2023-09-01 恒信东方文化股份有限公司 一种多人实时手部追踪系统及追踪方法
CN111142664A (zh) * 2019-12-27 2020-05-12 恒信东方文化股份有限公司 一种多人实时手部追踪系统及追踪方法
CN113065383A (zh) * 2020-01-02 2021-07-02 中车株洲电力机车研究所有限公司 一种基于三维手势识别的车载交互方法、装置及系统
CN113065383B (zh) * 2020-01-02 2024-03-29 中车株洲电力机车研究所有限公司 一种基于三维手势识别的车载交互方法、装置及系统
CN111242084A (zh) * 2020-01-21 2020-06-05 深圳市优必选科技股份有限公司 机器人控制方法、装置、机器人及计算机可读存储介质
CN111949134A (zh) * 2020-08-28 2020-11-17 深圳Tcl数字技术有限公司 人机交互方法、设备及计算机可读存储介质
CN112329540A (zh) * 2020-10-10 2021-02-05 广西电网有限责任公司电力科学研究院 一种面向架空输电线路作业到位监督的识别方法及系统
CN115840507A (zh) * 2022-12-20 2023-03-24 北京帮威客科技有限公司 一种基于3d图像控制的大屏设备交互方法
CN115840507B (zh) * 2022-12-20 2024-05-24 北京帮威客科技有限公司 一种基于3d图像控制的大屏设备交互方法
CN117278735A (zh) * 2023-09-15 2023-12-22 山东锦霖智能科技集团有限公司 一种沉浸式图像投影设备
CN117278735B (zh) * 2023-09-15 2024-05-17 山东锦霖智能科技集团有限公司 一种沉浸式图像投影设备

Also Published As

Publication number Publication date
CN105353873A (zh) 2016-02-24
CN105353873B (zh) 2019-03-15

Similar Documents

Publication Publication Date Title
WO2017075932A1 (zh) 基于三维显示的手势操控方法和系统
US11954808B2 (en) Rerendering a position of a hand to decrease a size of a hand to create a realistic virtual/augmented reality environment
US10394334B2 (en) Gesture-based control system
CN107004279B (zh) 自然用户界面相机校准
US9939914B2 (en) System and method for combining three-dimensional tracking with a three-dimensional display for a user interface
CN102959616B (zh) 自然交互的交互真实性增强
CN116324680A (zh) 用于操纵环境中的对象的方法
US20130335405A1 (en) Virtual object generation within a virtual environment
CN112771539A (zh) 采用使用神经网络从二维图像预测的三维数据以用于3d建模应用
US10540812B1 (en) Handling real-world light sources in virtual, augmented, and mixed reality (xR) applications
CN110546595B (zh) 导航全息图像
US9202309B2 (en) Methods and apparatus for digital stereo drawing
US20120274745A1 (en) Three-dimensional imager and projection device
JP2016525741A (ja) 共有ホログラフィックオブジェクトおよびプライベートホログラフィックオブジェクト
JP2011022984A (ja) 立体映像インタラクティブシステム
Starck et al. The multiple-camera 3-d production studio
EP2932358A1 (en) Direct interaction system for mixed reality environments
US20150058782A1 (en) System and method for creating and interacting with a surface display
Schütt et al. Semantic interaction in augmented reality environments for microsoft hololens
Corbett-Davies et al. An advanced interaction framework for augmented reality based exposure treatment
JP5597087B2 (ja) 仮想物体操作装置
Planche et al. Physics-based differentiable depth sensor simulation
Zhang et al. Virtual reality aided high-quality 3D reconstruction by remote drones
Xiang et al. Tsfps: An accurate and flexible 6dof tracking system with fiducial platonic solids
Sobota et al. Mixed reality: a known unknown

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16861205

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16861205

Country of ref document: EP

Kind code of ref document: A1