WO2020037594A1 - 一种基于高光谱成像的手势识别方法及装置 - Google Patents

一种基于高光谱成像的手势识别方法及装置 Download PDF

Info

Publication number
WO2020037594A1
WO2020037594A1 PCT/CN2018/101917 CN2018101917W WO2020037594A1 WO 2020037594 A1 WO2020037594 A1 WO 2020037594A1 CN 2018101917 W CN2018101917 W CN 2018101917W WO 2020037594 A1 WO2020037594 A1 WO 2020037594A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
hyperspectral
gesture recognition
images
hyperspectral imaging
Prior art date
Application number
PCT/CN2018/101917
Other languages
English (en)
French (fr)
Inventor
王星泽
李梓彤
蒲庆
舒远
阮思纯
徐炜文
Original Assignee
合刃科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 合刃科技(深圳)有限公司 filed Critical 合刃科技(深圳)有限公司
Priority to PCT/CN2018/101917 priority Critical patent/WO2020037594A1/zh
Priority to CN201880001197.7A priority patent/CN109196518B/zh
Publication of WO2020037594A1 publication Critical patent/WO2020037594A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the invention belongs to the technical field of gesture recognition, and particularly relates to a gesture recognition method based on hyperspectral imaging, and also relates to a gesture recognition device based on hyperspectral imaging.
  • gesture recognition is a topic that recognizes human gestures through mathematical algorithms. Users can use simple gestures to control or interact with the device and let the computer understand human behavior. Its core technologies are gesture segmentation algorithms, gesture analysis algorithms, and gesture recognition algorithms.
  • gesture operation is more convenient, interaction types are more diverse, and the interaction process is more intuitive.
  • gestures There are many commercial products controlled by gestures in the market today, such as Microsoft's Kinect, Google's Google Glass, Leap Motion.
  • Microsoft's Kinect Microsoft's Kinect
  • Google's Google Glass Google's Google Glass
  • Leap Motion The user's requirements for the diversity of interactions are constantly increasing, and the scope of reference for gesture recognition will also become wider.
  • the general gesture recognition process is: image acquisition, feature extraction, classification and matching.
  • most common products are based on visible light recognition systems.
  • the imaging results of other bands of light are lacking. Once the scene is too dark or the background is similar to the hand, the signal to noise is high, and the accuracy and effectiveness of its recognition are greatly reduced. Even if an active infrared imaging device is used, there may be problems of overexposure and noise due to the high sensitivity of the acquisition device, which affects the subsequent recognition effect.
  • the present invention discloses a gesture recognition method based on hyperspectral imaging to improve accuracy and effectiveness; at the same time, a high-based recognition system is disclosed accordingly.
  • Spectral imaging gesture recognition device In order to solve the problem of insufficient accuracy and effectiveness of the visible light-based recognition system in the prior art, the present invention discloses a gesture recognition method based on hyperspectral imaging to improve accuracy and effectiveness; at the same time, a high-based recognition system is disclosed accordingly.
  • Spectral imaging gesture recognition device is disclosed accordingly.
  • the invention discloses a hyperspectral imaging-based gesture recognition method, which includes the following steps:
  • the hyperspectral imaging system collects images of gestures under the common illumination of visible and infrared light sources to obtain hyperspectral images
  • the denoised image is identified through image recognition and motion tracking algorithms to obtain a gesture recognition result.
  • the target feature fusion processing is performed on the hyperspectral image to form a single-channel image, and the steps of forming a single-channel image in the hyperspectral image are performed according to the target feature matching mechanism, and the target feature is not matched to the matching mechanism.
  • the matching mechanism uses a trained convolutional neural network algorithm, and the matching mechanism is based on a convolutional neural network algorithm that combines the target features of different band images in the hyperspectral image with the target features of each band image in the training set. Do match.
  • the training of the convolutional neural network algorithm includes: matching the target features of each input band image with the target features of each band image preset in the training set, and matching images with a degree of matching greater than a preset matching degree threshold Update to training set.
  • the image recognition and motion tracking algorithm specifically uses a convolutional neural network algorithm.
  • convolutional neural network algorithm any algorithms suitable for gesture image recognition or motion tracking can also be used.
  • a hyperspectral imaging-based gesture recognition device disclosed by the present invention includes a visible light source, an infrared light source, a hyperspectral imaging system, an image processing unit, and a gesture recognition unit; wherein,
  • the visible light source and the infrared light source irradiate a target gesture together;
  • the hyperspectral imaging system collects images of gestures under the joint illumination of visible light sources and infrared light sources to obtain hyperspectral images
  • the image processing unit performs target feature fusion processing on the hyperspectral image to form a single-channel image, and denoises the single-channel image to obtain a denoised image;
  • the gesture recognition unit recognizes the denoised image through an image recognition and motion tracking algorithm to obtain a gesture recognition result.
  • a database unit is further included, and the database unit stores a training set of a convolutional neural network algorithm.
  • the image processing unit matches the images of different bands in the hyperspectral image with the target features of each band image in the training set based on the convolutional neural network algorithm, and matches the target features with the bands of the matching mechanism.
  • the images are fused into a single channel image.
  • the hyperspectral imaging system is a single hyperspectral camera.
  • the hyperspectral imaging system is a plurality of hyperspectral cameras, which respectively collect hyperspectral images of different wavebands.
  • the present invention can simultaneously acquire gesture image information in the visible light band and the infrared light band, and effectively fuse target features in a multi-channel hyperspectral image in gesture recognition, so that the target feature region contrast is enhanced, thereby effectively performing image segmentation. Extracting hand contour information for subsequent gesture recognition will help improve the accuracy and effectiveness of gesture recognition.
  • FIG. 1 is a schematic flowchart of a gesture recognition method based on hyperspectral imaging disclosed in Embodiment 1 of the present invention.
  • FIG. 2 is a schematic diagram of steps S100 to S300 in FIG. 1.
  • FIG. 3 is a schematic diagram of step S400 in FIG. 1.
  • FIG. 4 is a schematic diagram of a target feature fusion step in FIG. 2.
  • FIG. 5 is a schematic diagram of a convolutional neural network algorithm training principle in a hyperspectral imaging-based gesture recognition method disclosed in Embodiment 1 of the present invention.
  • FIG. 6 is a schematic flowchart of a gesture recognition method based on hyperspectral imaging disclosed in Embodiment 2 of the present invention.
  • FIG. 7 is a schematic flowchart of a gesture recognition method based on hyperspectral imaging disclosed in Embodiment 3 of the present invention.
  • FIG. 8 is a schematic flowchart of a gesture recognition method based on hyperspectral imaging disclosed in Embodiment 4 of the present invention.
  • FIG. 9 is a structural block diagram of a gesture recognition device based on hyperspectral imaging disclosed in Embodiment 5 of the present invention.
  • a gesture recognition method based on hyperspectral imaging mainly includes the following steps S100 to S400:
  • a hyperspectral imaging system is used to acquire an image of a gesture under the illumination of a visible light source and an infrared light source together to obtain a hyperspectral image.
  • step S100 the user's hand is in an environment with a background influence, and is illuminated by a visible light source and an infrared light source, and then image acquisition is performed by a hyperspectral imaging system to obtain a hyperspectral image (that is, having multiple channels). Image).
  • the hyperspectral image is subjected to target feature fusion processing to form a single-channel image.
  • Step S200 is also shown in FIG. 2, which includes: performing the image of different bands in the hyperspectral image according to the matching mechanism of the target feature (the matching mechanism is shown in FIG. 4), and excluding the image of the band whose target feature does not match the matching mechanism, The image of the band whose target features conform to the matching mechanism is fused into a single-channel image.
  • the target features can be selected from the target features
  • the matching mechanism can be a trained convolutional neural network algorithm (CNN).
  • CNN convolutional neural network algorithm
  • the matching mechanism is based on the convolutional neural network algorithm to target characteristics and training of different band images in the hyperspectral image.
  • the target features of each band image are collected for matching, and then the images of the bands whose target features do not conform to the matching mechanism are eliminated, and the images of the bands whose target features conform to the matching mechanism are fused into a single-channel image.
  • FIG. 4 In a schematic diagram of target feature fusion shown in FIG. 4: the visible light band 1 image and the visible light band 2 image that do not meet the matching mechanism are eliminated, and the infrared band image and the ultraviolet band image that match the matching mechanism are merged into a single-channel image.
  • the training of the convolutional neural network algorithm includes: matching the target features of the input band images with the target features of each band image preset in the training set, and matching images with a degree greater than a preset matching degree threshold Update to training set.
  • the image recognition and motion tracking algorithm in step S400 can also use a convolutional neural network algorithm.
  • a convolutional neural network algorithm can also be used.
  • other algorithms suitable for gesture image recognition or motion tracking can also be used.
  • this embodiment has at least the following beneficial effects:
  • Gesture image information in visible light band and infrared light band can be collected at the same time.
  • image information in infrared light band can also be used for recognition, which helps to improve gesture recognition. Accuracy and effectiveness.
  • a method for gesture recognition based on hyperspectral imaging disclosed in the second embodiment will take the gesture recognition of such a smart display device as an example, collecting image information of the gesture image displayed in the visible light band and the infrared light band, and by using the target of the image Feature fusion processing improves the accuracy and effectiveness of its recognition.
  • a gesture recognition method based on hyperspectral imaging disclosed in Embodiment 2 mainly includes the following steps S110 to S710:
  • the intelligent display device is initialized, and a hyperspectral imaging system, a visible light source, and an infrared light source are turned on.
  • S210 Detect whether there is a user gesture, and if not, perform re-detection within a preset time.
  • S310 A user gesture is detected, and a hyperspectral imaging system is used to acquire an image of the gesture under the common illumination of the visible light source and the infrared light source to obtain a hyperspectral image.
  • the images in different bands in the hyperspectral image are executed according to the matching mechanism of the target feature, the images in the band in which the target feature does not match the matching mechanism are eliminated, and the images in the band in which the target feature matches the matching mechanism are fused into a single-channel image.
  • S510 Denoise the single-channel image to obtain a denoised image.
  • S710 Control the smart display device to perform a corresponding preset action according to a gesture recognition result.
  • the preset actions in step S710 include channel switching, volume adjustment, menu setting, or actions of a person or an object on the screen.
  • a hyperspectral imaging-based gesture recognition method is applied.
  • a hyperspectral system hyperspectral camera
  • a method for gesture recognition based on hyperspectral imaging mainly includes the following steps S120 to S720:
  • S120 The video game screen is initialized, and a hyperspectral imaging system, a visible light source, and an infrared light source are turned on.
  • S220 Capture a video stream of the user's hand movements, analyze whether there is a user gesture, and if not, recapture within a preset time.
  • S320 Capture a user gesture, and acquire a hyperspectral image by collecting an image of the gesture under the common illumination of the visible light source and the infrared light source through a hyperspectral imaging system.
  • the target features of different band images in the hyperspectral image are matched with the target features of each band image in the training set, and then the images of the bands whose target features do not meet the matching mechanism are eliminated, and the target features are matched.
  • Mechanism-band images are fused into single-channel images.
  • S720 Control the electronic game screen to execute a corresponding preset action according to a gesture recognition result.
  • the preset actions in step S720 include actions of a person or an object corresponding to the preset gesture, and so on.
  • a gesture recognition method based on hyperspectral imaging disclosed in Embodiment 4 mainly includes the following steps S130 to S730:
  • the video game screen is initialized, and a hyperspectral imaging system, a visible light source, and an infrared light source are turned on.
  • S230 Capture a video stream of the user's hand movements, analyze whether there is a user gesture, and if not, recapture within a preset time.
  • S330 Capture a gesture of a user wearing gloves, and acquire a hyperspectral image by acquiring an image of the gesture under the illumination of a visible light source and an infrared light source through a hyperspectral imaging system.
  • the target features of different band images in the hyperspectral image are matched with the target features of each band image in the training set, the glove target feature image is eliminated, and the hand image of the band whose target feature matches the matching mechanism. Fusion into a single-channel image.
  • S530 Denoise the single-channel image to obtain a denoised image after removing target characteristics of the glove.
  • S730 Control the electronic game screen to perform a corresponding preset action according to a gesture recognition result.
  • the preset actions in step S730 include actions of a person or an object corresponding to the preset gesture, and so on.
  • a hyperspectral imaging-based gesture recognition device disclosed in Embodiment 5 includes a visible light source 10, an infrared light source 20, a hyperspectral imaging system 30, an image processing unit 40, and a gesture recognition unit 50.
  • the visible light source 10 and the infrared light source 20 are used to jointly illuminate the target gesture; the hyperspectral imaging system 30 is used to collect images of gestures under the common illumination of the visible light source 10 and the infrared light source 20 to obtain a hyperspectral image. That is, the user's hand is in an environment with a background influence, and is irradiated with the visible light source 10 and the infrared light source 20, and then the image is collected by the hyperspectral imaging system 30 to obtain a hyperspectral image (that is, an image with multiple channels).
  • the hyperspectral imaging system 30 may be a single hyperspectral camera, or may be multiple hyperspectral cameras.
  • the multiple hyperspectral cameras may collect hyperspectral images in different wavebands, respectively.
  • the image processing unit 40 performs target feature fusion processing on the hyperspectral image to form a single-channel image, and performs denoising processing on the single-channel image to obtain a denoised image.
  • the target feature fusion processing refers to executing images of different bands in a hyperspectral image according to a matching mechanism of target features, removing images of bands whose target features do not conform to the matching mechanism, and fusing images of bands whose target features meet the matching mechanism into Single-channel image.
  • the matching mechanism may use a trained convolutional neural network algorithm (CNN).
  • CNN convolutional neural network algorithm
  • the matching mechanism is based on the convolutional neural network algorithm that combines the target features of different band images in the hyperspectral image with the target features of each band image in the training set. Matching is performed, and then images of bands whose target features do not match the matching mechanism are eliminated, and images of bands whose target features match the matching mechanism are fused into a single-channel image.
  • the training of the convolutional neural network algorithm includes: matching the target features of the input band images with the target features of each band image preset in the training set, and updating the images with a matching degree greater than a preset matching degree threshold to the training set. .
  • the image processing unit 40 can match the images of different bands in the hyperspectral image with the target features of each band image in the training set based on the convolutional neural network algorithm, and fuse the images of the bands whose target features meet the matching mechanism into Single-channel image.
  • the gesture recognition unit 50 is configured to recognize the denoised image by using an image recognition and motion tracking algorithm to obtain a gesture recognition result.
  • image recognition and motion tracking algorithms can also use convolutional neural network algorithms. Of course, other algorithms suitable for gesture image recognition or motion tracking can also be used.
  • the hyperspectral imaging-based gesture recognition device of Embodiment 5 further includes a database unit 60, and the database unit 60 stores a convolutional neural network algorithm training set.
  • Gesture image information in visible light band and infrared light band can be collected at the same time.
  • image information in infrared light band can also be used for recognition, which helps to improve gesture recognition. Accuracy and effectiveness.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

一种基于高光谱成像的手势识别方法及装置,其中,手势识别方法包括:通过高光谱成像系统(30)采集处于可见光源(10)和红外光源(20)共同照射下的手势的图像,获得高光谱图像(S100);对多通道高光谱图像进行目标特征融合,形成单通道图像(S200);对单通道图像去噪处理,得到去噪图像(S300);通过图像识别和动作追踪算法对去噪图像进行识别,获取手势识别结果(S400)。通过同时采集可见光波段和红外光波段的手势图像信息,在手势识别中对多通道高光谱图像进行目标特征的有效融合,使得目标特征区域对比度增强,从而有效地进行图像分割,提取手的轮廓信息进行后续的手势识别,有助于提高手势识别的准确性和有效性。

Description

一种基于高光谱成像的手势识别方法及装置 技术领域
本发明属于手势识别技术领域,具体涉及一种基于高光谱成像的手势识别方法,同时涉及一种基于高光谱成像的手势识别装置。
背景技术
在计算机科学中,手势识别是通过数学算法来识别人类手势的一个议题。用户可以使用简单的手势来控制设备或与设备交互,让计算机理解人类的行为。其核心技术为手势分割算法、手势分析算法以及手势识别算法。
随着图像识别和动作追踪的技术的不断成熟,越来越多的设备都可以通过手势进行操作和交互。相比于传统鼠标键盘输入,手势操作更加方便,交互种类更加多样,交互过程更加直观。当今市场已出现多款通过手势控制的商用产品,比如微软的Kinect,谷歌的Google Glass,Leap Motion。用户对交互多样性的要求正在不断提升,手势识别的引用范围也会越来越广。
一般的手势识别过程为:图像采集,特征提取,分类匹配。目前常见产品大多都是基于可见光的识别系统,缺少其他波段光的成像结果,一旦使用场景过暗或者背景与手比较相似,信噪比较高,其识别的准确性和有效性都大大下降。即使使用主动红外成像设备,也有可能因为采集设备敏感程度过高而产生过曝和噪点的问题,影响之后识别效果。
技术问题
为解决现有技术中基于可见光的识别系统准确性和有效性不足的问题,本发明公开了一种基于高光谱成像的手势识别方法,以提高准确性和有效性;同时相应公开一种基于高光谱成像的手势识别装置。
技术解决方案
本发明公开的一种基于高光谱成像的手势识别方法,包括以下步骤:
通过高光谱成像系统采集处于可见光源和红外光源共同照射下的手势的图像,获得高光谱图像;
对所述高光谱图像进行目标特征融合处理,形成单通道图像;
对所述单通道图像去噪处理,得到去噪图像;
通过图像识别和动作追踪算法对所述去噪图像进行识别,获取手势识别结果。
进一步方案中,对所述高光谱图像进行目标特征融合处理,形成单通道图像的步骤包括:将所述高光谱图像中不同波段的图像依据目标特征的匹配机制执行,剔除目标特征不符合匹配机制的波段的图像,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
进一步方案中,所述匹配机制采用经过训练的卷积神经网络算法,所述匹配机制基于卷积神经网络算法将所述高光谱图像中不同波段图像的目标特征与训练集中各波段图像的目标特征做匹配。
进一步方案中,所述卷积神经网络算法的训练包括:将输入的各波段图像的目标特征与训练集中预置的各波段图像的目标特征作匹配,将匹配度大于预设匹配度阈值的图像更新至训练集中。
进一步方案中,所述图像识别和动作追踪算法具体采用卷积神经网络算法。当然,也可以采用其他适合手势图像识别或动作追踪的算法。
 
本发明相应公开的一种基于高光谱成像的手势识别装置,包括可见光源、红外光源、高光谱成像系统、图像处理单元以及手势识别单元;其中,
所述可见光源和红外光源共同照射目标手势;
所述高光谱成像系统采集处于可见光源和红外光源共同照射下的手势的图像,获得高光谱图像;
所述图像处理单元对所述高光谱图像进行目标特征融合处理,形成单通道图像,以及对所述单通道图像去噪处理,得到去噪图像;
所述手势识别单元通过图像识别和动作追踪算法对所述去噪图像进行识别,获取手势识别结果。
进一步方案中,还包括数据库单元,所述数据库单元存储有卷积神经网络算法训练集。
进一步方案中,所述图像处理单元基于卷积神经网络算法将所述高光谱图像中不同波段的图像依据目标特征与训练集中各波段图像的目标特征做匹配,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
进一步方案中,所述高光谱成像系统为单个高光谱相机。
进一步方案中,所述高光谱成像系统为多个高光谱相机,分别采集不同波段的高光谱图像。
有益效果
本发明至少具备以下有益效果:
(1)本发明可以同时采集可见光波段和红外光波段的手势图像信息,在手势识别中对多通道高光谱图像进行目标特征的有效融合,使得目标特征区域对比度增强,从而有效地进行图像分割,提取手的轮廓信息进行后续的手势识别,有助于提高手势识别的准确性和有效性。
(2)通过目标特征融合处理,可以消除遮挡或暗斑,减少环境的干扰。
附图说明
图1是本发明实施例一公开的基于高光谱成像的手势识别方法整体流程示意图。
图2是图1步骤S100至步骤S300的一种原理示意图。
图3是图1步骤S400的一种原理示意图。
图4是图2中目标特征融合步骤的一种原理示意图。
图5是本发明实施例一公开的基于高光谱成像的手势识别方法中卷积神经网络算法训练原理示意图。
图6是本发明实施例二公开的基于高光谱成像的手势识别方法整体流程示意图。
图7是本发明实施例三公开的基于高光谱成像的手势识别方法整体流程示意图。
图8是本发明实施例四公开的基于高光谱成像的手势识别方法整体流程示意图。
图9是本发明实施例五公开的基于高光谱成像的手势识别装置结构框图。
本发明的最佳实施方式
为了使本发明所解决的技术问题、技术方案及有益效果更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
实施例一
请参阅图1至图5所示,本实施例公开的一种基于高光谱成像的手势识别方法,主要包括以下步骤S100至S400:
S100、通过高光谱成像系统采集处于可见光源和红外光源共同照射下的手势的图像,获得高光谱图像。
步骤S100如图2所示,用户的手部处于带有背景影响的环境中,使用可见光源和红外光源共同照射,而后通过高光谱成像系统进行图像采集,得到高光谱图像(即具有多个通道的图像)。
S200、将高光谱图像做目标特征融合处理,形成单通道图像。
步骤S200同样如图2所示,其包括:将高光谱图像中不同波段的图像依据目标特征的匹配机制执行(匹配机制如图4所示),剔除目标特征不符合匹配机制的波段的图像,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
本实施例中,目标特征可选取目标特征,匹配机制可采用经过训练的卷积神经网络算法(CNN),该匹配机制基于卷积神经网络算法将高光谱图像中不同波段图像的目标特征与训练集中各波段图像的目标特征做匹配,而后剔除目标特征不符合匹配机制的波段的图像,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
如图4所示的一种目标特征融合示意图中:剔除不符合匹配机制的可见光波段1图像和可见光波段2图像,将符合匹配机制的红外波段图像与紫外波段图像融合成单通道图像。
如图5所示,卷积神经网络算法的训练包括:将输入的各波段图像的目标特征与训练集中预置的各波段图像的目标特征作匹配,将匹配度大于预设匹配度阈值的图像更新至训练集中。
S300、对单通道图像去噪处理,得到去噪图像。
S400、通过图像识别和动作追踪算法对去噪图像进行识别,获取手势识别结果。
步骤S400的图像识别和动作追踪算法同样可采用卷积神经网络算法,当然,也可以采用其他适合手势图像识别或动作追踪的算法。
因此,本实施例至少具备以下有益效果:
(1)可以同时采集可见光波段和红外光波段的手势图像信息,在手势识别中除了可以使用可见光波段的图像信息之外还可以使用红外光波段的图像信息进行识别,有助于提高手势识别的准确性和有效性。
(2)通过目标特征融合处理,可以消除遮挡或暗斑,减少环境的干扰。
实施例二
随着智能显示设备在交互式技术方向的发展越来越先进,目前有许多电视或电脑等显示可通过手势来控制,包括可以控制电视的频道切换、音量调节、菜单设置,或者屏幕画面中人物或物体的动作等等。实施例二公开的一种基于高光谱成像的手势识别方法将以此类智能显示设备的手势识别为例,采集手势图像在可见光波段和红外光波段下展现的图像信息,以及通过对图像的目标特征融合处理,提高其识别的准确性和有效性。
请参阅图6,实施例二公开的一种基于高光谱成像的手势识别方法,主要包括以下步骤S110至S710:
S110、智能显示设备进行初始化,并开启高光谱成像系统、可见光源和红外光源。
S210、检测是否存在用户手势,若无则在预设时间内重新检测。
S310、检测到用户手势,通过高光谱成像系统采集处于可见光源和红外光源共同照射下的手势的图像,获得高光谱图像。
S410、将高光谱图像中不同波段的图像依据目标特征的匹配机制执行,剔除目标特征不符合匹配机制的波段的图像,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
S510、对单通道图像去噪处理,得到去噪图像。
S610、通过图像识别和动作追踪算法对去噪图像进行识别,获取手势识别结果。
S710、根据手势识别结果控制智能显示设备执行对应的预设动作。
步骤S710中的预设动作包括频道切换、音量调节、菜单设置,或者屏幕画面中人物或物体的动作等等。
实施例三
不少电子游戏都支持体感操作,在游戏场景下,应用实施例三的一种基于高光谱成像的手势识别方法,可使用高光谱系统(高光谱像机)捕捉用户手部动作的视频串流;对于每一帧视频,先剔除亮度过低的频段图片,再使用SVD降低数据大小,之后使用卷积神经网络算法进行识别,选出最佳的分类;若出现环境光过暗的情况,则可见光波段的图片有可能会被剔除;再经SVD处理后,非可见光波段的图片信息会保留更多,卷积神经网络算法的输入的特征会更加明显,识别准确性会更高。
如图7所示,实施例三公开的一种基于高光谱成像的手势识别方法,主要包括以下步骤S120至S720:
S120、电子游戏画面进行初始化,并开启高光谱成像系统、可见光源和红外光源。
S220、捕捉用户手部动作的视频串流,分析是否存在用户手势,若无则在预设时间内重新捕捉。
S320、捕捉到用户手势,通过高光谱成像系统采集处于可见光源和红外光源共同照射下的手势的图像,获得高光谱图像。
S420、基于卷积神经网络算法将高光谱图像中不同波段图像的目标特征与训练集中各波段图像的目标特征做匹配,而后剔除目标特征不符合匹配机制的波段的图像,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
S520、对单通道图像去噪处理,得到去噪图像。
S620、通过卷积神经网络算法对去噪图像进行识别,获取手势识别结果。
S720、根据手势识别结果控制电子游戏画面执行对应的预设动作。
步骤S720中的预设动作包括与预设手势对应的人物或物体的动作等等。
实施例四
用户在对电子游戏进行手势控制时,假如手上戴着手套,由于手套的可见光目标特征和裸露的人手特征大不相同,现有产品的识别率难以令人满意,要提高准确率则需使用新的识别算法或者训练新的模型,成本会大幅增加。而应用实施例四的一种基于高光谱成像的手势识别方法,可以对部分穿过手套的人体红外辐射进行识别,结合卷积神经网络算法以及训练集中已包含红外谱的目标特征,可以准确对手势进行识别。
如图8所示,实施例四公开的一种基于高光谱成像的手势识别方法,主要包括以下步骤S130至S730:
S130、电子游戏画面进行初始化,并开启高光谱成像系统、可见光源和红外光源。
S230、捕捉用户手部动作的视频串流,分析是否存在用户手势,若无则在预设时间内重新捕捉。
S330、捕捉到戴着手套的用户手势,通过高光谱成像系统采集处于可见光源和红外光源共同照射下的手势的图像,获得高光谱图像。
S430、基于卷积神经网络算法将高光谱图像中不同波段图像的目标特征与训练集中各波段图像的目标特征做匹配,剔除手套目标特征图像,并将目标特征符合匹配机制的波段的手部图像融合成单通道图像。
S530、对单通道图像去噪处理,得到剔除了手套目标特征后的去噪图像。
S630、通过卷积神经网络算法对去噪图像进行识别,获取手势识别结果。
S730、根据手势识别结果控制电子游戏画面执行对应的预设动作。
步骤S730中的预设动作包括与预设手势对应的人物或物体的动作等等。
实施例五
如图9所示,实施例五相应公开的一种基于高光谱成像的手势识别装置,包括可见光源10、红外光源20、高光谱成像系统30、图像处理单元40以及手势识别单元50。
其中,可见光源10和红外光源20用于共同照射目标手势; 高光谱成像系统30用于采集处于可见光源10和红外光源20共同照射下的手势的图像,获得高光谱图像。即用户的手部处于带有背景影响的环境中,使用可见光源10和红外光源20共同照射,而后通过高光谱成像系统30进行图像采集,得到高光谱图像(即具有多个通道的图像)。
本实施例中,高光谱成像系统30可以是单个高光谱相机,也可以是多个高光谱相机,多个高光谱相机可以分别采集不同波段的高光谱图像。
图像处理单元40将高光谱图像做目标特征融合处理,形成单通道图像,以及对单通道图像去噪处理,得到去噪图像。所述目标特征融合处理是指将高光谱图像中不同波段的图像依据目标特征的匹配机制执行,剔除目标特征不符合匹配机制的波段的图像,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
本实施例中,匹配机制可采用经过训练的卷积神经网络算法(CNN),该匹配机制基于卷积神经网络算法将高光谱图像中不同波段图像的目标特征与训练集中各波段图像的目标特征做匹配,而后剔除目标特征不符合匹配机制的波段的图像,并将目标特征符合匹配机制的波段的图像融合成单通道图像。所述卷积神经网络算法的训练包括:将输入的各波段图像的目标特征与训练集中预置的各波段图像的目标特征作匹配,将匹配度大于预设匹配度阈值的图像更新至训练集中。因此,图像处理单元40可基于卷积神经网络算法将高光谱图像中不同波段的图像依据目标特征与训练集中各波段图像的目标特征做匹配,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
手势识别单元50用于通过图像识别和动作追踪算法对所述去噪图像进行识别,获取手势识别结果。具体的,图像识别和动作追踪算法同样可采用卷积神经网络算法,当然,也可以采用其他适合手势图像识别或动作追踪的算法。
进一步方案中,实施例五的一种基于高光谱成像的手势识别装置还包括数据库单元60,数据库单元60存储有卷积神经网络算法训练集。
综上所述,以上实施例二至实施例五同样具有以下有益效果:
(1)可以同时采集可见光波段和红外光波段的手势图像信息,在手势识别中除了可以使用可见光波段的图像信息之外还可以使用红外光波段的图像信息进行识别,有助于提高手势识别的准确性和有效性。
(2)通过目标特征融合处理,可以消除遮挡或暗斑,减少环境的干扰。
以上所述实施例尽管已经示出和描述了本发明的实施例,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。

Claims (10)

  1. 一种基于高光谱成像的手势识别方法,其特征在于,包括以下步骤:
    通过高光谱成像系统采集处于可见光源和红外光源共同照射下的手势的图像,获得高光谱图像;
    对所述高光谱图像进行目标特征融合处理,形成单通道图像;
    对所述单通道图像去噪处理,得到去噪图像;
    通过图像识别和动作追踪算法对所述去噪图像进行识别,获取手势识别结果。
  2. 根据权利要求1所述的基于高光谱成像的手势识别方法,其特征在于,对所述高光谱图像进行目标特征融合处理,形成单通道图像:将所述高光谱图像中不同波段的图像依据目标特征的匹配机制执行,剔除目标特征不符合匹配机制的波段的图像,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
  3. 根据权利要求2所述的基于高光谱成像的手势识别方法,其特征在于,所述匹配机制采用经过训练的卷积神经网络算法,所述匹配机制基于卷积神经网络算法将所述高光谱图像中不同波段图像的目标特征与训练集中各波段图像的目标特征做匹配。
  4. 根据权利要求3所述的基于高光谱成像的手势识别方法,其特征在于,所述卷积神经网络算法的训练包括:将输入的各波段图像的目标特征与训练集中预置的各波段图像的目标特征作匹配,将匹配度大于预设匹配度阈值的图像更新至训练集中。
  5. 根据权利要求1-4任一项所述的基于高光谱成像的手势识别方法,其特征在于,所述图像识别和动作追踪算法具体采用卷积神经网络算法。
  6. 一种基于高光谱成像的手势识别装置,其特征在于,包括可见光源、红外光源、高光谱成像系统、图像处理单元以及手势识别单元;其中,
    所述可见光源和红外光源共同照射目标手势;
    所述高光谱成像系统采集处于可见光源和红外光源共同照射下的手势的图像,获得高光谱图像;
    所述图像处理单元对所述高光谱图像进行目标特征融合处理,形成单通道图像,以及对所述单通道图像去噪处理,得到去噪图像;
    所述手势识别单元通过图像识别和动作追踪算法对所述去噪图像进行识别,获取手势识别结果。
  7. 根据权利要求所6述的基于高光谱成像的手势识别装置,其特征在于,还包括数据库单元,所述数据库单元存储有卷积神经网络算法训练集。
  8. 根据权利要求所7述的基于高光谱成像的手势识别装置,其特征在于,所述图像处理单元基于卷积神经网络算法将所述高光谱图像中不同波段的图像依据目标特征与训练集中各波段图像的目标特征做匹配,并将目标特征符合匹配机制的波段的图像融合成单通道图像。
  9. 根据权利要求6-8任一项所述的基于高光谱成像的手势识别装置,其特征在于,所述高光谱成像系统为单个高光谱相机。
  10. 根据权利要求6-8任一项所述的基于高光谱成像的手势识别装置,其特征在于,所述高光谱成像系统为多个高光谱相机,分别采集不同波段的高光谱图像。
PCT/CN2018/101917 2018-08-23 2018-08-23 一种基于高光谱成像的手势识别方法及装置 WO2020037594A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/101917 WO2020037594A1 (zh) 2018-08-23 2018-08-23 一种基于高光谱成像的手势识别方法及装置
CN201880001197.7A CN109196518B (zh) 2018-08-23 2018-08-23 一种基于高光谱成像的手势识别方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/101917 WO2020037594A1 (zh) 2018-08-23 2018-08-23 一种基于高光谱成像的手势识别方法及装置

Publications (1)

Publication Number Publication Date
WO2020037594A1 true WO2020037594A1 (zh) 2020-02-27

Family

ID=64938512

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/101917 WO2020037594A1 (zh) 2018-08-23 2018-08-23 一种基于高光谱成像的手势识别方法及装置

Country Status (2)

Country Link
CN (1) CN109196518B (zh)
WO (1) WO2020037594A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539360A (zh) * 2020-04-28 2020-08-14 重庆紫光华山智安科技有限公司 安全带佩戴识别方法、装置及电子设备
CN112161937A (zh) * 2020-11-04 2021-01-01 安徽大学 基于级联森林和卷积神经网络的小麦粉筋度检测方法
CN112257619A (zh) * 2020-10-27 2021-01-22 北京澎思科技有限公司 一种目标重识别方法、装置、设备及存储介质
CN113436111A (zh) * 2021-07-21 2021-09-24 西北工业大学 一种基于网络结构搜索的高光谱遥感图像去噪方法
CN117315430A (zh) * 2023-11-28 2023-12-29 华侨大学 面向大范围车辆再辨识的不完备模态特征融合方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508670B (zh) * 2018-11-12 2021-10-12 东南大学 一种基于红外摄像头的静态手势识别方法
CN110243769A (zh) * 2019-07-30 2019-09-17 南阳理工学院 一种多源信息辅助的高光谱亚像元目标识别系统和方法
CN113298092A (zh) * 2021-05-28 2021-08-24 有米科技股份有限公司 用于多层次图像轮廓信息提取的神经网络训练方法及装置
CN114390760B (zh) * 2022-01-20 2023-11-21 北方工业大学 一种灯光控制方法及系统
CN114782502B (zh) * 2022-06-16 2022-11-04 浙江宇视科技有限公司 一种多光谱多传感器协同处理方法及装置、存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120268374A1 (en) * 2011-04-25 2012-10-25 Heald Arthur D Method and apparatus for processing touchless control commands
US20130279756A1 (en) * 2010-12-16 2013-10-24 Ovadya Menadeva Computer vision based hand identification
CN106204601A (zh) * 2016-07-15 2016-12-07 华东师范大学 一种基于波段扫描形式的活体高光谱序列图像并行配准方法
CN107436685A (zh) * 2017-07-31 2017-12-05 京东方科技集团股份有限公司 显示装置、自发光的显示面板及手势识别方法
CN107679491A (zh) * 2017-09-29 2018-02-09 华中师范大学 一种融合多模态数据的3d卷积神经网络手语识别方法
CN108090477A (zh) * 2018-01-23 2018-05-29 北京易智能科技有限公司 一种基于多光谱融合的人脸识别方法与装置
CN108304789A (zh) * 2017-12-12 2018-07-20 北京深醒科技有限公司 脸部识别方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161607A1 (en) * 2015-12-04 2017-06-08 Pilot Ai Labs, Inc. System and method for improved gesture recognition using neural networks
WO2018076371A1 (zh) * 2016-10-31 2018-05-03 深圳市大疆创新科技有限公司 动作识别方法、网络训练方法、装置及设备
CN108229515A (zh) * 2016-12-29 2018-06-29 北京市商汤科技开发有限公司 高光谱图像的对象分类方法和装置、电子设备
CN106709477A (zh) * 2017-02-23 2017-05-24 哈尔滨工业大学深圳研究生院 一种基于自适应得分融合与深度学习的人脸识别方法及系统
CN108197585A (zh) * 2017-12-13 2018-06-22 北京深醒科技有限公司 脸部识别方法和装置
CN108197580B (zh) * 2018-01-09 2019-07-23 吉林大学 一种基于3d卷积神经网络的手势识别方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130279756A1 (en) * 2010-12-16 2013-10-24 Ovadya Menadeva Computer vision based hand identification
US20120268374A1 (en) * 2011-04-25 2012-10-25 Heald Arthur D Method and apparatus for processing touchless control commands
CN106204601A (zh) * 2016-07-15 2016-12-07 华东师范大学 一种基于波段扫描形式的活体高光谱序列图像并行配准方法
CN107436685A (zh) * 2017-07-31 2017-12-05 京东方科技集团股份有限公司 显示装置、自发光的显示面板及手势识别方法
CN107679491A (zh) * 2017-09-29 2018-02-09 华中师范大学 一种融合多模态数据的3d卷积神经网络手语识别方法
CN108304789A (zh) * 2017-12-12 2018-07-20 北京深醒科技有限公司 脸部识别方法及装置
CN108090477A (zh) * 2018-01-23 2018-05-29 北京易智能科技有限公司 一种基于多光谱融合的人脸识别方法与装置

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539360A (zh) * 2020-04-28 2020-08-14 重庆紫光华山智安科技有限公司 安全带佩戴识别方法、装置及电子设备
CN111539360B (zh) * 2020-04-28 2022-11-22 重庆紫光华山智安科技有限公司 安全带佩戴识别方法、装置及电子设备
CN112257619A (zh) * 2020-10-27 2021-01-22 北京澎思科技有限公司 一种目标重识别方法、装置、设备及存储介质
CN112161937A (zh) * 2020-11-04 2021-01-01 安徽大学 基于级联森林和卷积神经网络的小麦粉筋度检测方法
CN113436111A (zh) * 2021-07-21 2021-09-24 西北工业大学 一种基于网络结构搜索的高光谱遥感图像去噪方法
CN113436111B (zh) * 2021-07-21 2024-01-09 西北工业大学 一种基于网络结构搜索的高光谱遥感图像去噪方法
CN117315430A (zh) * 2023-11-28 2023-12-29 华侨大学 面向大范围车辆再辨识的不完备模态特征融合方法
CN117315430B (zh) * 2023-11-28 2024-03-12 华侨大学 面向大范围车辆再辨识的不完备模态特征融合方法

Also Published As

Publication number Publication date
CN109196518A (zh) 2019-01-11
CN109196518B (zh) 2022-06-07

Similar Documents

Publication Publication Date Title
WO2020037594A1 (zh) 一种基于高光谱成像的手势识别方法及装置
CN110209273B (zh) 手势识别方法、交互控制方法、装置、介质与电子设备
Gorodnichy et al. Nouse ‘use your nose as a mouse’perceptual vision technology for hands-free games and interfaces
CN106774850B (zh) 一种移动终端及其交互控制方法
US10424116B2 (en) Display apparatus and controlling method thereof
WO2020078119A1 (zh) 模拟用户穿戴服装饰品的方法、装置和系统
CN108919958A (zh) 一种图像传输方法、装置、终端设备及存储介质
CN108491072B (zh) 一种虚拟现实交互方法及装置
KR20170056860A (ko) 이미지 생성 방법 및 장치
CN103092332A (zh) 电视数字图像交互方法及系统
CN114138121A (zh) 用户手势识别方法、装置、系统、存储介质及计算设备
Thabet et al. Fast marching method and modified features fusion in enhanced dynamic hand gesture segmentation and detection method under complicated background
Lo et al. Augmediated reality system based on 3D camera selfgesture sensing
US11682045B2 (en) Augmented reality advertisements on objects
Ueng et al. Vision based multi-user human computer interaction
Lin et al. An eye-tracking and head-control system using movement increment-coordinate method
CN112651270A (zh) 一种注视信息确定方法、装置、终端设备及展示对象
CN114581535B (zh) 图像中用户骨关键点标注方法、装置、存储介质及设备
Ogata et al. Automatic threshold-setting method for iris detection for brown eyes in an eye–gaze interface system with a visible light camera
KR102305404B1 (ko) 적외선 영상을 이용한 착용형 증강현실 장치의 손동작 검출 방법 및 적외선 영상을 이용한 손동작 검출이 가능한 착용형 증강현실 장치
Taaban et al. Eye tracking based mobile application
Ferhat et al. Eye-tracking with webcam-based setups: Implementation of a real-time system and an analysis of factors affecting performance
De Beugher et al. Automatic analysis of in-the-wild mobile eye-tracking experiments
JP7296069B2 (ja) 視線入力装置、および視線入力方法
US20240040099A1 (en) Depth of field in video based on gaze

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18931124

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23/06/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18931124

Country of ref document: EP

Kind code of ref document: A1