CN110959160A - 一种手势识别方法、装置及设备 - Google Patents
一种手势识别方法、装置及设备 Download PDFInfo
- Publication number
- CN110959160A CN110959160A CN201780093539.8A CN201780093539A CN110959160A CN 110959160 A CN110959160 A CN 110959160A CN 201780093539 A CN201780093539 A CN 201780093539A CN 110959160 A CN110959160 A CN 110959160A
- Authority
- CN
- China
- Prior art keywords
- image
- images
- gesture recognition
- video segment
- recognition result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 62
- 238000013135 deep learning Methods 0.000 claims abstract description 47
- 230000009471 action Effects 0.000 claims abstract description 37
- 238000004364 calculation method Methods 0.000 claims abstract description 34
- 230000004927 fusion Effects 0.000 claims abstract description 26
- 230000003287 optical effect Effects 0.000 claims description 133
- 238000010801 machine learning Methods 0.000 claims description 50
- 230000033001 locomotion Effects 0.000 claims description 34
- 238000012545 processing Methods 0.000 claims description 31
- 238000012706 support-vector machine Methods 0.000 claims description 17
- 238000003062 neural network model Methods 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 7
- 210000002569 neuron Anatomy 0.000 claims description 5
- 230000008569 process Effects 0.000 abstract description 13
- 230000000694 effects Effects 0.000 abstract description 5
- 230000003993 interaction Effects 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 18
- 238000012549 training Methods 0.000 description 16
- 238000013136 deep learning model Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 11
- 239000000284 extract Substances 0.000 description 10
- 238000004891 communication Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/809—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
Abstract
本申请提供了一种手势识别方法,涉及人机交互技术领域,所述方法包括:从视频流中的第一视频段中提取出的M幅图像;通过深度学习算法对该M幅图像进行手势识别,获得该第一视频段对应的手势识别结果,对包含第一视频段在内的连续N个视频段的手势识别结果进行结果融合,获得融合后的手势识别结果。在上述识别过程中,不需要对视频流中的手势进行分割和跟踪,而是通过计算速度较快的深度学习算法来识别各个阶段动作,再将各个阶段动作融合,从而达到提高手势识别的速度,降低手势识别的延迟的效果。
Description
PCT国内申请,说明书已公开。
Claims (30)
- PCT国内申请,权利要求书已公开。
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/095388 WO2019023921A1 (zh) | 2017-08-01 | 2017-08-01 | 一种手势识别方法、装置及设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110959160A true CN110959160A (zh) | 2020-04-03 |
Family
ID=65232224
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201780093539.8A Pending CN110959160A (zh) | 2017-08-01 | 2017-08-01 | 一种手势识别方法、装置及设备 |
Country Status (6)
Country | Link |
---|---|
US (1) | US11450146B2 (zh) |
EP (1) | EP3651055A4 (zh) |
KR (1) | KR102364993B1 (zh) |
CN (1) | CN110959160A (zh) |
BR (1) | BR112020001729A8 (zh) |
WO (1) | WO2019023921A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115809006A (zh) * | 2022-12-05 | 2023-03-17 | 北京拙河科技有限公司 | 一种画面控制人工指令的方法及装置 |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10678244B2 (en) | 2017-03-23 | 2020-06-09 | Tesla, Inc. | Data synthesis for autonomous control systems |
US11409692B2 (en) | 2017-07-24 | 2022-08-09 | Tesla, Inc. | Vector computational unit |
US11893393B2 (en) | 2017-07-24 | 2024-02-06 | Tesla, Inc. | Computational array microprocessor system with hardware arbiter managing memory requests |
US11157441B2 (en) | 2017-07-24 | 2021-10-26 | Tesla, Inc. | Computational array microprocessor system using non-consecutive data formatting |
US10671349B2 (en) | 2017-07-24 | 2020-06-02 | Tesla, Inc. | Accelerated mathematical engine |
US11561791B2 (en) | 2018-02-01 | 2023-01-24 | Tesla, Inc. | Vector computational unit receiving data elements in parallel from a last row of a computational array |
US11215999B2 (en) | 2018-06-20 | 2022-01-04 | Tesla, Inc. | Data pipeline and deep learning system for autonomous driving |
US11361457B2 (en) | 2018-07-20 | 2022-06-14 | Tesla, Inc. | Annotation cross-labeling for autonomous control systems |
US11636333B2 (en) | 2018-07-26 | 2023-04-25 | Tesla, Inc. | Optimizing neural network structures for embedded systems |
US11562231B2 (en) | 2018-09-03 | 2023-01-24 | Tesla, Inc. | Neural networks for embedded devices |
CA3115784A1 (en) | 2018-10-11 | 2020-04-16 | Matthew John COOPER | Systems and methods for training machine models with augmented data |
US11196678B2 (en) | 2018-10-25 | 2021-12-07 | Tesla, Inc. | QOS manager for system on a chip communications |
US11816585B2 (en) | 2018-12-03 | 2023-11-14 | Tesla, Inc. | Machine learning models operating at different frequencies for autonomous vehicles |
US11537811B2 (en) | 2018-12-04 | 2022-12-27 | Tesla, Inc. | Enhanced object detection for autonomous vehicles based on field view |
US11610117B2 (en) | 2018-12-27 | 2023-03-21 | Tesla, Inc. | System and method for adapting a neural network model on a hardware platform |
US10997461B2 (en) | 2019-02-01 | 2021-05-04 | Tesla, Inc. | Generating ground truth for machine learning from time series elements |
US11567514B2 (en) | 2019-02-11 | 2023-01-31 | Tesla, Inc. | Autonomous and user controlled vehicle summon to a target |
US10956755B2 (en) | 2019-02-19 | 2021-03-23 | Tesla, Inc. | Estimating object properties using visual image data |
WO2020251385A1 (en) * | 2019-06-14 | 2020-12-17 | Ringcentral, Inc., (A Delaware Corporation) | System and method for capturing presentation gestures |
CN110728209B (zh) * | 2019-09-24 | 2023-08-08 | 腾讯科技(深圳)有限公司 | 一种姿态识别方法、装置、电子设备及存储介质 |
CN111368770B (zh) * | 2020-03-11 | 2022-06-07 | 桂林理工大学 | 基于骨骼点检测与跟踪的手势识别方法 |
CN114600072A (zh) * | 2020-03-20 | 2022-06-07 | 华为技术有限公司 | 用于基于手势控制设备的方法和系统 |
CN112115801B (zh) * | 2020-08-25 | 2023-11-24 | 深圳市优必选科技股份有限公司 | 动态手势识别方法、装置、存储介质及终端设备 |
US11481039B2 (en) * | 2020-08-28 | 2022-10-25 | Electronics And Telecommunications Research Institute | System for recognizing user hand gesture and providing virtual reality content based on deep learning using transfer learning |
US20220129667A1 (en) * | 2020-10-26 | 2022-04-28 | The Boeing Company | Human Gesture Recognition for Autonomous Aircraft Operation |
US20220292285A1 (en) * | 2021-03-11 | 2022-09-15 | International Business Machines Corporation | Adaptive selection of data modalities for efficient video recognition |
WO2022217290A1 (en) * | 2021-04-09 | 2022-10-13 | Google Llc | Using a machine-learned module for radar-based gesture detection in an ambient computer environment |
CN114564104A (zh) * | 2022-02-17 | 2022-05-31 | 西安电子科技大学 | 一种基于视频中动态手势控制的会议演示系统 |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102155933A (zh) * | 2011-03-08 | 2011-08-17 | 西安工程大学 | 一种基于视频差异分析的输电导线舞动测量方法 |
CN102395984A (zh) * | 2009-04-14 | 2012-03-28 | 皇家飞利浦电子股份有限公司 | 用于视频内容分析的关键帧提取 |
CN103514608A (zh) * | 2013-06-24 | 2014-01-15 | 西安理工大学 | 基于运动注意力融合模型的运动目标检测与提取方法 |
CN103984937A (zh) * | 2014-05-30 | 2014-08-13 | 无锡慧眼电子科技有限公司 | 基于光流法的行人计数方法 |
US20160092726A1 (en) * | 2014-09-30 | 2016-03-31 | Xerox Corporation | Using gestures to train hand detection in ego-centric video |
CN105550699A (zh) * | 2015-12-08 | 2016-05-04 | 北京工业大学 | 一种基于cnn融合时空显著信息的视频识别分类方法 |
CN105787458A (zh) * | 2016-03-11 | 2016-07-20 | 重庆邮电大学 | 基于人工设计特征和深度学习特征自适应融合的红外行为识别方法 |
US20170206405A1 (en) * | 2016-01-14 | 2017-07-20 | Nvidia Corporation | Online detection and classification of dynamic gestures with recurrent convolutional neural networks |
CN106991372A (zh) * | 2017-03-02 | 2017-07-28 | 北京工业大学 | 一种基于混合深度学习模型的动态手势识别方法 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5149033B2 (ja) * | 2008-02-26 | 2013-02-20 | 岐阜車体工業株式会社 | 動作解析方法及び動作解析装置並びにその動作解析装置を利用した動作評価装置 |
US20120056846A1 (en) * | 2010-03-01 | 2012-03-08 | Lester F. Ludwig | Touch-based user interfaces employing artificial neural networks for hdtp parameter and symbol derivation |
JP5604256B2 (ja) * | 2010-10-19 | 2014-10-08 | 日本放送協会 | 人物動作検出装置およびそのプログラム |
CN102854983B (zh) | 2012-09-10 | 2015-12-02 | 中国电子科技集团公司第二十八研究所 | 一种基于手势识别的人机交互方法 |
US9829984B2 (en) * | 2013-05-23 | 2017-11-28 | Fastvdo Llc | Motion-assisted visual language for human computer interfaces |
KR102214922B1 (ko) * | 2014-01-23 | 2021-02-15 | 삼성전자주식회사 | 행동 인식을 위한 특징 벡터 생성 방법, 히스토그램 생성 방법, 및 분류기 학습 방법 |
CN104182772B (zh) * | 2014-08-19 | 2017-10-24 | 大连理工大学 | 一种基于深度学习的手势识别方法 |
CN106295531A (zh) * | 2016-08-01 | 2017-01-04 | 乐视控股(北京)有限公司 | 一种手势识别方法和装置以及虚拟现实终端 |
-
2017
- 2017-08-01 CN CN201780093539.8A patent/CN110959160A/zh active Pending
- 2017-08-01 WO PCT/CN2017/095388 patent/WO2019023921A1/zh unknown
- 2017-08-01 BR BR112020001729A patent/BR112020001729A8/pt active Search and Examination
- 2017-08-01 EP EP17920578.6A patent/EP3651055A4/en active Pending
- 2017-08-01 KR KR1020207005925A patent/KR102364993B1/ko active IP Right Grant
-
2020
- 2020-01-29 US US16/776,282 patent/US11450146B2/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102395984A (zh) * | 2009-04-14 | 2012-03-28 | 皇家飞利浦电子股份有限公司 | 用于视频内容分析的关键帧提取 |
CN102155933A (zh) * | 2011-03-08 | 2011-08-17 | 西安工程大学 | 一种基于视频差异分析的输电导线舞动测量方法 |
CN103514608A (zh) * | 2013-06-24 | 2014-01-15 | 西安理工大学 | 基于运动注意力融合模型的运动目标检测与提取方法 |
CN103984937A (zh) * | 2014-05-30 | 2014-08-13 | 无锡慧眼电子科技有限公司 | 基于光流法的行人计数方法 |
US20160092726A1 (en) * | 2014-09-30 | 2016-03-31 | Xerox Corporation | Using gestures to train hand detection in ego-centric video |
CN105550699A (zh) * | 2015-12-08 | 2016-05-04 | 北京工业大学 | 一种基于cnn融合时空显著信息的视频识别分类方法 |
US20170206405A1 (en) * | 2016-01-14 | 2017-07-20 | Nvidia Corporation | Online detection and classification of dynamic gestures with recurrent convolutional neural networks |
CN105787458A (zh) * | 2016-03-11 | 2016-07-20 | 重庆邮电大学 | 基于人工设计特征和深度学习特征自适应融合的红外行为识别方法 |
CN106991372A (zh) * | 2017-03-02 | 2017-07-28 | 北京工业大学 | 一种基于混合深度学习模型的动态手势识别方法 |
Non-Patent Citations (6)
Title |
---|
KAREN SIMONYAN等: "Two-Stream Convolutional Networks", 《ADVANCES IN NEURAL INFORMAION PROCESSING SYSTEMS 27(NIPS 2014)》 * |
KAREN SIMONYAN等: "Two-Stream Convolutional Networks", 《ADVANCES IN NEURAL INFORMAION PROCESSING SYSTEMS 27(NIPS 2014)》, 8 December 2014 (2014-12-08) * |
PAVLO MOLCHANOV 等: "Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》, pages 4207 - 4215 * |
PAVLO MOLCHANOV等: "Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》, 12 December 2016 (2016-12-12), pages 4209 * |
张轩阁 等: "基于全局光流特征的微表情识别", 《模式识别与人工智能》, vol. 29, no. 8, pages 760 - 768 * |
谢剑斌 等: "《视觉感知与智能视频监控》", vol. 1, 31 March 2012, 国防科技大学出版社, pages: 276 - 280 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115809006A (zh) * | 2022-12-05 | 2023-03-17 | 北京拙河科技有限公司 | 一种画面控制人工指令的方法及装置 |
CN115809006B (zh) * | 2022-12-05 | 2023-08-08 | 北京拙河科技有限公司 | 一种画面控制人工指令的方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
KR20200036002A (ko) | 2020-04-06 |
WO2019023921A1 (zh) | 2019-02-07 |
BR112020001729A2 (pt) | 2020-07-21 |
US11450146B2 (en) | 2022-09-20 |
EP3651055A1 (en) | 2020-05-13 |
US20200167554A1 (en) | 2020-05-28 |
KR102364993B1 (ko) | 2022-02-17 |
EP3651055A4 (en) | 2020-10-21 |
BR112020001729A8 (pt) | 2023-04-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110959160A (zh) | 一种手势识别方法、装置及设备 | |
US10168794B2 (en) | Motion-assisted visual language for human computer interfaces | |
Biswas et al. | Gesture recognition using microsoft kinect® | |
EP2864933B1 (en) | Method, apparatus and computer program product for human-face features extraction | |
US8442269B2 (en) | Method and apparatus for tracking target object | |
US20160300100A1 (en) | Image capturing apparatus and method | |
CN106648078B (zh) | 应用于智能机器人的多模态交互方法及系统 | |
US20230082789A1 (en) | Methods and systems for hand gesture-based control of a device | |
CN103353935A (zh) | 一种用于智能家居系统的3d动态手势识别方法 | |
CN103105924A (zh) | 人机交互方法和装置 | |
KR102434397B1 (ko) | 전역적 움직임 기반의 실시간 다중 객체 추적 장치 및 방법 | |
US20220291755A1 (en) | Methods and systems for hand gesture-based control of a device | |
CN113194253A (zh) | 去除图像反光的拍摄方法、装置和电子设备 | |
JP2016099643A (ja) | 画像処理装置、画像処理方法および画像処理プログラム | |
KR101146417B1 (ko) | 무인 감시 로봇에서 중요 얼굴 추적 장치 및 방법 | |
JP2012068948A (ja) | 顔属性推定装置およびその方法 | |
KR101909326B1 (ko) | 얼굴 모션 변화에 따른 삼각 매쉬 모델을 활용하는 사용자 인터페이스 제어 방법 및 시스템 | |
CN114613006A (zh) | 一种远距离手势识别方法及装置 | |
US11847823B2 (en) | Object and keypoint detection system with low spatial jitter, low latency and low power usage | |
Zhang et al. | Eye detection for electronic map control application | |
KR20230166840A (ko) | 인공지능을 이용한 객체 이동 경로 확인 방법 | |
KR101517932B1 (ko) | 손 제스처 인식용 초광각 스테레오 카메라 시스템 장치 및 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |