CN110046544A - Digital gesture identification method based on convolutional neural networks - Google Patents

Digital gesture identification method based on convolutional neural networks Download PDF

Info

Publication number
CN110046544A
CN110046544A CN201910147442.1A CN201910147442A CN110046544A CN 110046544 A CN110046544 A CN 110046544A CN 201910147442 A CN201910147442 A CN 201910147442A CN 110046544 A CN110046544 A CN 110046544A
Authority
CN
China
Prior art keywords
image
convolutional neural
neural networks
gestures
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910147442.1A
Other languages
Chinese (zh)
Inventor
张国山
赵阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201910147442.1A priority Critical patent/CN110046544A/en
Publication of CN110046544A publication Critical patent/CN110046544A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of digital gesture identification method based on convolutional neural networks, including the following steps: using the images of gestures of Kinect depth camera acquisition 10 classes number, images of gestures is filtered;The sample set of each digital gesture characterization is established using filtered image, the method is as follows: morphology pretreatment is carried out to filtered images of gestures;Classification marker image information, obtains the sample set of each digital gesture characterization, and classifies and training set and test set is made;Construct convolutional neural networks-CNN;Training sample set is inputted, characteristics of image is extracted, carries out classification based training;The image in test data set is identified using the convolutional neural networks after training.

Description

Digital gesture identification method based on convolutional neural networks
Technical field
The present invention relates to deep learnings, field of image processing, and in particular to one kind is based on the hand of convolutional neural networks (CNN) Gesture identification.
Background technique
Gesture identification is always a popular research topic, and digital gesture identification will solve the acquisition of data, image Processing and selection, input sample expression selection, pattern recognition classifier device selection, and based on sample set to identifier into Row has the problems such as the training of supervision.
Gesture is person-to-person communication, inalienable part in exchange, and Gesture Recognition also open the mankind with With the development of science and technology, Gesture Recognition is from by extraneous auxiliary by the brand-new situation interacted between machine, equipment or computer Help the data glove era development of equipment to pattern classification stage based on computer vision currently a popular view-based access control model Gesture identification is divided into segmentation, 3 stage Hand Gesture Segmentations of feature extraction and identification are the bases of gesture identification, and target is from background Gesture is partitioned into complicated image since the colour of skin there are certain Clustering features in color space, current most of hand Gesture dividing method is all the color characteristic (YUV, HSV, YCbCr etc.) or geometrical characteristic (such as model of ellipse, artwork using the colour of skin Type) complete.The Main way studied at present is: current research work is all separately to carry out gestures detection and identification, On the basis of the continuous refinement of identification technology, how by applied mathematics morphology, neural network algorithm, genetic algorithm new technology It applies in gesture identification.
The maximum difficult point of gesture identification research is: the video that data processing stage system acquires camera carries out frame point From processing, single images of gestures is separated from video frame, and the pretreatments such as smooth, sharpening are made to data.Then it detects Whether there is images of gestures, if detecting images of gestures, images of gestures and background are subjected to separating treatment.Gesture analysis rank Section carries out feature detection to gesture, then is estimated with selected gesture model
Individual features parameter.Identify that sorting phase, will using each sorting algorithm by feature extraction and model parameter estimation Point or track in parameter space are categorized into different subspaces, finally convert specific meanings for reality for identification information Using.The influence of illumination and pixel etc. can all bring different degrees of influence to the accuracy of identifying system.And Kinect Depth image is unrelated with ambient lighting and shade, and pixel can clearly express the morphology of scenery, and Kinect is deep Degree camera is that Microsoft is a motion perception input equipment that its Xbox360 game host and windows platform PC make, and is made For a body-sensing peripheral hardware, it is actually that a 3D body-sensing using completely new space orientation technique (Light Coding) images Head, for Kinect there are three camera lens, intermediate camera lens is RGB color video camera, and the right and left camera lens is then respectively infrared ray transmitting The 3D depth inductor that device and infrared C MOS video camera are constituted.It is defeated using instant dynamic capture, image identification, microphone Enter, speech recognition, the functions such as community interactive, player is allowed to pass through natural user interface skill using body gesture and voice command Art is interacted with Xbox 360.
As the important component of computer intelligence interface, digital gesture identification has great importance, the technology The service efficiency that can greatly improve computer is improved, is the fields such as office automation, smart home, robot interactive control The following optimal input mode.Currently, there are problems to be mainly reflected in three aspects: the 1) acquisition of data set for gesture identification Problem;2) how the gesture pose detection in picture is accurately separated, is image by the problem of pretreatment of images of gestures Main aspect in test problems;3) digital gesture identification and neural network are combined, reaches recognition effect most It is good.
Summary of the invention
The object of the present invention is to provide a kind of recognition effect, more preferably the digital gesture based on convolutional neural networks is known automatically Other method.Technical solution is as follows:
A kind of digital gesture identification method based on convolutional neural networks, including the following steps:
(1) using the images of gestures of Kinect depth camera acquisition 10 classes number, images of gestures is filtered;Benefit The sample set of each digital gesture characterization is established with filtered image, the method is as follows: form is carried out to filtered images of gestures Learn pretreatment;Classification marker image information, obtains the sample set of each digital gesture characterization, and classifies and training set and test is made Collection;
(2) convolutional neural networks-CNN is constructed:
The digital images of gestures of each classification is imported convolutional neural networks by (2a), and as inputs layers, size is [320,320,3,59];
(2b) constructs 8 layers of convolutional neural networks, carries out the behaviour such as convolution, down-sampling, pond to each pixel of input picture Make, obtains every layer of maps characteristic pattern;
The input of (2c) by every layer of output as next layer finally converges at full fc layers of connection by the layer of front and back 8, Result is exported by output layer softmax classifier;
(3) training sample set is inputted, characteristics of image is extracted, carries out classification based training;
(3a) uses softmax classifier, classifies to image feature vector;
(3b) uses convolutional neural networks algorithm, is trained to training sample set, the model .mat text after being trained Part;
(4) image in the convolutional neural networks identification test data set after training is utilized.
Wherein, the filtering method of step (1) is preferably as follows: the depth image filtering algorithm based on joint two-sided filter, Using the depth image of the images of gestures of Kinect camera lens synchronization acquisition and color image as input, with gaussian kernel function meter The space length weight of depth image and the gray scale weight of RGB color image are calculated, the two weights are multiplied to obtain joint filter Wave weight designs joint two-sided filter, carries out convolution algorithm realization with the filter result of this filter and noise image Kinect depth image filtering.
Gesture identification based on convolutional neural networks (CNN) of the invention can efficiently realize gesture feature depth map The automatic identification and output of the acquisition of picture and further denoising and gesture numerical characteristic image, recognition accuracy can To reach 93% or so.Algorithm all has certain robustness to illumination variation, simple geometry deformation and additional noise, can Related fields for the identification of digital gesture feature;Algorithm is after extension, it can also be used to the automatic knowledge of other gesture features Not.
Detailed description of the invention
Fig. 1 is algorithm flow chart of the invention.
The data set that Fig. 2 has been pre-processed.
Fig. 3 is the entirety training figure of network.
Fig. 4 is the recognition result of test set picture.
Specific embodiment
The purpose of the present invention is acquiring data set using Kinect depth camera, is pre-processed and denoised by morphological image Afterwards, then based on the convolutional neural networks of building the automatic identification of digital gesture is realized, to reach practical requirement.Mainly It comprises the steps of:
(1) sample set of each digital gesture characterization is obtained;
The characterize data collection of (1a) acquisition 10 classes number;
(1b) carries out morphology pretreatment to acquired image;
Training set and test set is made in collected digital gesture data set classification by (1c) classification marker image information;
(2) convolutional neural networks-CNN is constructed:
The digital images of gestures of each classification is imported convolutional neural networks by (2a), and as inputs layers, size is [320,320,3,59];
(2b) constructs 8 layers of convolutional neural networks, carries out the behaviour such as convolution, down-sampling, pond to each pixel of input picture Make, obtains every layer of maps characteristic pattern;
The input of (2c) by every layer of output as next layer finally converges at full fc layers of connection by the layer of front and back 8, Result is exported by output layer softmax classifier;
(3) training sample set is inputted, characteristics of image is extracted, carries out classification based training;
(3a) uses softmax classifier, classifies to image feature vector;
(3b) uses convolutional neural networks algorithm, is trained to training sample set, the model .mat text after being trained Part;
(4) picture in the convolutional neural networks automatic identification test data set after training is utilized.
Test sample collection is input in trained convolutional neural networks, i.e., by inputting trained model .mat text Part tests test sample collection, realizes each digital gesture picture automatic identification, outputs test result.
It is described as follows in conjunction with 1 pair of specific steps of the invention of attached drawing:
(1) sample set of each digital gesture feature is obtained;
Acquiring 10 classes includes the image of digital gesture feature as data set, wherein including expression 0,1,2,3,4,5,6, 7, each 1050 of 8,9 digital gesture, 10500 training datasets, the digital gesture characteristic pattern of each classification are to utilize in total Kinect depth camera respectively from different angles be acquired under light.For the depth image of Kinect camera lens acquisition Generally there are noise and black hole phenomenon, directly applies to and identify that its effect is poor, we utilize the depth based on joint two-sided filter Image filter arithmetic is spent, using the depth image of Kinect camera lens synchronization acquisition and color image as input.Firstly, with Gaussian kernel function calculates the space length weight of depth image and the gray scale weight of RGB color image, then weighs the two Value multiplication obtains Federated filter weight, and designs joint two-sided filter using fast Gaussian transform replacement gaussian kernel function. Finally, carrying out convolution algorithm with the filter result of this filter and noise image realizes Kinect depth image filtering.Then with Machine extraction is wherein used as training sample set for 10000, carries out manual sort's label, is left 500 and is used as test sample collection;Most Training sample set 10000, test sample collection 500 are obtained eventually.
(2) convolutional neural networks --- CNN is constructed:
The digital images of gestures of each classification is imported into convolutional neural networks, as inputs layer, size for [320, 320,3,59];8 layers of convolutional neural networks are constructed, the behaviour such as convolution, down-sampling, pond is carried out to each pixel of input picture Make, obtains every layer of maps characteristic pattern;Every layer of output is finally converged as next layer of input by the layer of front and back 8 In full fc layers of connection, result is exported by output layer softmax classifier;
Training sample set is inputted, characteristics of image is extracted, carries out classification based training;Using softmax classifier, to characteristics of image Vector is classified;Using convolutional neural networks algorithm, training sample set is trained, the model .mat after being trained File;
The basic procedure of convolutional neural networks algorithm is as follows: the threshold value of random initializtion network weight and neuron;According to Formula (1) carries out propagated forward:
Layered method hidden neuron and output neuron are output and input;Wherein E represents output error, and d represents true Reality, wjk, vijRespectively represent the weight and threshold value of each layer.
Error back propagation is carried out according to formula (2):
Wherein, θ is the learning rate parameter (θ=0.001 in the present invention) of back-propagation algorithm, and n represents input vector Number (present invention in n=320*320*3*59), m represents the number of hidden layer output vector, and (m is exported with convolutional layer in the present invention The change of vector and change), l represents the number (present invention in l=1*1*4096*59) of output layer output vector, in above formula Negative sign indicates gradient decline in the weight space, i.e., so that the weight that the value of E declines changes direction.Weight is corrected by above formula And threshold value, until meeting termination condition.
(3) the convolutional neural networks automatic identification number gesture feature set after training is utilized.
Test sample collection is input in trained convolutional neural networks, i.e., by inputting trained model .mat text Part tests test sample collection, realizes the automatic identification of each digital gesture picture, outputs test result.
Compared with prior art, the present invention having the characteristics that and advantage:
First: convolutional neural networks are applied in digital gesture feature identification by the present invention, and data set includes: 10 classes number Gesture feature image, including digital gesture each 1050, in total 10500 of expression 0,1,2,3,4,5,6,7,8,9 Training dataset, the digital gesture characteristic pattern of each classification are to utilize Kinect depth camera difference from different angles and light It is acquired under line, morphologic filtering denoising then is carried out to collected 10500 picture.Test to test set The experimental results showed that most gesture feature image all identifies correctly, as shown in Figure 4.Table 2 is current traditional algorithm and this The recognition accuracy of inventive method compares.As can be seen from the table, the recognition accuracy of the method for the present invention is relatively preferable.
Second, the present invention constructs parallel Pooling layer, which is advantageous in that: in training dataset when production Can effectively reduce top-1 (correct option of maximum probability) and top-5 when the output of raw identical dimensional, (preceding 5 probability are highest In include correct option).In the structure of CNN, feature extraction layer can be by the part of the input of each neuron and preceding layer Acceptance region is connected, while extracting the feature of the part of this layer.After once local feature is extracted, it and other feature vectors Between positional relationship also determine therewith, facilitate the extraction of feature vector.
Third, the present invention is when acquiring digital images of gestures using the Kinect depth map based on joint two-sided filter As filtering algorithm, it can preferably retain the correlated characteristic of original image, help to improve recognition correct rate.

Claims (2)

1. a kind of digital gesture identification method based on convolutional neural networks, including the following steps:
(1) using the images of gestures of Kinect depth camera acquisition 10 classes number, images of gestures is filtered;Utilize filter Image after wave establishes the sample set of each digital gesture characterization, the method is as follows: it is pre- to carry out morphology to filtered images of gestures Processing;Classification marker image information, obtains the sample set of each digital gesture characterization, and classifies and training set and test set is made;
(2) convolutional neural networks-CNN is constructed:
The digital images of gestures of each classification is imported convolutional neural networks by (2a), as inputs layer, size for [320, 320,3,59];
(2b) constructs 8 layers of convolutional neural networks, carries out the operation such as convolution, down-sampling, pond to each pixel of input picture, obtains To every layer of maps characteristic pattern;
The input of (2c) by every layer of output as next layer finally converges at full fc layers of connection, passes through by the layer of front and back 8 Output layer softmax classifier exports result;
(3) training sample set is inputted, characteristics of image is extracted, carries out classification based training;
(3a) uses softmax classifier, classifies to image feature vector;
(3b) uses convolutional neural networks algorithm, is trained to training sample set, the model .mat file after being trained;
(4) image in the convolutional neural networks identification test data set after training is utilized.
2. the method according to claim 1, wherein the filtering method of step (1) is as follows: based on the bilateral filter of joint The depth image filtering algorithm of wave device, by the depth image and color image of the images of gestures of Kinect camera lens synchronization acquisition As input, the space length weight of depth image and the gray scale weight of RGB color image are calculated with gaussian kernel function, by this Two weights are multiplied to obtain Federated filter weight, joint two-sided filter are designed, with the filter result and noise pattern of this filter Kinect depth image filtering is realized as carrying out convolution algorithm.
CN201910147442.1A 2019-02-27 2019-02-27 Digital gesture identification method based on convolutional neural networks Pending CN110046544A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910147442.1A CN110046544A (en) 2019-02-27 2019-02-27 Digital gesture identification method based on convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910147442.1A CN110046544A (en) 2019-02-27 2019-02-27 Digital gesture identification method based on convolutional neural networks

Publications (1)

Publication Number Publication Date
CN110046544A true CN110046544A (en) 2019-07-23

Family

ID=67274219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910147442.1A Pending CN110046544A (en) 2019-02-27 2019-02-27 Digital gesture identification method based on convolutional neural networks

Country Status (1)

Country Link
CN (1) CN110046544A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569852A (en) * 2019-09-10 2019-12-13 瑞森网安(福建)信息科技有限公司 Image identification method based on convolutional neural network
CN110795990A (en) * 2019-09-11 2020-02-14 中国海洋大学 Gesture recognition method for underwater equipment
CN111142399A (en) * 2020-01-09 2020-05-12 四川轻化工大学 Embedded intelligent home automation control test system based on computer
CN111767860A (en) * 2020-06-30 2020-10-13 阳光学院 Method and terminal for realizing image recognition through convolutional neural network
CN113792573A (en) * 2021-07-13 2021-12-14 浙江理工大学 Static gesture recognition method for wavelet transformation low-frequency information and Xception network
TWI760769B (en) * 2020-06-12 2022-04-11 國立中央大學 Computing device and method for generating a hand gesture recognition model, and hand gesture recognition device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296667A (en) * 2016-08-01 2017-01-04 乐视控股(北京)有限公司 Hand detection method and system
CN107358576A (en) * 2017-06-24 2017-11-17 天津大学 Depth map super resolution ratio reconstruction method based on convolutional neural networks
CN107742095A (en) * 2017-09-23 2018-02-27 天津大学 Chinese sign Language Recognition Method based on convolutional neural networks
CN109344701A (en) * 2018-08-23 2019-02-15 武汉嫦娥医学抗衰机器人股份有限公司 A kind of dynamic gesture identification method based on Kinect

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296667A (en) * 2016-08-01 2017-01-04 乐视控股(北京)有限公司 Hand detection method and system
CN107358576A (en) * 2017-06-24 2017-11-17 天津大学 Depth map super resolution ratio reconstruction method based on convolutional neural networks
CN107742095A (en) * 2017-09-23 2018-02-27 天津大学 Chinese sign Language Recognition Method based on convolutional neural networks
CN109344701A (en) * 2018-08-23 2019-02-15 武汉嫦娥医学抗衰机器人股份有限公司 A kind of dynamic gesture identification method based on Kinect

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GONGFA LI等: ""Hand gesture recognition based on convolution neural network"", 《CLUSTER COMPUTING》 *
李知菲: ""基于联合双边滤波器的Kinect深度图像滤波算法"", 《计算机应用》 *
杨文斌等: ""基于卷积神经网络的手势识别方法"", 《安徽工程大学学报》 *
胡茗: ""基于CNN的手势姿态估计在手势识别中的应用"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
陈祖雪: ""基于深度卷积神经网络的手势识别研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569852A (en) * 2019-09-10 2019-12-13 瑞森网安(福建)信息科技有限公司 Image identification method based on convolutional neural network
CN110569852B (en) * 2019-09-10 2021-10-15 瑞森网安(福建)信息科技有限公司 Image identification method based on convolutional neural network
CN110795990A (en) * 2019-09-11 2020-02-14 中国海洋大学 Gesture recognition method for underwater equipment
CN110795990B (en) * 2019-09-11 2022-04-29 中国海洋大学 Gesture recognition method for underwater equipment
CN111142399A (en) * 2020-01-09 2020-05-12 四川轻化工大学 Embedded intelligent home automation control test system based on computer
TWI760769B (en) * 2020-06-12 2022-04-11 國立中央大學 Computing device and method for generating a hand gesture recognition model, and hand gesture recognition device
CN111767860A (en) * 2020-06-30 2020-10-13 阳光学院 Method and terminal for realizing image recognition through convolutional neural network
CN113792573A (en) * 2021-07-13 2021-12-14 浙江理工大学 Static gesture recognition method for wavelet transformation low-frequency information and Xception network

Similar Documents

Publication Publication Date Title
CN108520535B (en) Object classification method based on depth recovery information
CN109344701B (en) Kinect-based dynamic gesture recognition method
CN107203753B (en) Action recognition method based on fuzzy neural network and graph model reasoning
CN106599883B (en) CNN-based multilayer image semantic face recognition method
CN110046544A (en) Digital gesture identification method based on convolutional neural networks
CN104050471B (en) Natural scene character detection method and system
CN109815826B (en) Method and device for generating face attribute model
WO2020108362A1 (en) Body posture detection method, apparatus and device, and storage medium
CN104268583B (en) Pedestrian re-recognition method and system based on color area features
CN108717524B (en) Gesture recognition system based on double-camera mobile phone and artificial intelligence system
WO2020078119A1 (en) Method, device and system for simulating user wearing clothing and accessories
CN104850825A (en) Facial image face score calculating method based on convolutional neural network
CN112580590A (en) Finger vein identification method based on multi-semantic feature fusion network
CN107590432A (en) A kind of gesture identification method based on circulating three-dimensional convolutional neural networks
CN110674741A (en) Machine vision gesture recognition method based on dual-channel feature fusion
CN108960288B (en) Three-dimensional model classification method and system based on convolutional neural network
CN106909884B (en) Hand region detection method and device based on layered structure and deformable part model
CN112906550B (en) Static gesture recognition method based on watershed transformation
CN110082821A (en) A kind of no label frame microseism signal detecting method and device
CN111723600B (en) Pedestrian re-recognition feature descriptor based on multi-task learning
CN109034012A (en) First person gesture identification method based on dynamic image and video sequence
CN112200110A (en) Facial expression recognition method based on deep interference separation learning
Sarma et al. Hand gesture recognition using deep network through trajectory-to-contour based images
CN108537109A (en) Monocular camera sign Language Recognition Method based on OpenPose
CN114782979A (en) Training method and device for pedestrian re-recognition model, storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190723

RJ01 Rejection of invention patent application after publication