CN108537147A - A kind of gesture identification method based on deep learning - Google Patents

A kind of gesture identification method based on deep learning Download PDF

Info

Publication number
CN108537147A
CN108537147A CN201810242638.4A CN201810242638A CN108537147A CN 108537147 A CN108537147 A CN 108537147A CN 201810242638 A CN201810242638 A CN 201810242638A CN 108537147 A CN108537147 A CN 108537147A
Authority
CN
China
Prior art keywords
gesture
profile
dynamic
binaryzation
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810242638.4A
Other languages
Chinese (zh)
Other versions
CN108537147B (en
Inventor
董训锋
陈镜超
李国振
马啸天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Donghua University
National Dong Hwa University
Original Assignee
Donghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Donghua University filed Critical Donghua University
Priority to CN201810242638.4A priority Critical patent/CN108537147B/en
Publication of CN108537147A publication Critical patent/CN108537147A/en
Application granted granted Critical
Publication of CN108537147B publication Critical patent/CN108537147B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image

Abstract

The present invention provides a kind of gesture identification methods based on deep learning, which is characterized in that includes the following steps:Binaryzation convolutional neural networks are trained using gesture training set and test set;The colouring information reflected using the colour of skin, is split pretreated original image based on colouring information, extracts gesture profile;Judge the corresponding gesture instruction of gesture profile using the binaryzation convolutional neural networks after training;A series of corresponding dynamic gesture of gesture profiles, stop are positioned, and gesture path is tracked using TLD algorithms, the deviation in tracing process is modified using Haar classifier, reuses HMM algorithms identification dynamic gesture.Method provided by the invention can solve in traditional gesture identification generally there are accuracy of identification it is not high, stability is poor, real-time is poor, gesture function is single the problems such as.

Description

A kind of gesture identification method based on deep learning
Technical field
The present invention relates to one kind being based on deep learning gesture identification method, belongs to technical field of hand gesture recognition.
Background technology
The appearance of computer produces extremely important influence to the social production of the mankind and daily life, it is on the one hand The efficiency of information processing is greatly improved, the hair of intelligent life has on the other hand been pushed.Therefore, how efficiently easily with Computer interaction becomes the focus of people's research.
With the development of social information's technology, human-computer interaction technology (Human Computer Interaction, English letter Referred to as HCI), it has also become the important component of daily life.As a kind of emerging man-machine interaction mode, Gesture Recognition Extensive prospect of the application is suffered from many range fields:(1) digital living and amusement aspect.For example, 2008, Ericsson pushes away Go out a smart mobile phone R520m, which is acquired the gesture information of user by its built-in camera, key is served as in mobile phone interface Disk or touch screen, to realize the control to alarm clock and incoming call.(2) scientific and technical innovation field.It is led in space probation and military field engineering Domain is frequently encountered some hazardous environments or is not easy to the particular surroundings that people is in direct contact control, at this moment can be long-range by gesture Manipulation of the machine people interacts acquisition relevant information.(3) intelligent transportation field, for example, it is unmanned.Early in 2010, Google Company externally discloses their pilotless automobile, which opens the new era of intelligent transportation.
Gesture Recognition can play the role of following in human-computer interaction technique field:
(1) for a user, it helps user more easily to use product, saves user Shi Wen, promotes family body user experience;
(2) for product, the operation instruction of redundancy is eliminated, product, which uses, need to only provide relevant general gestures guidance i.e. It can.
Invention content
The technical problem to be solved by the present invention is to:Generally there are accuracy of identification is not high in traditional Gesture Recognition Algorithm, The problems such as stability is poor, and real-time is poor, and gesture function is single.
In order to solve the above-mentioned technical problem, the technical solution of the present invention is to provide a kind of, and the gesture based on deep learning is known Other method, which is characterized in that include the following steps:
Step 1 is trained binaryzation convolutional neural networks using gesture training set and test set;
After step 2, the original images of gestures of acquisition, original images of gestures is pre-processed, to remove illumination to original graph The influence as caused by;
Step 3, the colouring information reflected using the colour of skin, divide pretreated original image based on colouring information It cuts, extracts gesture profile;
The gesture profile that step 4, judgment step 3 are extracted whether be dynamic gesture rise, stop, if so, the gesture The gesture profile of a series of images extraction of profile thereafter is dynamic gesture, 6 is entered step, if it is not, then the gesture profile is Static gesture enters step 5;
Step 5 judges the corresponding gesture instruction of gesture profile using the binaryzation convolutional neural networks after training;
The corresponding dynamic gesture of step 6, a series of gesture profiles of positioning, stop, and track gesture rail using TLD algorithms Mark, the deviation in tracing process are modified using Haar classifier, reuse HMM algorithms identification dynamic gesture.
Preferably, in the step 2, the pretreatment includes brightness correction and light compensation;
When brightness correction, highlight regions in original images of gestures are corrected using modified exponential transform;For original Dark region in images of gestures, is corrected using the logarithmic transformation with parameter, to other regions then without correcting;
Light compensation is carried out based on dynamic threshold, is transformed into original images of gestures based on the theoretical algorithm of total reflection Then the set of the larger point of Y-component in YCbCr color space images is regarded white reference point by YCbCr color spaces.
Preferably, in the step 3, when being split to original image, using the colour of skin based on YCbCr color spaces Partitioning algorithm.
Method provided by the invention can solve in traditional gesture identification generally there are accuracy of identification it is not high, stablize The problems such as property is poor, real-time is poor, gesture function is single.
Due to the adoption of the above technical solution, compared with prior art, the present invention having the following advantages that and actively imitating Fruit:
The present invention improves tradition based on conventional Gesture Recognition Algorithm, has used improvement illumination compensation strategy so that Original image is easier to handle, and using improved complexion model dividing gesture to improve the accuracy of segmentation, uses improvement Depth convolutional network classifies to static gesture, improves static gesture discrimination;Using improved TLD and HMM algorithms to dynamic Gesture improves the robustness and real-time and discrimination of gesture system into line trace and identification.
Description of the drawings
Fig. 1 is the system structure signal of the design of the gesture recognition system the present invention is based on deep learning;
Fig. 2 is binaryzation convolutional neural networks structure chart of the present invention;
Fig. 3 is TLD algorithm frame figures;
Fig. 4 is the detail flowchart of TLD algorithms;
The improved TLD algorithm flow charts of Fig. 5;
Fig. 6 Design of System Software flow charts.
Specific implementation mode
Present invention will be further explained below with reference to specific examples.It should be understood that these embodiments are merely to illustrate the present invention Rather than it limits the scope of the invention.In addition, it should also be understood that, after reading the content taught by the present invention, people in the art Member can make various changes or modifications the present invention, and such equivalent forms equally fall within the application the appended claims and limited Range.
Embodiments of the present invention are related to a kind of gesture identification method based on deep learning, as shown in Figure 1, including following Step:
Step 1 is trained binaryzation convolutional neural networks using gesture training set and test set;
After step 2, the original images of gestures of acquisition, original images of gestures is pre-processed, to remove illumination to original graph The influence as caused by;
Step 3, the colouring information reflected using the colour of skin, divide pretreated original image based on colouring information It cuts, extracts gesture profile;
The gesture profile that step 4, judgment step 3 are extracted whether be dynamic gesture rise, stop, if so, the gesture The gesture profile of a series of images extraction of profile thereafter is dynamic gesture, 6 is entered step, if it is not, then the gesture profile is Static gesture enters step 5;
Step 5 judges the corresponding gesture instruction of gesture profile using the binaryzation convolutional neural networks after training;
The corresponding dynamic gesture of step 6, a series of gesture profiles of positioning, stop, and track gesture rail using TLD algorithms Mark, the deviation in tracing process are modified using Haar classifier, reuse HMM algorithms identification dynamic gesture.
Above-mentioned each step is further described with reference to embodiment:
1, include mainly to the pretreatment of original images of gestures progress in step 2:Based on exponential transform and logarithmic transformation Brightness correction, the light compensation based on dynamic threshold, specifically include:
(1) brightness correction based on exponential transform and logarithmic transformation.
Exponential transform only has good correction effect to inclined bright area in image, and logarithmic transformation is to region dark in image There is preferable correction effect, the two is combined and realizes a kind of light compensation policy for human hand, it is right as shown in formula (1) Highlight regions are corrected using modified exponential transform, for dark region, are corrected using the logarithmic transformation with parameter, To other regions without correcting.
Formula (1) is as follows using parameter:
G (x, y) indicates revised image;F (x, y) indicates original images of gestures;A indicates bloom regulation coefficient, at this A=0 in embodiment;B indicates the average brightness of image, in the present embodiment b=120/log T1;C indicates that normal number passes through reality It tests debugging to obtain, in the present embodiment c=T2;D indicates that (normal number obtained by experimental debugging, in the present embodiment d=1/ 255-T2;T1It indicates under dark illumination condition, light lower threshold, in the present embodiment T1=115;T2It indicates to shine item compared with light Under part, light upper limit threshold, T in the present embodiment2=135.
(2) the light compensation based on dynamic threshold
Image is transformed into YCbCr color spaces based on total reflection theoretical algorithm, then by Y in YCbCr color spaces The set of the larger point of component regards white reference point.Its detailed flow is as follows:
Assuming that original images of gestures is f (x, y), size is m × n, then has:
Original images of gestures f (x, y) is transformed into YCbCr color skies by step 1 first with formula (2) from rgb color space Between:
Step 2 is obtained with reference to white point
(a) transformed image is cut into M × N blocks, in the present embodiment, M=3, N=4;
(b) to the block after each segmentation, C in YCbCr space is calculated separatelybAnd CrThe average value M of componentbAnd Mr
(c) M is usedbAnd MrCarry out the C to each piecemealbAnd CrThe mean absolute error D of componentbAnd DrIt is calculated, is calculated Formula is formula (3):
In formula (3), Cb(i, j) indicates offset of the B component relative to brightness of each pixel, Cr(i, j) indicates R Offset of the component relative to brightness, sum indicate the sum of all pixels mesh of current piecemeal.
When 2, being split to pretreated original image based on colouring information in step 3, using based on YCbCr colors Space skin color segmentation algorithm, specifically includes:
YCbCr color spaces are also known as YUV color spaces, and Y indicates that brightness, Cr and Cb indicate coloration and saturation degree, In, Cr reflects the difference between RGB input signals RED sector and rgb signal brightness value.And Cb reflections is RGB input letters Difference number between blue portion and rgb signal brightness value.Rgb color space is that formula (4) is shown to the conversion formula of YCrCb:
By repetition test, the basic value of parameter is as follows:
77≤Cb≤127 AND 132≤Cr≤172 (5)
But formula (5) is comprising more skin color ranges, the interval provided is excessive, therefore is readily incorporated such as orange Or brown object interference.The present invention be directed to the distinctive features of skin colors of yellow, by repeatedly debugging, in value adjusted It is whole, the interference of class colour of skin object can be effectively excluded, value is as follows:
3, binaryzation convolutional neural networks use the convolutional Neural net based on binaryzation on the basis of MOCNN in step 1 Network (BCNN), specifically includes:
Current depth convolutional neural networks algorithm is exactly that calculating consumption is huge all there are one common defects at present.Therefore, The optimization of network calculations consumption is also unfolded mainly around the two aspects.Here on the basis of MPCNN gesture classification methods, It is proposed a kind of convolutional neural networks (binary convolution neural networks, BCNN) gesture based on binaryzation Sorting technique is improved neural network using the approximate strategy of binaryzation, reduces its consumption to computing resource.Binaryzation Network reduce computing resource consumption mode it is main there are two:First, indicate original double essences using the approximate weights of binaryzation Weights are spent, the EMS memory occupation of network in the calculation is reduced;Second, it is consumed in maximum multiplication calculating to being calculated in each layer Input and weights are substituted using the approximate value of binaryzation, and such multiplication calculating can be simplified to addition and subtraction even position fortune It calculates.Including the transformation to convolution block and the transformation to full link block.
(1) binaryzation of convolution block.
The concrete mode that convolutional neural networks are carried out with binaryzation approximation transformation is as follows:
The first step carries out binaryzation according to formula (7) to the weight matrix w of convolutional network and obtains during forward-propagating wb, and retain the weight matrix w of script, i.e.,:
In formula (7),Matrix w is obtained after representing binaryzation approximationbIn weights, cf、wf、hfIndicate volume Quantity, width and the height of product core,In the sign functions of standard, as w=0, it can obtain Sign (w)=0, and here in order to achieve the effect that binaryzation, being not allow for the 3rd value exists, so regulation takes as w=0 Sign (w)=1.
Second step, before every layer of preceding layer increasing a binaryzation active coating obtains nodal value and is jealous of, and substitutes script ReLU active coatings, as shown in formula (8), i.e.,:
In formula (8),For i-th layer of input value of binaryzation network,C, w, h indicate defeated respectively Enter the port number of image, width and height;L(X(i-1)) it is the value that i-th layer of binaryzation active coating obtains;X(i-1)Indicate binaryzation (i-1)-th layer of input value of network.
The function of sign is consistent with formula (8).Finally, the weight w obtainedbConvolution behaviour is carried out in binaryzation convolutional layer Make, as shown in formula (9), i.e.,:
In formula (9):Lb(Xb) be binaryzation network layer functions;For convolution operation;XbAswbPass through formula respectively (7) it is obtained with formula (8).
For convolution block, structure is also required to certain adjustment.BatchNorm layers of normalized and binaryzation are activated Layer is placed on before convolution operation, this is that the result of binaryzation active coating in order to prevent result occurs when by maximum pond layer Most of the case where being 1.Specific network structure, as shown in Figure 2.
The process of trained backpropagation is as follows.Last layer calculates gradient, and layer second from the bottom is successively reversed to first layer The gradient of the gradient and weights that calculate node is propagated, then the w retained before binaryzation is updated to obtain wuAnd it carries out such as formula (10) loose operation, i.e.,:
In formula (10), wuIndicate the value after the floating number right value update retained during forward-propagating;σ(wu) indicate power Weight wuProbability when > 0;Chip () indicates max functions.
(2) binaryzation of full link block.
Binaryzation convolutional layer is replaced with unlike the binaryzation of full link block is almost the same from the binaryzation of convolution block The full articulamentum of binaryzation, and eliminate maximum pond layer.Shown in the calculation formula such as formula (11) of the full articulamentum of binaryzation.
Lb(Xb)=wbXb (11)
In formula (11), Lb(Xb) be binaryzation full connection layer functions;Xb, wbPass through formula (7) respectively and formula (8) obtains.Two The full articulamentum of value eliminates biasing b.
4, in step 6, track gesture path using TLD algorithms, the deviation in tracing process using Haar classifier into Row is corrected, and the specific method for reusing HMM algorithms identification dynamic gesture includes:
4.1, TLD algorithm frames are made of three parts:Tracking, study, detection, as shown in Figure 3:
In algorithm frame, three part cooperative compensatings complete the tracking to object.In tracking module, precondition Not high for speed of moving body, object is not in significantly displacement between adjacent two frame, and tracked target exists always Within the scope of camera, moving target is estimated with this, if target disappears from the visual field, it will cause tracking to fail. In detection module, premise does not generate interference between the every frame of video, by the model for detecting and learning in the past, detection is used to calculate Method demarcates the region that is likely to occur of target respectively in every frame picture search target.When detection module when the error occurs, Study module according to tracking module obtain as a result, the mistake occurred to detection module is evaluated, generate training sample, update The mesh of detection module
The key feature points for marking model and tracking module, to avoid the occurrence of similar mistake.The detailed process of TLD algorithms Figure is as shown in Figure 4.
TLD algorithms are good to target following real-time, and when target is blocked or leaves camera shooting head region, occur again When, it can still identify by into line trace.But the algorithm needs to manually select tracked target by mouse in initialization, no Conducive to the automation of target following;Meanwhile the LBP features used in detection module readily satisfy in real time although calculating simply Property require, but tracking during will appear position deviation, cause tracking fail.Therefore this system is in original TLD algorithms On the basis of, in conjunction with the characteristics of static gesture identification and gesture tracking, following improvement is made to algorithm:
When to solve algorithm initialization, manual selected target regional issue is needed, static gesture identification database is added In detection module, when in video frame occur with gesture database match gesture when, auto-initiation TLD track algorithms.Meanwhile Due to that using trained static gesture database, then can remove the study module in original TLD algorithms, work as user gesture When changing, video frame need to be only retrieved again with the presence or absence of gesture in gesture database, then by TLD algorithm initializations, is improved TLD algorithm flows it is as shown in Figure 5.
4.2, the deviation during being tracked using Haar classifier amendment
The structure key step of Haar classifier includes extraction Haar features and training grader two parts.Haar feature masters To include central feature, linear character, edge feature and to corner characteristics.Final Haar classifier, the present invention adopt in order to obtain It is trained with improved Adaboost algorithm.First to train different Weak Classifiers from the Haar features of sample extraction, so These Weak Classifiers are integrated afterwards to obtain final strong classifier, that is, the Haar classifier needed herein.
The implementation process of improved Adaboost algorithm is as follows:
Assuming that X is sample space, Y is sample class logo collection.There is Y={ 0,1 } for typical two classification problem, remembers S={ (xi, yi) | i=1,2,3 ..., m } it is the training sample set being added after label, wherein there is xi∈ X, yi∈ Y, it is assumed that reach To iteration T times altogether when final target.
Step 1 initializes the weights of m sample:
In formula, Dt(i) sample (x in the t times iteration is indicatedi, yi) weights.
Step 2 calculates separately t=1,2,3 ..., T:
(a) each feature f for being sample x trains a Weak Classifier hl(x, f, p, θ):
In formula (13), θ indicates the threshold value of the corresponding Weak Classifiers of f, and the effect of p is adjustment sign of inequality direction.It calculates and uses qiTo the classification error rate ε after the weighting of the Weak Classifiers of all featuresf
εf=∑iqi|ht(x, f, p, θ)-yi| (14)
In formula (14), yiIndicate element in sample class identifier space, qiIndicate the weights of i-th of training sample.
(b) it picks out and possesses minimal error rate εtBest Weak Classifier εt
εt=minF, p, θiqi|ht(x, f, p, θ)-yi| (15)
(c) sample weights are corrected using best Weak Classifier:
βtt(1-εt) (17)
In formula (16), Dt+1(i) probability value of the t+1 training sample is indicated,Indicate Dt+1With DtThere are iteration passes System, can pass through DtUpdate Dt+1
In formula (17), βtIndicate normaliztion constant.
If sample xiCorrectly classified, then ei=0, otherwise, ei=1.
Step 3, final Haar classifier C (x):
αt=log (1/ βt) (19)
4.3, HMM dynamic gesture track identifications are based on
In the present invention, identification dynamic gesture track can use Hidden Markov Model, identification process to correspond to hidden Ma Er Three processes of section's husband's model solution:
(1) estimation problem
The problem refers to and generated by the model one for given Hidden Markov Model λ=(π, A, a B) Sequence of observations O=(o1, o2 ..., oT), calculate the likelihood probability P (O | λ) of the observation sequence O of generation.Solve the problems, such as this one A efficient algorithm is preceding to-backward recursion algorithm.
Defining forward variable is:
αt(i)=P (o1, o2... oT, qti| λ), 1≤t≤T (19)
In formula (19), P () indicate observation sequence like probability;o1, o2... oTIndicate observation sequence;qtIndicate moment t's Observation;θiIndicate system mode value;λ indicates Hidden Markov Model;T indicates observation total time;T indicates time scale, takes Between value 0-T.
Remember bj(ot)=bjk|ot=vk, bj(ot) indicate observation state transfer matrix, bjkIndicate arbitrary t moment, systematic perspective Survey matrix, vkIndicate that the hidden state of t moment, forwards algorithms step are:
Initialization:
α1(i)=πibj(o1), 1≤i≤N (20)
In formula (20), α1(i) indicate o occur from the 1-i moment1~oiObservation sequence, and moment hidden state v1It is 1 Probability;πiIndicate initial probability distribution matrix.
Recurrence:
In formula (21), αt+1(j) j moment hidden states v is indicatedt+1For the probability of t+1, αI, jIndicate in arbitrary t moment, be System state-transition matrix.
Calculating P (O | λ):
In formula (15), and P (O | λ) it indicates to generate the likelihood probability of observation sequence O under "current" model λ.Variable is after definition:
βt(i)=P (ot+1, ot+2... oT, qti| λ), 1≤t≤T (22)
In formula (22), βt(i) posterior probability of t moment P (O | λ) is indicated.
The step of backward algorithm is:
Initialization:
βT(i)=1, (23) 1≤i≤N
Recurrence:
T=T-1, T-2,, 1,1≤i≤N
Calculating P (O | λ):
By using forwards algorithms in calculating first half, if the period is 0~t, after the latter half of calculating uses Can be in the hope of probability if the period is t~T to algorithm:
(2) decoding problem
For a Hidden Markov Model λ=(π, A, B), it is necessary first to find out an observation sequence of model generation Arrange O=(o1, o2... oT), on the basis of the sequence of observations, computation model is undergone most during generating observation sequence Good status switchUsed here as Viterbi algorithm.
(3) problem concerning study
In the case where not knowing Hidden Markov Model parameter, observation sequence O=(o are generated according to model1, o2... oT), by adjusting model parameter so that and likelihood probability P (O | λ) value maximum.In this system, problem concerning study is usually used Baum-Welch algorithms solve.
Gesture identification platform acquires images of gestures by camera, and gesture command therein, which is converted into computer, to be held Capable instruction.Firstly the need of sample database, static gesture carries out in this Basis of Database with dynamic gesture track identification: Images of gestures can be obtained by camera, can also be directly from local video file;Obtain images of gestures after, to its into The operations such as row Hand Gesture Segmentation, image binaryzation and feature extraction;Gesture identification is finally carried out to it, is returned to recognition result and is convenient for me Process observation.Design of System Software flow is as shown in Figure 6.The system is developed using multithreading, wherein image preprocessing, gesture It is segmented in by-pass journey 1 and completes, dynamic gesture tracking and identification are completed in by-pass journey 3.

Claims (3)

1. a kind of gesture identification method based on deep learning, which is characterized in that include the following steps:
Step 1 is trained binaryzation convolutional neural networks using gesture training set and test set;
After step 2, the original images of gestures of acquisition, original images of gestures is pre-processed, original image is made with removing illumination At influence;
Step 3, the colouring information reflected using the colour of skin, are split pretreated original image based on colouring information, carry Take gesture profile;
The gesture profile that step 4, judgment step 3 are extracted whether be dynamic gesture rise, stop, if so, the gesture profile Thereafter the gesture profile of a series of images extraction is dynamic gesture, enters step 6, if it is not, then the gesture profile is static state Gesture enters step 5;
Step 5 judges the corresponding gesture instruction of gesture profile using the binaryzation convolutional neural networks after training;
The corresponding dynamic gesture of step 6, a series of gesture profiles of positioning, stop, and gesture path is tracked using TLD algorithms, Deviation in tracing process is modified using Haar classifier, reuses HMM algorithms identification dynamic gesture.
2. a kind of gesture identification method based on deep learning as described in claim 1, which is characterized in that in the step 2 In, the pretreatment includes brightness correction and light compensation;
When brightness correction, highlight regions in original images of gestures are corrected using modified exponential transform;For original gesture Dark region in image, is corrected using the logarithmic transformation with parameter, to other regions then without correcting;
Light compensation is carried out based on dynamic threshold, original images of gestures is transformed by YCbCr colors based on the theoretical algorithm of total reflection Then the set of the larger point of Y-component in YCbCr color space images is regarded white reference point by color space.
3. a kind of gesture identification method based on deep learning as described in claim 1, which is characterized in that in the step 3 In, when being split to original image, using the skin color segmentation algorithm based on YCbCr color spaces.
CN201810242638.4A 2018-03-22 2018-03-22 Gesture recognition method based on deep learning Active CN108537147B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810242638.4A CN108537147B (en) 2018-03-22 2018-03-22 Gesture recognition method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810242638.4A CN108537147B (en) 2018-03-22 2018-03-22 Gesture recognition method based on deep learning

Publications (2)

Publication Number Publication Date
CN108537147A true CN108537147A (en) 2018-09-14
CN108537147B CN108537147B (en) 2021-12-10

Family

ID=63483626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810242638.4A Active CN108537147B (en) 2018-03-22 2018-03-22 Gesture recognition method based on deep learning

Country Status (1)

Country Link
CN (1) CN108537147B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508670A (en) * 2018-11-12 2019-03-22 东南大学 A kind of static gesture identification method based on infrared camera
CN109614922A (en) * 2018-12-07 2019-04-12 南京富士通南大软件技术有限公司 A kind of dynamic static gesture identification method and system
CN109634415A (en) * 2018-12-11 2019-04-16 哈尔滨拓博科技有限公司 It is a kind of for controlling the gesture identification control method of analog quantity
CN109684959A (en) * 2018-12-14 2019-04-26 武汉大学 The recognition methods of video gesture based on Face Detection and deep learning and device
CN110908581A (en) * 2019-11-20 2020-03-24 网易(杭州)网络有限公司 Gesture recognition method and device, computer storage medium and electronic equipment
CN111753764A (en) * 2020-06-29 2020-10-09 济南浪潮高新科技投资发展有限公司 Gesture recognition method of edge terminal based on attitude estimation
CN112183639A (en) * 2020-09-30 2021-01-05 四川大学 Mineral image identification and classification method
CN112270220A (en) * 2020-10-14 2021-01-26 西安工程大学 Sewing gesture recognition method based on deep learning
CN112784812A (en) * 2021-02-08 2021-05-11 安徽工程大学 Deep squatting action recognition method
CN113449573A (en) * 2020-03-27 2021-09-28 华为技术有限公司 Dynamic gesture recognition method and device
CN114049539A (en) * 2022-01-10 2022-02-15 杭州海康威视数字技术股份有限公司 Collaborative target identification method, system and device based on decorrelation binary network
CN114627561A (en) * 2022-05-16 2022-06-14 南昌虚拟现实研究院股份有限公司 Dynamic gesture recognition method and device, readable storage medium and electronic equipment
US20230107097A1 (en) * 2021-10-06 2023-04-06 Fotonation Limited Method for identifying a gesture

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106502570A (en) * 2016-10-25 2017-03-15 科世达(上海)管理有限公司 A kind of method of gesture identification, device and onboard system
US20170220122A1 (en) * 2010-07-13 2017-08-03 Intel Corporation Efficient Gesture Processing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170220122A1 (en) * 2010-07-13 2017-08-03 Intel Corporation Efficient Gesture Processing
CN106502570A (en) * 2016-10-25 2017-03-15 科世达(上海)管理有限公司 A kind of method of gesture identification, device and onboard system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
田淑芳 等: "《遥感地质学 第2版》", 31 January 2014, 北京:地质出版社 *
胡骏飞 等: "基于二值化卷积神经网络的手势分类方法研究", 《湖南工业大学学报》 *
范文兵 等: "一种基于肤色特征提取的手势检测识别方法", 《现代电子技术》 *
韦艳柳 等: "利用肤色信息和几何特征的人脸检测算法研究", 《无线互联科技》 *
齐静 等: "机器人视觉手势交互技术研究进展", 《机器人》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508670A (en) * 2018-11-12 2019-03-22 东南大学 A kind of static gesture identification method based on infrared camera
CN109508670B (en) * 2018-11-12 2021-10-12 东南大学 Static gesture recognition method based on infrared camera
CN109614922A (en) * 2018-12-07 2019-04-12 南京富士通南大软件技术有限公司 A kind of dynamic static gesture identification method and system
CN109614922B (en) * 2018-12-07 2023-05-02 南京富士通南大软件技术有限公司 Dynamic and static gesture recognition method and system
CN109634415A (en) * 2018-12-11 2019-04-16 哈尔滨拓博科技有限公司 It is a kind of for controlling the gesture identification control method of analog quantity
CN109684959A (en) * 2018-12-14 2019-04-26 武汉大学 The recognition methods of video gesture based on Face Detection and deep learning and device
CN110908581B (en) * 2019-11-20 2021-04-23 网易(杭州)网络有限公司 Gesture recognition method and device, computer storage medium and electronic equipment
CN110908581A (en) * 2019-11-20 2020-03-24 网易(杭州)网络有限公司 Gesture recognition method and device, computer storage medium and electronic equipment
CN113449573A (en) * 2020-03-27 2021-09-28 华为技术有限公司 Dynamic gesture recognition method and device
CN111753764A (en) * 2020-06-29 2020-10-09 济南浪潮高新科技投资发展有限公司 Gesture recognition method of edge terminal based on attitude estimation
CN112183639A (en) * 2020-09-30 2021-01-05 四川大学 Mineral image identification and classification method
CN112270220A (en) * 2020-10-14 2021-01-26 西安工程大学 Sewing gesture recognition method based on deep learning
CN112784812A (en) * 2021-02-08 2021-05-11 安徽工程大学 Deep squatting action recognition method
US20230107097A1 (en) * 2021-10-06 2023-04-06 Fotonation Limited Method for identifying a gesture
CN114049539A (en) * 2022-01-10 2022-02-15 杭州海康威视数字技术股份有限公司 Collaborative target identification method, system and device based on decorrelation binary network
CN114627561A (en) * 2022-05-16 2022-06-14 南昌虚拟现实研究院股份有限公司 Dynamic gesture recognition method and device, readable storage medium and electronic equipment

Also Published As

Publication number Publication date
CN108537147B (en) 2021-12-10

Similar Documents

Publication Publication Date Title
CN108537147A (en) A kind of gesture identification method based on deep learning
CN109977918B (en) Target detection positioning optimization method based on unsupervised domain adaptation
US20200285896A1 (en) Method for person re-identification based on deep model with multi-loss fusion training strategy
Wang et al. Research on face recognition based on deep learning
CN104601964B (en) Pedestrian target tracking and system in non-overlapping across the video camera room of the ken
Zhang et al. Robust visual tracking based on online learning sparse representation
Ioannou et al. Emotion recognition through facial expression analysis based on a neurofuzzy network
CN109584248A (en) Infrared surface object instance dividing method based on Fusion Features and dense connection network
CN108629288B (en) Gesture recognition model training method, gesture recognition method and system
CN108268859A (en) A kind of facial expression recognizing method based on deep learning
CN106897670A (en) A kind of express delivery violence sorting recognition methods based on computer vision
CN108256421A (en) A kind of dynamic gesture sequence real-time identification method, system and device
CN111666843A (en) Pedestrian re-identification method based on global feature and local feature splicing
CN112949647B (en) Three-dimensional scene description method and device, electronic equipment and storage medium
CN103049751A (en) Improved weighting region matching high-altitude video pedestrian recognizing method
CN109543632A (en) A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN106778852A (en) A kind of picture material recognition methods for correcting erroneous judgement
CN114758288A (en) Power distribution network engineering safety control detection method and device
CN111158491A (en) Gesture recognition man-machine interaction method applied to vehicle-mounted HUD
CN109344822A (en) A kind of scene text detection method based on shot and long term memory network
Zheng et al. Static Hand Gesture Recognition Based on Gaussian Mixture Model and Partial Differential Equation.
CN115690152A (en) Target tracking method based on attention mechanism
CN114821764A (en) Gesture image recognition method and system based on KCF tracking detection
CN111158457A (en) Vehicle-mounted HUD (head Up display) human-computer interaction system based on gesture recognition
Fan Research and realization of video target detection system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant