CN110390305A

CN110390305A - The method and device of gesture identification based on figure convolutional neural networks

Info

Publication number: CN110390305A
Application number: CN201910676491.4A
Authority: CN
Inventors: 叶典; 邱卫根; 陈玉冰; 刘畅; 曾博; 曹祖晟
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2019-07-25
Filing date: 2019-07-25
Publication date: 2019-10-29

Abstract

The present invention provides a kind of method and devices of gesture identification based on figure convolutional neural networks, this method comprises: by obtaining gesture artis space-time diagram using Attitude estimation algorithm and the gesture artis space-time diagram being normalized, obtain data to be calculated, allow to calculate the data to be calculated by space-time diagram convolutional neural networks-gesture identification model of foundation, finally obtains recognition result.And space-time diagram convolutional neural networks-gesture identification model, classified by six space-time convolution units and three pond layers and a support vector cassification machine to the data being calculated, to realize the real-time for improving gesture identification while the accuracy for improving gesture identification.

Description

The method and device of gesture identification based on figure convolutional neural networks

Technical field

The present invention relates to video image identification technical field, in particular to a kind of gesture based on figure convolutional neural networks is known Method for distinguishing and device.

Background technique

The mode that gesture motion is exchanged as hominid, is used till today always.The importance of information is conveyed by gesture, It not with the propulsion of time and the development of technology, and is gradually eliminated, opposite gesture movement becomes in field of human-computer interaction More importantly interactive mode.Gesture Recognition is being currently the research hotspot of computer application and artificial intelligence field, In The fields such as robot control, sign language identification, unmanned and motion detection, gesture have given full play to that it is convenient and efficient, meaning is rich Richness leads to ripe understandable advantageous feature.

Traditional Gesture Recognition mainly has two major classes, gesture identification (the common camera shooting including view-based access control model sensor Head, depth camera head etc.) and it is based on wearable sensor (data glove etc.).But due to the inconvenience of wearable sensor With identification retardance, so most popular research direction is the gesture identification of view-based access control model sensor instantly.

In the gesture identification of current visual sensor, the limitation of template matching and probability statistics model causes Lower to the accuracy of gesture identification, real-time is poor.

Summary of the invention

In view of this, the embodiment of the present invention provides the method and dress of a kind of gesture identification based on figure convolutional neural networks It sets, for improving the real-time of gesture identification while guaranteeing the accuracy of gesture identification.

To achieve the above object, the embodiment of the present invention provides the following technical solutions:

A method of the gesture identification based on figure convolutional neural networks, comprising:

Gesture data set is pre-processed using Attitude estimation algorithm, obtains gesture artis space-time diagram；Wherein, described Attitude estimation algorithm is based on convolutional neural networks and supervised learning and using deep learning algorithm as Development of Framework；

The gesture artis space-time diagram is normalized, data to be calculated are obtained；

The data to be calculated are input in space-time diagram convolutional neural networks-gesture identification model and obtain identification knot Fruit；Wherein, the model of the space-time diagram convolutional neural networks-gesture identification is to construct to obtain using figure convolutional neural networks.

Optionally, described that gesture data set is pre-processed using Attitude estimation algorithm, obtain gesture artis space-time Figure, comprising:

The video data that the gesture data is concentrated is calculated by the Attitude estimation algorithm, obtains the video The sequence of frames of video of data；Wherein, the sequence of frames of video includes the set of relationship of the finger-joint point of each frame；

Each of set of relationship by each frame finger-joint point finger-joint point line, obtains finger-joint The space diagram of point；

By identical finger-joint point line on each frame in the set of relationship of each frame finger-joint point, obtain The time diagram of finger-joint point；

The space diagram of the time diagram of the finger-joint point and finger-joint point is combined together, the hand is constructed Gesture artis space-time diagram.

Optionally, described that the gesture artis space-time diagram is normalized, obtain data to be calculated, comprising:

By each of described gesture artis space-time diagram finger-joint point time diagram and space diagram under value, into Row normalization obtains the data to be calculated of each finger-joint point.

Optionally, the network structure of the model of the space-time convolutional neural networks-gesture identification, comprising:

Six space-time convolution units, three pond layers and a support vector machine classifier；

Any two space-time convolution unit and a pond layer form computing unit；Described in two in the computing unit Space-time convolution unit is successively run, and extracts to obtain the finger-joint point information of higher-dimension from the data to be calculated；Wherein, the latter The space-time convolution unit handles the processing result of the previous space-time convolution unit；Hand of the pond layer to the higher-dimension Articulations digitorum manus point information carries out down-sampled operation and obtains information to be sorted；

The support vector machine classifier carries out classified calculating to the information to be sorted, obtains the information pair to be sorted The probability for the gesture-type answered finds the hand of a maximum probability in the probability of the corresponding gesture-type of the information to be sorted Gesture type, obtains recognition result.

A kind of device of the gesture identification based on figure convolutional neural networks, comprising:

Pretreatment unit obtains gesture artis for pre-processing using Attitude estimation algorithm to gesture data set Space-time diagram；Wherein, the Attitude estimation algorithm is based on convolutional neural networks and supervised learning and using deep learning algorithm as frame Frame exploitation；

Normalized unit obtains number to be calculated for the gesture artis space-time diagram to be normalized According to；

Recognition unit, for the data to be calculated to be input to space-time diagram convolutional neural networks-gesture identification model In obtain recognition result；Wherein, the model of the space-time diagram convolutional neural networks-gesture identification is to utilize figure convolutional neural networks Building obtains.

Optionally, the pretreatment unit, comprising:

Attitude estimation algorithm computing unit, the video data for concentrating the gesture data pass through the Attitude estimation Algorithm is calculated, and the sequence of frames of video of the video data is obtained；Wherein, the sequence of frames of video includes the finger of each frame The set of relationship of artis；

First connection unit, for by each of the set of relationship of each frame finger-joint point finger-joint point Line obtains the space diagram of finger-joint point；

Second connection unit, for by identical hand on each frame in the set of relationship of each frame finger-joint point Articulations digitorum manus point line, obtains the time diagram of finger-joint point；

Combining unit, for the space diagram of the time diagram of the finger-joint point and finger-joint point to be incorporated in one It rises, constructs the gesture artis space-time diagram.

Optionally, the normalized unit, comprising:

Normalized subelement, for by each of described gesture artis space-time diagram finger-joint point when Between value under figure and space diagram, be normalized, obtain the data to be calculated of each finger-joint point.

Six space-time convolution units, three pond layer units and a support vector machine classifier unit；

Any two space-time convolution unit and a pond layer form computing unit；Described in two in the computing unit Space-time convolution unit is successively run, and extracts to obtain the finger-joint point information of higher-dimension from the data to be calculated；Wherein, the latter The space-time convolution unit handles the processing result of the previous space-time convolution unit；

The pond layer unit, is also used to the finger-joint point information to the higher-dimension, carry out down-sampled operation obtain to Classification information；

The support vector machine classifier unit, for carrying out classified calculating to the information to be sorted, obtain it is described to The probability of the corresponding gesture-type of classification information, found in the probability of the corresponding gesture-type of the information to be sorted one it is general The maximum gesture-type of rate, obtains recognition result.

Optionally, each space-time convolution unit includes:

Attention model, figure convolution model and time convolution model；

Wherein, the attention model is used to constrain the identification range of the figure convolution model；The figure convolution model is used In identifying in the identification range at itself to the data to be calculated, the space structure collection between the finger-joint point is obtained Close: the time convolution model be used between the space structure set the finger-joint point that the figure convolution model obtains into Row calculates, and obtains the finger-joint point information of higher-dimension.

As it can be seen from the above scheme the method and dress of a kind of gesture identification based on figure convolutional neural networks provided by the present application In setting, by obtaining gesture artis space-time diagram using Attitude estimation algorithm and carrying out normalizing to the gesture artis space-time diagram Change processing, obtains data to be calculated, allows to space-time diagram convolutional neural networks-gesture identification model by foundation to institute It states data to be calculated to be calculated, finally obtains recognition result.And space-time diagram convolutional neural networks-gesture identification model, lead to Cross six space-time convolution units and three pond layers and a support vector cassification machine to the data being calculated into Row classification, to realize the real-time for improving gesture identification while the accuracy for improving gesture identification.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.

Fig. 1 is a kind of specific stream of the method for the gesture identification based on figure convolutional neural networks provided in an embodiment of the present invention Cheng Tu；

Fig. 2 be another embodiment of the present invention provides a kind of gesture identification based on figure convolutional neural networks method tool Body flow chart；

Fig. 3 be another embodiment of the present invention provides a kind of finger-joint point space diagram schematic diagram；

Fig. 4 be another embodiment of the present invention provides a kind of finger-joint point time diagram schematic diagram；

Fig. 5 be another embodiment of the present invention provides a kind of gesture artis space-time diagram schematic diagram；

Fig. 6 be another embodiment of the present invention provides a kind of gesture identification based on figure convolutional neural networks method tool Body flow chart；

Fig. 7 be another embodiment of the present invention provides the device of the gesture identification based on figure convolutional neural networks a kind of show It is intended to；

Fig. 8 be another embodiment of the present invention provides the device of the gesture identification based on figure convolutional neural networks a kind of show It is intended to.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

The embodiment of the present invention provides a kind of method of gesture identification based on figure convolutional neural networks, as shown in Figure 1, packet It includes:

S101, gesture data set is pre-processed using Attitude estimation algorithm, obtains gesture artis space-time diagram.

Wherein, gesture data collection can be the big-sample data collection The with a variety of type gestures having disclosed 20BN-JESTER, currently, the data set has contained 148092 gesture videos, totally 27 class gesture motion, total data set Size is 22.8GB；Equally, other gesture data collection also can be used according to the actual situation.

It should be noted that due to being common red (red, R) green (green, G) blue in The 20BN-JESTER (blue, B) image/video, cannot be directly as the input data of space-time diagram convolutional neural networks.So The 20BN- The finger data in RGB image video in JESTER can be obtained by Attitude estimation algorithm.

It should also be noted that, Attitude estimation algorithm is the human body attitude identification project of Carnegie Mellon Univ USA, appearance State algorithm for estimating be based on convolutional neural networks and supervised learning and using deep learning algorithm as the open source library of Development of Framework, can be with Realize the Attitude estimations such as human action, facial expression, finger movement.

Specifically, after by being pre-processed using Attitude estimation algorithm opponent's power set data, when obtaining gesture artis Empty graph, when later use space-time convolutional neural networks are calculated, can input data in gesture artis space-time diagram into Row calculates.

Optionally, in another embodiment of the present invention, a kind of embodiment of step S101, as shown in Figure 2, comprising:

S201, the video data that gesture data is concentrated is calculated by Attitude estimation algorithm, obtains video data Sequence of frames of video.

Specifically, the data class obtained after the video data that gesture data is concentrated is calculated by Attitude estimation algorithm Type is sequence of frames of video；Wherein, sequence of frames of video includes the set of relationship of the finger-joint point of each frame.

S202, by each of set of relationship of each frame finger-joint point finger-joint point line, obtain finger pass The space diagram of node.

Specifically, can be with as shown in figure 3, circle represents finger-joint point, the line between finger-joint point be represent The side that the space diagram of finger-joint point spatially forms.

S203, by finger-joint point line identical on each frame in the set of relationship of each frame finger-joint point, obtain To the time diagram of finger-joint point.

Specifically, can be with as shown in figure 4, circle represents the finger-joint point of first frame, the circle with shade be represent The finger-joint point of second frame；Wherein the finger-joint point of first frame and the finger-joint point of the second frame are identical；Finger-joint point Line represent the side that the time diagram of finger-joint point is composition in the time.

It will be in every two adjacent frame pictures in the set of relationship of each frame finger-joint point it should be noted that can be Identical finger-joint point carry out line, i.e., by t frame picture and t+1 frame picture identical finger-joint point carry out Line；Equally, the identical finger-joint point in t frame picture and t+2 frame picture can also be subjected to line.

It should also be noted that, by the phase in every two adjacent frame pictures in the set of relationship of each frame finger-joint point With finger-joint point carry out line when, two adjacent frame pictures are closer, finally later use space-time convolutional neural networks into When row calculates, what is obtained is more accurate to the recognition result of gesture identification, conversely, two adjacent frame pictures are remoter, finally rear Continuous when being calculated using space-time convolutional neural networks, what is obtained is more inaccurate to the recognition result of gesture identification.

S204, the space diagram of the time diagram of finger-joint point and finger-joint point is combined together, constructs gesture joint Point space-time diagram.

Specifically, gesture joint as shown in Figure 5 can be constructed in conjunction with step S202 and S203 and Fig. 3 and Fig. 4 Point space-time diagram；Wherein, dark circle is the finger-joint point of first frame, and light circle is the finger-joint point of the second frame, The space diagram of the finger-joint point of first frame is clicked through with the identical finger-joint in the space diagram of the finger-joint of the second frame point Row line constructs gesture artis space-time diagram.

S102, gesture artis space-time diagram is normalized, obtains data to be calculated.

Optionally, in another embodiment of the present invention, a kind of embodiment of step S102, comprising:

By each of gesture artis space-time diagram finger-joint point time diagram and space diagram under value, returned One changes, and obtains the data to be calculated of each finger-joint point.

Wherein, normalized is a kind of mode of simplified calculating, i.e., the expression formula that will have dimension is turned to by transformation Nondimensional expression formula, becomes scalar.

Specifically, the variation of finger-joint is very big since finger-joint is under different frame and different angle, it is therefore desirable to will Position feature of each of gesture artis space-time diagram joint under different frame is normalized, advantageously with calculation The convergence of method, so that subsequent arithmetic process is more convenient accurate.

S103, data to be calculated are input in space-time diagram convolutional neural networks-gesture identification model obtain identification knot Fruit.

Wherein, space-time diagram convolutional neural networks-gesture identification model is to utilize figure convolutional neural networks and numerous gestures Data set constructs obtain in advance.

Specifically, the parameter of space-time diagram convolutional neural networks-gesture identification model can be by stochastic gradient descent side Method (Stochastic gradiernt descent, SGD) is trained study, and stochastic gradient descent is most fast in calculating decline Direction constantly selects a data to be calculated at random, rather than scans whole training datasets, to accelerate iteration speed.

During the specific implementation of the present embodiment, when trained data sample is less, cross validation can also be used Method carries out the training of model, i.e., is concentrated through stratified sampling from data and obtains, and then, uses k-1 subset as training set every time, Remaining subset is as test set, it is hereby achieved that k group training/test set, and then k group data are subjected to k training again and are tried It tests, finally obtains the mean value of this k trained test result.

Optionally, in another embodiment of the present invention, a kind of embodiment of step S103, as shown in Figure 6, comprising:

Six space-time convolution units, three pond layers and support vector machines (Support Vector Machine, SVM) classifier.

Wherein, the 6 of STGCN-H are indicated with stgc_u1, stgc_u2, stgc_u3, stgc_u4, stgc_u5, stgc_u6 A space-time convolution unit；Pool1, pool2, pool3 respectively indicate 3 pond layers.

Specifically, any two space-time convolution unit and a pond layer can be formed computing unit；In computing unit Two space-time convolution units successively run, extract to obtain the finger-joint point information of higher-dimension from data to be calculated；Pond layer is used In the amount of compressed data and parameter, reduce over-fitting, removing the information that extra information leaves has scale invariability, most can table Up to the information of finger characteristic.

Wherein, the latter space-time convolution unit handles the processing result of previous space-time convolution unit；Pond layer is to higher-dimension Finger-joint point information, carry out it is down-sampled operation obtain information to be sorted.

SVM classifier treats classification information and carries out classified calculating, obtains the probability of the corresponding gesture-type of information to be sorted, The gesture-type that a maximum probability is found in the probability of the corresponding gesture-type of information to be sorted, obtains recognition result.

Specifically, the data set used in SVM classifier has class label, SVM classifier is carried using data set Class label, classify to obtained information to be sorted, also, the result of svm classifier final output is one with general The tensor of rate information can find the gesture class of a maximum probability in the probability of the corresponding gesture-type of information to be sorted Type obtains final recognition result.

Optionally, in another embodiment of the present invention, each space-time convolution unit includes:

Attention model, figure convolution model and time convolution model.

Wherein, attention model is used to constrain the identification range of the figure convolution model；Figure convolution model is used at itself Identification range in data to be calculated are identified, obtain the space structure set between finger-joint point: time convolution model The space structure set finger-joint point for obtaining between figure convolution model calculates, and obtains the finger-joint point of higher-dimension Information.

It should be noted that during based on the gesture identification of figure convolutional neural networks, the finger-joint point of palm 15 are shared, since the artis of finger is more and may be used also during based on the gesture identification of figure convolutional neural networks There can be complicated background image, can all generate interference to the performance of figure convolution model, it is therefore desirable to which attention model constrains The identification range of figure convolution model, so that figure convolution model focuses more on finger-joint point information important in image, to mention The accuracy rate of height identification.

It should also be noted that, the operation in figure convolution model is based on aerial prospective visual angle (spatial Perspective it) is operated, also, the node that directly acts in image of the convolution kernel in figure convolution model and is closed on Node directly practises the feature between finger-joint from palm joint point diagram middle school；And time convolution model be by study when Between variation in finger-joint point variation local feature.

During the specific implementation of the present embodiment, residual error mechanism can also be used in space-time convolution unit.Specifically, It cannot be realized by the simple stacking of layer and layer since the depth of network is promoted, it may appear that gradient disappearance problem, deep layer Network is difficult to train.Because gradient propagates backward to the layer of front, repeating to be multiplied may make gradient infinitely small.Cause with network The number of plies it is deeper, performance tends to be saturated, in addition start rapidly decline.Space-time convolution list can be reduced by adding residual error mechanism Member error rate in calculating process.

Equally, it during the specific implementation of the present embodiment, can also be added random after each space-time convolution unit Lose (dropout) work.Thus reduce the dependency degree of different neurons.Specifically, some intermediate outputs, in given instruction Practice on collection, it may occur however that the case where only relying on certain neurons, this will result in the over-fitting to training set.And it is random The some neurons of dropout inactivate them, do not generate contribution to subsequent learning parameter, and more multi-neuron can be allowed to participate in To in final output, and then reduce over-fitting.

As it can be seen from the above scheme in a kind of method of gesture identification based on figure convolutional neural networks provided by the present application, By obtaining gesture artis space-time diagram using Attitude estimation algorithm and place being normalized to the gesture artis space-time diagram Reason, obtains data to be calculated, allow to by space-time diagram convolutional neural networks-gesture identification model of foundation to it is described to It calculates data to be calculated, finally obtains recognition result.And space-time diagram convolutional neural networks-gesture identification model, pass through six A space-time convolution unit and three pond layers and a support vector cassification machine divide the data being calculated Class, to realize the real-time for improving gesture identification while the accuracy for improving gesture identification.

The embodiment of the present invention provides a kind of device of gesture identification based on figure convolutional neural networks, as shown in fig. 7, packet It includes:

Pretreatment unit 701 obtains gesture joint for pre-processing using Attitude estimation algorithm to gesture data set Point space-time diagram.

Wherein, Attitude estimation algorithm is opened based on convolutional neural networks and supervised learning and using deep learning algorithm as frame Hair.

Optionally, in another embodiment of the present invention, a kind of embodiment of pretreatment unit 701, as shown in figure 8, packet It includes:

Attitude estimation algorithm computing unit 801, the video data for concentrating gesture data pass through Attitude estimation algorithm It is calculated, obtains the sequence of frames of video of video data.

Wherein, sequence of frames of video includes the set of relationship of the finger-joint point of each frame.

First connection unit 802, for by each of the set of relationship of each frame finger-joint point finger-joint point Line obtains the space diagram of finger-joint point.

Second connection unit 803, for by identical hand on each frame in the set of relationship of each frame finger-joint point Articulations digitorum manus point line, obtains the time diagram of finger-joint point.

Combining unit 804, for the space diagram of the time diagram of finger-joint point and finger-joint point to be combined together, structure Make gesture artis space-time diagram.

The specific work process of unit disclosed in the above embodiment of the present invention, reference can be made to corresponding embodiment of the method content, As shown in Fig. 2, Fig. 3, Fig. 4 and Fig. 5, details are not described herein again.

Normalized unit 702 obtains number to be calculated for gesture artis space-time diagram to be normalized According to.

Optionally, in another embodiment of the present invention, a kind of embodiment of normalized unit 702, comprising:

Normalized subelement, for by each of gesture artis space-time diagram finger-joint point time diagram Value under space diagram, is normalized, and obtains the data to be calculated of each finger-joint point.

Recognition unit 703, for data to be calculated to be input in space-time diagram convolutional neural networks-gesture identification model Obtain recognition result.

The structural schematic diagram of the convolutional neural networks of space-time diagram disclosed in the above embodiment of the present invention-gesture identification model with And the corresponding course of work, reference can be made to corresponding embodiment of the method content, as shown in fig. 6, details are not described herein again.

The specific work process of unit disclosed in the above embodiment of the present invention, reference can be made to corresponding embodiment of the method content, As shown in Figure 1, details are not described herein again.

Optionally, in another embodiment of the present invention, the network of the model of the space-time convolutional neural networks-gesture identification A kind of embodiment of structure, comprising:

Six space-time convolution units, three pond layer units and a support vector machine classifier unit.

Wherein, any two space-time convolution unit and a pond layer form computing unit.

Two space-time convolution units in computing unit are successively run, and the finger for extracting to obtain higher-dimension from data to be calculated closes Nodal information.

Wherein, the latter space-time convolution unit handles the processing result of previous space-time convolution unit.

Pond layer unit is also used to the finger-joint point information to higher-dimension, carries out down-sampled operation and obtains information to be sorted.

Support vector machine classifier unit carries out classified calculating for treating classification information, it is corresponding to obtain information to be sorted Gesture-type probability, the gesture class of a maximum probability is found in the probability of the corresponding gesture-type of information to be sorted Type obtains recognition result.

Optionally, in another embodiment of the present invention, a kind of embodiment of space-time convolution unit, comprising:

Attention model, figure convolution model and time convolution model；

Wherein, attention model is used for the identification range of constraints graph convolution model；Figure convolution model is used for the knowledge at itself Data to be calculated are identified in other range, obtain the space structure set between finger-joint point: time convolution model is used for Space structure set the finger-joint point that figure convolution model obtains is calculated, the finger-joint point letter of higher-dimension is obtained Breath.

As it can be seen from the above scheme in a kind of device of gesture identification based on figure convolutional neural networks provided by the present application, It is right using normalized unit 702 after pretreatment unit 701 obtains gesture artis space-time diagram using Attitude estimation algorithm The gesture artis space-time diagram is normalized, and obtains data to be calculated, and recognition unit 703 is allowed to pass through foundation Space-time diagram convolutional neural networks-gesture identification model the data to be calculated are calculated, finally obtain recognition result. And space-time diagram convolutional neural networks-gesture identification model, pass through six space-time convolution units and three pond layers and one Support vector cassification machine classifies to the data being calculated, and is improving the accurate of gesture identification to realize While property, the real-time of gesture identification is improved.

The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims

1. a kind of method of the gesture identification based on figure convolutional neural networks characterized by comprising

Gesture data set is pre-processed using Attitude estimation algorithm, obtains gesture artis space-time diagram；Wherein, the posture Algorithm for estimating is based on convolutional neural networks and supervised learning and using deep learning algorithm as Development of Framework；

The data to be calculated are input in space-time diagram convolutional neural networks-gesture identification model and obtain recognition result；Its In, the model of the space-time diagram convolutional neural networks-gesture identification is to construct to obtain using figure convolutional neural networks.

2. the method according to claim 1, wherein described carry out gesture data set using Attitude estimation algorithm Pretreatment, obtains gesture artis space-time diagram, comprising:

The video data that the gesture data is concentrated is calculated by the Attitude estimation algorithm, obtains the video data Sequence of frames of video；Wherein, the sequence of frames of video includes the set of relationship of the finger-joint point of each frame；

Each of set of relationship by each frame finger-joint point finger-joint point line, obtains finger-joint point Space diagram；

By identical finger-joint point line on each frame in the set of relationship of each frame finger-joint point, finger is obtained The time diagram of artis；

The space diagram of the time diagram of the finger-joint point and finger-joint point is combined together, the gesture is constructed and closes Node space-time diagram.

3. according to the method described in claim 2, it is characterized in that, described be normalized the gesture artis space-time diagram Processing, obtains data to be calculated, comprising:

By each of described gesture artis space-time diagram finger-joint point time diagram and space diagram under value, returned One changes, and obtains the data to be calculated of each finger-joint point.

4. the method according to claim 1, wherein space-time convolutional neural networks-gesture identification model Network structure, comprising:

Any two space-time convolution unit and a pond layer form computing unit；Two space-times in the computing unit Convolution unit is successively run, and extracts to obtain the finger-joint point information of higher-dimension from the data to be calculated；Wherein, described in the latter Space-time convolution unit handles the processing result of the previous space-time convolution unit；The pond layer closes the finger of the higher-dimension Nodal information carries out down-sampled operation and obtains information to be sorted；

The support vector machine classifier carries out classified calculating to the information to be sorted, and it is corresponding to obtain the information to be sorted The probability of gesture-type finds the gesture class of a maximum probability in the probability of the corresponding gesture-type of the information to be sorted Type obtains recognition result.

5. method according to claim 4, which is characterized in that each space-time convolution unit include: attention model, Figure convolution model and time convolution model；

Wherein, the attention model is used to constrain the identification range of the figure convolution model；The figure convolution model is used for The data to be calculated are identified in the identification range of itself, obtain the space structure set between the finger-joint point: The time convolution model is used to carry out the space structure set the finger-joint point that the figure convolution model obtains It calculates, obtains the finger-joint point information of higher-dimension.

6. a kind of device of the gesture identification based on figure convolutional neural networks, which is characterized in that including；

Pretreatment unit obtains gesture artis space-time for pre-processing using Attitude estimation algorithm to gesture data set Figure；Wherein, the Attitude estimation algorithm is opened based on convolutional neural networks and supervised learning and using deep learning algorithm as frame Hair；

Normalized unit obtains data to be calculated for the gesture artis space-time diagram to be normalized；

Recognition unit is obtained for the data to be calculated to be input in space-time diagram convolutional neural networks-gesture identification model To recognition result；Wherein, the model of the space-time diagram convolutional neural networks-gesture identification is to be constructed using figure convolutional neural networks It obtains.

7. device according to claim 6, which is characterized in that the pretreatment unit, comprising:

Attitude estimation algorithm computing unit, the video data for concentrating the gesture data pass through the Attitude estimation algorithm It is calculated, obtains the sequence of frames of video of the video data；Wherein, the sequence of frames of video includes the finger-joint of each frame The set of relationship of point；

First connection unit, for connecting each of the set of relationship of each frame finger-joint point finger-joint point Line obtains the space diagram of finger-joint point；

Second connection unit, for closing finger identical on each frame in the set of relationship of each frame finger-joint point Node line obtains the time diagram of finger-joint point；

Combining unit, for the space diagram of the time diagram of the finger-joint point and finger-joint point to be combined together, Construct the gesture artis space-time diagram.

8. device according to claim 6, which is characterized in that the normalized unit, comprising:

Normalized subelement, for by each of described gesture artis space-time diagram finger-joint point time diagram Value under space diagram, is normalized, and obtains the data to be calculated of each finger-joint point.

9. device according to claim 6, which is characterized in that the model of the space-time convolutional neural networks-gesture identification Network structure, comprising:

Any two space-time convolution unit and a pond layer form computing unit；Two space-times in the computing unit Convolution unit is successively run, and extracts to obtain the finger-joint point information of higher-dimension from the data to be calculated；Wherein, described in the latter Space-time convolution unit handles the processing result of the previous space-time convolution unit；

The pond layer unit, is also used to the finger-joint point information to the higher-dimension, and the down-sampled operation of progress obtains to be sorted Information；

The support vector machine classifier unit obtains described to be sorted for carrying out classified calculating to the information to be sorted The probability of the corresponding gesture-type of information finds a probability most in the probability of the corresponding gesture-type of the information to be sorted Big gesture-type, obtains recognition result.

10. device according to claim 9, which is characterized in that each space-time convolution unit includes:

Attention model, figure convolution model and time convolution model；