CN110390305A - The method and device of gesture identification based on figure convolutional neural networks - Google Patents

The method and device of gesture identification based on figure convolutional neural networks Download PDF

Info

Publication number
CN110390305A
CN110390305A CN201910676491.4A CN201910676491A CN110390305A CN 110390305 A CN110390305 A CN 110390305A CN 201910676491 A CN201910676491 A CN 201910676491A CN 110390305 A CN110390305 A CN 110390305A
Authority
CN
China
Prior art keywords
space
finger
gesture
joint point
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910676491.4A
Other languages
Chinese (zh)
Inventor
叶典
邱卫根
陈玉冰
刘畅
曾博
曹祖晟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910676491.4A priority Critical patent/CN110390305A/en
Publication of CN110390305A publication Critical patent/CN110390305A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm

Abstract

The present invention provides a kind of method and devices of gesture identification based on figure convolutional neural networks, this method comprises: by obtaining gesture artis space-time diagram using Attitude estimation algorithm and the gesture artis space-time diagram being normalized, obtain data to be calculated, allow to calculate the data to be calculated by space-time diagram convolutional neural networks-gesture identification model of foundation, finally obtains recognition result.And space-time diagram convolutional neural networks-gesture identification model, classified by six space-time convolution units and three pond layers and a support vector cassification machine to the data being calculated, to realize the real-time for improving gesture identification while the accuracy for improving gesture identification.

Description

The method and device of gesture identification based on figure convolutional neural networks
Technical field
The present invention relates to video image identification technical field, in particular to a kind of gesture based on figure convolutional neural networks is known Method for distinguishing and device.
Background technique
The mode that gesture motion is exchanged as hominid, is used till today always.The importance of information is conveyed by gesture, It not with the propulsion of time and the development of technology, and is gradually eliminated, opposite gesture movement becomes in field of human-computer interaction More importantly interactive mode.Gesture Recognition is being currently the research hotspot of computer application and artificial intelligence field, In The fields such as robot control, sign language identification, unmanned and motion detection, gesture have given full play to that it is convenient and efficient, meaning is rich Richness leads to ripe understandable advantageous feature.
Traditional Gesture Recognition mainly has two major classes, gesture identification (the common camera shooting including view-based access control model sensor Head, depth camera head etc.) and it is based on wearable sensor (data glove etc.).But due to the inconvenience of wearable sensor With identification retardance, so most popular research direction is the gesture identification of view-based access control model sensor instantly.
In the gesture identification of current visual sensor, the limitation of template matching and probability statistics model causes Lower to the accuracy of gesture identification, real-time is poor.
Summary of the invention
In view of this, the embodiment of the present invention provides the method and dress of a kind of gesture identification based on figure convolutional neural networks It sets, for improving the real-time of gesture identification while guaranteeing the accuracy of gesture identification.
To achieve the above object, the embodiment of the present invention provides the following technical solutions:
A method of the gesture identification based on figure convolutional neural networks, comprising:
Gesture data set is pre-processed using Attitude estimation algorithm, obtains gesture artis space-time diagram;Wherein, described Attitude estimation algorithm is based on convolutional neural networks and supervised learning and using deep learning algorithm as Development of Framework;
The gesture artis space-time diagram is normalized, data to be calculated are obtained;
The data to be calculated are input in space-time diagram convolutional neural networks-gesture identification model and obtain identification knot Fruit;Wherein, the model of the space-time diagram convolutional neural networks-gesture identification is to construct to obtain using figure convolutional neural networks.
Optionally, described that gesture data set is pre-processed using Attitude estimation algorithm, obtain gesture artis space-time Figure, comprising:
The video data that the gesture data is concentrated is calculated by the Attitude estimation algorithm, obtains the video The sequence of frames of video of data;Wherein, the sequence of frames of video includes the set of relationship of the finger-joint point of each frame;
Each of set of relationship by each frame finger-joint point finger-joint point line, obtains finger-joint The space diagram of point;
By identical finger-joint point line on each frame in the set of relationship of each frame finger-joint point, obtain The time diagram of finger-joint point;
The space diagram of the time diagram of the finger-joint point and finger-joint point is combined together, the hand is constructed Gesture artis space-time diagram.
Optionally, described that the gesture artis space-time diagram is normalized, obtain data to be calculated, comprising:
By each of described gesture artis space-time diagram finger-joint point time diagram and space diagram under value, into Row normalization obtains the data to be calculated of each finger-joint point.
Optionally, the network structure of the model of the space-time convolutional neural networks-gesture identification, comprising:
Six space-time convolution units, three pond layers and a support vector machine classifier;
Any two space-time convolution unit and a pond layer form computing unit;Described in two in the computing unit Space-time convolution unit is successively run, and extracts to obtain the finger-joint point information of higher-dimension from the data to be calculated;Wherein, the latter The space-time convolution unit handles the processing result of the previous space-time convolution unit;Hand of the pond layer to the higher-dimension Articulations digitorum manus point information carries out down-sampled operation and obtains information to be sorted;
The support vector machine classifier carries out classified calculating to the information to be sorted, obtains the information pair to be sorted The probability for the gesture-type answered finds the hand of a maximum probability in the probability of the corresponding gesture-type of the information to be sorted Gesture type, obtains recognition result.
A kind of device of the gesture identification based on figure convolutional neural networks, comprising:
Pretreatment unit obtains gesture artis for pre-processing using Attitude estimation algorithm to gesture data set Space-time diagram;Wherein, the Attitude estimation algorithm is based on convolutional neural networks and supervised learning and using deep learning algorithm as frame Frame exploitation;
Normalized unit obtains number to be calculated for the gesture artis space-time diagram to be normalized According to;
Recognition unit, for the data to be calculated to be input to space-time diagram convolutional neural networks-gesture identification model In obtain recognition result;Wherein, the model of the space-time diagram convolutional neural networks-gesture identification is to utilize figure convolutional neural networks Building obtains.
Optionally, the pretreatment unit, comprising:
Attitude estimation algorithm computing unit, the video data for concentrating the gesture data pass through the Attitude estimation Algorithm is calculated, and the sequence of frames of video of the video data is obtained;Wherein, the sequence of frames of video includes the finger of each frame The set of relationship of artis;
First connection unit, for by each of the set of relationship of each frame finger-joint point finger-joint point Line obtains the space diagram of finger-joint point;
Second connection unit, for by identical hand on each frame in the set of relationship of each frame finger-joint point Articulations digitorum manus point line, obtains the time diagram of finger-joint point;
Combining unit, for the space diagram of the time diagram of the finger-joint point and finger-joint point to be incorporated in one It rises, constructs the gesture artis space-time diagram.
Optionally, the normalized unit, comprising:
Normalized subelement, for by each of described gesture artis space-time diagram finger-joint point when Between value under figure and space diagram, be normalized, obtain the data to be calculated of each finger-joint point.
Optionally, the network structure of the model of the space-time convolutional neural networks-gesture identification, comprising:
Six space-time convolution units, three pond layer units and a support vector machine classifier unit;
Any two space-time convolution unit and a pond layer form computing unit;Described in two in the computing unit Space-time convolution unit is successively run, and extracts to obtain the finger-joint point information of higher-dimension from the data to be calculated;Wherein, the latter The space-time convolution unit handles the processing result of the previous space-time convolution unit;
The pond layer unit, is also used to the finger-joint point information to the higher-dimension, carry out down-sampled operation obtain to Classification information;
The support vector machine classifier unit, for carrying out classified calculating to the information to be sorted, obtain it is described to The probability of the corresponding gesture-type of classification information, found in the probability of the corresponding gesture-type of the information to be sorted one it is general The maximum gesture-type of rate, obtains recognition result.
Optionally, each space-time convolution unit includes:
Attention model, figure convolution model and time convolution model;
Wherein, the attention model is used to constrain the identification range of the figure convolution model;The figure convolution model is used In identifying in the identification range at itself to the data to be calculated, the space structure collection between the finger-joint point is obtained Close: the time convolution model be used between the space structure set the finger-joint point that the figure convolution model obtains into Row calculates, and obtains the finger-joint point information of higher-dimension.
As it can be seen from the above scheme the method and dress of a kind of gesture identification based on figure convolutional neural networks provided by the present application In setting, by obtaining gesture artis space-time diagram using Attitude estimation algorithm and carrying out normalizing to the gesture artis space-time diagram Change processing, obtains data to be calculated, allows to space-time diagram convolutional neural networks-gesture identification model by foundation to institute It states data to be calculated to be calculated, finally obtains recognition result.And space-time diagram convolutional neural networks-gesture identification model, lead to Cross six space-time convolution units and three pond layers and a support vector cassification machine to the data being calculated into Row classification, to realize the real-time for improving gesture identification while the accuracy for improving gesture identification.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of specific stream of the method for the gesture identification based on figure convolutional neural networks provided in an embodiment of the present invention Cheng Tu;
Fig. 2 be another embodiment of the present invention provides a kind of gesture identification based on figure convolutional neural networks method tool Body flow chart;
Fig. 3 be another embodiment of the present invention provides a kind of finger-joint point space diagram schematic diagram;
Fig. 4 be another embodiment of the present invention provides a kind of finger-joint point time diagram schematic diagram;
Fig. 5 be another embodiment of the present invention provides a kind of gesture artis space-time diagram schematic diagram;
Fig. 6 be another embodiment of the present invention provides a kind of gesture identification based on figure convolutional neural networks method tool Body flow chart;
Fig. 7 be another embodiment of the present invention provides the device of the gesture identification based on figure convolutional neural networks a kind of show It is intended to;
Fig. 8 be another embodiment of the present invention provides the device of the gesture identification based on figure convolutional neural networks a kind of show It is intended to.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
The embodiment of the present invention provides a kind of method of gesture identification based on figure convolutional neural networks, as shown in Figure 1, packet It includes:
S101, gesture data set is pre-processed using Attitude estimation algorithm, obtains gesture artis space-time diagram.
Wherein, gesture data collection can be the big-sample data collection The with a variety of type gestures having disclosed 20BN-JESTER, currently, the data set has contained 148092 gesture videos, totally 27 class gesture motion, total data set Size is 22.8GB;Equally, other gesture data collection also can be used according to the actual situation.
It should be noted that due to being common red (red, R) green (green, G) blue in The 20BN-JESTER (blue, B) image/video, cannot be directly as the input data of space-time diagram convolutional neural networks.So The 20BN- The finger data in RGB image video in JESTER can be obtained by Attitude estimation algorithm.
It should also be noted that, Attitude estimation algorithm is the human body attitude identification project of Carnegie Mellon Univ USA, appearance State algorithm for estimating be based on convolutional neural networks and supervised learning and using deep learning algorithm as the open source library of Development of Framework, can be with Realize the Attitude estimations such as human action, facial expression, finger movement.
Specifically, after by being pre-processed using Attitude estimation algorithm opponent's power set data, when obtaining gesture artis Empty graph, when later use space-time convolutional neural networks are calculated, can input data in gesture artis space-time diagram into Row calculates.
Optionally, in another embodiment of the present invention, a kind of embodiment of step S101, as shown in Figure 2, comprising:
S201, the video data that gesture data is concentrated is calculated by Attitude estimation algorithm, obtains video data Sequence of frames of video.
Specifically, the data class obtained after the video data that gesture data is concentrated is calculated by Attitude estimation algorithm Type is sequence of frames of video;Wherein, sequence of frames of video includes the set of relationship of the finger-joint point of each frame.
S202, by each of set of relationship of each frame finger-joint point finger-joint point line, obtain finger pass The space diagram of node.
Specifically, can be with as shown in figure 3, circle represents finger-joint point, the line between finger-joint point be represent The side that the space diagram of finger-joint point spatially forms.
S203, by finger-joint point line identical on each frame in the set of relationship of each frame finger-joint point, obtain To the time diagram of finger-joint point.
Specifically, can be with as shown in figure 4, circle represents the finger-joint point of first frame, the circle with shade be represent The finger-joint point of second frame;Wherein the finger-joint point of first frame and the finger-joint point of the second frame are identical;Finger-joint point Line represent the side that the time diagram of finger-joint point is composition in the time.
It will be in every two adjacent frame pictures in the set of relationship of each frame finger-joint point it should be noted that can be Identical finger-joint point carry out line, i.e., by t frame picture and t+1 frame picture identical finger-joint point carry out Line;Equally, the identical finger-joint point in t frame picture and t+2 frame picture can also be subjected to line.
It should also be noted that, by the phase in every two adjacent frame pictures in the set of relationship of each frame finger-joint point With finger-joint point carry out line when, two adjacent frame pictures are closer, finally later use space-time convolutional neural networks into When row calculates, what is obtained is more accurate to the recognition result of gesture identification, conversely, two adjacent frame pictures are remoter, finally rear Continuous when being calculated using space-time convolutional neural networks, what is obtained is more inaccurate to the recognition result of gesture identification.
S204, the space diagram of the time diagram of finger-joint point and finger-joint point is combined together, constructs gesture joint Point space-time diagram.
Specifically, gesture joint as shown in Figure 5 can be constructed in conjunction with step S202 and S203 and Fig. 3 and Fig. 4 Point space-time diagram;Wherein, dark circle is the finger-joint point of first frame, and light circle is the finger-joint point of the second frame, The space diagram of the finger-joint point of first frame is clicked through with the identical finger-joint in the space diagram of the finger-joint of the second frame point Row line constructs gesture artis space-time diagram.
S102, gesture artis space-time diagram is normalized, obtains data to be calculated.
Optionally, in another embodiment of the present invention, a kind of embodiment of step S102, comprising:
By each of gesture artis space-time diagram finger-joint point time diagram and space diagram under value, returned One changes, and obtains the data to be calculated of each finger-joint point.
Wherein, normalized is a kind of mode of simplified calculating, i.e., the expression formula that will have dimension is turned to by transformation Nondimensional expression formula, becomes scalar.
Specifically, the variation of finger-joint is very big since finger-joint is under different frame and different angle, it is therefore desirable to will Position feature of each of gesture artis space-time diagram joint under different frame is normalized, advantageously with calculation The convergence of method, so that subsequent arithmetic process is more convenient accurate.
S103, data to be calculated are input in space-time diagram convolutional neural networks-gesture identification model obtain identification knot Fruit.
Wherein, space-time diagram convolutional neural networks-gesture identification model is to utilize figure convolutional neural networks and numerous gestures Data set constructs obtain in advance.
Specifically, the parameter of space-time diagram convolutional neural networks-gesture identification model can be by stochastic gradient descent side Method (Stochastic gradiernt descent, SGD) is trained study, and stochastic gradient descent is most fast in calculating decline Direction constantly selects a data to be calculated at random, rather than scans whole training datasets, to accelerate iteration speed.
During the specific implementation of the present embodiment, when trained data sample is less, cross validation can also be used Method carries out the training of model, i.e., is concentrated through stratified sampling from data and obtains, and then, uses k-1 subset as training set every time, Remaining subset is as test set, it is hereby achieved that k group training/test set, and then k group data are subjected to k training again and are tried It tests, finally obtains the mean value of this k trained test result.
Optionally, in another embodiment of the present invention, a kind of embodiment of step S103, as shown in Figure 6, comprising:
Six space-time convolution units, three pond layers and support vector machines (Support Vector Machine, SVM) classifier.
Wherein, the 6 of STGCN-H are indicated with stgc_u1, stgc_u2, stgc_u3, stgc_u4, stgc_u5, stgc_u6 A space-time convolution unit;Pool1, pool2, pool3 respectively indicate 3 pond layers.
Specifically, any two space-time convolution unit and a pond layer can be formed computing unit;In computing unit Two space-time convolution units successively run, extract to obtain the finger-joint point information of higher-dimension from data to be calculated;Pond layer is used In the amount of compressed data and parameter, reduce over-fitting, removing the information that extra information leaves has scale invariability, most can table Up to the information of finger characteristic.
Wherein, the latter space-time convolution unit handles the processing result of previous space-time convolution unit;Pond layer is to higher-dimension Finger-joint point information, carry out it is down-sampled operation obtain information to be sorted.
SVM classifier treats classification information and carries out classified calculating, obtains the probability of the corresponding gesture-type of information to be sorted, The gesture-type that a maximum probability is found in the probability of the corresponding gesture-type of information to be sorted, obtains recognition result.
Specifically, the data set used in SVM classifier has class label, SVM classifier is carried using data set Class label, classify to obtained information to be sorted, also, the result of svm classifier final output is one with general The tensor of rate information can find the gesture class of a maximum probability in the probability of the corresponding gesture-type of information to be sorted Type obtains final recognition result.
Optionally, in another embodiment of the present invention, each space-time convolution unit includes:
Attention model, figure convolution model and time convolution model.
Wherein, attention model is used to constrain the identification range of the figure convolution model;Figure convolution model is used at itself Identification range in data to be calculated are identified, obtain the space structure set between finger-joint point: time convolution model The space structure set finger-joint point for obtaining between figure convolution model calculates, and obtains the finger-joint point of higher-dimension Information.
It should be noted that during based on the gesture identification of figure convolutional neural networks, the finger-joint point of palm 15 are shared, since the artis of finger is more and may be used also during based on the gesture identification of figure convolutional neural networks There can be complicated background image, can all generate interference to the performance of figure convolution model, it is therefore desirable to which attention model constrains The identification range of figure convolution model, so that figure convolution model focuses more on finger-joint point information important in image, to mention The accuracy rate of height identification.
It should also be noted that, the operation in figure convolution model is based on aerial prospective visual angle (spatial Perspective it) is operated, also, the node that directly acts in image of the convolution kernel in figure convolution model and is closed on Node directly practises the feature between finger-joint from palm joint point diagram middle school;And time convolution model be by study when Between variation in finger-joint point variation local feature.
During the specific implementation of the present embodiment, residual error mechanism can also be used in space-time convolution unit.Specifically, It cannot be realized by the simple stacking of layer and layer since the depth of network is promoted, it may appear that gradient disappearance problem, deep layer Network is difficult to train.Because gradient propagates backward to the layer of front, repeating to be multiplied may make gradient infinitely small.Cause with network The number of plies it is deeper, performance tends to be saturated, in addition start rapidly decline.Space-time convolution list can be reduced by adding residual error mechanism Member error rate in calculating process.
Equally, it during the specific implementation of the present embodiment, can also be added random after each space-time convolution unit Lose (dropout) work.Thus reduce the dependency degree of different neurons.Specifically, some intermediate outputs, in given instruction Practice on collection, it may occur however that the case where only relying on certain neurons, this will result in the over-fitting to training set.And it is random The some neurons of dropout inactivate them, do not generate contribution to subsequent learning parameter, and more multi-neuron can be allowed to participate in To in final output, and then reduce over-fitting.
As it can be seen from the above scheme in a kind of method of gesture identification based on figure convolutional neural networks provided by the present application, By obtaining gesture artis space-time diagram using Attitude estimation algorithm and place being normalized to the gesture artis space-time diagram Reason, obtains data to be calculated, allow to by space-time diagram convolutional neural networks-gesture identification model of foundation to it is described to It calculates data to be calculated, finally obtains recognition result.And space-time diagram convolutional neural networks-gesture identification model, pass through six A space-time convolution unit and three pond layers and a support vector cassification machine divide the data being calculated Class, to realize the real-time for improving gesture identification while the accuracy for improving gesture identification.
The embodiment of the present invention provides a kind of device of gesture identification based on figure convolutional neural networks, as shown in fig. 7, packet It includes:
Pretreatment unit 701 obtains gesture joint for pre-processing using Attitude estimation algorithm to gesture data set Point space-time diagram.
Wherein, Attitude estimation algorithm is opened based on convolutional neural networks and supervised learning and using deep learning algorithm as frame Hair.
Optionally, in another embodiment of the present invention, a kind of embodiment of pretreatment unit 701, as shown in figure 8, packet It includes:
Attitude estimation algorithm computing unit 801, the video data for concentrating gesture data pass through Attitude estimation algorithm It is calculated, obtains the sequence of frames of video of video data.
Wherein, sequence of frames of video includes the set of relationship of the finger-joint point of each frame.
First connection unit 802, for by each of the set of relationship of each frame finger-joint point finger-joint point Line obtains the space diagram of finger-joint point.
Second connection unit 803, for by identical hand on each frame in the set of relationship of each frame finger-joint point Articulations digitorum manus point line, obtains the time diagram of finger-joint point.
Combining unit 804, for the space diagram of the time diagram of finger-joint point and finger-joint point to be combined together, structure Make gesture artis space-time diagram.
The specific work process of unit disclosed in the above embodiment of the present invention, reference can be made to corresponding embodiment of the method content, As shown in Fig. 2, Fig. 3, Fig. 4 and Fig. 5, details are not described herein again.
Normalized unit 702 obtains number to be calculated for gesture artis space-time diagram to be normalized According to.
Optionally, in another embodiment of the present invention, a kind of embodiment of normalized unit 702, comprising:
Normalized subelement, for by each of gesture artis space-time diagram finger-joint point time diagram Value under space diagram, is normalized, and obtains the data to be calculated of each finger-joint point.
Recognition unit 703, for data to be calculated to be input in space-time diagram convolutional neural networks-gesture identification model Obtain recognition result.
Wherein, space-time diagram convolutional neural networks-gesture identification model is to utilize figure convolutional neural networks and numerous gestures Data set constructs obtain in advance.
Specifically, the parameter of space-time diagram convolutional neural networks-gesture identification model can be by stochastic gradient descent side Method (Stochastic gradiernt descent, SGD) is trained study, and stochastic gradient descent is most fast in calculating decline Direction constantly selects a data to be calculated at random, rather than scans whole training datasets, to accelerate iteration speed.
During the specific implementation of the present embodiment, when trained data sample is less, cross validation can also be used Method carries out the training of model, i.e., is concentrated through stratified sampling from data and obtains, and then, uses k-1 subset as training set every time, Remaining subset is as test set, it is hereby achieved that k group training/test set, and then k group data are subjected to k training again and are tried It tests, finally obtains the mean value of this k trained test result.
The structural schematic diagram of the convolutional neural networks of space-time diagram disclosed in the above embodiment of the present invention-gesture identification model with And the corresponding course of work, reference can be made to corresponding embodiment of the method content, as shown in fig. 6, details are not described herein again.
The specific work process of unit disclosed in the above embodiment of the present invention, reference can be made to corresponding embodiment of the method content, As shown in Figure 1, details are not described herein again.
Optionally, in another embodiment of the present invention, the network of the model of the space-time convolutional neural networks-gesture identification A kind of embodiment of structure, comprising:
Six space-time convolution units, three pond layer units and a support vector machine classifier unit.
Wherein, any two space-time convolution unit and a pond layer form computing unit.
Two space-time convolution units in computing unit are successively run, and the finger for extracting to obtain higher-dimension from data to be calculated closes Nodal information.
Wherein, the latter space-time convolution unit handles the processing result of previous space-time convolution unit.
Pond layer unit is also used to the finger-joint point information to higher-dimension, carries out down-sampled operation and obtains information to be sorted.
Support vector machine classifier unit carries out classified calculating for treating classification information, it is corresponding to obtain information to be sorted Gesture-type probability, the gesture class of a maximum probability is found in the probability of the corresponding gesture-type of information to be sorted Type obtains recognition result.
The structural schematic diagram of the convolutional neural networks of space-time diagram disclosed in the above embodiment of the present invention-gesture identification model with And the corresponding course of work, reference can be made to corresponding embodiment of the method content, as shown in fig. 6, details are not described herein again.
Optionally, in another embodiment of the present invention, a kind of embodiment of space-time convolution unit, comprising:
Attention model, figure convolution model and time convolution model;
Wherein, attention model is used for the identification range of constraints graph convolution model;Figure convolution model is used for the knowledge at itself Data to be calculated are identified in other range, obtain the space structure set between finger-joint point: time convolution model is used for Space structure set the finger-joint point that figure convolution model obtains is calculated, the finger-joint point letter of higher-dimension is obtained Breath.
During the specific implementation of the present embodiment, residual error mechanism can also be used in space-time convolution unit.Specifically, It cannot be realized by the simple stacking of layer and layer since the depth of network is promoted, it may appear that gradient disappearance problem, deep layer Network is difficult to train.Because gradient propagates backward to the layer of front, repeating to be multiplied may make gradient infinitely small.Cause with network The number of plies it is deeper, performance tends to be saturated, in addition start rapidly decline.Space-time convolution list can be reduced by adding residual error mechanism Member error rate in calculating process.
Equally, it during the specific implementation of the present embodiment, can also be added random after each space-time convolution unit Lose (dropout) work.Thus reduce the dependency degree of different neurons.Specifically, some intermediate outputs, in given instruction Practice on collection, it may occur however that the case where only relying on certain neurons, this will result in the over-fitting to training set.And it is random The some neurons of dropout inactivate them, do not generate contribution to subsequent learning parameter, and more multi-neuron can be allowed to participate in To in final output, and then reduce over-fitting.
As it can be seen from the above scheme in a kind of device of gesture identification based on figure convolutional neural networks provided by the present application, It is right using normalized unit 702 after pretreatment unit 701 obtains gesture artis space-time diagram using Attitude estimation algorithm The gesture artis space-time diagram is normalized, and obtains data to be calculated, and recognition unit 703 is allowed to pass through foundation Space-time diagram convolutional neural networks-gesture identification model the data to be calculated are calculated, finally obtain recognition result. And space-time diagram convolutional neural networks-gesture identification model, pass through six space-time convolution units and three pond layers and one Support vector cassification machine classifies to the data being calculated, and is improving the accurate of gesture identification to realize While property, the real-time of gesture identification is improved.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of method of the gesture identification based on figure convolutional neural networks characterized by comprising
Gesture data set is pre-processed using Attitude estimation algorithm, obtains gesture artis space-time diagram;Wherein, the posture Algorithm for estimating is based on convolutional neural networks and supervised learning and using deep learning algorithm as Development of Framework;
The gesture artis space-time diagram is normalized, data to be calculated are obtained;
The data to be calculated are input in space-time diagram convolutional neural networks-gesture identification model and obtain recognition result;Its In, the model of the space-time diagram convolutional neural networks-gesture identification is to construct to obtain using figure convolutional neural networks.
2. the method according to claim 1, wherein described carry out gesture data set using Attitude estimation algorithm Pretreatment, obtains gesture artis space-time diagram, comprising:
The video data that the gesture data is concentrated is calculated by the Attitude estimation algorithm, obtains the video data Sequence of frames of video;Wherein, the sequence of frames of video includes the set of relationship of the finger-joint point of each frame;
Each of set of relationship by each frame finger-joint point finger-joint point line, obtains finger-joint point Space diagram;
By identical finger-joint point line on each frame in the set of relationship of each frame finger-joint point, finger is obtained The time diagram of artis;
The space diagram of the time diagram of the finger-joint point and finger-joint point is combined together, the gesture is constructed and closes Node space-time diagram.
3. according to the method described in claim 2, it is characterized in that, described be normalized the gesture artis space-time diagram Processing, obtains data to be calculated, comprising:
By each of described gesture artis space-time diagram finger-joint point time diagram and space diagram under value, returned One changes, and obtains the data to be calculated of each finger-joint point.
4. the method according to claim 1, wherein space-time convolutional neural networks-gesture identification model Network structure, comprising:
Six space-time convolution units, three pond layers and a support vector machine classifier;
Any two space-time convolution unit and a pond layer form computing unit;Two space-times in the computing unit Convolution unit is successively run, and extracts to obtain the finger-joint point information of higher-dimension from the data to be calculated;Wherein, described in the latter Space-time convolution unit handles the processing result of the previous space-time convolution unit;The pond layer closes the finger of the higher-dimension Nodal information carries out down-sampled operation and obtains information to be sorted;
The support vector machine classifier carries out classified calculating to the information to be sorted, and it is corresponding to obtain the information to be sorted The probability of gesture-type finds the gesture class of a maximum probability in the probability of the corresponding gesture-type of the information to be sorted Type obtains recognition result.
5. method according to claim 4, which is characterized in that each space-time convolution unit include: attention model, Figure convolution model and time convolution model;
Wherein, the attention model is used to constrain the identification range of the figure convolution model;The figure convolution model is used for The data to be calculated are identified in the identification range of itself, obtain the space structure set between the finger-joint point: The time convolution model is used to carry out the space structure set the finger-joint point that the figure convolution model obtains It calculates, obtains the finger-joint point information of higher-dimension.
6. a kind of device of the gesture identification based on figure convolutional neural networks, which is characterized in that including;
Pretreatment unit obtains gesture artis space-time for pre-processing using Attitude estimation algorithm to gesture data set Figure;Wherein, the Attitude estimation algorithm is opened based on convolutional neural networks and supervised learning and using deep learning algorithm as frame Hair;
Normalized unit obtains data to be calculated for the gesture artis space-time diagram to be normalized;
Recognition unit is obtained for the data to be calculated to be input in space-time diagram convolutional neural networks-gesture identification model To recognition result;Wherein, the model of the space-time diagram convolutional neural networks-gesture identification is to be constructed using figure convolutional neural networks It obtains.
7. device according to claim 6, which is characterized in that the pretreatment unit, comprising:
Attitude estimation algorithm computing unit, the video data for concentrating the gesture data pass through the Attitude estimation algorithm It is calculated, obtains the sequence of frames of video of the video data;Wherein, the sequence of frames of video includes the finger-joint of each frame The set of relationship of point;
First connection unit, for connecting each of the set of relationship of each frame finger-joint point finger-joint point Line obtains the space diagram of finger-joint point;
Second connection unit, for closing finger identical on each frame in the set of relationship of each frame finger-joint point Node line obtains the time diagram of finger-joint point;
Combining unit, for the space diagram of the time diagram of the finger-joint point and finger-joint point to be combined together, Construct the gesture artis space-time diagram.
8. device according to claim 6, which is characterized in that the normalized unit, comprising:
Normalized subelement, for by each of described gesture artis space-time diagram finger-joint point time diagram Value under space diagram, is normalized, and obtains the data to be calculated of each finger-joint point.
9. device according to claim 6, which is characterized in that the model of the space-time convolutional neural networks-gesture identification Network structure, comprising:
Six space-time convolution units, three pond layer units and a support vector machine classifier unit;
Any two space-time convolution unit and a pond layer form computing unit;Two space-times in the computing unit Convolution unit is successively run, and extracts to obtain the finger-joint point information of higher-dimension from the data to be calculated;Wherein, described in the latter Space-time convolution unit handles the processing result of the previous space-time convolution unit;
The pond layer unit, is also used to the finger-joint point information to the higher-dimension, and the down-sampled operation of progress obtains to be sorted Information;
The support vector machine classifier unit obtains described to be sorted for carrying out classified calculating to the information to be sorted The probability of the corresponding gesture-type of information finds a probability most in the probability of the corresponding gesture-type of the information to be sorted Big gesture-type, obtains recognition result.
10. device according to claim 9, which is characterized in that each space-time convolution unit includes:
Attention model, figure convolution model and time convolution model;
Wherein, the attention model is used to constrain the identification range of the figure convolution model;The figure convolution model is used for The data to be calculated are identified in the identification range of itself, obtain the space structure set between the finger-joint point: The time convolution model is used to carry out the space structure set the finger-joint point that the figure convolution model obtains It calculates, obtains the finger-joint point information of higher-dimension.
CN201910676491.4A 2019-07-25 2019-07-25 The method and device of gesture identification based on figure convolutional neural networks Pending CN110390305A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910676491.4A CN110390305A (en) 2019-07-25 2019-07-25 The method and device of gesture identification based on figure convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910676491.4A CN110390305A (en) 2019-07-25 2019-07-25 The method and device of gesture identification based on figure convolutional neural networks

Publications (1)

Publication Number Publication Date
CN110390305A true CN110390305A (en) 2019-10-29

Family

ID=68287397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910676491.4A Pending CN110390305A (en) 2019-07-25 2019-07-25 The method and device of gesture identification based on figure convolutional neural networks

Country Status (1)

Country Link
CN (1) CN110390305A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291713A (en) * 2020-02-27 2020-06-16 山东大学 Gesture recognition method and system based on skeleton
CN111476181A (en) * 2020-04-13 2020-07-31 河北工业大学 Human skeleton action recognition method
CN111737909A (en) * 2020-06-10 2020-10-02 哈尔滨工业大学 Structural health monitoring data anomaly identification method based on space-time graph convolutional network
CN112148128A (en) * 2020-10-16 2020-12-29 哈尔滨工业大学 Real-time gesture recognition method and device and man-machine interaction system
CN112183314A (en) * 2020-09-27 2021-01-05 哈尔滨工业大学(深圳) Expression information acquisition device and expression identification method and system
CN112329525A (en) * 2020-09-27 2021-02-05 中国科学院软件研究所 Gesture recognition method and device based on space-time diagram convolutional neural network
CN112543936A (en) * 2020-10-29 2021-03-23 香港应用科技研究院有限公司 Motion structure self-attention-seeking convolutional network for motion recognition
WO2021218126A1 (en) * 2020-04-26 2021-11-04 武汉Tcl集团工业研究院有限公司 Gesture identification method, terminal device, and computer readable storage medium
CN113673560A (en) * 2021-07-15 2021-11-19 华南理工大学 Human behavior identification method based on multi-stream three-dimensional adaptive graph convolution
CN114155604A (en) * 2021-12-03 2022-03-08 哈尔滨理工大学 Dynamic gesture recognition method based on 3D convolutional neural network
WO2022088176A1 (en) * 2020-10-29 2022-05-05 Hong Kong Applied Science and Technology Research Institute Company Limited Actional-structural self-attention graph convolutional network for action recognition
CN115546824A (en) * 2022-04-18 2022-12-30 荣耀终端有限公司 Taboo picture identification method, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095862A (en) * 2015-07-10 2015-11-25 南开大学 Human gesture recognizing method based on depth convolution condition random field
CN108875708A (en) * 2018-07-18 2018-11-23 广东工业大学 Behavior analysis method, device, equipment, system and storage medium based on video
US20190095806A1 (en) * 2017-09-28 2019-03-28 Siemens Aktiengesellschaft SGCNN: Structural Graph Convolutional Neural Network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095862A (en) * 2015-07-10 2015-11-25 南开大学 Human gesture recognizing method based on depth convolution condition random field
US20190095806A1 (en) * 2017-09-28 2019-03-28 Siemens Aktiengesellschaft SGCNN: Structural Graph Convolutional Neural Network
CN108875708A (en) * 2018-07-18 2018-11-23 广东工业大学 Behavior analysis method, device, equipment, system and storage medium based on video

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
万晓依: "基于时空结构关系的3D人体行为识别研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
马静: "基于姿态和骨架信息的行为识别方法研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291713B (en) * 2020-02-27 2023-05-16 山东大学 Gesture recognition method and system based on skeleton
CN111291713A (en) * 2020-02-27 2020-06-16 山东大学 Gesture recognition method and system based on skeleton
CN111476181A (en) * 2020-04-13 2020-07-31 河北工业大学 Human skeleton action recognition method
CN111476181B (en) * 2020-04-13 2022-03-04 河北工业大学 Human skeleton action recognition method
WO2021218126A1 (en) * 2020-04-26 2021-11-04 武汉Tcl集团工业研究院有限公司 Gesture identification method, terminal device, and computer readable storage medium
CN111737909B (en) * 2020-06-10 2021-02-09 哈尔滨工业大学 Structural health monitoring data anomaly identification method based on space-time graph convolutional network
CN111737909A (en) * 2020-06-10 2020-10-02 哈尔滨工业大学 Structural health monitoring data anomaly identification method based on space-time graph convolutional network
CN112329525A (en) * 2020-09-27 2021-02-05 中国科学院软件研究所 Gesture recognition method and device based on space-time diagram convolutional neural network
CN112183314A (en) * 2020-09-27 2021-01-05 哈尔滨工业大学(深圳) Expression information acquisition device and expression identification method and system
CN112183314B (en) * 2020-09-27 2023-12-12 哈尔滨工业大学(深圳) Expression information acquisition device, expression recognition method and system
CN112148128A (en) * 2020-10-16 2020-12-29 哈尔滨工业大学 Real-time gesture recognition method and device and man-machine interaction system
CN112148128B (en) * 2020-10-16 2022-11-25 哈尔滨工业大学 Real-time gesture recognition method and device and man-machine interaction system
CN112543936A (en) * 2020-10-29 2021-03-23 香港应用科技研究院有限公司 Motion structure self-attention-seeking convolutional network for motion recognition
CN112543936B (en) * 2020-10-29 2021-09-28 香港应用科技研究院有限公司 Motion structure self-attention-drawing convolution network model for motion recognition
WO2022088176A1 (en) * 2020-10-29 2022-05-05 Hong Kong Applied Science and Technology Research Institute Company Limited Actional-structural self-attention graph convolutional network for action recognition
CN113673560A (en) * 2021-07-15 2021-11-19 华南理工大学 Human behavior identification method based on multi-stream three-dimensional adaptive graph convolution
CN113673560B (en) * 2021-07-15 2023-06-09 华南理工大学 Human behavior recognition method based on multi-flow three-dimensional self-adaptive graph convolution
CN114155604A (en) * 2021-12-03 2022-03-08 哈尔滨理工大学 Dynamic gesture recognition method based on 3D convolutional neural network
CN115546824A (en) * 2022-04-18 2022-12-30 荣耀终端有限公司 Taboo picture identification method, equipment and storage medium
CN115546824B (en) * 2022-04-18 2023-11-28 荣耀终端有限公司 Taboo picture identification method, apparatus and storage medium

Similar Documents

Publication Publication Date Title
CN110390305A (en) The method and device of gesture identification based on figure convolutional neural networks
Sincan et al. Autsl: A large scale multi-modal turkish sign language dataset and baseline methods
US11393206B2 (en) Image recognition method and apparatus, terminal, and storage medium
Islam et al. Static hand gesture recognition using convolutional neural network with data augmentation
Patrona et al. Motion analysis: Action detection, recognition and evaluation based on motion capture data
Calinon et al. On learning, representing, and generalizing a task in a humanoid robot
CN104517097A (en) Kinect-based moving human body posture recognition method
CN110378208A (en) A kind of Activity recognition method based on depth residual error network
Zhang et al. Human pose estimation and tracking via parsing a tree structure based human model
CN106548194B (en) The construction method and localization method of two dimensional image human joint points location model
Liu et al. Joint dynamic pose image and space time reversal for human action recognition from videos
Park et al. Attribute and-or grammar for joint parsing of human attributes, part and pose
Yan et al. Research on dynamic sign language algorithm based on sign language trajectory and key frame extraction
Mahmoud et al. Convolutional neural networks framework for human hand gesture recognition
Li et al. Graph diffusion convolutional network for skeleton based semantic recognition of two-person actions
Tong et al. Inferring facial action units with causal relations
Zhang et al. Efficient human pose estimation via parsing a tree structure based human model
Saha et al. Human skeleton matching for e-learning of dance using a probabilistic neural network
Ahmed et al. Two person interaction recognition based on effective hybrid learning
Parameshwaran et al. Unravelling of convolutional neural networks through bharatanatyam mudra classification with limited data
Lee et al. Learning action symbols for hierarchical grammar induction
Robinson et al. A Deep Learning Human Activity Recognition Framework for Socially Assistive Robots to Support Reablement of Older Adults
Dozdor et al. Two-Model-Based Online Hand Gesture Recognition from Skeleton Data.
Devanne 3d human behavior understanding by shape analysis of human motion and pose
Zhang et al. An object attribute guided framework for robot learning manipulations from human demonstration videos

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191029