CN112270220A - Sewing gesture recognition method based on deep learning - Google Patents
Sewing gesture recognition method based on deep learning Download PDFInfo
- Publication number
- CN112270220A CN112270220A CN202011096967.6A CN202011096967A CN112270220A CN 112270220 A CN112270220 A CN 112270220A CN 202011096967 A CN202011096967 A CN 202011096967A CN 112270220 A CN112270220 A CN 112270220A
- Authority
- CN
- China
- Prior art keywords
- sewing
- gesture data
- formula
- information
- calculating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Sewing Machines And Sewing (AREA)
Abstract
The invention discloses a sewing gesture recognition method based on deep learning, which is implemented according to the following steps: step 1, data set collection and pretreatment; step 2, inputting the pictures in the preprocessed data set into a GRU neural network in a RGB picture frame mode for data training; step 3, taking the output result of the GRU network as the input of the DNN neural network for further feature extraction to form the GRU-DNN network for identifying the sewing gesture; and 4, sending the features extracted in the step 3 into an SVM classifier for action classification. The invention solves the problems that the DNN in the prior art can not process the condition of change in time sequence in the behavior detection process, and the RNN network structure has no gradient in the detection process, so that the identification effect is inaccurate.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence, and relates to a sewing gesture recognition method based on deep learning.
Background
With the increase in labor costs and the increase in computer technology, "man + machine + environment" systems have also become an irreversible trend. The deep learning related technology obtains remarkable results in the field of behavior detection, overcomes the defect that the traditional artificial feature method can only identify in a simple scene, optimizes the classification task more effectively, and further extracts feature information in data more efficiently.
The existing sewing gesture recognition mainly adopts a recurrent neural network for recognition, and the recurrent neural network mainly represents the models as follows: RNN (recurrent neural network) model, LSTM model, GRU (gated cyclic unit) model. The RNN model can connect the current process with the past state and has a certain memory function. The LSTM model and the GRU model are structural variants of the RNN model, and compared with the RNN model, the LSTM neural network enables the recurrent neural network to memorize past information and selectively forget unimportant information. Compared with an LSTM network structure, the GRU neural network can utilize all information of pictures in the identification process, solves the problem that gradient disappears under long sequence information on the basis of the LSTM, and is simpler in structure and better in identification effect compared with an LSTM structure model. The DNN neural network (deep neural network) is also widely applied to the field of behavior recognition as a feedforward artificial neural network, can solve the problem of deep level, and can extract features better in depth, but DNN cannot process the change situation in time sequence when performing behavior detection, and the problem that the gradient disappears in the detection process occurs in the basic RNN structure, and the effect of acquiring the deep level information of the image when the RNN network performs detection is relatively deficient compared with DNN.
Disclosure of Invention
The invention aims to provide a sewing gesture recognition method based on deep learning, and solves the problems that in the prior art, the DNN cannot timely process the condition that the time sequence changes when behavior detection is carried out, and the basic RNN network structure has gradient disappearance in the detection process, so that the recognition effect is inaccurate.
The invention adopts the technical scheme that a sewing gesture recognition method based on deep learning is implemented according to the following steps:
step 1, data set collection and pretreatment;
step 2, inputting the pictures in the preprocessed data set into a GRU neural network in a RGB picture frame mode for data training;
step 3, taking the output result of the GRU network as the input of the DNN neural network for further feature extraction to form the GRU-DNN network for identifying the sewing gesture;
and 4, sending the features extracted in the step 3 into an SVM classifier for action classification.
The present invention is also characterized in that,
the step 1 specifically comprises the following steps:
step 1.1, collecting sewing gesture data pictures, and carrying out color correction on the collected sewing gesture data pictures through a dynamic threshold method so as to eliminate the influence of illumination on color rendering;
step 1.2, adjusting the brightness of the sewing gesture data picture processed in the step 1.1 to be 0.6 to 1.5 times of the original brightness;
and step 1.3, randomly rotating the sewing gesture data picture with the brightness adjusted in the step 1.2 for 90 degrees or 180 degrees or 270 degrees without rotating to obtain a preprocessed sewing gesture data picture serving as a training set.
The step 1.1 specifically comprises the following steps:
step 1.1.1, dividing each sewing gesture data picture in a training set into a plurality of areas;
step 1.1.2, calculating C of pixel points in each regionbAnd CrAnd all pixels C in each regionbAnd CrAverage value M ofbAnd MrIn which C isbIndicating the color saturation of a pixel, CrRepresenting the tone of the pixel point;
Cb=-0.169×R-0.331×G+0.500×B (1)
Cr=0.500×R-0.419×G-0.081×G (2)
r, G, B is the component value C of red, green and blue of each pixel point in the collected sewing gesture data imageb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(N) is the tone of the nth pixel point in the corresponding region, and N is the number of the pixel points in the corresponding region;
step 1.1.3, C of each region is calculated separatelybAnd CrCumulative value D of corresponding absolute difference of componentsbAnd DrThe calculation formula is as follows:
wherein N is the number of pixel points per region, Cb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(n) is the tone of the nth pixel point in the corresponding region;
step 1.1.4, each pixel point D is interpretedb/DrA value of (D) ifb/DrIs smaller than the corresponding region Mb/MrIf so, ignoring the pixel point in the corresponding area;
step 1.1.5, for each sewing gesture data picture, removing the sewing gesture data picture through judgment of step one 1.1.4The ignored pixel points are re-solved for M corresponding to each region according to the formulas (3) to (6)b、Mr、Db、DrThen M corresponding to each regionb、Mr、Db、DrRespectively summing the M and taking the average value as the M of the corresponding sewing gesture data pictureB、MR、DB、DRValue, wherein MBIs the average value of color saturation of the picture corresponding to the whole sewing gesture data, MRFor the average value of the tone of the picture corresponding to the whole sewing gesture data, DBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.6, if the pixel point in each region simultaneously satisfies the formulas (7) and (8), the pixel point is preliminarily determined as a white reference point:
|Cb(n)-(Mb+Db x sign(Mb))|<1.5 x DB (7)
|Cr(n)-(1.5 x Mr x sign(Mr))|<1.5 x DR (8)
in the formula, Mb、MrAverage value of hue and saturation components of sewing gesture data picture, Db、DrSign is a signal processing function D for the calculated cumulative value of the absolute difference of hue and saturation components for each small regionBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.7, sorting the preliminarily determined white reference points in each area according to the brightness of the white reference points, and taking the first 10 percent of the white reference points as the finally determined white reference points;
step 1.1.8, calculate the average value R of all white reference point brightness in each areaaver、Gaver、Baver;
Wherein m is the number of the white reference points finally determined in the corresponding region, R1、R2……RmFor the color component of the red channel of each white reference point, G1、G2…GmThe color component of the green channel being the determined white reference point, B1、B2…BmA color component of a blue channel that is the determined white reference point;
step 1.1.9, calculating the gain of each channel, wherein the calculation formula is as follows:
Rgain=Ymax/Raver (12)
Ggain=Ymax/Gaver (13)
Bgain=Ymax/Baver (14)
Y=0.299 x R+0.587 x G+0.114 x B (15)
in the formula: y ismaxIs the maximum value of the Y component in the color space in the whole image, Raver、Gaver、BaverThe average value of the brightness of the white reference point is R, G, B, and the component values of red, green and blue of each pixel point in the collected sewing gesture data image are R, G, B;
step 1.1.10, calculate the final color for each channel:
R′=R x Rgain (16)
G′=G x Ggain (17)
B′=B x Bgain (18)
in the formula, R, G, B represents the red, green and blue component values of each pixel point in the collected sewing gesture data image, and R ', G ' and B ' represent the red, green and blue components of the pixel points in the corrected sewing gesture data image.
The step 2 specifically comprises the following steps:
step 2.1, the color components of the red channel, the green channel and the blue channel of the corrected sewing gesture data picture obtained in the step 1.1.10 are stored in a computer in a matrix form, and then the three matrices are converted into a column vector X which is used as a characteristic vector to be sent into a GRU network structure;
step 2.2, calculating the value of an update gate in the GRU network structure, specifically:
determining how much information is repeated from the previous time to the next time, and calculating the formula as follows:
Zt=σ×(W×Xt+U×ht-1) (19)
in the formula, XtFor the t-th component of the input feature vector X, ht-1For the stored information of the t-1 step, sigma is a logic sigmoid function, W and U are weight matrixes, the updating gate adds the two parts of information and puts the two parts of information into a sigmod activation function, the activation result is compressed to be between 0 and 1, and the updating gate controls the degree of bringing the state of the previous moment into the current state, namely the information of the previous moment is applied to the current moment;
step 2.3, the calculation of the reset gate specifically comprises the following steps: determining how much information in the past needs to be forgotten, and calculating the formula as follows:
r(t)=σ×(W×Xt+U×ht-1) (20)
in the formula: w and U are weight matrices, XtFor the t-th component of the input feature vector X, ht-1The information of the t-1 step is stored;
step 2.4, calculating the current memory content, and storing the current memory content in a reset gate, wherein the calculation formula is as follows:
h′t=tanh(Wxt+rt⊙Uht-1) (21)
in the formula, rtTo reset the output value of the gate, XtIs the first of the input sequence xt components, ht-1The information of the t-1 step is stored;
step 2.5, the final output content of the gating circulation unit is obtained by adding the information preserved from the previous moment to the final memory and the information preserved from the current memory to the final moment, and the calculation formula is as follows:
ht=Zt⊙ht-1+(1-Zt)⊙h′t (22)
in the formula, ZtTo update the door calculation, ht-1For the saved information of step t-1, Zt⊙ht-1Indicates the information, h ', reserved from the previous step to the final memory'tFor the current memory content, (1-Z)t)⊙h′tAnd information indicating that the current memory content is reserved to be finally memorized is obtained, and data training is completed.
The step 3 specifically comprises the following steps:
step 3.1, taking the final memory information stored in the GRU network structure as the input of the DNN neural network, and then carrying out initialization parameters, namely the initialization of the weight w and the bias b;
step 3.2, calculating an activation function, wherein the calculation formula is as follows:
wherein z is an independent variable, and z is 0, ± 1, ± 2 … …;
and 3.3, carrying out forward propagation to obtain an output result, wherein an output formula is as follows:
al=σ×(Wl×al-1+bl) (24)
wherein l represents the number of layers, al-1Is the output of layer l-1 in the neural network, alAs output of layer I in the neural network, WlIs the weight of the l-th layer, blIs the bias of the l-th layer;
step 3.4, calculating a loss function, wherein the calculation formula is as follows:
in the formula: a islThe output of the l layer in the neural network is shown, x is a sequence output after GRU neural network training, and y is real training sample output;
and 3.5, performing reverse propagation, wherein a calculation formula updated for each layer of parameters W and b is as follows:
Zl=Wl×al-1+bl (26)
wherein Z islFor the inactive output of the l-th layer, the loss function is coupled to ZlCalculating a partial derivative to obtain:
couple the loss function to WlCalculating a partial derivative to obtain:
pair of loss functions blCalculating a partial derivative to obtain:
wherein, al-1Refers to the output of the l-1 layer neural network, blIs the bias of the l-th layer;
jointly solving the steps (24) to (29) to obtain Wl、blRealize to Wl、blIs continuously updated.
And 3.6, calculating backwards layer by layer from the input layer until the calculation is carried out to the output layer, and obtaining a final feature extraction result.
The invention has the beneficial effects that:
the invention discloses a sewing gesture recognition method based on deep learning, which combines a GRU network structure and a DNN network structure for behavior detection according to strong relevance of the GRU network structure in time and space during detection and effectiveness of the DNN network structure in extracting deep features. And carrying out color correction on the input data by using a dynamic threshold value method so as to eliminate the influence of illumination on color rendering. The pictures are rotated by 90 degrees, 180 degrees and 270 degrees to enhance the robustness of each angle in the imaging process. The preprocessed data are trained by utilizing the GRU network structure, the output result is sent into the DNN network structure as the input data of the DNN network structure for further feature extraction, compared with a single DNN network structure, the GRU-DNN network structure fully utilizes information on a time sequence and can obtain information of a deeper image when behavior detection is carried out, and the recognition effect is more accurate compared with the single network structure.
Drawings
FIG. 1 is an overall flow chart of a sewing gesture recognition method based on deep learning according to the present invention;
FIG. 2 is a flow chart of color correction during data preprocessing in the sewing gesture recognition method based on deep learning according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a sewing gesture recognition method based on deep learning, the flow of which is shown in figure 1 and is implemented according to the following steps:
step 1, data set collection and pretreatment; the method specifically comprises the following steps:
step 1.1, collecting sewing gesture data pictures, and carrying out color correction on the collected sewing gesture data pictures through a dynamic threshold method so as to eliminate the influence of illumination on color rendering; the color correction is mainly because a certain deviation exists between the acquired image and the real image, and the influence of illumination on color rendering is eliminated by adopting a dynamic threshold algorithm, as shown in fig. 2, specifically:
step 1.1.1, dividing each sewing gesture data picture in a training set into a plurality of areas;
step 1.1.2, calculating C of pixel points in each regionbAnd CrTo do so byAnd all pixel points C in each regionbAnd CrAverage value M ofbAnd MrIn which C isbIndicating the color saturation of a pixel, CrRepresenting the tone of the pixel point;
Cb=-0.169×R-0.331×G+0.500×B (1)
Cr=0.500×R-0.419×G-0.081×G (2)
r, G, B is the component value C of red, green and blue of each pixel point in the collected sewing gesture data imageb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(N) is the tone of the nth pixel point in the corresponding region, and N is the number of the pixel points in the corresponding region;
step 1.1.3, C of each region is calculated separatelybAnd CrCumulative value D of corresponding absolute difference of componentsbAnd DrThe calculation formula is as follows:
wherein N is the number of pixel points per region, Cb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(n) is the tone of the nth pixel point in the corresponding region;
step 1.1.4, each pixel point D is interpretedb/DrA value of (D) ifb/DrIs smaller than the corresponding region Mb/MrA value of (1), thenIgnoring the pixel point of the corresponding region;
step 1.1.5, for each sewing gesture data picture, judging in step 1.1.4, removing ignored pixel points, and re-obtaining M corresponding to each region according to formulas (3) - (6)b、Mr、Db、DrThen M corresponding to each regionb、Mr、Db、DrRespectively summing the M and taking the average value as the M of the corresponding sewing gesture data pictureB、MR、DB、DRValue, wherein MBIs the average value of color saturation of the picture corresponding to the whole sewing gesture data, MRFor the average value of the tone of the picture corresponding to the whole sewing gesture data, DBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.6, if the pixel point in each region simultaneously satisfies the formulas (7) and (8), the pixel point is preliminarily determined as a white reference point:
|Cb(n)-(Mb+Db x sign(Mb))|<1.5 x DB (7)
|Cr(n)-(1.5 x Mr x sign(Mr))|<1.5 x DR (8)
in the formula, Mb、MrAverage value of hue and saturation components of sewing gesture data picture, Db、DrSign is a signal processing function D for the calculated cumulative value of the absolute difference of hue and saturation components for each small regionBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.7, sorting the preliminarily determined white reference points in each area according to the brightness of the white reference points, and taking the first 10 percent of the white reference points as the finally determined white reference points;
step 1.1.8, calculate the average value R of all white reference point brightness in each areaaver、Gaver、Baver;
Wherein m is the number of the white reference points finally determined in the corresponding region, R1、R2……RmFor the color component of the red channel of each white reference point, G1、G2…GmThe color component of the green channel being the determined white reference point, B1、B2…BmA color component of a blue channel that is the determined white reference point;
step 1.1.9, calculating the gain of each channel, wherein the calculation formula is as follows:
Rgain=Ymax/Raver (12)
Ggain=Ymax/Gaver (13)
Bgain=Ymax/Baver (14)
Y=0.299 x R+0.587 x G+0.114 x B (15)
in the formula: y ismaxIs the maximum value of the Y component in the color space in the whole image, Raver、Gaver、BaverThe average value of the brightness of the white reference point is R, G, B, and the component values of red, green and blue of each pixel point in the collected sewing gesture data image are R, G, B;
step 1.1.10, calculate the final color for each channel:
R′=R x Rgain (16)
G′=G x Ggain (17)
B′=B x Bgain (18)
in the formula, R, G, B is the red, green and blue component values of each pixel point in the collected sewing gesture data image, and R ', G ' and B ' are the red, green and blue components of the pixel points in the corrected sewing gesture data image;
step 1.2, adjusting the brightness of the sewing gesture data picture processed in the step 1.1 to be 0.6 to 1.5 times of the original brightness;
step 1.3, the sewing gesture data picture with the brightness adjusted in the step 1.2 is randomly rotated for 90 degrees or 180 degrees or 270 degrees without rotating to enhance the robustness of different imaging angles, and the sewing gesture data picture after preprocessing is obtained and used as a training set;
step 2, inputting the pictures in the preprocessed data set into a GRU neural network in a RGB picture frame mode for data training; the method specifically comprises the following steps:
step 2.1, the color components of the red channel, the green channel and the blue channel of the corrected sewing gesture data picture obtained in the step 1.1.10 are stored in a computer in a matrix form, and then the three matrices are converted into a column vector X which is used as a characteristic vector to be sent into a GRU network structure; for example:
it is assumed that R ', G ', and B ' obtained in step 1.1.10 are stored in the computer as follows:
the three matrices represent the preprocessed image in the computer, the values in the matrices correspond to the red, green and blue intensity values in the image, and for the convenience of feature extraction of the neural network, the 3 matrices are converted into 1 vector X, and the final result of X can be obtained in the above example:
from the above, it can be seen that the R ', G ', B ' matrices are 3 × 3 in size, respectively, and then the total dimension of the vector X is 3 × 3 × 3, resulting in 27. In the field of artificial intelligence, each data input into the neural network is called a feature, so the example mentioned above has 27 features, the 27-dimensional vector is also called a feature vector, and the neural network receives the feature vector as an input to perform prediction;
the converted feature vectors are sent into a GRU network structure, and values of an update gate and a reset gate in the GRU network structure are calculated respectively;
step 2.2, calculating the value of an update gate in the GRU network structure, specifically:
determining how much information is repeated from the previous time to the next time, and calculating the formula as follows:
Zt=σ×(W×Xt+U×ht-1) (19)
in the formula, XtFor the t-th component of the input feature vector X, ht-1For the stored information of the t-1 step, sigma is a logic sigmoid function, W and U are weight matrixes, the updating gate adds the two parts of information and puts the information into a sigmod activation function, the activation result is compressed to be between 0 and 1, and the updating gate controls the degree of bringing the state of the previous moment into the current state, namely how much information of the previous moment is applied to the current moment, ZtThe larger the information is brought in;
step 2.3, the calculation of the reset gate specifically comprises the following steps: determining how much information in the past needs to be forgotten, and calculating the formula as follows:
r(t)=σ×(W×Xt+U×ht-1) (20)
in the formula: w and U are weight matrices, XtFor the t-th component of the input feature vector X,ht-1the information of the t-1 step is stored;
step 2.4, calculating the current memory content, and storing the current memory content in a reset gate, wherein the calculation formula is as follows:
h′t=tanh(Wxt+rt⊙Uht-1) (21)
in the formula, rtTo reset the output value of the gate, XtFor the t-th component of the input sequence x, ht-1The information of the t-1 step is stored;
step 2.5, the final output content of the gating circulation unit is obtained by adding the information preserved from the previous moment to the final memory and the information preserved from the current memory to the final moment, and the calculation formula is as follows:
ht=Zt⊙ht-1+(1-Zt)⊙h′t (22)
in the formula, ZtTo update the door calculation, ht-1For the saved information of step t-1, Zt⊙ht-1Indicates the information, h ', reserved from the previous step to the final memory'tFor the current memory content, (1-Z)t)⊙h′tInformation indicating that the current memory content is reserved to be finally memorized is obtained, and data training is completed;
step 3, taking the output result of the GRU network as the input of the DNN neural network for further feature extraction to form the GRU-DNN network for identifying the sewing gesture; the method specifically comprises the following steps:
step 3.1, taking the final memory information stored in the GRU network structure as the input of the DNN neural network, and then carrying out initialization parameters, namely the initialization of the weight w and the bias b;
step 3.2, calculating an activation function, wherein the calculation formula is as follows:
wherein z is an independent variable, and z is 0, ± 1, ± 2 … …;
step 3.3, forward propagation, namely, a forward propagation algorithm, namely, a series of linear operations and activation operations are performed by using a plurality of weighting coefficient matrixes W, bias vectors b and input value vectors X, from an input layer, backward calculation is performed layer by layer until an output layer is obtained, and an output formula of the output layer is as follows:
al=σ×(Wl×al-1+bl) (24)
wherein l represents the number of layers, al-1Is the output of layer l-1 in the neural network, alAs output of layer I in the neural network, WlIs the weight of the l-th layer, blIs the bias of the l-th layer;
step 3.4, calculating a loss function, wherein the calculation formula is as follows:
in the formula: a islThe output of the l layer in the neural network is shown, x is a sequence output after GRU neural network training, and y is real training sample output;
and 3.5, performing back propagation to continuously update the parameters W and b, finding a proper linear coefficient matrix W and a proper bias vector b through a back propagation algorithm, and enabling the output calculated by all input training samples to be equal to or very close to the sample output as much as possible, wherein the calculation formula for updating the parameters W and b of each layer is as follows:
Zl=Wl×al-1+bl (26)
wherein Z islFor the inactive output of the l-th layer, the loss function is coupled to ZlCalculating a partial derivative to obtain:
couple the loss function to WlCalculating a partial derivative to obtain:
pair of loss functions blCalculating a partial derivative to obtain:
wherein, al-1Refers to the output of the l-1 layer neural network, blIs the bias of the l-th layer;
jointly solving the steps (24) to (29) to obtain Wl、blRealize to Wl、blIs continuously updated.
Step 3.6, starting from the input layer, calculating backwards layer by layer until the calculation is carried out to the output layer, and enabling the output result of the training sample calculation to be as close to the real training sample output result as possible through the calculation, wherein the output result of the training sample calculation at the moment is used as the finally extracted feature output;
and 4, sending the features extracted in the step 3 into an SVM classifier for action classification.
The invention discloses a sewing gesture recognition method based on deep learning, which combines a GRU network structure and a DNN network structure for behavior detection according to strong relevance of the GRU network structure in time and space during detection and effectiveness of the DNN network structure in extracting deep features. And carrying out color correction on the input data by using a dynamic threshold value method so as to eliminate the influence of illumination on color rendering. The pictures are rotated by 90 degrees, 180 degrees and 270 degrees to enhance the robustness of each angle in the imaging process. The preprocessed data are trained by utilizing the GRU network structure, the output result is sent into the DNN network structure as the input data of the DNN network structure for further feature extraction, compared with a single DNN network structure, the GRU-DNN network structure fully utilizes information on a time sequence and can obtain information of a deeper image when behavior detection is carried out, and the recognition effect is more accurate compared with the single network structure.
Claims (5)
1. A sewing gesture recognition method based on deep learning is characterized by comprising the following steps:
step 1, data set collection and pretreatment;
step 2, inputting the pictures in the preprocessed data set into a GRU neural network in a RGB picture frame mode for data training;
step 3, taking the output result of the GRU network as the input of the DNN neural network for further feature extraction to form the GRU-DNN network for identifying the sewing gesture;
and 4, sending the features extracted in the step 3 into an SVM classifier for action classification.
2. The sewing gesture recognition method based on deep learning according to claim 1, wherein the step 1 specifically comprises:
step 1.1, collecting sewing gesture data pictures, and carrying out color correction on the collected sewing gesture data pictures through a dynamic threshold method so as to eliminate the influence of illumination on color rendering;
step 1.2, adjusting the brightness of the sewing gesture data picture processed in the step 1.1 to be 0.6 to 1.5 times of the original brightness;
and step 1.3, randomly rotating the sewing gesture data picture with the brightness adjusted in the step 1.2 for 90 degrees or 180 degrees or 270 degrees without rotating to obtain a preprocessed sewing gesture data picture serving as a training set.
3. The sewing gesture recognition method based on deep learning according to claim 2, wherein the step 1.1 is specifically as follows:
step 1.1.1, dividing each sewing gesture data picture in a training set into a plurality of areas;
step 1.1.2, calculating C of pixel points in each regionbAnd CrAnd all pixels C in each regionbAnd CrAverage value M ofbAnd MrIn which C isbIndicating the color saturation of a pixel, CrRepresenting the tone of the pixel point;
Cb=-0.169×R-0.331×G+0.500×B (1)
Cr=0.500×R-0.419×G-0.081×G (2)
r, G, B is the component value C of red, green and blue of each pixel point in the collected sewing gesture data imageb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(N) is the tone of the nth pixel point in the corresponding region, and N is the number of the pixel points in the corresponding region;
step 1.1.3, C of each region is calculated separatelybAnd CrCumulative value D of corresponding absolute difference of componentsbAnd DrThe calculation formula is as follows:
wherein N is the number of pixel points per region, Cb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(n) is the tone of the nth pixel point in the corresponding region;
step 1.1.4, each pixel point D is interpretedb/DrA value of (D) ifb/DrIs smaller than the corresponding region Mb/MrIf so, ignoring the pixel point in the corresponding area;
step 1.1.5, for each sewing gesture data picture, judging in step 1.1.4, removing ignored pixel points, and pressingAccording to formulas (3) - (6), M corresponding to each area is solved againb、Mr、Db、DrThen M corresponding to each regionb、Mr、Db、DrRespectively summing the M and taking the average value as the M of the corresponding sewing gesture data pictureB、MR、DB、DRValue, wherein MBIs the average value of color saturation of the picture corresponding to the whole sewing gesture data, MRFor the average value of the tone of the picture corresponding to the whole sewing gesture data, DBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.6, if the pixel point in each region simultaneously satisfies the formulas (7) and (8), the pixel point is preliminarily determined as a white reference point:
|Cb(n)-(Mb+Db×sign(Mb))|<1.5×DB (7)
|Cr(n)-(1.5×Mr×sign(Mr))|<1.5×DR (8)
in the formula, Mb、MrAverage value of hue and saturation components of sewing gesture data picture, Db、DrSign is a signal processing function D for the calculated cumulative value of the absolute difference of hue and saturation components for each small regionBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.7, sorting the preliminarily determined white reference points in each area according to the brightness of the white reference points, and taking the first 10 percent of the white reference points as the finally determined white reference points;
step 1.1.8, calculate the average value R of all white reference point brightness in each areaaver、Gaver、Baver;
Wherein m is the number of the white reference points finally determined in the corresponding region, R1、R2......RmFor the color component of the red channel of each white reference point, G1、G2...GmThe color component of the green channel being the determined white reference point, B1、B2...BmA color component of a blue channel that is the determined white reference point;
step 1.1.9, calculating the gain of each channel, wherein the calculation formula is as follows:
Rgain=Ymax/Raver (12)
Ggain=Ymax/Gaver (13)
Bgain=Ymax/Baver (14)
Y=0.299×R+0.587×G+0.114×B (15)
in the formula: y ismaxIs the maximum value of the Y component in the color space in the whole image, Raver、Gaver、BaverThe average value of the brightness of the white reference point is R, G, B, and the component values of red, green and blue of each pixel point in the collected sewing gesture data image are R, G, B;
step 1.1.10, calculate the final color for each channel:
R′=R×Rgain (16)
G′=G×Ggain (17)
B′=B×Bgain (18)
in the formula, R, G, B represents the red, green and blue component values of each pixel point in the collected sewing gesture data image, and R ', G ' and B ' represent the red, green and blue components of the pixel points in the corrected sewing gesture data image.
4. The sewing gesture recognition method based on deep learning according to claim 3, wherein the step 2 specifically comprises:
step 2.1, the color components of the red channel, the green channel and the blue channel of the corrected sewing gesture data picture obtained in the step 1.1.10 are stored in a computer in a matrix form, and then the three matrices are converted into a column vector X which is used as a characteristic vector to be sent into a GRU network structure;
step 2.2, calculating the value of an update gate in the GRU network structure, specifically:
determining how much information is repeated from the previous time to the next time, and calculating the formula as follows:
Zt=σ×(W×Xt+U×ht-1) (19)
in the formula, XtFor the t-th component of the input feature vector X, ht-1For the stored information of the t-1 step, sigma is a logic sigmoid function, W and U are weight matrixes, the updating gate adds the two parts of information and puts the two parts of information into a sigmod activation function, the activation result is compressed to be between 0 and 1, and the updating gate controls the degree of bringing the state of the previous moment into the current state, namely the information of the previous moment is applied to the current moment;
step 2.3, the calculation of the reset gate specifically comprises the following steps: determining how much information in the past needs to be forgotten, and calculating the formula as follows:
r(t)=σ×(W×Xt+U×ht-1) (20)
in the formula: w and U are weight matrices, XtFor the t-th component of the input feature vector X, ht-1The information of the t-1 step is stored;
step 2.4, calculating the current memory content, and storing the current memory content in a reset gate, wherein the calculation formula is as follows:
h′t=tanh(Wxt+rt⊙Uht-1) (21)
in the formula, rtTo reset the output value of the gate, XtFor the t-th component of the input sequence x, ht-1The information of the t-1 step is stored;
step 2.5, the final output content of the gating circulation unit is obtained by adding the information preserved from the previous moment to the final memory and the information preserved from the current memory to the final moment, and the calculation formula is as follows:
ht=Zt⊙ht-1+(1-Zt)⊙h′t (22)
in the formula, ZtTo update the door calculation, ht-1For the saved information of step t-1, Zt⊙ht-1Indicates the information, h ', reserved from the previous step to the final memory'tFor the current memory content, (1-Z)t)⊙h′tAnd information indicating that the current memory content is reserved to be finally memorized is obtained, and data training is completed.
5. The sewing gesture recognition method based on deep learning according to claim 4, wherein the step 3 specifically comprises:
step 3.1, taking the final memory information stored in the GRU network structure as the input of the DNN neural network, and then carrying out initialization parameters, namely the initialization of the weight w and the bias b;
step 3.2, calculating an activation function, wherein the calculation formula is as follows:
wherein z is an independent variable, and z is 0, ± 1, ± 2 … …;
and 3.3, carrying out forward propagation to obtain an output result, wherein an output formula is as follows:
al=σ×(Wl×al-1+bl) (24)
wherein l represents the number of layers, al-1Is the output of layer l-1 in the neural network, alFor the input of the l layer in the neural networkGo out, WlIs the weight of the l-th layer, blIs the bias of the l-th layer;
step 3.4, calculating a loss function, wherein the calculation formula is as follows:
in the formula: a islThe output of the l layer in the neural network is shown, x is a sequence output after GRU neural network training, and y is real training sample output;
and 3.5, performing reverse propagation, wherein a calculation formula updated for each layer of parameters W and b is as follows:
Zl=Wl×al-1+bl (26)
wherein Z islFor the inactive output of the l-th layer, the loss function is coupled to ZlCalculating a partial derivative to obtain:
couple the loss function to WlCalculating a partial derivative to obtain:
pair of loss functions blCalculating a partial derivative to obtain:
wherein, al-1Refers to the output of the l-1 layer neural network, blIs the bias of the l-th layer;
jointly solving the steps (24) to (29) to obtain Wl、blRealize to Wl、blIs continuously updated.
And 3.6, calculating backwards layer by layer from the input layer until the calculation is carried out to the output layer, and obtaining a final feature extraction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011096967.6A CN112270220B (en) | 2020-10-14 | 2020-10-14 | Sewing gesture recognition method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011096967.6A CN112270220B (en) | 2020-10-14 | 2020-10-14 | Sewing gesture recognition method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112270220A true CN112270220A (en) | 2021-01-26 |
CN112270220B CN112270220B (en) | 2022-02-25 |
Family
ID=74337505
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011096967.6A Active CN112270220B (en) | 2020-10-14 | 2020-10-14 | Sewing gesture recognition method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112270220B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230107097A1 (en) * | 2021-10-06 | 2023-04-06 | Fotonation Limited | Method for identifying a gesture |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103208126A (en) * | 2013-04-17 | 2013-07-17 | 同济大学 | Method for monitoring moving object in natural environment |
CN105427261A (en) * | 2015-11-27 | 2016-03-23 | 努比亚技术有限公司 | Method and apparatus for removing image color noise and mobile terminal |
CN105812762A (en) * | 2016-03-23 | 2016-07-27 | 武汉鸿瑞达信息技术有限公司 | Automatic white balance method for processing image color cast |
CN108052884A (en) * | 2017-12-01 | 2018-05-18 | 华南理工大学 | A kind of gesture identification method based on improvement residual error neutral net |
CN108205671A (en) * | 2016-12-16 | 2018-06-26 | 浙江宇视科技有限公司 | Image processing method and device |
CN108537147A (en) * | 2018-03-22 | 2018-09-14 | 东华大学 | A kind of gesture identification method based on deep learning |
US10134421B1 (en) * | 2016-08-04 | 2018-11-20 | Amazon Technologies, Inc. | Neural network based beam selection |
CN108846356A (en) * | 2018-06-11 | 2018-11-20 | 南京邮电大学 | A method of the palm of the hand tracing and positioning based on real-time gesture identification |
CN108965609A (en) * | 2018-08-31 | 2018-12-07 | 南京宽塔信息技术有限公司 | The recognition methods of mobile terminal application scenarios and device |
CN109378064A (en) * | 2018-10-29 | 2019-02-22 | 南京医基云医疗数据研究院有限公司 | Medical data processing method, device electronic equipment and computer-readable medium |
CN109584186A (en) * | 2018-12-25 | 2019-04-05 | 西北工业大学 | A kind of unmanned aerial vehicle onboard image defogging method and device |
CN110827218A (en) * | 2019-10-31 | 2020-02-21 | 西北工业大学 | Airborne image defogging method based on image HSV transmissivity weighted correction |
CN110852960A (en) * | 2019-10-25 | 2020-02-28 | 江苏荣策士科技发展有限公司 | Image enhancement device and method for removing fog |
CN110929769A (en) * | 2019-11-14 | 2020-03-27 | 保定赛瑞电力科技有限公司 | Reactor mechanical fault joint detection model, method and device based on vibration and sound |
-
2020
- 2020-10-14 CN CN202011096967.6A patent/CN112270220B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103208126A (en) * | 2013-04-17 | 2013-07-17 | 同济大学 | Method for monitoring moving object in natural environment |
CN105427261A (en) * | 2015-11-27 | 2016-03-23 | 努比亚技术有限公司 | Method and apparatus for removing image color noise and mobile terminal |
CN105812762A (en) * | 2016-03-23 | 2016-07-27 | 武汉鸿瑞达信息技术有限公司 | Automatic white balance method for processing image color cast |
US10134421B1 (en) * | 2016-08-04 | 2018-11-20 | Amazon Technologies, Inc. | Neural network based beam selection |
CN108205671A (en) * | 2016-12-16 | 2018-06-26 | 浙江宇视科技有限公司 | Image processing method and device |
CN108052884A (en) * | 2017-12-01 | 2018-05-18 | 华南理工大学 | A kind of gesture identification method based on improvement residual error neutral net |
CN108537147A (en) * | 2018-03-22 | 2018-09-14 | 东华大学 | A kind of gesture identification method based on deep learning |
CN108846356A (en) * | 2018-06-11 | 2018-11-20 | 南京邮电大学 | A method of the palm of the hand tracing and positioning based on real-time gesture identification |
CN108965609A (en) * | 2018-08-31 | 2018-12-07 | 南京宽塔信息技术有限公司 | The recognition methods of mobile terminal application scenarios and device |
CN109378064A (en) * | 2018-10-29 | 2019-02-22 | 南京医基云医疗数据研究院有限公司 | Medical data processing method, device electronic equipment and computer-readable medium |
CN109584186A (en) * | 2018-12-25 | 2019-04-05 | 西北工业大学 | A kind of unmanned aerial vehicle onboard image defogging method and device |
CN110852960A (en) * | 2019-10-25 | 2020-02-28 | 江苏荣策士科技发展有限公司 | Image enhancement device and method for removing fog |
CN110827218A (en) * | 2019-10-31 | 2020-02-21 | 西北工业大学 | Airborne image defogging method based on image HSV transmissivity weighted correction |
CN110929769A (en) * | 2019-11-14 | 2020-03-27 | 保定赛瑞电力科技有限公司 | Reactor mechanical fault joint detection model, method and device based on vibration and sound |
Non-Patent Citations (2)
Title |
---|
GYEOWOON JUNG等: "DNN-GRU multiple layers for VAD in PC Game Café", 《2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - ASIA (ICCE-ASIA)》 * |
王晓华 等: "基于改进POLO深度卷积神经网络的缝纫手势检测", 《纺织学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230107097A1 (en) * | 2021-10-06 | 2023-04-06 | Fotonation Limited | Method for identifying a gesture |
US11983327B2 (en) * | 2021-10-06 | 2024-05-14 | Fotonation Limited | Method for identifying a gesture |
Also Published As
Publication number | Publication date |
---|---|
CN112270220B (en) | 2022-02-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021253939A1 (en) | Rough set-based neural network method for segmenting fundus retinal vascular image | |
CN108830157B (en) | Human behavior identification method based on attention mechanism and 3D convolutional neural network | |
CN107844795B (en) | Convolutional neural network feature extraction method based on principal component analysis | |
Zhang et al. | Plant disease recognition based on plant leaf image. | |
CN106204779B (en) | Check class attendance method based on plurality of human faces data collection strategy and deep learning | |
Varga et al. | Fully automatic image colorization based on Convolutional Neural Network | |
CN108009493B (en) | Human face anti-cheating recognition method based on motion enhancement | |
CN108268859A (en) | A kind of facial expression recognizing method based on deep learning | |
US20180130186A1 (en) | Hybrid machine learning systems | |
CN108388896A (en) | A kind of licence plate recognition method based on dynamic time sequence convolutional neural networks | |
CN109657612B (en) | Quality sorting system based on facial image features and application method thereof | |
Xu et al. | Recurrent convolutional neural network for video classification | |
CN109543632A (en) | A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features | |
CN106778785A (en) | Build the method for image characteristics extraction model and method, the device of image recognition | |
CN108509920A (en) | The face identification method of the multichannel combined feature selecting study of more patch based on CNN | |
CN107862680B (en) | Target tracking optimization method based on correlation filter | |
CN110969171A (en) | Image classification model, method and application based on improved convolutional neural network | |
CN109902613A (en) | A kind of human body feature extraction method based on transfer learning and image enhancement | |
CN112487981A (en) | MA-YOLO dynamic gesture rapid recognition method based on two-way segmentation | |
CN107516083A (en) | A kind of remote facial image Enhancement Method towards identification | |
CN112766021A (en) | Method for re-identifying pedestrians based on key point information and semantic segmentation information of pedestrians | |
Yang et al. | A Face Detection Method Based on Skin Color Model and Improved AdaBoost Algorithm. | |
Li et al. | A self-attention feature fusion model for rice pest detection | |
Gurrala et al. | A new segmentation method for plant disease diagnosis | |
CN105825234A (en) | Superpixel and background model fused foreground detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |