CN112270220A - Sewing gesture recognition method based on deep learning - Google Patents

Sewing gesture recognition method based on deep learning Download PDF

Info

Publication number
CN112270220A
CN112270220A CN202011096967.6A CN202011096967A CN112270220A CN 112270220 A CN112270220 A CN 112270220A CN 202011096967 A CN202011096967 A CN 202011096967A CN 112270220 A CN112270220 A CN 112270220A
Authority
CN
China
Prior art keywords
sewing
gesture data
formula
information
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011096967.6A
Other languages
Chinese (zh)
Other versions
CN112270220B (en
Inventor
王晓华
杨思捷
王文杰
张蕾
苏泽斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Polytechnic University
Original Assignee
Xian Polytechnic University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Polytechnic University filed Critical Xian Polytechnic University
Priority to CN202011096967.6A priority Critical patent/CN112270220B/en
Publication of CN112270220A publication Critical patent/CN112270220A/en
Application granted granted Critical
Publication of CN112270220B publication Critical patent/CN112270220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Sewing Machines And Sewing (AREA)

Abstract

The invention discloses a sewing gesture recognition method based on deep learning, which is implemented according to the following steps: step 1, data set collection and pretreatment; step 2, inputting the pictures in the preprocessed data set into a GRU neural network in a RGB picture frame mode for data training; step 3, taking the output result of the GRU network as the input of the DNN neural network for further feature extraction to form the GRU-DNN network for identifying the sewing gesture; and 4, sending the features extracted in the step 3 into an SVM classifier for action classification. The invention solves the problems that the DNN in the prior art can not process the condition of change in time sequence in the behavior detection process, and the RNN network structure has no gradient in the detection process, so that the identification effect is inaccurate.

Description

Sewing gesture recognition method based on deep learning
Technical Field
The invention belongs to the technical field of artificial intelligence, and relates to a sewing gesture recognition method based on deep learning.
Background
With the increase in labor costs and the increase in computer technology, "man + machine + environment" systems have also become an irreversible trend. The deep learning related technology obtains remarkable results in the field of behavior detection, overcomes the defect that the traditional artificial feature method can only identify in a simple scene, optimizes the classification task more effectively, and further extracts feature information in data more efficiently.
The existing sewing gesture recognition mainly adopts a recurrent neural network for recognition, and the recurrent neural network mainly represents the models as follows: RNN (recurrent neural network) model, LSTM model, GRU (gated cyclic unit) model. The RNN model can connect the current process with the past state and has a certain memory function. The LSTM model and the GRU model are structural variants of the RNN model, and compared with the RNN model, the LSTM neural network enables the recurrent neural network to memorize past information and selectively forget unimportant information. Compared with an LSTM network structure, the GRU neural network can utilize all information of pictures in the identification process, solves the problem that gradient disappears under long sequence information on the basis of the LSTM, and is simpler in structure and better in identification effect compared with an LSTM structure model. The DNN neural network (deep neural network) is also widely applied to the field of behavior recognition as a feedforward artificial neural network, can solve the problem of deep level, and can extract features better in depth, but DNN cannot process the change situation in time sequence when performing behavior detection, and the problem that the gradient disappears in the detection process occurs in the basic RNN structure, and the effect of acquiring the deep level information of the image when the RNN network performs detection is relatively deficient compared with DNN.
Disclosure of Invention
The invention aims to provide a sewing gesture recognition method based on deep learning, and solves the problems that in the prior art, the DNN cannot timely process the condition that the time sequence changes when behavior detection is carried out, and the basic RNN network structure has gradient disappearance in the detection process, so that the recognition effect is inaccurate.
The invention adopts the technical scheme that a sewing gesture recognition method based on deep learning is implemented according to the following steps:
step 1, data set collection and pretreatment;
step 2, inputting the pictures in the preprocessed data set into a GRU neural network in a RGB picture frame mode for data training;
step 3, taking the output result of the GRU network as the input of the DNN neural network for further feature extraction to form the GRU-DNN network for identifying the sewing gesture;
and 4, sending the features extracted in the step 3 into an SVM classifier for action classification.
The present invention is also characterized in that,
the step 1 specifically comprises the following steps:
step 1.1, collecting sewing gesture data pictures, and carrying out color correction on the collected sewing gesture data pictures through a dynamic threshold method so as to eliminate the influence of illumination on color rendering;
step 1.2, adjusting the brightness of the sewing gesture data picture processed in the step 1.1 to be 0.6 to 1.5 times of the original brightness;
and step 1.3, randomly rotating the sewing gesture data picture with the brightness adjusted in the step 1.2 for 90 degrees or 180 degrees or 270 degrees without rotating to obtain a preprocessed sewing gesture data picture serving as a training set.
The step 1.1 specifically comprises the following steps:
step 1.1.1, dividing each sewing gesture data picture in a training set into a plurality of areas;
step 1.1.2, calculating C of pixel points in each regionbAnd CrAnd all pixels C in each regionbAnd CrAverage value M ofbAnd MrIn which C isbIndicating the color saturation of a pixel, CrRepresenting the tone of the pixel point;
Cb=-0.169×R-0.331×G+0.500×B (1)
Cr=0.500×R-0.419×G-0.081×G (2)
Figure BDA0002724077960000031
Figure BDA0002724077960000032
r, G, B is the component value C of red, green and blue of each pixel point in the collected sewing gesture data imageb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(N) is the tone of the nth pixel point in the corresponding region, and N is the number of the pixel points in the corresponding region;
step 1.1.3, C of each region is calculated separatelybAnd CrCumulative value D of corresponding absolute difference of componentsbAnd DrThe calculation formula is as follows:
Figure BDA0002724077960000033
Figure BDA0002724077960000034
wherein N is the number of pixel points per region, Cb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(n) is the tone of the nth pixel point in the corresponding region;
step 1.1.4, each pixel point D is interpretedb/DrA value of (D) ifb/DrIs smaller than the corresponding region Mb/MrIf so, ignoring the pixel point in the corresponding area;
step 1.1.5, for each sewing gesture data picture, removing the sewing gesture data picture through judgment of step one 1.1.4The ignored pixel points are re-solved for M corresponding to each region according to the formulas (3) to (6)b、Mr、Db、DrThen M corresponding to each regionb、Mr、Db、DrRespectively summing the M and taking the average value as the M of the corresponding sewing gesture data pictureB、MR、DB、DRValue, wherein MBIs the average value of color saturation of the picture corresponding to the whole sewing gesture data, MRFor the average value of the tone of the picture corresponding to the whole sewing gesture data, DBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.6, if the pixel point in each region simultaneously satisfies the formulas (7) and (8), the pixel point is preliminarily determined as a white reference point:
|Cb(n)-(Mb+Db x sign(Mb))|<1.5 x DB (7)
|Cr(n)-(1.5 x Mr x sign(Mr))|<1.5 x DR (8)
in the formula, Mb、MrAverage value of hue and saturation components of sewing gesture data picture, Db、DrSign is a signal processing function D for the calculated cumulative value of the absolute difference of hue and saturation components for each small regionBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.7, sorting the preliminarily determined white reference points in each area according to the brightness of the white reference points, and taking the first 10 percent of the white reference points as the finally determined white reference points;
step 1.1.8, calculate the average value R of all white reference point brightness in each areaaver、Gaver、Baver
Figure BDA0002724077960000041
Figure BDA0002724077960000042
Figure BDA0002724077960000043
Wherein m is the number of the white reference points finally determined in the corresponding region, R1、R2……RmFor the color component of the red channel of each white reference point, G1、G2…GmThe color component of the green channel being the determined white reference point, B1、B2…BmA color component of a blue channel that is the determined white reference point;
step 1.1.9, calculating the gain of each channel, wherein the calculation formula is as follows:
Rgain=Ymax/Raver (12)
Ggain=Ymax/Gaver (13)
Bgain=Ymax/Baver (14)
Y=0.299 x R+0.587 x G+0.114 x B (15)
in the formula: y ismaxIs the maximum value of the Y component in the color space in the whole image, Raver、Gaver、BaverThe average value of the brightness of the white reference point is R, G, B, and the component values of red, green and blue of each pixel point in the collected sewing gesture data image are R, G, B;
step 1.1.10, calculate the final color for each channel:
R′=R x Rgain (16)
G′=G x Ggain (17)
B′=B x Bgain (18)
in the formula, R, G, B represents the red, green and blue component values of each pixel point in the collected sewing gesture data image, and R ', G ' and B ' represent the red, green and blue components of the pixel points in the corrected sewing gesture data image.
The step 2 specifically comprises the following steps:
step 2.1, the color components of the red channel, the green channel and the blue channel of the corrected sewing gesture data picture obtained in the step 1.1.10 are stored in a computer in a matrix form, and then the three matrices are converted into a column vector X which is used as a characteristic vector to be sent into a GRU network structure;
step 2.2, calculating the value of an update gate in the GRU network structure, specifically:
determining how much information is repeated from the previous time to the next time, and calculating the formula as follows:
Zt=σ×(W×Xt+U×ht-1) (19)
in the formula, XtFor the t-th component of the input feature vector X, ht-1For the stored information of the t-1 step, sigma is a logic sigmoid function, W and U are weight matrixes, the updating gate adds the two parts of information and puts the two parts of information into a sigmod activation function, the activation result is compressed to be between 0 and 1, and the updating gate controls the degree of bringing the state of the previous moment into the current state, namely the information of the previous moment is applied to the current moment;
step 2.3, the calculation of the reset gate specifically comprises the following steps: determining how much information in the past needs to be forgotten, and calculating the formula as follows:
r(t)=σ×(W×Xt+U×ht-1) (20)
in the formula: w and U are weight matrices, XtFor the t-th component of the input feature vector X, ht-1The information of the t-1 step is stored;
step 2.4, calculating the current memory content, and storing the current memory content in a reset gate, wherein the calculation formula is as follows:
h′t=tanh(Wxt+rt⊙Uht-1) (21)
in the formula, rtTo reset the output value of the gate, XtIs the first of the input sequence xt components, ht-1The information of the t-1 step is stored;
step 2.5, the final output content of the gating circulation unit is obtained by adding the information preserved from the previous moment to the final memory and the information preserved from the current memory to the final moment, and the calculation formula is as follows:
ht=Zt⊙ht-1+(1-Zt)⊙h′t (22)
in the formula, ZtTo update the door calculation, ht-1For the saved information of step t-1, Zt⊙ht-1Indicates the information, h ', reserved from the previous step to the final memory'tFor the current memory content, (1-Z)t)⊙h′tAnd information indicating that the current memory content is reserved to be finally memorized is obtained, and data training is completed.
The step 3 specifically comprises the following steps:
step 3.1, taking the final memory information stored in the GRU network structure as the input of the DNN neural network, and then carrying out initialization parameters, namely the initialization of the weight w and the bias b;
step 3.2, calculating an activation function, wherein the calculation formula is as follows:
Figure BDA0002724077960000071
wherein z is an independent variable, and z is 0, ± 1, ± 2 … …;
and 3.3, carrying out forward propagation to obtain an output result, wherein an output formula is as follows:
al=σ×(Wl×al-1+bl) (24)
wherein l represents the number of layers, al-1Is the output of layer l-1 in the neural network, alAs output of layer I in the neural network, WlIs the weight of the l-th layer, blIs the bias of the l-th layer;
step 3.4, calculating a loss function, wherein the calculation formula is as follows:
Figure BDA0002724077960000072
in the formula: a islThe output of the l layer in the neural network is shown, x is a sequence output after GRU neural network training, and y is real training sample output;
and 3.5, performing reverse propagation, wherein a calculation formula updated for each layer of parameters W and b is as follows:
Zl=Wl×al-1+bl (26)
wherein Z islFor the inactive output of the l-th layer, the loss function is coupled to ZlCalculating a partial derivative to obtain:
Figure BDA0002724077960000073
couple the loss function to WlCalculating a partial derivative to obtain:
Figure BDA0002724077960000074
pair of loss functions blCalculating a partial derivative to obtain:
Figure BDA0002724077960000075
wherein, al-1Refers to the output of the l-1 layer neural network, blIs the bias of the l-th layer;
jointly solving the steps (24) to (29) to obtain Wl、blRealize to Wl、blIs continuously updated.
And 3.6, calculating backwards layer by layer from the input layer until the calculation is carried out to the output layer, and obtaining a final feature extraction result.
The invention has the beneficial effects that:
the invention discloses a sewing gesture recognition method based on deep learning, which combines a GRU network structure and a DNN network structure for behavior detection according to strong relevance of the GRU network structure in time and space during detection and effectiveness of the DNN network structure in extracting deep features. And carrying out color correction on the input data by using a dynamic threshold value method so as to eliminate the influence of illumination on color rendering. The pictures are rotated by 90 degrees, 180 degrees and 270 degrees to enhance the robustness of each angle in the imaging process. The preprocessed data are trained by utilizing the GRU network structure, the output result is sent into the DNN network structure as the input data of the DNN network structure for further feature extraction, compared with a single DNN network structure, the GRU-DNN network structure fully utilizes information on a time sequence and can obtain information of a deeper image when behavior detection is carried out, and the recognition effect is more accurate compared with the single network structure.
Drawings
FIG. 1 is an overall flow chart of a sewing gesture recognition method based on deep learning according to the present invention;
FIG. 2 is a flow chart of color correction during data preprocessing in the sewing gesture recognition method based on deep learning according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a sewing gesture recognition method based on deep learning, the flow of which is shown in figure 1 and is implemented according to the following steps:
step 1, data set collection and pretreatment; the method specifically comprises the following steps:
step 1.1, collecting sewing gesture data pictures, and carrying out color correction on the collected sewing gesture data pictures through a dynamic threshold method so as to eliminate the influence of illumination on color rendering; the color correction is mainly because a certain deviation exists between the acquired image and the real image, and the influence of illumination on color rendering is eliminated by adopting a dynamic threshold algorithm, as shown in fig. 2, specifically:
step 1.1.1, dividing each sewing gesture data picture in a training set into a plurality of areas;
step 1.1.2, calculating C of pixel points in each regionbAnd CrTo do so byAnd all pixel points C in each regionbAnd CrAverage value M ofbAnd MrIn which C isbIndicating the color saturation of a pixel, CrRepresenting the tone of the pixel point;
Cb=-0.169×R-0.331×G+0.500×B (1)
Cr=0.500×R-0.419×G-0.081×G (2)
Figure BDA0002724077960000091
Figure BDA0002724077960000092
r, G, B is the component value C of red, green and blue of each pixel point in the collected sewing gesture data imageb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(N) is the tone of the nth pixel point in the corresponding region, and N is the number of the pixel points in the corresponding region;
step 1.1.3, C of each region is calculated separatelybAnd CrCumulative value D of corresponding absolute difference of componentsbAnd DrThe calculation formula is as follows:
Figure BDA0002724077960000093
Figure BDA0002724077960000094
wherein N is the number of pixel points per region, Cb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(n) is the tone of the nth pixel point in the corresponding region;
step 1.1.4, each pixel point D is interpretedb/DrA value of (D) ifb/DrIs smaller than the corresponding region Mb/MrA value of (1), thenIgnoring the pixel point of the corresponding region;
step 1.1.5, for each sewing gesture data picture, judging in step 1.1.4, removing ignored pixel points, and re-obtaining M corresponding to each region according to formulas (3) - (6)b、Mr、Db、DrThen M corresponding to each regionb、Mr、Db、DrRespectively summing the M and taking the average value as the M of the corresponding sewing gesture data pictureB、MR、DB、DRValue, wherein MBIs the average value of color saturation of the picture corresponding to the whole sewing gesture data, MRFor the average value of the tone of the picture corresponding to the whole sewing gesture data, DBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.6, if the pixel point in each region simultaneously satisfies the formulas (7) and (8), the pixel point is preliminarily determined as a white reference point:
|Cb(n)-(Mb+Db x sign(Mb))|<1.5 x DB (7)
|Cr(n)-(1.5 x Mr x sign(Mr))|<1.5 x DR (8)
in the formula, Mb、MrAverage value of hue and saturation components of sewing gesture data picture, Db、DrSign is a signal processing function D for the calculated cumulative value of the absolute difference of hue and saturation components for each small regionBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.7, sorting the preliminarily determined white reference points in each area according to the brightness of the white reference points, and taking the first 10 percent of the white reference points as the finally determined white reference points;
step 1.1.8, calculate the average value R of all white reference point brightness in each areaaver、Gaver、Baver
Figure BDA0002724077960000101
Figure BDA0002724077960000102
Figure BDA0002724077960000103
Wherein m is the number of the white reference points finally determined in the corresponding region, R1、R2……RmFor the color component of the red channel of each white reference point, G1、G2…GmThe color component of the green channel being the determined white reference point, B1、B2…BmA color component of a blue channel that is the determined white reference point;
step 1.1.9, calculating the gain of each channel, wherein the calculation formula is as follows:
Rgain=Ymax/Raver (12)
Ggain=Ymax/Gaver (13)
Bgain=Ymax/Baver (14)
Y=0.299 x R+0.587 x G+0.114 x B (15)
in the formula: y ismaxIs the maximum value of the Y component in the color space in the whole image, Raver、Gaver、BaverThe average value of the brightness of the white reference point is R, G, B, and the component values of red, green and blue of each pixel point in the collected sewing gesture data image are R, G, B;
step 1.1.10, calculate the final color for each channel:
R′=R x Rgain (16)
G′=G x Ggain (17)
B′=B x Bgain (18)
in the formula, R, G, B is the red, green and blue component values of each pixel point in the collected sewing gesture data image, and R ', G ' and B ' are the red, green and blue components of the pixel points in the corrected sewing gesture data image;
step 1.2, adjusting the brightness of the sewing gesture data picture processed in the step 1.1 to be 0.6 to 1.5 times of the original brightness;
step 1.3, the sewing gesture data picture with the brightness adjusted in the step 1.2 is randomly rotated for 90 degrees or 180 degrees or 270 degrees without rotating to enhance the robustness of different imaging angles, and the sewing gesture data picture after preprocessing is obtained and used as a training set;
step 2, inputting the pictures in the preprocessed data set into a GRU neural network in a RGB picture frame mode for data training; the method specifically comprises the following steps:
step 2.1, the color components of the red channel, the green channel and the blue channel of the corrected sewing gesture data picture obtained in the step 1.1.10 are stored in a computer in a matrix form, and then the three matrices are converted into a column vector X which is used as a characteristic vector to be sent into a GRU network structure; for example:
it is assumed that R ', G ', and B ' obtained in step 1.1.10 are stored in the computer as follows:
Figure BDA0002724077960000121
Figure BDA0002724077960000122
Figure BDA0002724077960000123
the three matrices represent the preprocessed image in the computer, the values in the matrices correspond to the red, green and blue intensity values in the image, and for the convenience of feature extraction of the neural network, the 3 matrices are converted into 1 vector X, and the final result of X can be obtained in the above example:
Figure BDA0002724077960000124
from the above, it can be seen that the R ', G ', B ' matrices are 3 × 3 in size, respectively, and then the total dimension of the vector X is 3 × 3 × 3, resulting in 27. In the field of artificial intelligence, each data input into the neural network is called a feature, so the example mentioned above has 27 features, the 27-dimensional vector is also called a feature vector, and the neural network receives the feature vector as an input to perform prediction;
the converted feature vectors are sent into a GRU network structure, and values of an update gate and a reset gate in the GRU network structure are calculated respectively;
step 2.2, calculating the value of an update gate in the GRU network structure, specifically:
determining how much information is repeated from the previous time to the next time, and calculating the formula as follows:
Zt=σ×(W×Xt+U×ht-1) (19)
in the formula, XtFor the t-th component of the input feature vector X, ht-1For the stored information of the t-1 step, sigma is a logic sigmoid function, W and U are weight matrixes, the updating gate adds the two parts of information and puts the information into a sigmod activation function, the activation result is compressed to be between 0 and 1, and the updating gate controls the degree of bringing the state of the previous moment into the current state, namely how much information of the previous moment is applied to the current moment, ZtThe larger the information is brought in;
step 2.3, the calculation of the reset gate specifically comprises the following steps: determining how much information in the past needs to be forgotten, and calculating the formula as follows:
r(t)=σ×(W×Xt+U×ht-1) (20)
in the formula: w and U are weight matrices, XtFor the t-th component of the input feature vector X,ht-1the information of the t-1 step is stored;
step 2.4, calculating the current memory content, and storing the current memory content in a reset gate, wherein the calculation formula is as follows:
h′t=tanh(Wxt+rt⊙Uht-1) (21)
in the formula, rtTo reset the output value of the gate, XtFor the t-th component of the input sequence x, ht-1The information of the t-1 step is stored;
step 2.5, the final output content of the gating circulation unit is obtained by adding the information preserved from the previous moment to the final memory and the information preserved from the current memory to the final moment, and the calculation formula is as follows:
ht=Zt⊙ht-1+(1-Zt)⊙h′t (22)
in the formula, ZtTo update the door calculation, ht-1For the saved information of step t-1, Zt⊙ht-1Indicates the information, h ', reserved from the previous step to the final memory'tFor the current memory content, (1-Z)t)⊙h′tInformation indicating that the current memory content is reserved to be finally memorized is obtained, and data training is completed;
step 3, taking the output result of the GRU network as the input of the DNN neural network for further feature extraction to form the GRU-DNN network for identifying the sewing gesture; the method specifically comprises the following steps:
step 3.1, taking the final memory information stored in the GRU network structure as the input of the DNN neural network, and then carrying out initialization parameters, namely the initialization of the weight w and the bias b;
step 3.2, calculating an activation function, wherein the calculation formula is as follows:
Figure BDA0002724077960000141
wherein z is an independent variable, and z is 0, ± 1, ± 2 … …;
step 3.3, forward propagation, namely, a forward propagation algorithm, namely, a series of linear operations and activation operations are performed by using a plurality of weighting coefficient matrixes W, bias vectors b and input value vectors X, from an input layer, backward calculation is performed layer by layer until an output layer is obtained, and an output formula of the output layer is as follows:
al=σ×(Wl×al-1+bl) (24)
wherein l represents the number of layers, al-1Is the output of layer l-1 in the neural network, alAs output of layer I in the neural network, WlIs the weight of the l-th layer, blIs the bias of the l-th layer;
step 3.4, calculating a loss function, wherein the calculation formula is as follows:
Figure BDA0002724077960000151
in the formula: a islThe output of the l layer in the neural network is shown, x is a sequence output after GRU neural network training, and y is real training sample output;
and 3.5, performing back propagation to continuously update the parameters W and b, finding a proper linear coefficient matrix W and a proper bias vector b through a back propagation algorithm, and enabling the output calculated by all input training samples to be equal to or very close to the sample output as much as possible, wherein the calculation formula for updating the parameters W and b of each layer is as follows:
Zl=Wl×al-1+bl (26)
wherein Z islFor the inactive output of the l-th layer, the loss function is coupled to ZlCalculating a partial derivative to obtain:
Figure BDA0002724077960000152
couple the loss function to WlCalculating a partial derivative to obtain:
Figure BDA0002724077960000153
pair of loss functions blCalculating a partial derivative to obtain:
Figure BDA0002724077960000154
wherein, al-1Refers to the output of the l-1 layer neural network, blIs the bias of the l-th layer;
jointly solving the steps (24) to (29) to obtain Wl、blRealize to Wl、blIs continuously updated.
Step 3.6, starting from the input layer, calculating backwards layer by layer until the calculation is carried out to the output layer, and enabling the output result of the training sample calculation to be as close to the real training sample output result as possible through the calculation, wherein the output result of the training sample calculation at the moment is used as the finally extracted feature output;
and 4, sending the features extracted in the step 3 into an SVM classifier for action classification.
The invention discloses a sewing gesture recognition method based on deep learning, which combines a GRU network structure and a DNN network structure for behavior detection according to strong relevance of the GRU network structure in time and space during detection and effectiveness of the DNN network structure in extracting deep features. And carrying out color correction on the input data by using a dynamic threshold value method so as to eliminate the influence of illumination on color rendering. The pictures are rotated by 90 degrees, 180 degrees and 270 degrees to enhance the robustness of each angle in the imaging process. The preprocessed data are trained by utilizing the GRU network structure, the output result is sent into the DNN network structure as the input data of the DNN network structure for further feature extraction, compared with a single DNN network structure, the GRU-DNN network structure fully utilizes information on a time sequence and can obtain information of a deeper image when behavior detection is carried out, and the recognition effect is more accurate compared with the single network structure.

Claims (5)

1. A sewing gesture recognition method based on deep learning is characterized by comprising the following steps:
step 1, data set collection and pretreatment;
step 2, inputting the pictures in the preprocessed data set into a GRU neural network in a RGB picture frame mode for data training;
step 3, taking the output result of the GRU network as the input of the DNN neural network for further feature extraction to form the GRU-DNN network for identifying the sewing gesture;
and 4, sending the features extracted in the step 3 into an SVM classifier for action classification.
2. The sewing gesture recognition method based on deep learning according to claim 1, wherein the step 1 specifically comprises:
step 1.1, collecting sewing gesture data pictures, and carrying out color correction on the collected sewing gesture data pictures through a dynamic threshold method so as to eliminate the influence of illumination on color rendering;
step 1.2, adjusting the brightness of the sewing gesture data picture processed in the step 1.1 to be 0.6 to 1.5 times of the original brightness;
and step 1.3, randomly rotating the sewing gesture data picture with the brightness adjusted in the step 1.2 for 90 degrees or 180 degrees or 270 degrees without rotating to obtain a preprocessed sewing gesture data picture serving as a training set.
3. The sewing gesture recognition method based on deep learning according to claim 2, wherein the step 1.1 is specifically as follows:
step 1.1.1, dividing each sewing gesture data picture in a training set into a plurality of areas;
step 1.1.2, calculating C of pixel points in each regionbAnd CrAnd all pixels C in each regionbAnd CrAverage value M ofbAnd MrIn which C isbIndicating the color saturation of a pixel, CrRepresenting the tone of the pixel point;
Cb=-0.169×R-0.331×G+0.500×B (1)
Cr=0.500×R-0.419×G-0.081×G (2)
Figure FDA0002724077950000021
Figure FDA0002724077950000022
r, G, B is the component value C of red, green and blue of each pixel point in the collected sewing gesture data imageb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(N) is the tone of the nth pixel point in the corresponding region, and N is the number of the pixel points in the corresponding region;
step 1.1.3, C of each region is calculated separatelybAnd CrCumulative value D of corresponding absolute difference of componentsbAnd DrThe calculation formula is as follows:
Figure FDA0002724077950000023
Figure FDA0002724077950000024
wherein N is the number of pixel points per region, Cb(n) is the color saturation of the nth pixel point in the corresponding region, Cr(n) is the tone of the nth pixel point in the corresponding region;
step 1.1.4, each pixel point D is interpretedb/DrA value of (D) ifb/DrIs smaller than the corresponding region Mb/MrIf so, ignoring the pixel point in the corresponding area;
step 1.1.5, for each sewing gesture data picture, judging in step 1.1.4, removing ignored pixel points, and pressingAccording to formulas (3) - (6), M corresponding to each area is solved againb、Mr、Db、DrThen M corresponding to each regionb、Mr、Db、DrRespectively summing the M and taking the average value as the M of the corresponding sewing gesture data pictureB、MR、DB、DRValue, wherein MBIs the average value of color saturation of the picture corresponding to the whole sewing gesture data, MRFor the average value of the tone of the picture corresponding to the whole sewing gesture data, DBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.6, if the pixel point in each region simultaneously satisfies the formulas (7) and (8), the pixel point is preliminarily determined as a white reference point:
|Cb(n)-(Mb+Db×sign(Mb))|<1.5×DB (7)
|Cr(n)-(1.5×Mr×sign(Mr))|<1.5×DR (8)
in the formula, Mb、MrAverage value of hue and saturation components of sewing gesture data picture, Db、DrSign is a signal processing function D for the calculated cumulative value of the absolute difference of hue and saturation components for each small regionBFor the cumulative value of the absolute difference of the color saturation of the picture corresponding to the entire sewing gesture data, DRThe integral value of the absolute difference of the tone of the corresponding whole sewing gesture data picture is obtained;
step 1.1.7, sorting the preliminarily determined white reference points in each area according to the brightness of the white reference points, and taking the first 10 percent of the white reference points as the finally determined white reference points;
step 1.1.8, calculate the average value R of all white reference point brightness in each areaaver、Gaver、Baver
Figure FDA0002724077950000031
Figure FDA0002724077950000032
Figure FDA0002724077950000033
Wherein m is the number of the white reference points finally determined in the corresponding region, R1、R2......RmFor the color component of the red channel of each white reference point, G1、G2...GmThe color component of the green channel being the determined white reference point, B1、B2...BmA color component of a blue channel that is the determined white reference point;
step 1.1.9, calculating the gain of each channel, wherein the calculation formula is as follows:
Rgain=Ymax/Raver (12)
Ggain=Ymax/Gaver (13)
Bgain=Ymax/Baver (14)
Y=0.299×R+0.587×G+0.114×B (15)
in the formula: y ismaxIs the maximum value of the Y component in the color space in the whole image, Raver、Gaver、BaverThe average value of the brightness of the white reference point is R, G, B, and the component values of red, green and blue of each pixel point in the collected sewing gesture data image are R, G, B;
step 1.1.10, calculate the final color for each channel:
R′=R×Rgain (16)
G′=G×Ggain (17)
B′=B×Bgain (18)
in the formula, R, G, B represents the red, green and blue component values of each pixel point in the collected sewing gesture data image, and R ', G ' and B ' represent the red, green and blue components of the pixel points in the corrected sewing gesture data image.
4. The sewing gesture recognition method based on deep learning according to claim 3, wherein the step 2 specifically comprises:
step 2.1, the color components of the red channel, the green channel and the blue channel of the corrected sewing gesture data picture obtained in the step 1.1.10 are stored in a computer in a matrix form, and then the three matrices are converted into a column vector X which is used as a characteristic vector to be sent into a GRU network structure;
step 2.2, calculating the value of an update gate in the GRU network structure, specifically:
determining how much information is repeated from the previous time to the next time, and calculating the formula as follows:
Zt=σ×(W×Xt+U×ht-1) (19)
in the formula, XtFor the t-th component of the input feature vector X, ht-1For the stored information of the t-1 step, sigma is a logic sigmoid function, W and U are weight matrixes, the updating gate adds the two parts of information and puts the two parts of information into a sigmod activation function, the activation result is compressed to be between 0 and 1, and the updating gate controls the degree of bringing the state of the previous moment into the current state, namely the information of the previous moment is applied to the current moment;
step 2.3, the calculation of the reset gate specifically comprises the following steps: determining how much information in the past needs to be forgotten, and calculating the formula as follows:
r(t)=σ×(W×Xt+U×ht-1) (20)
in the formula: w and U are weight matrices, XtFor the t-th component of the input feature vector X, ht-1The information of the t-1 step is stored;
step 2.4, calculating the current memory content, and storing the current memory content in a reset gate, wherein the calculation formula is as follows:
h′t=tanh(Wxt+rt⊙Uht-1) (21)
in the formula, rtTo reset the output value of the gate, XtFor the t-th component of the input sequence x, ht-1The information of the t-1 step is stored;
step 2.5, the final output content of the gating circulation unit is obtained by adding the information preserved from the previous moment to the final memory and the information preserved from the current memory to the final moment, and the calculation formula is as follows:
ht=Zt⊙ht-1+(1-Zt)⊙h′t (22)
in the formula, ZtTo update the door calculation, ht-1For the saved information of step t-1, Zt⊙ht-1Indicates the information, h ', reserved from the previous step to the final memory'tFor the current memory content, (1-Z)t)⊙h′tAnd information indicating that the current memory content is reserved to be finally memorized is obtained, and data training is completed.
5. The sewing gesture recognition method based on deep learning according to claim 4, wherein the step 3 specifically comprises:
step 3.1, taking the final memory information stored in the GRU network structure as the input of the DNN neural network, and then carrying out initialization parameters, namely the initialization of the weight w and the bias b;
step 3.2, calculating an activation function, wherein the calculation formula is as follows:
Figure FDA0002724077950000061
wherein z is an independent variable, and z is 0, ± 1, ± 2 … …;
and 3.3, carrying out forward propagation to obtain an output result, wherein an output formula is as follows:
al=σ×(Wl×al-1+bl) (24)
wherein l represents the number of layers, al-1Is the output of layer l-1 in the neural network, alFor the input of the l layer in the neural networkGo out, WlIs the weight of the l-th layer, blIs the bias of the l-th layer;
step 3.4, calculating a loss function, wherein the calculation formula is as follows:
Figure FDA0002724077950000062
in the formula: a islThe output of the l layer in the neural network is shown, x is a sequence output after GRU neural network training, and y is real training sample output;
and 3.5, performing reverse propagation, wherein a calculation formula updated for each layer of parameters W and b is as follows:
Zl=Wl×al-1+bl (26)
wherein Z islFor the inactive output of the l-th layer, the loss function is coupled to ZlCalculating a partial derivative to obtain:
Figure FDA0002724077950000063
couple the loss function to WlCalculating a partial derivative to obtain:
Figure FDA0002724077950000064
pair of loss functions blCalculating a partial derivative to obtain:
Figure FDA0002724077950000065
wherein, al-1Refers to the output of the l-1 layer neural network, blIs the bias of the l-th layer;
jointly solving the steps (24) to (29) to obtain Wl、blRealize to Wl、blIs continuously updated.
And 3.6, calculating backwards layer by layer from the input layer until the calculation is carried out to the output layer, and obtaining a final feature extraction result.
CN202011096967.6A 2020-10-14 2020-10-14 Sewing gesture recognition method based on deep learning Active CN112270220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011096967.6A CN112270220B (en) 2020-10-14 2020-10-14 Sewing gesture recognition method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011096967.6A CN112270220B (en) 2020-10-14 2020-10-14 Sewing gesture recognition method based on deep learning

Publications (2)

Publication Number Publication Date
CN112270220A true CN112270220A (en) 2021-01-26
CN112270220B CN112270220B (en) 2022-02-25

Family

ID=74337505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011096967.6A Active CN112270220B (en) 2020-10-14 2020-10-14 Sewing gesture recognition method based on deep learning

Country Status (1)

Country Link
CN (1) CN112270220B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230107097A1 (en) * 2021-10-06 2023-04-06 Fotonation Limited Method for identifying a gesture

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103208126A (en) * 2013-04-17 2013-07-17 同济大学 Method for monitoring moving object in natural environment
CN105427261A (en) * 2015-11-27 2016-03-23 努比亚技术有限公司 Method and apparatus for removing image color noise and mobile terminal
CN105812762A (en) * 2016-03-23 2016-07-27 武汉鸿瑞达信息技术有限公司 Automatic white balance method for processing image color cast
CN108052884A (en) * 2017-12-01 2018-05-18 华南理工大学 A kind of gesture identification method based on improvement residual error neutral net
CN108205671A (en) * 2016-12-16 2018-06-26 浙江宇视科技有限公司 Image processing method and device
CN108537147A (en) * 2018-03-22 2018-09-14 东华大学 A kind of gesture identification method based on deep learning
US10134421B1 (en) * 2016-08-04 2018-11-20 Amazon Technologies, Inc. Neural network based beam selection
CN108846356A (en) * 2018-06-11 2018-11-20 南京邮电大学 A method of the palm of the hand tracing and positioning based on real-time gesture identification
CN108965609A (en) * 2018-08-31 2018-12-07 南京宽塔信息技术有限公司 The recognition methods of mobile terminal application scenarios and device
CN109378064A (en) * 2018-10-29 2019-02-22 南京医基云医疗数据研究院有限公司 Medical data processing method, device electronic equipment and computer-readable medium
CN109584186A (en) * 2018-12-25 2019-04-05 西北工业大学 A kind of unmanned aerial vehicle onboard image defogging method and device
CN110827218A (en) * 2019-10-31 2020-02-21 西北工业大学 Airborne image defogging method based on image HSV transmissivity weighted correction
CN110852960A (en) * 2019-10-25 2020-02-28 江苏荣策士科技发展有限公司 Image enhancement device and method for removing fog
CN110929769A (en) * 2019-11-14 2020-03-27 保定赛瑞电力科技有限公司 Reactor mechanical fault joint detection model, method and device based on vibration and sound

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103208126A (en) * 2013-04-17 2013-07-17 同济大学 Method for monitoring moving object in natural environment
CN105427261A (en) * 2015-11-27 2016-03-23 努比亚技术有限公司 Method and apparatus for removing image color noise and mobile terminal
CN105812762A (en) * 2016-03-23 2016-07-27 武汉鸿瑞达信息技术有限公司 Automatic white balance method for processing image color cast
US10134421B1 (en) * 2016-08-04 2018-11-20 Amazon Technologies, Inc. Neural network based beam selection
CN108205671A (en) * 2016-12-16 2018-06-26 浙江宇视科技有限公司 Image processing method and device
CN108052884A (en) * 2017-12-01 2018-05-18 华南理工大学 A kind of gesture identification method based on improvement residual error neutral net
CN108537147A (en) * 2018-03-22 2018-09-14 东华大学 A kind of gesture identification method based on deep learning
CN108846356A (en) * 2018-06-11 2018-11-20 南京邮电大学 A method of the palm of the hand tracing and positioning based on real-time gesture identification
CN108965609A (en) * 2018-08-31 2018-12-07 南京宽塔信息技术有限公司 The recognition methods of mobile terminal application scenarios and device
CN109378064A (en) * 2018-10-29 2019-02-22 南京医基云医疗数据研究院有限公司 Medical data processing method, device electronic equipment and computer-readable medium
CN109584186A (en) * 2018-12-25 2019-04-05 西北工业大学 A kind of unmanned aerial vehicle onboard image defogging method and device
CN110852960A (en) * 2019-10-25 2020-02-28 江苏荣策士科技发展有限公司 Image enhancement device and method for removing fog
CN110827218A (en) * 2019-10-31 2020-02-21 西北工业大学 Airborne image defogging method based on image HSV transmissivity weighted correction
CN110929769A (en) * 2019-11-14 2020-03-27 保定赛瑞电力科技有限公司 Reactor mechanical fault joint detection model, method and device based on vibration and sound

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GYEOWOON JUNG等: "DNN-GRU multiple layers for VAD in PC Game Café", 《2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - ASIA (ICCE-ASIA)》 *
王晓华 等: "基于改进POLO深度卷积神经网络的缝纫手势检测", 《纺织学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230107097A1 (en) * 2021-10-06 2023-04-06 Fotonation Limited Method for identifying a gesture
US11983327B2 (en) * 2021-10-06 2024-05-14 Fotonation Limited Method for identifying a gesture

Also Published As

Publication number Publication date
CN112270220B (en) 2022-02-25

Similar Documents

Publication Publication Date Title
WO2021253939A1 (en) Rough set-based neural network method for segmenting fundus retinal vascular image
CN108830157B (en) Human behavior identification method based on attention mechanism and 3D convolutional neural network
CN107844795B (en) Convolutional neural network feature extraction method based on principal component analysis
Zhang et al. Plant disease recognition based on plant leaf image.
CN106204779B (en) Check class attendance method based on plurality of human faces data collection strategy and deep learning
Varga et al. Fully automatic image colorization based on Convolutional Neural Network
CN108009493B (en) Human face anti-cheating recognition method based on motion enhancement
CN108268859A (en) A kind of facial expression recognizing method based on deep learning
US20180130186A1 (en) Hybrid machine learning systems
CN108388896A (en) A kind of licence plate recognition method based on dynamic time sequence convolutional neural networks
CN109657612B (en) Quality sorting system based on facial image features and application method thereof
Xu et al. Recurrent convolutional neural network for video classification
CN109543632A (en) A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN106778785A (en) Build the method for image characteristics extraction model and method, the device of image recognition
CN108509920A (en) The face identification method of the multichannel combined feature selecting study of more patch based on CNN
CN107862680B (en) Target tracking optimization method based on correlation filter
CN110969171A (en) Image classification model, method and application based on improved convolutional neural network
CN109902613A (en) A kind of human body feature extraction method based on transfer learning and image enhancement
CN112487981A (en) MA-YOLO dynamic gesture rapid recognition method based on two-way segmentation
CN107516083A (en) A kind of remote facial image Enhancement Method towards identification
CN112766021A (en) Method for re-identifying pedestrians based on key point information and semantic segmentation information of pedestrians
Yang et al. A Face Detection Method Based on Skin Color Model and Improved AdaBoost Algorithm.
Li et al. A self-attention feature fusion model for rice pest detection
Gurrala et al. A new segmentation method for plant disease diagnosis
CN105825234A (en) Superpixel and background model fused foreground detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant