CN111444820B - Gesture recognition method based on imaging radar - Google Patents

Gesture recognition method based on imaging radar Download PDF

Info

Publication number
CN111444820B
CN111444820B CN202010215230.5A CN202010215230A CN111444820B CN 111444820 B CN111444820 B CN 111444820B CN 202010215230 A CN202010215230 A CN 202010215230A CN 111444820 B CN111444820 B CN 111444820B
Authority
CN
China
Prior art keywords
neural network
layer
matrix
neurons
convolution kernel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010215230.5A
Other languages
Chinese (zh)
Other versions
CN111444820A (en
Inventor
张雷
张博
吴沫君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202010215230.5A priority Critical patent/CN111444820B/en
Publication of CN111444820A publication Critical patent/CN111444820A/en
Application granted granted Critical
Publication of CN111444820B publication Critical patent/CN111444820B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a gesture recognition method based on an imaging radar. Belonging to the field of human-computer interaction. The method disclosed by the invention uses the imaging radar as a hardware carrier, and combines a self-coding technology and a recurrent neural network to realize high-accuracy identification of the dynamic gesture of the user. The method can be applied to different imaging radars. For the gesture recognition method based on camera equipment realizes, the module that this method realized is lighter, does not receive the influence of light intensity simultaneously during the gesture in the discernment environment, because do not need the camera to shoot the video recording, user privacy can not revealed in this method, can be applied in a lot of scenes such as intelligent household electrical appliances control, intelligent automobile driver's cabin control.

Description

Gesture recognition method based on imaging radar
Technical Field
The invention relates to a gesture recognition method based on an imaging radar, and belongs to the technical field of man-machine interaction.
Background
In recent years, gesture recognition is a research hotspot of human-computer interaction. The traditional gesture recognition method is based on the image collected by a camera for recognition. The image shot by the camera can clearly keep hand information, but the image is too large, and contains a lot of data which are useless for gesture recognition. The real-time processing of the camera images not only requires higher hardware computing speed, but also obtains gesture recognition results influenced by ambient light, and the method cannot realize higher gesture recognition accuracy in many occasions. Moreover, the camera can shoot the user image, so that the privacy problem is easily caused. There are some methods that implement gesture recognition methods using ultrasonic radar and machine learning. Since the resolution of the ultrasonic radar is particularly low, it is difficult to achieve a high gesture recognition rate.
Improving the recognition accuracy of gesture recognition is an important goal of gesture recognition. The resolution ratio of formation of image radar is higher than the resolution ratio of ultrasonic wave, and the formation of image result is not influenced by ambient light, and the detection range can pierce through the object obstacle. When the imaging radar is used for hand recognition, the signal data volume is small, and the requirement on hardware calculation speed is low. If an effective method is explored for processing the data of the imaging radar, high-precision gesture recognition is realized. Has wide application prospect.
Disclosure of Invention
The invention aims to provide a gesture recognition method based on an imaging radar, which aims to solve the problems that the dynamic gesture recognition rate is low, the gesture recognition effect is greatly influenced by ambient light and the like in the prior art, and the accuracy of gesture recognition is improved.
The invention provides a gesture recognition method based on an imaging radar, which comprises the following steps:
(1) collecting radar images of dynamic gestures to be recognized, forming a matrix A by the collected images, wherein the matrix A is a (N, S, T) and (W, H) matrix, the matrix A comprises N dynamic gestures, each dynamic gesture has S sample sequences, each sample sequence is formed by T radar images, each radar image comprises W, H and W pixels, W is the width of the radar image, and H is the height of the radar image;
(2) collecting radar images of any gesture to form a matrix B, wherein the matrix B is a matrix of M (W) H, the matrix B comprises M radar images, each radar image comprises W pixels H, W, H is the same as that in the step (1), W is the width of the radar image, and H is the height of the radar image;
(3) constructing a self-coding-decoding neural network E, which specifically comprises the following steps:
(3-1) the first layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W1 of the convolutional neural network is Lw1*Pw1*Qw1Matrix, where Lw1Is the number of channels, P, of the convolution kernelw1Is the width of the convolution kernel, Qw1Is the height of the convolution kernel;
(3-2) the second layer of the self-encoding-decoding neural network E is a pooled neural network whose convolution kernel weight W2 is a 2 x 2 matrix;
(3-3) the third layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W3 of the convolutional neural network is LW3*PW3*QW3Matrix, where LW3Is the number of channels, P, of the convolution kernelW3Is the width of the convolution kernel, QW3Is the height of the convolution kernel;
(3-4) the fourth layer of the self-encoding-decoding neural network E is a pooled neural network, and the convolution kernel W4 of the pooled neural network is a 2 × 2 matrix;
(3-5) the fifth layer of the self-encoding-decoding neural network E is a fully-connected neural network, the number of the neurons is F5, and the weight of the neurons is W5;
(3-6) the sixth layer of the self-encoding-decoding neural network E is a fully-connected neural network, the number of the neurons is F6, and the weight of the neurons is W6;
(3-7) the seventh layer of the self-encoding-decoding neural network E is an upsampling neural network, and the convolution kernel W7 of the upsampling neural network is a 2 x 2 matrix;
(3-8) the eighth layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W8 of the convolutional neural network is LW8*PW8*QW8Matrix, where LW8Is the number of channels, P, of the convolution kernelW8Is the width of the convolution kernel, QW8Is the height of the convolution kernel;
(3-9) the ninth layer of the self-encoding-decoding neural network is an upsampling neural network, and a convolution kernel W9 of the upsampling neural network is a 2 x 2 matrix;
(3-10) the tenth layer of the self-encoding-decoding neural network is a convolutional neural network, and the convolutional kernel weight W10 of the convolutional neural network is LW10*PW10*QW10Matrix, where LW10Is the number of channels, P, of the convolution kernelW10Is the width of the convolution kernel, QW10Is the height of the convolution kernel;
(3-11) obtaining a self-encoding-decoding neural network E according to the steps (3-1) - (3-10);
(4) inputting the matrix B of the random gesture radar image collected in the step (2) into the self-coding-decoding neural network E in the step (3-11), and outputting the matrix B as a self-coding-decoded radar image E (B);
(5) with a loss function B2-E(B)2Training the self-coding-decoding neural network E (3-11) by using a gradient descent method to obtain trained matrixes W1 ', W2 ', … … W10 ';
(6) and (3) constructing a feature extraction neural network C by using the matrices W1 ', W2 ', … … W5 ' trained in the step (5), and specifically comprising the following steps:
(6-1) the first layer of the feature extraction neural network C is a convolutional neural network, and the convolutional kernel is W1';
(6-2) the second layer of the feature extraction neural network C is a pooled neural network, and the convolution kernel is W2';
(6-3) the third layer of the feature extraction neural network C is a convolutional neural network, and the convolutional kernel is W3';
(6-4) the fourth layer of the feature extraction neural network C is a pooled neural network, and the convolution kernel is W4';
(6-5) the fifth layer of the feature extraction neural network C is a fully-connected neural network, the number of the neurons is F5, and the weight of the neurons is W5';
(6-6) obtaining a feature extraction neural network C according to the steps (6-1) to (6-5);
(7) inputting the matrix A in the step (1) into the feature extraction neural network C in the step (6-6) to obtain a feature matrix CM, wherein the matrix CM is a matrix of (N S T) F5, and contains the features of the N dynamic gestures in the step (1), namely, each dynamic gesture has S sample sequences, each sample sequence has T radar images, T is the number of the radar images in the step (1), F5 features are extracted from each radar image, and F5 is the number of neurons in the 5 th layer in the feature extraction neural network C;
(8) constructing a recurrent neural network (RN), and specifically comprising the following steps:
(8-1) the first layer of the recurrent neural network RN is a long-term memory neural network layer, the number of neurons is R, and the weight of the neurons is Wifco
(8-2) the second layer of the recurrent neural network RN is a SoftMax classification layer, the number of the neurons is N +1, N is the number of the dynamic gesture types in the step (1), and the weight of the neurons is Ws;
(8-3) obtaining a recurrent neural network (RN) according to the step (8-1) and the step (8-2);
(9) inputting the feature matrix CM described in the step (7) into the recurrent neural network RN in the step (8), and outputting a predicted dynamic gesture classification result RE;
(10) taking the highest accuracy rate of the dynamic gesture classification result RE in the step (9) as a training targetTraining the weights W of the recurrent neural network RN of step (8) respectively using a gradient descent methodifcoAnd Ws to obtain a trained matrix Wifco'and Ws';
(11) using the weights W1 ', W2 ', … … W5 ' trained in the step (5) and the W trained in the step (10)ifco'and Ws', constructing a gesture recognition neural network GR, and specifically comprising the following steps:
(11-1) the first layer of the gesture recognition neural network GR is a convolutional neural network, a convolution kernel W1';
(11-2) the second layer of the gesture recognition neural network GR is a pooled neural network, a convolution kernel W2';
(11-3) the third layer of the gesture recognition neural network GR is a convolution neural network, and a convolution kernel W3';
(11-4) the fourth layer of the gesture recognition neural network GR is a pooled neural network, and the convolution kernel W4';
(11-5) the fifth layer of the gesture recognition neural network GR is a fully connected neural network, the number of the neurons is F5, namely the number of the neurons of the feature extraction neural network C in the step (6), and the weight of the neurons is W5';
(11-6) the sixth layer of the gesture recognition neural network GR is a long-time memory neural network layer, the number of the neurons is R, namely the number of the neurons of the recurrent neural network RN in the step (8), and the weight of the neurons is Wifco’;
(11-7) the seventh layer of the gesture recognition neural network GR is a SoftMax classification layer, the number of the neurons is N +1, and the weight of the neurons is Ws ', namely the matrix Ws' trained in the step (10);
(11-8) obtaining a gesture recognition neural network GR according to the steps (11-1) - (11-7);
(12) and (3) acquiring imaging radar images of the target to be recognized in real time, wherein each T image in the imaging radar images forms a sequence I, the sequence I is used as the real-time input of the gesture recognition neural network GR in the step (11), the output of the gesture recognition neural network GR is the gesture of the target to be recognized, and the gesture recognition based on the imaging radar is realized.
The imaging radar-based gesture recognition method provided by the invention has the advantages that:
according to the gesture recognition method based on the imaging radar, the imaging radar is used as a hardware carrier, and the self-coding technology and the cyclic neural network are combined to realize high-accuracy recognition of the dynamic gesture of the user. The method can be applied to different imaging radars. For the gesture recognition method based on camera equipment realizes, the module that this method realized is lighter, does not receive the influence of light intensity simultaneously during the gesture in the discernment environment, because do not need the camera to shoot the video recording, user privacy can not revealed in this method, can be applied in a lot of scenes such as intelligent household electrical appliances control, intelligent automobile driver's cabin control.
Drawings
FIG. 1 is a block flow diagram of the method of the present invention.
Detailed Description
The imaging radar-based gesture recognition method provided by the invention has a flow chart shown in fig. 1, and comprises the following steps:
(1) collecting radar images of dynamic gestures to be recognized, forming a matrix A by the collected images, wherein the matrix A is a (N, S, T) and (W, H) matrix, the matrix A comprises N dynamic gestures, each dynamic gesture has S sample sequences, each sample sequence is formed by T radar images, each radar image comprises W, H and W pixels, W is the width of the radar image, and H is the height of the radar image;
(2) collecting radar images of any gesture to form a matrix B, wherein the matrix B is a matrix of M (W) H, the matrix B comprises M radar images, each radar image comprises W pixels H, W, H is the same as that in the step (1), W is the width of the radar image, and H is the height of the radar image;
(3) constructing a self-coding-decoding neural network E, which specifically comprises the following steps:
(3-1) the first layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W1 of the convolutional neural network is Lw1*Pw1*Qw1Matrix, where Lw1Is the number of channels, P, of the convolution kernelw1Is the width of the convolution kernel, Qw1Is thatThe height of the convolution kernel;
(3-2) the second layer of the self-encoding-decoding neural network E is a pooled neural network whose convolution kernel weight W2 is a 2 x 2 matrix;
(3-3) the third layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W3 of the convolutional neural network is LW3*PW3*QW3Matrix, where LW3Is the number of channels, P, of the convolution kernelW3Is the width of the convolution kernel, QW3Is the height of the convolution kernel;
(3-4) the fourth layer of the self-encoding-decoding neural network E is a pooled neural network, and the convolution kernel W4 of the pooled neural network is a 2 × 2 matrix;
(3-5) the fifth layer of the self-encoding-decoding neural network E is a fully-connected neural network, the number of the neurons is F5, and the weight of the neurons is W5;
(3-6) the sixth layer of the self-encoding-decoding neural network E is a fully-connected neural network, the number of the neurons is F6, and the weight of the neurons is W6;
(3-7) the seventh layer of the self-encoding-decoding neural network E is an upsampling neural network, and the convolution kernel W7 of the upsampling neural network is a 2 x 2 matrix;
(3-8) the eighth layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W8 of the convolutional neural network is LW8*PW8*QW8Matrix, where LW8Is the number of channels, P, of the convolution kernelW8Is the width of the convolution kernel, QW8Is the height of the convolution kernel;
(3-9) the ninth layer of the self-encoding-decoding neural network is an upsampling neural network, and a convolution kernel W9 of the upsampling neural network is a 2 x 2 matrix;
(3-10) the tenth layer of the self-encoding-decoding neural network is a convolutional neural network, and the convolutional kernel weight W10 of the convolutional neural network is LW10*PW10*QW10Matrix, where LW10Is the number of channels, P, of the convolution kernelW10Is the width of the convolution kernel, QW10Is the height of the convolution kernel;
(3-11) obtaining a self-encoding-decoding neural network E according to the steps (3-1) - (3-10);
(4) inputting the matrix B of the random gesture radar image collected in the step (2) into the self-coding-decoding neural network E in the step (3-11), and outputting the matrix B as a self-coding-decoded radar image E (B);
(5) with a loss function B2-E(B)2Training the self-coding-decoding neural network E (3-11) by using a gradient descent method to obtain trained matrixes W1 ', W2 ', … … W10 ';
(6) and (3) constructing a feature extraction neural network C by using the matrices W1 ', W2 ', … … W5 ' trained in the step (5), and specifically comprising the following steps:
(6-1) the first layer of the feature extraction neural network C is a convolutional neural network, and the convolutional kernel is W1';
(6-2) the second layer of the feature extraction neural network C is a pooled neural network, and the convolution kernel is W2';
(6-3) the third layer of the feature extraction neural network C is a convolutional neural network, and the convolutional kernel is W3';
(6-4) the fourth layer of the feature extraction neural network C is a pooled neural network, and the convolution kernel is W4';
(6-5) the fifth layer of the feature extraction neural network C is a fully-connected neural network, the number of the neurons is F5, and the weight of the neurons is W5';
(6-6) obtaining a feature extraction neural network C according to the steps (6-1) to (6-5);
(7) inputting the matrix A in the step (1) into the feature extraction neural network C in the step (6-6) to obtain a feature matrix CM, wherein the matrix CM is a matrix of (N S T) F5, and contains the features of the N dynamic gestures in the step (1), namely, each dynamic gesture has S sample sequences, each sample sequence has T radar images, T is the number of the radar images in the step (1), F5 features are extracted from each radar image, and F5 is the number of neurons in the 5 th layer in the feature extraction neural network C;
(8) constructing a recurrent neural network (RN), and specifically comprising the following steps:
(8-1) the first layer of the recurrent neural network RN is long and shortMemory neural network layer, the number of neurons is R, the weight of neurons is Wifco
(8-2) the second layer of the recurrent neural network RN is a SoftMax classification layer, the number of the neurons is N +1, N is the number of the dynamic gesture types in the step (1), and the weight of the neurons is Ws;
(8-3) obtaining a recurrent neural network (RN) according to the step (8-1) and the step (8-2);
(9) inputting the feature matrix CM described in the step (7) into the recurrent neural network RN in the step (8), and outputting a predicted dynamic gesture classification result RE;
(10) respectively training the weight W of the recurrent neural network RN in the step (8) by using a gradient descent method with the highest accuracy of the dynamic gesture classification result RE in the step (9) as a training targetifcoAnd Ws to obtain a trained matrix Wifco'and Ws';
(11) using the weights W1 ', W2 ', … … W5 ' trained in the step (5) and the W trained in the step (10)ifco'and Ws', constructing a gesture recognition neural network GR, and specifically comprising the following steps:
(11-1) the first layer of the gesture recognition neural network GR is a convolutional neural network, a convolution kernel W1';
(11-2) the second layer of the gesture recognition neural network GR is a pooled neural network, a convolution kernel W2';
(11-3) the third layer of the gesture recognition neural network GR is a convolution neural network, and a convolution kernel W3';
(11-4) the fourth layer of the gesture recognition neural network GR is a pooled neural network, and the convolution kernel W4';
(11-5) the fifth layer of the gesture recognition neural network GR is a fully connected neural network, the number of the neurons is F5, namely the number of the neurons of the feature extraction neural network C in the step (6), and the weight of the neurons is W5';
(11-6) the sixth layer of the gesture recognition neural network GR is a long-time memory neural network layer, the number of the neurons is R, namely the number of the neurons of the recurrent neural network RN in the step (8), and the weight of the neurons is Wifco’;
(11-7) the seventh layer of the gesture recognition neural network GR is a SoftMax classification layer, the number of the neurons is N +1, and the weight of the neurons is Ws ', namely the matrix Ws' trained in the step (10);
(11-8) obtaining a gesture recognition neural network GR according to the steps (11-1) - (11-7);
(12) and (3) acquiring imaging radar images of the target to be recognized in real time, wherein each T image in the imaging radar images forms a sequence I, the sequence I is used as the real-time input of the gesture recognition neural network GR in the step (11), the output of the gesture recognition neural network GR is the gesture of the target to be recognized, and the gesture recognition based on the imaging radar is realized.

Claims (1)

1. A gesture recognition method based on imaging radar is characterized by comprising the following steps:
(1) collecting radar images of dynamic gestures to be recognized, forming a matrix A by the collected images, wherein the matrix A is a (N, S, T) and (W, H) matrix, the matrix A comprises N dynamic gestures, each dynamic gesture has S sample sequences, each sample sequence is formed by T radar images, each radar image comprises W, H and W pixels, W is the width of the radar image, and H is the height of the radar image;
(2) collecting radar images of any gesture to form a matrix B, wherein the matrix B is a matrix of M (W) H, the matrix B comprises M radar images, each radar image comprises W pixels H, W, H is the same as that in the step (1), W is the width of the radar image, and H is the height of the radar image;
(3) constructing a self-coding-decoding neural network E, which specifically comprises the following steps:
(3-1) the first layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W1 of the convolutional neural network is Lw1*Pw1*Qw1Matrix, where Lw1Is the number of channels, P, of the convolution kernelw1Is the width of the convolution kernel, Qw1Is the height of the convolution kernel;
(3-2) the second layer of the self-encoding-decoding neural network E is a pooled neural network whose convolution kernel weight W2 is a 2 x 2 matrix;
(3-3) the third layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W3 of the convolutional neural network is LW3*PW3*QW3Matrix, where LW3Is the number of channels, P, of the convolution kernelW3Is the width of the convolution kernel, QW3Is the height of the convolution kernel;
(3-4) the fourth layer of the self-encoding-decoding neural network E is a pooled neural network, and the convolution kernel W4 of the pooled neural network is a 2 × 2 matrix;
(3-5) the fifth layer of the self-encoding-decoding neural network E is a fully-connected neural network, the number of the neurons is F5, and the weight of the neurons is W5;
(3-6) the sixth layer of the self-encoding-decoding neural network E is a fully-connected neural network, the number of the neurons is F6, and the weight of the neurons is W6;
(3-7) the seventh layer of the self-encoding-decoding neural network E is an upsampling neural network, and the convolution kernel W7 of the upsampling neural network is a 2 x 2 matrix;
(3-8) the eighth layer of the self-encoding-decoding neural network E is a convolutional neural network, and the convolutional kernel weight W8 of the convolutional neural network is LW8*PW8*QW8Matrix, where LW8Is the number of channels, P, of the convolution kernelW8Is the width of the convolution kernel, QW8Is the height of the convolution kernel;
(3-9) the ninth layer of the self-encoding-decoding neural network is an upsampling neural network, and a convolution kernel W9 of the upsampling neural network is a 2 x 2 matrix;
(3-10) the tenth layer of the self-encoding-decoding neural network is a convolutional neural network, and the convolutional kernel weight W10 of the convolutional neural network is LW10*PW10*QW10Matrix, where LW10Is the number of channels, P, of the convolution kernelW10Is the width of the convolution kernel, QW10Is the height of the convolution kernel;
(3-11) obtaining a self-encoding-decoding neural network E according to the steps (3-1) - (3-10);
(4) inputting the matrix B of the random gesture radar image collected in the step (2) into the self-coding-decoding neural network E in the step (3-11), and outputting the matrix B as a self-coding-decoded radar image E (B);
(5) with a loss function B2-E(B)2Training the self-coding-decoding neural network E (3-11) by using a gradient descent method to obtain trained matrixes W1 ', W2 ', … … W10 ';
(6) and (3) constructing a feature extraction neural network C by using the matrices W1 ', W2 ', … … W5 ' trained in the step (5), and specifically comprising the following steps:
(6-1) the first layer of the feature extraction neural network C is a convolutional neural network, and the convolutional kernel is W1';
(6-2) the second layer of the feature extraction neural network C is a pooled neural network, and the convolution kernel is W2';
(6-3) the third layer of the feature extraction neural network C is a convolutional neural network, and the convolutional kernel is W3';
(6-4) the fourth layer of the feature extraction neural network C is a pooled neural network, and the convolution kernel is W4';
(6-5) the fifth layer of the feature extraction neural network C is a fully-connected neural network, the number of the neurons is F5, and the weight of the neurons is W5';
(6-6) obtaining a feature extraction neural network C according to the steps (6-1) to (6-5);
(7) inputting the matrix A in the step (1) into the feature extraction neural network C in the step (6-6) to obtain a feature matrix CM, wherein the matrix CM is a matrix of (N S T) F5, and contains the features of the N dynamic gestures in the step (1), namely, each dynamic gesture has S sample sequences, each sample sequence has T radar images, T is the number of the radar images in the step (1), F5 features are extracted from each radar image, and F5 is the number of neurons in the 5 th layer in the feature extraction neural network C;
(8) constructing a recurrent neural network (RN), and specifically comprising the following steps:
(8-1) the first layer of the recurrent neural network RN is a long-term memory neural network layer, the number of neurons is R, and the weight of the neurons is Wifco
(8-2) the second layer of the recurrent neural network RN is a SoftMax classification layer, the number of the neurons is N +1, N is the number of the dynamic gesture types in the step (1), and the weight of the neurons is Ws;
(8-3) obtaining a recurrent neural network (RN) according to the step (8-1) and the step (8-2);
(9) inputting the feature matrix CM described in the step (7) into the recurrent neural network RN in the step (8), and outputting a predicted dynamic gesture classification result RE;
(10) respectively training the weight W of the recurrent neural network RN in the step (8) by using a gradient descent method with the highest accuracy of the dynamic gesture classification result RE in the step (9) as a training targetifcoAnd Ws to obtain a trained matrix Wifco'and Ws';
(11) using the weights W1 ', W2 ', … … W5 ' trained in the step (5) and the W trained in the step (10)ifco'and Ws', constructing a gesture recognition neural network GR, and specifically comprising the following steps:
(11-1) the first layer of the gesture recognition neural network GR is a convolutional neural network, a convolution kernel W1';
(11-2) the second layer of the gesture recognition neural network GR is a pooled neural network, a convolution kernel W2';
(11-3) the third layer of the gesture recognition neural network GR is a convolution neural network, and a convolution kernel W3';
(11-4) the fourth layer of the gesture recognition neural network GR is a pooled neural network, and the convolution kernel W4';
(11-5) the fifth layer of the gesture recognition neural network GR is a fully connected neural network, the number of the neurons is F5, namely the number of the neurons of the feature extraction neural network C in the step (6), and the weight of the neurons is W5';
(11-6) the sixth layer of the gesture recognition neural network GR is a long-time memory neural network layer, the number of the neurons is R, namely the number of the neurons of the recurrent neural network RN in the step (8), and the weight of the neurons is Wifco’;
(11-7) the seventh layer of the gesture recognition neural network GR is a SoftMax classification layer, the number of the neurons is N +1, and the weight of the neurons is Ws ', namely the matrix Ws' trained in the step (10);
(11-8) obtaining a gesture recognition neural network GR according to the steps (11-1) - (11-7);
(12) and (3) acquiring imaging radar images of the target to be recognized in real time, wherein each T image in the imaging radar images forms a sequence I, the sequence I is used as the real-time input of the gesture recognition neural network GR in the step (11), the output of the gesture recognition neural network GR is the gesture of the target to be recognized, and the gesture recognition based on the imaging radar is realized.
CN202010215230.5A 2020-03-24 2020-03-24 Gesture recognition method based on imaging radar Active CN111444820B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010215230.5A CN111444820B (en) 2020-03-24 2020-03-24 Gesture recognition method based on imaging radar

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010215230.5A CN111444820B (en) 2020-03-24 2020-03-24 Gesture recognition method based on imaging radar

Publications (2)

Publication Number Publication Date
CN111444820A CN111444820A (en) 2020-07-24
CN111444820B true CN111444820B (en) 2021-06-04

Family

ID=71629507

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010215230.5A Active CN111444820B (en) 2020-03-24 2020-03-24 Gesture recognition method based on imaging radar

Country Status (1)

Country Link
CN (1) CN111444820B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3467707B1 (en) * 2017-10-07 2024-03-13 Tata Consultancy Services Limited System and method for deep learning based hand gesture recognition in first person view
EP3727145B1 (en) * 2017-12-22 2024-01-24 ResMed Sensor Technologies Limited Apparatus, system, and method for physiological sensing in vehicles
CN108509910B (en) * 2018-04-02 2021-09-28 重庆邮电大学 Deep learning gesture recognition method based on FMCW radar signals
CN110569823B (en) * 2019-09-18 2023-04-18 西安工业大学 Sign language identification and skeleton generation method based on RNN

Also Published As

Publication number Publication date
CN111444820A (en) 2020-07-24

Similar Documents

Publication Publication Date Title
CN108257158B (en) Target prediction and tracking method based on recurrent neural network
CN110348288B (en) Gesture recognition method based on 77GHz millimeter wave radar signal
CN108519812B (en) Three-dimensional micro Doppler gesture recognition method based on convolutional neural network
CN111461037B (en) End-to-end gesture recognition method based on FMCW radar
CN111476058B (en) Gesture recognition method based on millimeter wave radar
CN111695457A (en) Human body posture estimation method based on weak supervision mechanism
CN111813222B (en) Terahertz radar-based fine dynamic gesture recognition method
CN111157988A (en) Gesture radar signal processing method based on RDTM and ATM fusion
Tang et al. Human activity recognition based on mixed CNN with radar multi-spectrogram
CN116824629A (en) High-robustness gesture recognition method based on millimeter wave radar
Yu et al. The multi-level classification and regression network for visual tracking via residual channel attention
CN116206359A (en) Human gait recognition method based on millimeter wave radar and dynamic sampling neural network
CN113537120B (en) Complex convolution neural network target identification method based on complex coordinate attention
Li et al. Supervised domain adaptation for few-shot radar-based human activity recognition
CN112801928B (en) Attention mechanism-based millimeter wave radar and visual sensor fusion method
Zhang et al. Riddle: Real-time interacting with hand description via millimeter-wave sensor
CN111444820B (en) Gesture recognition method based on imaging radar
Qin et al. Dense sampling and detail enhancement network: Improved small object detection based on dense sampling and detail enhancement
CN116311353A (en) Intensive pedestrian multi-target tracking method based on feature fusion, computer equipment and storage medium
CN114067359B (en) Pedestrian detection method integrating human body key points and visible part attention characteristics
CN115713672A (en) Target detection method based on two-way parallel attention mechanism
CN115294656A (en) FMCW radar-based hand key point tracking method
Luo et al. EdgeActNet: Edge Intelligence-enabled Human Activity Recognition using Radar Point Cloud
CN116563313B (en) Remote sensing image soybean planting region segmentation method based on gating and attention fusion
Yan et al. Object Detection Method Based On Improved SSD Algorithm For Smart Grid

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant