CN116524537A - Human body posture recognition method based on CNN and LSTM combination - Google Patents

Human body posture recognition method based on CNN and LSTM combination Download PDF

Info

Publication number
CN116524537A
CN116524537A CN202310465077.5A CN202310465077A CN116524537A CN 116524537 A CN116524537 A CN 116524537A CN 202310465077 A CN202310465077 A CN 202310465077A CN 116524537 A CN116524537 A CN 116524537A
Authority
CN
China
Prior art keywords
time
distance
channel
cnn
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310465077.5A
Other languages
Chinese (zh)
Inventor
武其松
孟德馨
赵涤燹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Network Communication and Security Zijinshan Laboratory
Original Assignee
Southeast University
Network Communication and Security Zijinshan Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University, Network Communication and Security Zijinshan Laboratory filed Critical Southeast University
Priority to CN202310465077.5A priority Critical patent/CN116524537A/en
Publication of CN116524537A publication Critical patent/CN116524537A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Radar Systems Or Details Thereof (AREA)

Abstract

The invention discloses a human body posture recognition method based on CNN and LSTM combination, firstly, collecting intermediate frequency signal sample data required by training and testing; secondly, performing distance dimension Fourier transform on the intermediate frequency signal data to obtain a time-distance image, summing the distance unit data of the target along the distance dimension to obtain a one-dimensional distance spectrum peak, performing short-time Fourier transform to obtain a time-frequency image, and labeling each image with labels of different categories; establishing a three-channel deep learning neural network model, combining a CNN (computer numerical network) network and an LSTM (computer numerical network) network in each channel, taking a time-frequency image as input by a first channel and a second channel, extracting features by a convolution kernel with size difference in a convolution layer, and taking a time-distance image as input by a third channel. Data is input into the model for training. According to the method, the millimeter wave radar is combined with the CNN and LSTM networks, multiple types of characteristic images are fused, time sequence characteristic information is fully utilized, and accuracy of recognizing human body gestures is improved.

Description

Human body posture recognition method based on CNN and LSTM combination
Technical Field
The invention relates to a human body posture recognition method based on combination of CNN and LSTM, in particular to a target posture recognition method based on combination of millimeter wave radar, convolutional Neural Network (CNN) and long-short-term memory network (LSTM).
Background
In modern society of rapid technological development, target detection and motion recognition and classification become important research directions. In daily life, the old people with inconvenient behaviors are monitored and protected, road conditions are analyzed and judged in an automatic driving technology, criminals are detected in various anti-terrorist actions, and the like, and the method belongs to the field of research. The existing target recognition and action classification technology mainly has three aspects, namely, the method is based on a camera and a shot video, and the complete picture is utilized for recognition; secondly, the human body actions are identified by wearing wearable sensor equipment; thirdly, the detection is performed by using non-contact sensors such as radar, vision sensors, infrared sensors and the like. In addition to the more invasive privacy problem generated by camera monitoring, the security problem of privacy disclosure can also be generated when data flow to terminal equipment is transmitted; the wearable device causes a lot of inconveniences in daily life. The radar has the characteristics of non-wearing type, no influence of illumination and air environment, wall-penetrating detection without privacy problem and the like, so that the radar gradually becomes a popular choice for indoor monitoring and recognition.
In addition, the traditional machine learning method is simple and effective in manually extracting features when aiming at simple specific tasks, and deep learning developed from the traditional neural network has the characteristics of strong learning ability, good portability and high coverage, so long as a large amount of data are possessed, the recognition accuracy even exceeding the human performance can be obtained, and the training of different data can be realized by the same network structure. Deep learning has been widely used in various fields such as speech recognition, machine translation, image recognition, and automatic driving. The method for detecting the target by using the camera generally depends on a deep learning method based on a convolutional neural network. The deep learning utilizes the multi-layer network to capture image detail information better, performs segmentation extraction of target characters in the image, and tests the data set through the training data set to obtain parameters with higher accuracy, so that a neural network is built, and the image afferent network can be classified.
With the development of signal processing technology, we can determine the behavior of the monitored target by analyzing the electromagnetic wave received by the radar antenna. In the detection system, background noise generated by reflection of some static objects exists, and the identification of target objects can be affected. After background noise is removed from echo data received by the radar, the distance and Doppler frequency analysis of the time-varying signals is carried out through the distance Fourier transform and the short-time Fourier transform, so that the defect that the distance and frequency of the time-domain signals are difficult to acquire is overcome. The different actions are detected by a deep learning method by utilizing the characteristic that the frequency of signals generated by a human body under different action conditions is different and the generated time-frequency diagram and time-distance diagram are different.
Disclosure of Invention
Aiming at the problems, the human body gesture recognition method based on the combination of CNN and LSTM aims to solve the problem of low recognition accuracy rate on the premise of protecting privacy, and provides a method for realizing human body action gesture recognition by using a millimeter wave radar for the old or related groups.
The aim of the invention can be achieved by adopting the following technical scheme, which comprises the following steps:
step 1, obtaining intermediate frequency signals required by training and testing through a millimeter wave radar, and obtaining the intermediate frequency signals after mixing the transmitting signals with echo signals received by an receiving antenna.
The millimeter wave radar in the method is placed in the moving distance of the identified object, and after electromagnetic wave signals are sent out through the transmitting antenna, reflected signals are received through the reflecting antenna after being reflected by the monitoring target and the background environment, so that the non-contact monitoring is realized. In the detection process, signal acquisition needs to be performed on a plurality of different targets, and each target is required to realize a plurality of different postures.
Step 2, performing distance dimension Fourier transform on the intermediate frequency signal to obtain a time-distance image, and summing the distance unit data of the target along the distance dimension to obtain a one-dimensional distance spectrum peak; and carrying out short-time Fourier transform on the one-dimensional distance spectrum peak to obtain a time-frequency image, and labeling the two images with labels of corresponding types.
And 3, establishing an improved three-channel deep learning neural network model, combining CNN and LSTM networks in the three channels, taking time-frequency images as input in a first channel and a second channel, extracting features by adopting convolution kernels with size difference in a convolution layer, taking time-distance images as input in a third channel, inputting data into the model according to requirements, training, obtaining optimal model parameters, and storing.
The millimeter wave radar in the step 1 obtains the intermediate frequency signal needed by training and testing and is formed by mixing a transmitting signal and an echo signal received by a receiving antenna, and the steps are as follows:
step 1.1, a linear frequency modulation continuous wave signal, also called chirp signal, transmitted by a millimeter wave radar transmitting antenna at the time t is:
wherein A is tx To transmit signal amplitude, f c Is carrier frequency, B is bandwidth, T c Is a sweep frequency period.
Step 1.2. The echo signal received by the receiving antenna is the delay time tau of the transmitting signal d The following signals:
delay time τ d Can be expressed as:
where d is the distance between the target and the radar and c is the speed of light.
And then mixing the echo signal and the transmitting signal to obtain an intermediate frequency signal, wherein the mixing refers to conjugate multiplication of the echo signal and the transmitting signal, and the intermediate frequency signal is in the following form:
wherein A is IF For the amplitude of the intermediate frequency signal, f b Is the frequency of the intermediate frequency signal,is the phase of the intermediate frequency signal.
In practice, signal processing is often done in the digital domain, requiring sampling of the signal. For multicycle chirp signals, f b Andrelated to the time interval between chrip. Assume that the sampling rate of the millimeter wave radar system is f s At this time, the discrete sampling form of the intermediate frequency signal is:
wherein m is a fast time sampling point, and represents distance dimension information of the signal; n is the slow time sampling point, characterizing the Doppler information of the signal.
Performing distance dimension Fourier transform on the intermediate frequency signal in the step 2 to obtain a time-distance image, and summing the distance unit data of the target along the distance dimension to obtain a one-dimensional distance spectrum peak; performing short-time Fourier transform on the one-dimensional distance spectrum peak to obtain a time-frequency image, and labeling the two images with labels of corresponding types, wherein the method comprises the following steps:
step 2.1, performing distance dimension Fourier transform on the sampled intermediate frequency signal to obtain a time-distance image:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing fourier transform of the fast time dimension, and k represents the distance dimension sampling point after fourier transform of the fast time dimension.
Step 2.2, summing the distance unit data of the target to obtain a one-dimensional distance spectrum peak, and performing short-time Fourier transform to obtain a time-frequency image:
wherein STFT represents short-time Fourier transform, p represents time-dimension sampling points after short-time Fourier transform, l represents Doppler-dimension sampling points after short-time Fourier transform, and k 0 A starting distance unit k for representing the movement track crossing of the object to be detected 1 And the distance unit is used for indicating the end of the motion trail of the target to be detected.
And 2.3, respectively labeling the time-distance images and the time-frequency images corresponding to various actions with corresponding labels, and storing the labels in different folders for classification.
In the step 3, an improved three-channel deep learning neural network model is established, and the gestures are classified by combining CNN and LSTM, and the method comprises the following steps:
and 3.1, constructing a three-channel network, and taking the time-distance image generated in the step 2 as the input of the first channel and the second channel and the time-distance image as the input of the third channel.
In the first and third channels, firstly, the convolution check image with a size of a×a is adopted in the convolution layer to perform feature extraction, and 0 filling (padding) is utilized to ensure that the image edge information is not lost, then the average pooling layer is adopted to reduce the parameter calculation amount, and the feature image is changed into sequence data and is sent into the LSTM network. In the second channel, features are extracted from the convolution layer by adopting a convolution check image with the size of b multiplied by b, wherein b is larger than a, because the time-frequency images after the transformation of different attitude signals have obviously different peaks, and the features with different thickness degrees can be extracted by adopting different convolution kernel sizes.
Step 3.3, fusing the three-channel feature images by using a concatate () method in a keras library, performing nonlinear operation on the feature images by using a Relu function, and classifying by using a softmax function through a full connection layer, wherein the softmax function is a function for converting a group of numbers into probability distribution, and maps each original value to a probability value between 0 and 1, and the formula is as follows:
wherein x is i Is the original value to be converted, n is the number of categories, j represents the current category.
And 3.4, compiling a model, and performing error calculation on the real label and the actual label by using a classification cross loss entropy function as a loss function to obtain the model accuracy, wherein the classification cross loss entropy function is as follows:
further, compared with the prior art, the human body posture recognition method based on the combination of CNN and LSTM has the following advantages that:
1) The data set of the human body gesture recognition method based on the combination of millimeter wave radar and deep learning provided by the invention comes from a plurality of targets, and has higher diversity and universality;
2) The human body gesture recognition method based on the combination of millimeter wave radar and deep learning can effectively remove low-frequency clutter generated by static objects in the environment;
3) The human body gesture recognition method based on the millimeter wave radar and the deep learning combination can effectively extract the characteristics in the time-frequency image and the time-distance image obtained after the signals are transformed;
4) The human body posture recognition method based on the combination of millimeter wave radar and deep learning provided by the invention can be used for recognizing the human body posture with higher accuracy by fusing multiple aspects of information.
Drawings
FIG. 1 is a flow chart of the method of the present disclosure;
FIG. 2 is a deep learning neural network diagram of the present invention;
FIGS. 3.1 (a) and 3.1 (b) are time-frequency images and time-distance images, respectively, of a basketball playing position;
fig. 3.2 (a) and 3.2 (b) are respectively a time-frequency image and a time-distance image at the time of boxing gesture;
fig. 3.3 (a), 3.3 (b) are time-frequency images and time-distance images, respectively, at the dancing position;
fig. 3.4 (a), 3.4 (b) are time-frequency images and time-distance images, respectively, at jump-in-place;
fig. 3.5 (a), 3.5 (b) are time-frequency images and time-distance images, respectively, of a running;
fig. 4 (a) and fig. 4 (b) are training set and validation set results, respectively.
Detailed Description
In order to make the objects, embodiments and advantages of the present embodiments more comprehensible, the present embodiments are described in more detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1-3.5 (b), the invention provides a human body posture recognition method based on combination of CNN and LSTM, comprising the steps of:
step 1, obtaining intermediate frequency signals required by training and testing through a millimeter wave radar, and obtaining the intermediate frequency signals after mixing the transmitting signals with echo signals received by an receiving antenna.
In this embodiment, five actions of running, boxing, basketball, dancing and in-situ jump are mainly designed, the millimeter wave radar is placed at a distance of 1.5 meters from the identified object, and after electromagnetic wave signals are sent out through the transmitting antenna, reflected signals are received through the reflecting antenna after being reflected by the monitoring target and the background environment, so that non-contact monitoring is realized.
The linear frequency modulation continuous wave signal transmitted by the millimeter wave radar transmitting antenna at the time t is also called chirp signal, which is:
wherein A is tx To transmit signal amplitude, f c Is carrier frequency, B is bandwidth, T c For the sweep period, the carrier frequency f in this embodiment c 77GHz, bandwidth b=1.6 GHz, sweep period tc=40.96 μs.
The echo signal received by the receiving antenna is the delay time tau of the transmitting signal d The following signals:
delay time τ d Can be expressed as:
where d is the distance between the target and the radar and c is the speed of light.
And then mixing the echo signal and the transmitting signal to obtain an intermediate frequency signal, wherein the mixing refers to conjugate multiplication of the echo signal and the transmitting signal, and the intermediate frequency signal is in the following form:
wherein A is IF For the amplitude of the intermediate frequency signal, f b Is the frequency of the intermediate frequency signal,is the phase of the intermediate frequency signal.
In practice, signal processing is often done in the digital domain, requiring sampling of the signal. For multicycle chirp signals, f b Andrelated to the time interval between chirp. In this embodiment, the sampling rate of the millimeter wave radar system is f s =6.25 MHz, the discrete sampling form of the intermediate frequency signal is:
wherein m is a fast time sampling point, and represents distance dimension information of the signal; n is the slow time sampling point, characterizing the Doppler information of the signal. In this embodiment, the fast sampling point number is 256, and the slow sampling point number is 100.
Step 2, performing distance dimension Fourier transform on the intermediate frequency signal to obtain a time-distance image, and summing the distance unit data of the target along the distance dimension to obtain a one-dimensional distance spectrum peak; and carrying out short-time Fourier transform on the one-dimensional distance spectrum peak to obtain a time-frequency image, and labeling the two images with labels of corresponding types.
Step 2.1, performing distance dimension Fourier transform on the sampled intermediate frequency signal to obtain a time-distance image:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing fourier transform of the fast time dimension, and k represents the distance dimension sampling point after fourier transform of the fast time dimension.
Step 2.2, summing the distance unit data of the target to obtain a one-dimensional distance spectrum peak, and performing short-time Fourier transform to obtain a time-frequency image:
wherein STFT represents short-time Fourier transform, p represents time-dimension sampling points after short-time Fourier transform, l represents Doppler-dimension sampling points after short-time Fourier transform, and k 0 A starting distance unit k for representing the movement track crossing of the object to be detected 1 And the distance unit is used for indicating the end of the motion trail of the target to be detected.
And 2.3, respectively labeling the time-distance images and the time-frequency images corresponding to various actions with corresponding labels, and storing the labels in different folders for classification.
And 3, establishing an improved three-channel deep learning neural network model, combining CNN and LSTM networks in the three channels, taking time-frequency images as input in a first channel and a second channel, extracting features by adopting convolution kernels with size difference in a convolution layer, taking time-distance images as input in a third channel, inputting data into the model according to requirements, training, obtaining optimal model parameters, and storing.
In the step 3, an improved three-channel deep learning neural network model is established, and the gestures are classified by combining CNN and LSTM, and the method comprises the following steps:
and 3.1, constructing a three-channel network, and taking the time-distance image generated in the step 2 as the input of the first channel and the second channel and the time-distance image as the input of the third channel.
In the first and third channels, firstly, the convolution check image with a size of a×a is adopted in the convolution layer to perform feature extraction, and 0 filling (padding) is utilized to ensure that the image edge information is not lost, then the average pooling layer is adopted to reduce the parameter calculation amount, and the feature image is changed into sequence data and is sent into the LSTM network. In the second channel, features are extracted from the convolution layer by adopting a convolution check image with the size of b multiplied by b, wherein b is larger than a, because the time-frequency images after the transformation of different attitude signals have obviously different peaks, and the features with different thickness degrees can be extracted by adopting different convolution kernel sizes.
Step 3.3, fusing the three-channel feature images by using a concatate () method in a keras library, performing nonlinear operation on the feature images by using a Relu function, and classifying by using a softmax function through a full connection layer, wherein the softmax function is a function for converting a group of numbers into probability distribution, and maps each original value to a probability value between 0 and 1, and the formula is as follows:
wherein x is i Is the original value to be converted, n is the number of categories, j represents the current category.
And 3.4, compiling a model, and performing error calculation on the real label and the actual label by using a classification cross loss entropy function as a loss function to obtain the model accuracy, wherein the classification cross loss entropy function is as follows:
in this embodiment, a convolution kernel with a size of 3×3 may be adopted to perform feature extraction in the first channel, and a convolution kernel with a size of 5×5 may be adopted to extract features in the second channel, where the specific structure of the neural network is shown in fig. 2. The data set is used for 600 pieces, wherein each type of image is 100 pieces, and is divided into 480 pieces of training set and 120 pieces of verification set according to the proportion. The training runs were 20 runs, the number of samples was 5, and examples of time-frequency images and time-distance images for five different poses are shown in fig. 3. The network training results and test results are shown in fig. 4.
Table 1 shows the network structure in a specific embodiment
Table 2 shows the results of comparison of different types of methods
Method name Classification accuracy
Single class feature input CNN 91.24%
Single class feature input LSTM 89.9%
The method of the invention 93.94%
In conclusion, the method utilizes the reflected signals received by the radar, and after the time-frequency and time distance images are obtained through Fourier transform, the human body gestures are identified and classified by combining the deep learning neural network, so that feature fusion is effectively performed, time sequence information is effectively utilized, and the accuracy of human body gesture identification by the millimeter wave radar is improved.
The foregoing is illustrative of the methods and structures of the present invention and modifications and substitutions of the specific embodiments described herein by those skilled in the art will be apparent to those of ordinary skill in the art without departing from the invention or beyond the scope of the appended claims.

Claims (8)

1. The human body posture recognition method based on the combination of CNN and LSTM is characterized by comprising the following steps:
collecting radar reflection signals of different targets, wherein each target realizes a plurality of different postures;
obtaining a time-distance image and a time-frequency image according to the reflected signals, and respectively labeling the time-distance image and the time-frequency image corresponding to various gestures with corresponding labels;
establishing a deep learning neural network model comprising a feature extraction layer, a feature fusion layer and a full-connection layer, wherein the feature extraction layer comprises three channels, each channel comprises a CNN network and an LSTM network, the input of a first channel and a second channel is a time-frequency image, the input of a third channel is a time-distance image, the feature images output by the channels are input into the feature fusion layer for feature fusion, and the classification result is output at the full-connection layer;
training the deep learning neural network model;
and inputting the image to be identified into a trained deep learning neural network model, and outputting the corresponding human body posture type.
2. The human body posture recognition method based on the combination of CNN and LSTM according to claim 1, wherein the reflected signal is an intermediate frequency signal obtained by mixing a transmitting signal with an echo signal received by a receiving antenna.
3. The human body posture recognition method based on the combination of CNN and LSTM according to claim 2, wherein the discrete sampling form of the intermediate frequency signal is:
wherein A is IF For the amplitude of the intermediate frequency signal, f b Is the frequency of the intermediate frequency signal,the phase of the intermediate frequency signal, m is a fast time sampling point, and the distance dimension information of the signal is represented; n is a slow time sampling point, doppler information representing a signal, f s Is the radar system sampling rate.
4. The human body posture recognition method based on the combination of CNN and LSTM according to claim 1, wherein the reflected signal is subjected to distance dimension Fourier transform to obtain a time-distance image.
5. The human body posture recognition method based on the combination of CNN and LSTM according to claim 1, wherein the distance unit data of the target is summed along the distance dimension to obtain a one-dimensional distance spectrum peak; and carrying out short-time Fourier transform on the one-dimensional distance spectrum peak to obtain a time-frequency image.
6. The method for recognizing human body posture based on the combination of CNN and LSTM according to claim 1, wherein the convolution layers of the first and second channels use convolution kernels having a difference in size.
7. The method of claim 1, wherein the convolution layer of the first channel and the third channel uses a convolution kernel of a×a, and the convolution layer of the second channel uses a convolution kernel of b×b, wherein b > a.
8. The human body posture recognition method based on the combination of CNN and LSTM according to claim 1, wherein the three-channel characteristic images are fused by using a concatate () method in a keras library, the characteristic images are subjected to nonlinear operation by using a Relu function, and then classified by using a softmax function through a full connection layer.
CN202310465077.5A 2023-04-26 2023-04-26 Human body posture recognition method based on CNN and LSTM combination Pending CN116524537A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310465077.5A CN116524537A (en) 2023-04-26 2023-04-26 Human body posture recognition method based on CNN and LSTM combination

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310465077.5A CN116524537A (en) 2023-04-26 2023-04-26 Human body posture recognition method based on CNN and LSTM combination

Publications (1)

Publication Number Publication Date
CN116524537A true CN116524537A (en) 2023-08-01

Family

ID=87397031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310465077.5A Pending CN116524537A (en) 2023-04-26 2023-04-26 Human body posture recognition method based on CNN and LSTM combination

Country Status (1)

Country Link
CN (1) CN116524537A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117310646A (en) * 2023-11-27 2023-12-29 南昌大学 Lightweight human body posture recognition method and system based on indoor millimeter wave radar

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117310646A (en) * 2023-11-27 2023-12-29 南昌大学 Lightweight human body posture recognition method and system based on indoor millimeter wave radar
CN117310646B (en) * 2023-11-27 2024-03-22 南昌大学 Lightweight human body posture recognition method and system based on indoor millimeter wave radar

Similar Documents

Publication Publication Date Title
CN107862705B (en) Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics
Angelov et al. Practical classification of different moving targets using automotive radar and deep neural networks
CN111399642B (en) Gesture recognition method and device, mobile terminal and storage medium
CN101223456B (en) Computer implemented method for identifying a moving object by using a statistical classifier
CN107358250B (en) Body gait recognition methods and system based on the fusion of two waveband radar micro-doppler
Liu et al. Deep learning and recognition of radar jamming based on CNN
Ahmed et al. Radar-based air-writing gesture recognition using a novel multistream CNN approach
CN111461037B (en) End-to-end gesture recognition method based on FMCW radar
CN110348288A (en) A kind of gesture identification method based on 77GHz MMW RADAR SIGNAL USING
CN111175718B (en) Automatic target recognition method and system for ground radar combining time-frequency domains
CN108680796A (en) Electromagnetic information leakage detecting system and method for computer display
WO2023029390A1 (en) Millimeter wave radar-based gesture detection and recognition method
CN113837131B (en) Multi-scale feature fusion gesture recognition method based on FMCW millimeter wave radar
Yang et al. Human motion serialization recognition with through-the-wall radar
US20230039196A1 (en) Small unmanned aerial systems detection and classification using multi-modal deep neural networks
CN113537417B (en) Target identification method and device based on radar, electronic equipment and storage medium
CN116524537A (en) Human body posture recognition method based on CNN and LSTM combination
Hendy et al. Deep learning approaches for air-writing using single UWB radar
CN115508821A (en) Multisource fuses unmanned aerial vehicle intelligent detection system
CN114781463A (en) Cross-scene robust indoor tumble wireless detection method and related equipment
CN111965620B (en) Gait feature extraction and identification method based on time-frequency analysis and deep neural network
CN114581958A (en) Static human body posture estimation method based on CSI signal arrival angle estimation
Martinez et al. Deep learning-based segmentation for the extraction of micro-doppler signatures
Tang et al. SAR deception jamming target recognition based on the shadow feature
CN114692679A (en) Meta-learning gesture recognition method based on frequency modulated continuous wave

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination