CN116184394A - Millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion - Google Patents

Millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion Download PDF

Info

Publication number
CN116184394A
CN116184394A CN202310018139.8A CN202310018139A CN116184394A CN 116184394 A CN116184394 A CN 116184394A CN 202310018139 A CN202310018139 A CN 202310018139A CN 116184394 A CN116184394 A CN 116184394A
Authority
CN
China
Prior art keywords
radar
spectrogram
gesture recognition
signals
resolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310018139.8A
Other languages
Chinese (zh)
Inventor
陈川
范孝冬
贾勇
张葛祥
杨强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Univeristy of Technology
Original Assignee
Chengdu Univeristy of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Univeristy of Technology filed Critical Chengdu Univeristy of Technology
Priority to CN202310018139.8A priority Critical patent/CN116184394A/en
Publication of CN116184394A publication Critical patent/CN116184394A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/02Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
    • G01S7/41Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
    • G01S7/411Identification of targets based on measurements of radar reflectivity
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/02Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
    • G01S7/41Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
    • G01S7/417Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section involving the use of neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Radar Systems Or Details Thereof (AREA)

Abstract

According to the millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion, radar echo signals of human gestures are processed through short-time Fourier transform, pulse compression processing, two-dimensional fast Fourier transform and minimum variance undistorted response wave beam forming algorithm, four types of radar spectrograms which contain different physical meanings and have complementary characteristics are generated, and according to the construction of a gesture recognition network based on a two-dimensional convolution network and a multi-resolution fusion module in series, human gesture motion recognition is realized, the network divides radar feature spectrograms under four types of different domains aiming at the radar multi-domain spectrogram, the method can realize more comprehensive and sufficient radar multi-domain feature expression under the condition of limited data, the two-dimensional convolution network has strong feature extraction capability, the multi-resolution fusion module can acquire multi-resolution feature tensors, and the human gesture motion recognition rate is high.

Description

Millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion
Technical Field
The invention relates to the technical field of radar signal processing, in particular to a millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion.
Background
Gesture recognition is an important research direction in the fields of radar signal processing and application, and is widely applied to the fields of intelligent home, intelligent driving and intelligent medical treatment. The global gesture recognition market size 2021 was reported to reach 899.57 hundred million yuan (renminbi), and it is expected that the global gesture recognition market size would reach 3051.06 hundred million yuan by 2027, growing at a composite annual growth rate of 22.9%. Therefore, the research of gesture recognition technology has higher economic benefit.
Gesture recognition is used as a non-contact interaction mode, and corresponding intelligent electrical equipment can be controlled without touching. Wear of the device will inevitably result and there is a risk of disease transmission compared to using a conventional keyboard or touch screen interface. However, with gesture recognition techniques, instruction input may be performed without directly touching or otherwise contacting the device. By adopting a non-contact mode, the abrasion of the equipment can be reduced. Simultaneously, the transmission of virus and bacteria can be reduced, and the risk of infection is reduced. Therefore, the research of gesture recognition technology can effectively prolong the service life of equipment, reduce the risk of the user infecting germs through contact, greatly improve the use experience of the user and ensure the physical and mental health of the user.
Currently, gesture recognition techniques are divided into two categories, contact and non-contact. The contact gesture recognition technology is mainly completed by using wearable equipment, gesture sensors such as an acceleration sensor, a gyroscope and the like are installed in the equipment to monitor gesture information of a user hand in real time, and gesture recognition is performed through collected gesture data. The method has the characteristics of no environmental influence, high recognition precision and the like, but needs to wear the detection equipment in real time by a user, and has the problems of limited use scene, difficult portability and the like, thereby greatly influencing the user experience. The non-contact gesture recognition technology mainly comprises the steps of depending on a visual sensor, ultrasonic waves, wi-Fi and other equipment. Gesture recognition based on visual sensors is subject to light conditions that are not suitable for extreme environments and there is a risk of revealing personal privacy data. The ultrasonic sensor has limited detection distance and cannot meet the requirement of remote gesture recognition. Wi-Fi is severely interfered by environmental clutter, and the recognition accuracy is low. Thus, research hotspots in the related art have been focused on gesture recognition technology based on invisible imaging in recent years.
Therefore, a gesture recognition method with high recognition efficiency is required.
Disclosure of Invention
In view of the above, the present invention aims to provide a millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion, which uses radar signals to realize gesture recognition.
In order to achieve the above purpose, the present invention provides the following technical solutions:
the millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion provided by the invention comprises the following steps:
(1) Determining a target to be detected in a radar detection area;
(2) Collecting radar echo signals returned by targets, generating multiple types of radar spectrograms which contain different physical meanings and have complementary characteristics aiming at the same gesture of the same target, and constructing a data set according to different gestures of different targets;
(3) Designing a gesture recognition network based on space-time feature extraction and multi-domain spectrogram complementary feature fusion of a two-dimensional convolution network and a multi-resolution fusion module which are connected in series for multi-domain spectrogram parallel input, training and testing the network by utilizing a data set, and designing a feature extraction and gesture recognition module based on the network;
(4) Classifying the target gestures according to the input fusion characteristics in a characteristic extraction and gesture recognition module, and judging gesture actions of the target in real time; if the preset gesture is detected, the step (5) is entered; if the preset gesture is not detected, returning to the step (4); the preset gestures comprise 5 gestures including waving a hand leftwards, waving a hand rightwards, waving a hand upwards, waving a hand downwards and pushing the hand forwards.
(5) And sending the identified gesture information.
Further, the radar spectrum in step (2) includes a frequency, distance, speed, and horizontal angle spectrum.
Further, the step (2) specifically comprises:
(21) Preprocessing an original echo signal acquired by a radar;
(22) Performing short-time Fourier transform on the preprocessed signals, and stacking in a slow time dimension to obtain time-frequency characteristic expression of echo signals;
(23) Pulse compression processing is carried out on the preprocessed signals, stacking is carried out on the signals in a slow time dimension, and distance characteristic expression is obtained in a multi-receiving channel accumulation mode;
(24) Performing two-dimensional fast Fourier transform on the preprocessed signals, and sequentially stacking the signals in a slow time dimension to obtain a speed characteristic expression;
(25) And processing the preprocessed signals by using a minimum variance undistorted response beam forming algorithm, and sequentially stacking in a slow time dimension to obtain the horizontal angle characteristic expression.
Further, the step (21) specifically comprises:
static clutter suppression is carried out on an original echo signal, phases of a moving target in the echo signal are different in different time, the current echo signal and the echo signal with the interval time tau are canceled, and a target echo signal is obtained, wherein the calculation formula is as follows:
S[t]=S R [t]-S R [t-τ]
wherein t represents a slow time index in the frequency modulation period, S R [t]Is the radar echo signal at time t.
Further, the step (22) specifically comprises:
firstly segmenting the preprocessed signals, windowing each segment of signals, performing short-time Fourier transform on the windowed data, and finally stacking the processed results of each segment of signals on a slow time dimension to obtain the frequency characteristic expression of echo signals, wherein the frequency characteristic expression is obtained through the following calculation formula:
Figure BDA0004041375950000031
where S (t) is the target echo signal, W (·) is a window function, STFT (t, f) is the obtained short-time fourier transform result, and epsilon represents the slice length around the analysis time point t.
Further, the step (23) specifically comprises:
firstly splitting the preprocessed signals, namely splitting the echo signals into a plurality of complete chirp signals, then carrying out pulse compression processing on each chirp signal to obtain distance images of single chirp signals, finally stacking the distance images of each chirp signal in a slow time dimension according to time, and obtaining the distance characteristic expression of the echo signals in a multi-receiving-channel accumulation mode, wherein the calculation method comprises the following steps:
Figure BDA0004041375950000032
wherein R is m (k, m) represents the amplitude of the mth chirp signal at the k point; w (W) i Representing a window function, the window function being a hamming window; n (N) ADC Representing the number of sampling points; s (N) ADC -i, m) represents an nth of the mth chirp signal ADC -values of i sampling points;
secondly, if the target is far away from the radar, the condition of low energy of the echo signal can occur, so that the difference between the characteristic information and the background is not obvious, and the method for accumulating the multi-channel data is utilized, wherein the specific implementation method is as follows:
Figure BDA0004041375950000041
wherein N is ADC Represents Hadamard product, R mC Representing a distance spectrum obtained for the C-th receive channel;
and finally accumulating the distance spectrums of the different receiving channels to form a final distance characteristic expression.
Further, the step (24) specifically comprises:
firstly, windowing an original signal, performing fast Fourier transform on the original signal in a fast time dimension to obtain a range profile, then, windowing the range profile, performing fast Fourier transform on the range profile in a slow time dimension to obtain a velocity spectrum, and finally, stacking the velocity spectrums in the slow time dimension according to a time sequence to obtain a velocity characteristic expression, wherein the calculation method comprises the following steps:
Figure BDA0004041375950000042
wherein V is m (k, s) represents the amplitude at the kth line point s of the distance spectrum, R m (k, m-i) represents the amplitude of the kth row m-i column of the f-th frame, N C The number of chirp signals contained per frame.
Further, the step (25) specifically comprises:
firstly, windowing an original signal, obtaining a range profile after fast Fourier transformation in a fast time dimension, and processing the range profile by using a minimum variance undistorted response beam forming algorithm to obtain undistorted output of a target azimuth signal, wherein the calculation method comprises the following steps:
A (θ,m) =a (θ) R xm a’ (θ)
wherein A is (θ,m) Echo signal energy representing theta position in mth chirp signal, n is number of array elements, R xm An autocorrelation matrix representing a distance spectrum of an mth chirp signal, that is:
Figure BDA0004041375950000043
a (θ) is a steering vector, namely:
Figure BDA0004041375950000051
finally, stacking the horizontal angle spectrums in the slow time dimension according to the time sequence to obtain the horizontal angle characteristic expression.
Further, the step (3) specifically comprises:
firstly, uniformly remolding multi-class feature spectrograms into tensors with the same size, mapping the tensors into feature tensors with the fixed size by using a linear projection network, inputting the obtained feature tensors into a two-dimensional convolution network structure to realize abstraction of spectrogram features, then inputting the abstracted features into a multi-resolution fusion module to generate multi-resolution feature tensors, fusing the multi-resolution feature tensors of the multi-domain spectrograms to obtain fusion features complementary to the multi-domain spectrograms, and finally classifying the features by using logistic regression.
The millimeter wave radar gesture recognition system based on multi-domain spectrogram and multi-resolution fusion provided by the invention comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the method when executing the program.
The invention has the beneficial effects that:
according to the millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion, radar detection targets in a detection area are obtained, radar echo signals returned by the targets are collected, meanwhile, the radar echo signals are uploaded to a gesture action characterization module of the system, four types of radar spectrograms which contain different physical meanings and have complementary characteristics are generated aiming at the same gesture of the same target, the radar spectrograms comprise frequency spectrograms, distance spectrograms, speed spectrograms and horizontal angle spectrograms, and a data set is constructed according to different gestures of different targets; establishing a gesture recognition network, training and testing the network by utilizing a data set, and extracting and gesture recognition modules based on the network design characteristics; classifying the target gestures according to the input multi-resolution fusion characteristics in a characteristic extraction and gesture recognition module, and judging the current gesture actions of the target in real time; and the wireless data transmission module is used for sending gesture information to the upper computer, and the upper computer controls the external equipment to realize corresponding functions according to different gesture information so as to realize real-time gesture recognition.
According to the method, four types of radar spectrograms which contain different physical meanings and have complementary features are generated by carrying out short-time Fourier transform, pulse compression processing, two-dimensional fast Fourier transform and minimum variance undistorted response beam forming algorithm processing on the signals, and a gesture recognition network based on space-time feature extraction and multi-domain spectrogram complementary feature fusion of a two-dimensional convolution network and a multi-resolution fusion module is constructed, so that gesture recognition is realized.
The method utilizes the concept that an intelligent recognition network utilizes the layer-by-layer abstraction in CNN, the network aims at feature spectrograms, uniformly remodels the feature spectrograms into tensors with the same size, maps the feature spectrograms into feature tensors with the fixed size by using a linear projection network, inputs the obtained feature tensors into a two-dimensional convolution network structure to realize abstraction of spectrogram features, and then inputs the abstract features into a multi-resolution fusion module to carry out multi-resolution feature tensor fusion for a classifier to realize gesture motion recognition; the multi-resolution fusion module and the two-dimensional convolution network module enable the multi-resolution fusion module to have multi-resolution feature fusion of any kind or different kinds of feature expression, the feature fusion capacity and generalization capacity of the network are greatly enhanced, the method can achieve more comprehensive and sufficient radar multi-domain feature expression under the limited condition of data, the two-dimensional convolution network has stronger feature extraction capacity, the multi-resolution fusion module can obtain multi-resolution feature tensor, and the method has higher human gesture motion recognition rate.
The method and the device have the advantages that the millimeter wave radar is utilized for gesture recognition, so that the privacy of a user is protected, the influence of ambient light is avoided, the limitation of a use scene is avoided, and the like; the representation mode of generating multi-feature complementation by adopting the radar multi-domain spectrogram can realize more comprehensive and sufficient gesture feature expression under the limited data condition. And the imaging quality of the radar spectrogram is improved by utilizing a multi-receiving-channel accumulation mode so as to improve the gesture recognition rate. And designing a composite neural network consisting of a parallel two-dimensional convolution network and a multi-resolution module to realize multi-resolution radar multi-domain spectrogram characteristic fusion. The characteristics are extracted from different domains by utilizing a multichannel parallel input method, and gesture characteristics under different resolutions are fused by utilizing a multi-resolution fusion module, so that better recognition capability than that of a two-dimensional convolution network is obtained. The generalization capability and the recognition precision of the network are effectively improved.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
In order to make the objects, technical solutions and advantageous effects of the present invention more clear, the present invention provides the following drawings for description:
FIG. 1 is a flow chart of a millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion;
FIG. 2 is a schematic diagram of a millimeter wave radar installation;
FIG. 3 is a schematic diagram of a radar multi-domain spectrogram characterization technique;
FIG. 4 is a diagram of a millimeter wave radar gesture recognition network based on multi-domain spectrogram and multi-resolution fusion;
FIG. 5 is a confusion matrix for testing the network, wherein waving left, waving right, waving up, waving down and pushing forward are denoted by A1, A2, A3, A4 and A5, respectively;
FIG. 6 is a block diagram of a non-line-of-sight gesture recognition system.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and specific examples, which are not intended to limit the invention, so that those skilled in the art may better understand the invention and practice it.
Example 1
The radar serving as an environmental data acquisition sensor can effectively avoid invading the privacy of a user, has the unique advantages of high penetrating power, high distance resolution, high speed resolution, high angle resolution and the like, has higher advantages in the aspects of detecting and tracking human hand actions in a large-range application scene, and has wide application prospect. The human gesture recognition method based on the millimeter wave radar mainly comprises three steps of feature expression, feature extraction and classification recognition. Firstly, millimeter wave radar echo signals are processed to generate radar images containing multidimensional feature information, then gesture feature information contained in the radar images is automatically extracted through manual extraction or a neural network, and finally, a classifier is designed to classify gestures according to the feature information, so that gesture actions are identified and classified.
The radar human body gesture recognition is characterized in that a time-frequency spectrogram, a distance spectrogram, a speed spectrogram and a horizontal angle spectrogram are generated by methods such as time-frequency analysis, pulse compression, two-dimensional Fourier transformation, minimum variance non-distortion wave beam formation and the like to complete multi-dimensional expression of gesture features, and the time-frequency spectrogram can be regarded as a power spectrum sequence which changes along with time, so that the Doppler features of human body gesture targets and the position parameter features of scattering centers are reflected. Different signal processing methods can be adopted to generate feature spectrograms with different physical meanings, and feature spectrograms on different domains have differences in geometric details and semantic information representation on the feature expression of the same gesture, but have great complementarity on the feature expression. In the existing research, the characteristic expression is carried out on a multi-purpose single-class spectrogram, and in order to fully mine and utilize complementary characteristics existing among radar spectrograms on different domains, the method utilizes the radar multi-domain spectrogram to more fully express gesture characteristics in more dimensions, so that the recognition precision of human gesture actions is improved.
The feature extraction mode in the radar human gesture recognition mainly comprises manual extraction and automatic neural network extraction. The manual extraction features are marked with effective information from the original data through a manual design feature extraction method and serve as a basis for gesture judgment, but in actual operation, the manual feature extraction method has high requirements on professional knowledge of practitioners, effective judgment information is difficult to capture from an original radar spectrogram, and the defects of low efficiency and high operation complexity exist.
According to the embodiment, human gesture recognition is realized by using radar signal processing and deep learning technology, frequency, distance, speed and horizontal angle characteristics of different gestures of a target are extracted by using methods such as STFT, pulse compression, 2D-FFT, MVDR and the like, radar feature spectrograms under four different domains are generated, feature extraction, multi-resolution feature fusion, multi-domain feature fusion and gesture classification are performed on the basis of a gesture recognition network in which space-time feature extraction and multi-domain spectrogram complementary feature fusion of a two-dimensional convolution network and a multi-resolution fusion module are connected in series, and finally human gesture recognition is realized. The method has the characteristics of protecting the privacy of the user, being not influenced by the ambient light, being not limited by the use scene, fully expressing the characteristics and the like, and can achieve better gesture recognition rate.
As shown in fig. 1, the present embodiment provides a millimeter wave radar non-line-of-sight human gesture recognition method based on multi-domain feature fusion, which includes the following steps:
(1) Installing millimeter wave radar in a detection area, wherein the radar coverage area is provided with a determined number of targets;
(2) The method comprises the steps of utilizing a millimeter wave radar sensor to monitor targets in a detection area in real time, collecting radar echo signals returned by the targets, uploading the radar echo signals to a gesture action characterization module of a system, generating four types of radar spectrograms which contain different physical meanings and have complementary characteristics aiming at the same target and the same gesture, wherein the radar spectrograms comprise a combination of four spectrograms, namely a time-frequency spectrogram, a time-distance spectrogram, a time-speed spectrogram and a time-angle spectrogram, and constructing a data set according to different gestures of different targets.
The embodiment is formed by combining four spectrograms of a time-frequency spectrogram, a time-distance spectrogram, a time-speed spectrogram and a time-angle spectrogram, wherein the characteristic information is characteristic information in a time dimension and can represent information (frequency, distance, speed and angle) of a target at different moments. The defect of non-time dimension characteristic information is overcome, and specific distance and angle of each time point target cannot be identified from the time dimension if the distance and angle are the same. In particular, if a range-angle spectrum is used, the final imaging result may be very similar for confusing gestures, resulting in indistinguishability. But they exhibit characteristics that differ in the time dimension. Range-doppler only knows that there is a target object at this distance, but does not know how the target moves. But by stacking in the time dimension it can be found that the target object is moving forward, close to the radar.
The four kinds of radar spectrograms have independent characteristics and also contain complementary characteristics, namely when a single spectrogram cannot distinguish some kinds of gestures, the radar spectrograms under other radar fields are utilized to supplement the gesture characteristics so as to more fully and comprehensively complete gesture characteristic expression, for example, when a hand is swung left and a hand is swung right, the characteristics of the two kinds of gestures in the distance spectrograms are very similar. If the horizontal angle features are added on the basis of the recognition method as the basis for the recognition, the recognition rate of the confusing gestures can be greatly improved, because the features of the two types of gestures in the horizontal angle map are completely opposite. Through fusion in the channel dimension, the feature can complete gesture feature expression more fully and comprehensively so as to improve gesture recognition accuracy.
(3) The method comprises the steps of constructing a gesture recognition network, wherein the gesture recognition network is a gesture recognition network which is input in parallel by a multi-domain spectrogram, is based on space-time feature extraction and multi-domain spectrogram complementary feature fusion of a two-dimensional convolution network and a multi-resolution fusion module which are connected in series, training and testing the network by utilizing a data set, and comprises a feature extraction and gesture recognition module based on the gesture recognition network;
(4) Classifying the target gestures according to the input fusion characteristics in a characteristic extraction and gesture recognition module, and judging gesture actions of the target in real time; if the preset gesture is detected, the step (5) is entered; if the preset gesture is not detected, returning to the step (4); the preset gestures comprise 5 gestures including waving a hand leftwards, waving a hand rightwards, waving a hand upwards, waving a hand downwards and pushing the hand forwards.
(5) And the wireless data transmission module is used for sending gesture information to the upper computer, and the upper computer controls the external equipment to realize corresponding functions according to different gesture information so as to realize real-time gesture recognition.
As shown in fig. 2, in the step (1), in order to achieve the best measuring effect, the millimeter wave radar is mounted on the wall with a height of 1.0m to 2.0m; the included angle between the vertical direction and the vertical direction is about-25 degrees to 25 degrees; this embodiment is preferably 2.0m from the ground while tilting down 25 deg. for an IWR 6843 ISK FMWC millimeter wave radar by texas instruments.
As shown in fig. 3, the step (2) specifically includes:
(21) Preprocessing an original echo signal acquired by a radar;
(22) Performing short-time Fourier transform on the preprocessed signals, and stacking in a slow time dimension to obtain time-frequency characteristic expression of echo signals;
(23) Pulse compression processing is carried out on the preprocessed signals, stacking is carried out on the signals in a slow time dimension, and distance characteristic expression is obtained in a multi-receiving channel accumulation mode;
(24) And carrying out two-dimensional fast Fourier transform on the preprocessed signals, and sequentially stacking the signals in a slow time dimension to obtain the speed characteristic expression.
(25) And processing the preprocessed signals by using a minimum variance undistorted response beam forming algorithm, and sequentially stacking in a slow time dimension to obtain the horizontal angle characteristic expression.
The step (21) specifically comprises the following steps:
static clutter suppression is carried out on an original echo signal, a moving target display algorithm is adopted, phases of moving targets in the echo signal in different time are different, the current echo signal and the echo signal with the interval time of tau are canceled, and a target echo signal can be obtained, wherein the calculation formula is as follows:
S[t]=S R [t]-S R [t-τ]
wherein t represents a slow time index in the frequency modulation period, S R [t]The radar echo signal at the time t;
the step (22) specifically comprises the following steps:
firstly segmenting the preprocessed signals, windowing each segment of signals, performing short-time Fourier transform on the windowed data, and finally stacking the processed results of each segment of signals on a slow time dimension to obtain the frequency characteristic expression of the echo signals, wherein the frequency characteristic expression can be obtained through the following calculation formula:
Figure BDA0004041375950000101
where S (t) is the target echo signal, W (·) is a window function, STFT (t, f) is the obtained short-time fourier transform result, and epsilon represents the slice length around the analysis time point t.
The step (23) specifically comprises the following steps:
firstly splitting the preprocessed signals, namely splitting the echo signals into a plurality of complete chirp signals, then carrying out pulse compression processing on each chirp signal to obtain distance images of single chirp signals, and finally stacking the distance images of each chirp signal in a slow time dimension according to time to obtain the distance characteristic expression of the echo signals, wherein the calculation method comprises the following steps:
Figure BDA0004041375950000102
wherein R is m (k, m) represents the amplitude of the mth chirp signal at the k point; w (W) i Representing a window function, wherein the window function adopted by the method is a Hamming window; n (N) ADC Representing the number of sampling points; s (N) ADC -i, m) represents an nth of the mth chirp signal ADC -values of i sampling points.
Secondly, if the target is far away from the radar, the condition of low energy of the echo signal can occur, so that the difference between the characteristic information and the background is not obvious. In order to enhance the target characteristic information, the method for accumulating the multi-channel data is utilized, and the specific implementation method is as follows
Figure BDA0004041375950000111
Wherein N is ADC Represents Hadamard product, R mC Representing the distance spectrum obtained for the C-th receive channel.
Finally, the distance spectra of these different receive channels are accumulated to form a final distance feature representation.
The step (24) specifically comprises the following steps:
firstly, windowing an original signal, performing fast Fourier transform on the original signal in a fast time dimension to obtain a range profile, then, windowing the range profile, performing fast Fourier transform on the range profile in a slow time dimension to obtain a velocity spectrum, and finally, stacking the velocity spectrums in the slow time dimension according to a time sequence to obtain a velocity characteristic expression, wherein the calculation method comprises the following steps:
Figure BDA0004041375950000112
wherein V is m (k, s) represents the magnitude at the kth line point s of the distance spectrum. R is R m (k, m-i) represents the magnitude of the kth row m-i column of the f-th frame. N (N) C The number of chirp signals contained per frame.
The step (25) specifically comprises the following steps:
firstly, windowing an original signal, obtaining a range profile after fast Fourier transformation in a fast time dimension, and processing the range profile by using a minimum variance undistorted response beam forming algorithm to obtain undistorted output of a target azimuth signal, wherein the calculation method comprises the following steps:
A (θ,m) =a (θ) R xm a’ (θ)
wherein A is (θ,m) And represents the echo signal energy at the theta position in the mth chirp signal. n is the number of array elements. R is R xm An autocorrelation matrix representing a distance spectrum of an mth chirp signal, that is:
Figure BDA0004041375950000113
a (θ) is a steering vector, namely:
Figure BDA0004041375950000114
finally, stacking the horizontal angle spectrograms in the slow time dimension according to the time sequence to obtain the horizontal angle characteristic expression.
The step (3) specifically comprises the following steps:
as shown in fig. 4, fig. 4 is a millimeter wave radar gesture recognition network structure based on multi-domain spectrogram and multi-resolution fusion; the gesture recognition network comprises a linear projection layer, a feature abstraction layer, a multi-resolution feature fusion layer, a semantic information extraction layer, a complementary feature fusion layer and a classifier;
the linear projection layer is used for mapping the multi-domain feature spectrogram into a feature tensor with a fixed size;
the characteristic abstraction layer is used for carrying out characteristic abstraction on the radar multi-domain spectrogram;
the multi-resolution feature fusion layer is used for fusing feature tensors under different resolutions;
the semantic information extraction layer is used for acquiring deeper semantic information, reducing the size of the feature map and improving the running speed of the system;
the complementary feature fusion layer is used for fusing radar feature tensors of different domains;
the classifier is used for classifying the fusion characteristics to obtain classification results;
the gesture recognition network provided by the embodiment adopts a parallel two-dimensional convolution network module and a multi-resolution fusion mechanism to perform feature extraction and fusion, and aims at a radar multi-domain feature spectrogram, wherein the multi-domain feature spectrogram comprises a time-frequency spectrogram, a time-distance spectrogram, a time-speed spectrogram and a time-angle spectrogram, the multi-domain feature spectrogram is mapped into a feature tensor with fixed size by using a linear projection layer, the obtained H multiplied by W multiplied by C feature tensor is input into a feature abstraction layer to realize feature abstraction of the radar feature spectrogram, the multi-resolution feature fusion layer performs multi-resolution fusion on the feature tensor under different resolutions to obtain multi-resolution fusion features, and then the multi-resolution features are input into a semantic information extraction layer.
In this embodiment, by cross processing of convolution and normalization, more deep semantic information is obtained, and the operation can reduce the size of the feature map to increase the running speed of the system, ensure the real-time performance of the recognition system, and finally complete cross fusion of the feature of the recognition system and the features of other domains in the complementary feature fusion layer, that is, the features of different domains are stacked in the channel dimension, so that the fused features contain different physical meanings to express the gesture features more comprehensively.
And inputting the fusion characteristics into a logistic regression model for classification, and finally realizing human gesture recognition.
In the network structure of fig. 4, the linear projection module aims at dimension reduction of high-dimensional features, and the two-dimensional convolution network of the feature abstraction module is responsible for feature abstraction of radar multi-domain spectrograms, namely, the function of the multi-resolution feature fusion layer module is to fuse feature tensors under different resolutions. Compared with the semantic features of the single-resolution feature tensor, the feature tensor processed by the multi-resolution feature fusion module not only contains the spatial texture features of the gestures, but also contains deeper semantic features, so that the recognition rate of the easily-confused gestures can be effectively improved.
The complementary feature fusion layer is used for cross-fusing the multi-domain spectrogram features, and has the functions of fusing radar feature tensors of different domains, and finally classifying the fused features through a classifier to obtain a classification result, so that human gesture recognition is realized.
In the prior art, a single type of spectrogram is often used for gesture recognition, however, recognition confusion is caused by very similar characteristics when the confusable gestures are classified, such as when a hand is swung left and a hand is swung right, the characteristics of the two types of gestures in a distance spectrogram are very similar. If the horizontal angle features are added on the basis of the recognition method as the basis for the recognition, the recognition rate of the confusing gestures can be greatly improved, because the features of the two types of gestures in the horizontal angle map are completely opposite.
As shown in fig. 5, fig. 5 is a confusion matrix obtained by testing the network, and 23040 radar spectrograms are used as training data of the network, including 5 gestures of waving left hand, waving right hand, waving up hand, waving down hand and pushing forward. And then 5760 radar spectrograms are used for testing the network to obtain an confusion matrix, wherein the ordinate represents the network identification result, the abscissa represents the actual gesture category, A1, A2, A3, A4 and A5 respectively represent left hand waving, right hand waving, upward hand waving, downward hand waving and forward pushing, the identification rate of each gesture can be seen to be more than 90% by the confusion matrix, and the overall identification rate can be up to 96.0%.
Example 2
The millimeter wave radar gesture recognition system based on multi-domain spectrogram and multi-resolution fusion provided by the embodiment comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the method when executing the program.
As shown in fig. 6, fig. 6 is a block diagram of a non-line-of-sight human gesture recognition system, which comprises a millimeter wave radar data acquisition module, a gesture recognition network, a wireless data transmission module and an upper computer platform;
the millimeter wave radar data acquisition module is used for acquiring original radar echo data of a target;
the gesture recognition network is used for fusing multiple types of radar spectrograms to obtain fusion characteristics and classifying target gestures according to the fusion characteristics; the gesture recognition network comprises a gesture characterization module and a feature extraction and gesture recognition module;
the gesture characterization module is used for forming a radar multi-domain spectrogram of the target gesture;
the feature extraction and gesture recognition module is used for extracting and fusing multi-domain spectrogram features and multi-resolution features of the target and classifying the target gestures so as to realize human gesture recognition;
the wireless data transmission module is used for transmitting data and instructions between the radar module and the upper computer platform for communication;
the upper computer platform is used for displaying gesture recognition results and controlling external equipment to achieve corresponding functions when corresponding gesture actions occur, so that users can conveniently achieve intelligent control of the equipment without contact.
The gesture characterization module in this embodiment generates four types of radar spectrograms which contain different physical meanings and have complementary features for the same gesture of the same target, where the radar spectrograms include a combination of four spectrograms, namely a time-frequency spectrogram, a time-distance spectrogram, a time-speed spectrogram and a time-angle spectrogram, and constructs a data set according to different gestures of different targets. The four radar spectrograms have independent characteristics and also contain complementary characteristics, namely when a single spectrogram cannot distinguish some types of gestures, the radar spectrograms in other radar fields are utilized to supplement the gesture characteristics so as to more fully and comprehensively complete gesture characteristic expression, so that the characteristics can more fully and comprehensively complete gesture characteristic expression, and the gesture recognition precision is improved.
The feature extraction and gesture recognition module of the embodiment comprises a linear projection layer, a feature abstraction layer, a multi-resolution feature fusion layer, a semantic information extraction layer, a complementary feature fusion layer and a classifier; the method comprises the steps of performing feature extraction and fusion by adopting a parallel two-dimensional convolution network module and a multi-resolution fusion mechanism, mapping the multi-domain feature spectrogram into a feature tensor with a fixed size by using a linear projection layer aiming at the radar multi-domain feature spectrogram, inputting the obtained H multiplied by W multiplied by C feature tensor into a feature abstract layer to realize feature abstraction of the radar feature spectrogram, performing multi-resolution fusion on the feature tensor under different resolutions by using the multi-resolution feature fusion layer to obtain multi-resolution fusion features, and then inputting the multi-resolution features into a semantic information extraction layer.
The above-described embodiments are merely preferred embodiments for fully explaining the present invention, and the scope of the present invention is not limited thereto. Equivalent substitutions and modifications will occur to those skilled in the art based on the present invention, and are intended to be within the scope of the present invention. The protection scope of the invention is subject to the claims.

Claims (10)

1. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion is characterized by comprising the following steps of: the method comprises the following steps:
(1) Determining a target to be detected in a radar detection area;
(2) Collecting radar echo signals returned by targets, generating multiple types of radar spectrograms aiming at the same target and the same gesture, wherein the radar spectrograms have different physical meanings and complementary characteristics, and constructing a data set according to different gestures of different targets;
(3) The method comprises the steps of constructing a gesture recognition network, wherein the gesture recognition network is a composite neural network formed by a parallel two-dimensional convolution network and a multi-resolution module and is used for fusing multiple radar spectrograms to obtain fusion characteristics and classifying target gestures according to the fusion characteristics, and training and testing the gesture recognition network by utilizing a data set;
(4) Inputting the fusion characteristics into a gesture recognition network to classify target gestures and judging gesture actions of the targets; if the preset gesture is detected, the step (5) is entered; if the preset gesture is not detected, returning to the step (4); the preset gestures comprise left hand waving, right hand waving, upward hand waving, downward hand waving and forward pushing gestures;
(5) And sending the identified gesture information.
2. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion of claim 1, wherein the method is characterized by comprising the following steps of: the radar spectrogram in the step (2) comprises a time-frequency spectrogram, a time-distance spectrogram, a time-speed spectrogram and a time-angle spectrogram.
3. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion of claim 1, wherein the method is characterized by comprising the following steps of: the step (2) comprises the following steps:
(21) Preprocessing an original echo signal acquired by a radar;
(22) Performing short-time Fourier transform on the preprocessed signals, and stacking in a slow time dimension to obtain time-frequency characteristic expression of echo signals;
(23) Pulse compression processing is carried out on the preprocessed signals, stacking is carried out on the signals in a slow time dimension, and distance characteristic expression is obtained in a multi-receiving channel accumulation mode;
(24) Performing two-dimensional fast Fourier transform on the preprocessed signals, and sequentially stacking the signals in a slow time dimension to obtain a speed characteristic expression;
(25) And processing the preprocessed signals by using a minimum variance undistorted response beam forming algorithm, and sequentially stacking in a slow time dimension to obtain the horizontal angle characteristic expression.
4. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion of claim 3, wherein the method is characterized by: the step (21) is specifically as follows:
static clutter suppression is carried out on an original echo signal, phases of a moving target in the echo signal are different in different time, the current echo signal and the echo signal with the interval time tau are canceled, and a target echo signal is obtained, wherein the calculation formula is as follows:
S[t]=S R [t]-S R [t-τ]
wherein t represents a slow time index in the frequency modulation period, S R [t]Is the radar echo signal at time t.
5. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion of claim 3, wherein the method is characterized by: the step (22) is specifically as follows:
firstly segmenting the preprocessed signals, windowing each segment of signals, performing short-time Fourier transform on the windowed data, and finally stacking the processed results of each segment of signals on a slow time dimension to obtain the frequency characteristic expression of echo signals, wherein the frequency characteristic expression is obtained through the following calculation formula:
Figure FDA0004041375940000021
where S (t) is the target echo signal, W (·) is a window function, STFT (t, f) is the obtained short-time fourier transform result, and epsilon represents the slice length around the analysis time point t.
6. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion of claim 3, wherein the method is characterized by: the step (23) is specifically as follows:
firstly splitting the preprocessed signals, namely splitting the echo signals into a plurality of complete chirp signals, then carrying out pulse compression processing on each chirp signal to obtain distance images of single chirp signals, finally stacking the distance images of each chirp signal in a slow time dimension according to time, and obtaining the distance characteristic expression of the echo signals in a multi-receiving-channel accumulation mode, wherein the calculation method comprises the following steps:
Figure FDA0004041375940000022
wherein R is m (k, m) represents the amplitude of the mth chirp signal at the k point; w (W) i Representing a window function, the window function being a hamming window; n (N) ADC Representing the number of sampling points; s (N) ADC -i, m) represents an nth of the mth chirp signal ADC -values of i sampling points;
secondly, if the target is far away from the radar, the condition of low energy of the echo signal can occur, so that the difference between the characteristic information and the background is not obvious, and the method for accumulating the multi-channel data is utilized, wherein the specific implementation method is as follows:
Figure FDA0004041375940000031
wherein N is ADC Represents Hadamard product, R mC Representing a distance spectrum obtained for the C-th receive channel;
and finally accumulating the distance spectrums of the different receiving channels to form a final distance characteristic expression.
7. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion of claim 3, wherein the method is characterized by: the step (24) is specifically as follows:
firstly, windowing an original signal, performing fast Fourier transform on the original signal in a fast time dimension to obtain a range profile, then, windowing the range profile, performing fast Fourier transform on the range profile in a slow time dimension to obtain a velocity spectrum, and finally, stacking the velocity spectrums in the slow time dimension according to a time sequence to obtain a velocity characteristic expression, wherein the calculation method comprises the following steps:
Figure FDA0004041375940000032
wherein V is m (k, s) represents the amplitude at the kth line point s of the distance spectrum, R m (k, m-i) represents the amplitude of the kth row m-i column of the f-th frame, N C The number of chirp signals contained per frame.
8. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion of claim 3, wherein the method is characterized by: the step (25) is specifically as follows:
firstly, windowing an original signal, obtaining a range profile after fast Fourier transformation in a fast time dimension, and processing the range profile by using a minimum variance undistorted response beam forming algorithm to obtain undistorted output of a target azimuth signal, wherein the calculation method comprises the following steps:
A (θ,m) =a (θ) R xm a' (θ)
wherein A is (θ,m) Echo signal energy representing theta position in mth chirp signal, n is number of array elements, R xm An autocorrelation matrix representing a distance spectrum of an mth chirp signal, that is:
Figure FDA0004041375940000041
a (θ) is a steering vector, namely:
Figure FDA0004041375940000042
finally, stacking the horizontal angle spectrums in the slow time dimension according to the time sequence to obtain the horizontal angle characteristic expression.
9. The millimeter wave radar gesture recognition method based on multi-domain spectrogram and multi-resolution fusion of claim 1, wherein the method is characterized by comprising the following steps of: the step (3) is specifically as follows:
firstly, uniformly remolding multi-class feature spectrograms into tensors with the same size, mapping the tensors into feature tensors with the fixed size by using a linear projection network, inputting the obtained feature tensors into a two-dimensional convolution network structure to realize abstraction of spectrogram features, then inputting the abstracted features into a multi-resolution fusion module to generate multi-resolution feature tensors, fusing the multi-resolution feature tensors of the multi-domain spectrograms to obtain fusion features complementary to the multi-domain spectrograms, and finally classifying the features by using logistic regression.
10. Millimeter wave radar gesture recognition system based on a multi-domain spectrogram and multi-resolution fusion, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of the preceding claims 1 to 9 when executing the program.
CN202310018139.8A 2023-01-06 2023-01-06 Millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion Pending CN116184394A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310018139.8A CN116184394A (en) 2023-01-06 2023-01-06 Millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310018139.8A CN116184394A (en) 2023-01-06 2023-01-06 Millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion

Publications (1)

Publication Number Publication Date
CN116184394A true CN116184394A (en) 2023-05-30

Family

ID=86451687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310018139.8A Pending CN116184394A (en) 2023-01-06 2023-01-06 Millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion

Country Status (1)

Country Link
CN (1) CN116184394A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116482680A (en) * 2023-06-19 2023-07-25 精华隆智慧感知科技(深圳)股份有限公司 Body interference identification method, device, system and storage medium
CN117129947A (en) * 2023-10-26 2023-11-28 成都金支点科技有限公司 Planar transformation method radar signal identification method based on mininet
CN117671777A (en) * 2023-10-17 2024-03-08 广州易而达科技股份有限公司 Gesture recognition method, device, equipment and storage medium based on radar

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116482680A (en) * 2023-06-19 2023-07-25 精华隆智慧感知科技(深圳)股份有限公司 Body interference identification method, device, system and storage medium
CN116482680B (en) * 2023-06-19 2023-08-25 精华隆智慧感知科技(深圳)股份有限公司 Body interference identification method, device, system and storage medium
CN117671777A (en) * 2023-10-17 2024-03-08 广州易而达科技股份有限公司 Gesture recognition method, device, equipment and storage medium based on radar
CN117671777B (en) * 2023-10-17 2024-05-14 广州易而达科技股份有限公司 Gesture recognition method, device, equipment and storage medium based on radar
CN117129947A (en) * 2023-10-26 2023-11-28 成都金支点科技有限公司 Planar transformation method radar signal identification method based on mininet
CN117129947B (en) * 2023-10-26 2023-12-26 成都金支点科技有限公司 Planar transformation method radar signal identification method based on mininet

Similar Documents

Publication Publication Date Title
CN116184394A (en) Millimeter wave radar gesture recognition method and system based on multi-domain spectrogram and multi-resolution fusion
CN110007366B (en) Life searching method and system based on multi-sensor fusion
CN111178331B (en) Radar image recognition system, method, apparatus, and computer-readable storage medium
CN113408328B (en) Gesture segmentation and recognition algorithm based on millimeter wave radar
CN109901130B (en) Rotor unmanned aerial vehicle detection and identification method based on Radon transformation and improved 2DPCA
CN111427031A (en) Identity and gesture recognition method based on radar signals
Li et al. Human behavior recognition using range-velocity-time points
CN113313040A (en) Human body posture identification method based on FMCW radar signal
CN115063884B (en) Millimeter wave radar head action recognition method based on multi-domain fusion deep learning
WO2023029390A1 (en) Millimeter wave radar-based gesture detection and recognition method
Kim et al. Radar-based human activity recognition combining range–time–Doppler maps and range-distributed-convolutional neural networks
CN113064483A (en) Gesture recognition method and related device
Janakaraj et al. STAR: Simultaneous tracking and recognition through millimeter waves and deep learning
CN115877376A (en) Millimeter wave radar gesture recognition method and recognition system based on multi-head self-attention mechanism
Seo et al. Underwater moving target classification using multilayer processing of active sonar system
Lee et al. Digit recognition in air-writing using single millimeter-wave band radar system
Xie et al. Lightweight midrange arm-gesture recognition system from MmWave radar point clouds
Biswas et al. Complex sincnet for more interpretable radar based activity recognition
CN116561700A (en) Indoor human body posture recognition method based on millimeter wave radar
CN115982620A (en) Millimeter wave radar human body falling behavior identification method and system based on multi-class three-dimensional features and Transformer
CN114511873B (en) Static gesture recognition method and device based on millimeter wave radar imaging
Wang et al. Human behavior recognition based on multi-dimensional feature learning of millimeter-wave radar
CN115754956A (en) Millimeter wave radar gesture recognition method based on envelope data time sequence
Wang et al. Rammar: RAM assisted mask R-CNN for FMCW sensor based HGD system
Zheng et al. Hand gesture recognition based-on three-branch CNN with fine-tuning using MIMO radar

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination