CN115345906A - Human body posture tracking method based on millimeter wave radar - Google Patents
Human body posture tracking method based on millimeter wave radar Download PDFInfo
- Publication number
- CN115345906A CN115345906A CN202211007781.8A CN202211007781A CN115345906A CN 115345906 A CN115345906 A CN 115345906A CN 202211007781 A CN202211007781 A CN 202211007781A CN 115345906 A CN115345906 A CN 115345906A
- Authority
- CN
- China
- Prior art keywords
- network
- human body
- information
- body posture
- millimeter wave
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000000007 visual effect Effects 0.000 claims abstract description 5
- 238000012549 training Methods 0.000 claims description 8
- 230000033001 locomotion Effects 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 239000013598 vector Substances 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 3
- 238000009825 accumulation Methods 0.000 claims description 2
- 238000013527 convolutional neural network Methods 0.000 claims description 2
- 238000012360 testing method Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 7
- 238000013528 artificial neural network Methods 0.000 abstract description 2
- 238000010438 heat treatment Methods 0.000 abstract 1
- 230000036544 posture Effects 0.000 description 24
- 238000001514 detection method Methods 0.000 description 14
- 210000003414 extremity Anatomy 0.000 description 9
- 230000006399 behavior Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 210000000245 forearm Anatomy 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000000779 smoke Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 210000000707 wrist Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/05—Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves
- A61B5/0507—Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves using microwaves or terahertz waves
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Measuring devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb
- A61B5/1116—Determining posture transitions
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Measuring devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb
- A61B5/1121—Determining geometric values, e.g. centre of rotation or angular range of movement
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Measuring devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb
- A61B5/1126—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb using a particular sensing technique
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/725—Details of waveform analysis using specific filters therefor, e.g. Kalman or adaptive filters
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7253—Details of waveform analysis characterised by using transforms
- A61B5/7257—Details of waveform analysis characterised by using transforms using Fourier transforms
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
- A61B5/7267—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
- G06T2207/10044—Radar image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Animal Behavior & Ethology (AREA)
- Veterinary Medicine (AREA)
- Public Health (AREA)
- Pathology (AREA)
- Artificial Intelligence (AREA)
- Heart & Thoracic Surgery (AREA)
- Surgery (AREA)
- Physiology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Physics (AREA)
- Psychiatry (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Dentistry (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Fuzzy Systems (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Geometry (AREA)
- Radiology & Medical Imaging (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Radar Systems Or Details Thereof (AREA)
Abstract
A human body posture tracking method based on a millimeter wave radar processes radio frequency information into a horizontal heating vertical heat map as input of a neural network, and outputs a human body posture by using a bottom-up method; synchronously acquiring visual and radio frequency information in a scene; to be provided withHRNetAs a teacher network, modeling by using an N-Joint human body model, estimating the posture of a human body, calculating the coordinates of the key points of the human skeleton in a pixel coordinate system, and supervising the learning of radio frequency information; using Kalman filteringThe wave filter predicts the human body posture under the shielding condition according to the posture information inferred by the historical radio frequency frame, and can effectively solve the problem of the posture information loss under the shielding condition; the Hungarian algorithm is used for realizing data association, the pose information between the frames can be effectively associated by calculating the similarity of the poses between the adjacent frames, a track set is established for each successfully associated object, the pose information successfully matched with the subsequent frames is associated in the corresponding track set, and therefore the target is tracked.
Description
Technical Field
The invention belongs to the field of intersection of wireless perception and computer vision, and particularly relates to a human posture tracking method based on a millimeter wave radar.
Background
Vision-based Human Pose tracking is the task of estimating multi-person poses (HPE) and assigning a unique instance ID to each object in the frame. The system generally comprises image acquisition equipment and background processing equipment, one or more cameras are generally used for acquiring videos of daily activities of a human body and transmitting the videos to the background processing system through network communication, the system separates a foreground from a video frame, extracts characteristic information of the target human body, including static characteristics and dynamic characteristics, and performs posture estimation and tracking by combining a deep learning method according to changes of the characteristic values.
Some progress has been made in HPE performance based on vision, but there are still limitations in the field of application. Under the dim circumstances of scene light, the camera will face the dim problem of light, and shelter from each other between the human body in addition, the rate of accuracy and the performance of trail tracking that can by a wide margin reduction detect. And the camera is invasive, in some respects, presenting a privacy violation problem. For example, in some scenarios, for fall detection by gesture-based methods and behavior recognition in sensitive environments, visual methods involve serious privacy problems.
Disclosure of Invention
The invention mainly solves the technical problem of providing a human body posture tracking implementation based on a millimeter wave radar. Can guarantee data privacy nature when limiting factors such as ambient light is dim, the electromagnetic wave, camera stadia are all abolished to realize that the human gesture of efficient is tracked.
In the aspect of posture estimation work, in order to express key points and features extracted from input data, the adopted human body dynamics model (skeleton model) is selected, the posture data and the motion behavior of a human body in a two-dimensional pixel plane can be effectively displayed, and the calculation cost and the model construction cost are low. The invention provides a cross-supervised learning method, a visual information auxiliary radar is used for learning human body information in radio frequency information, and a bottom-up method is used for realizing human body posture estimation. After training is completed, the later-stage radar radio frequency network can estimate the human body posture only by taking radio frequency information as input. The invention uses the Kalman filter to predict and compensate the motion, and can effectively predict the key point data of the shielded object under the condition that the human body is shielded. On the basis of a scheme for solving the association problem between the prediction information of the Kalman filter and the attitude information output by the network and the newly arrived measurement information, the method establishes an allocation problem and uses a Hungarian algorithm for calculation.
A personnel human body posture tracking method based on cross-supervised learning millimeter wave radar comprises the following steps:
step 1, data is acquired. The invention uses a monocular camera and a RXX 6843 millimeter wave radar with three transmitting and four receiving antennas to synchronously acquire image data I and radar original radio frequency data R. The image acquisition frame rate and the radar data acquisition frame rate were set to 20.
And 2, selecting a teacher network. The invention uses HRNet as teacher network, HRNet from high resolution subnet as the first stage, increase high resolution to subnet of low resolution step by step, form more stages, and connect the multi-resolution subnet in parallel, so can keep high resolution representation in the network at any time. And finally, extracting 14 human skeleton key points from the image data I by the network to obtain the pixel coordinates of each key point, thereby realizing human posture estimation, wherein the data output of the teacher network is five-dimensional tensor information in the form of (N, C, F, V and M). N represents the number of network input pictures, C represents a channel and represents a horizontal and vertical coordinate and a confidence score under a pixel coordinate system; f represents the number of frames, V represents the number of key points, and M represents the number of people.
And 3, radar data processing. The acquired radar data are in the form of (S, L, X and F), S is sampling points samples, L represents the number of transmitting chirp per second, X represents the number of channels, corresponds to transmitting and receiving antennas of the radar, and F represents the number of frames for acquiring radio frequency data. And performing dimension transformation and Fourier transformation on the acquired data to obtain a horizontal H and a vertical heat map V.
And 4, training a student network S. Inputting horizontal and vertical heat maps (H, V) into two branch encoding networks (E) of a student network v ,E h )。(E v ,E h ) The radio frequency coding network is constructed, the coding network is a convolution network of 9 multiplied by 5 with 10 layers in total, then the result is normalized in batch, and the RELU activation function is used at the end of each layer. And inputting the image data I into a teacher supervision network T to finish the human body posture estimation based on vision, obtaining the pixel coordinates of 14 human body key points, and training a student network S by taking the result T (I) as a label. The goal of learning in the student network is to minimize the difference between the pixel coordinates of the key points predicted by the teacher network. And finally outputting the posture information of the human body, wherein each frame comprises posture data corresponding to different personnel, and the data output form is the same as that of the teacher network.
And 5, tracking personnel. The invention uses a Kalman filter to realize motion prediction, and predicts the position of a target possibly appearing in the next frame for the attitude data generated in the previous step. The invention creates and maintains trajectories (Tracks) for each frame of object detection, i.e. creates a set of trajectories for each object. When an object in a frame appears in the first frame, or an object that does not match an existing object appears, a new trajectory (Tracks) is created.
And 6, matching the postures. The invention realizes the association problem between the prediction object and the detection object by using the Hungarian algorithm. And matching and associating the position of the next frame object predicted by the current frame with the human body position detected by the next frame student network S.
However, a failure may occur during the pairing process, which is specifically classified into the following cases and a coping method is described:
(1) New targets are emerging. After a new object appears, the newly detected object (Detections) cannot be associated with the objects in the original trajectory set (Tracks), because the kalman filter does not have the historical information of the new object and cannot make a prediction. I.e. a match failure condition occurs. At this point a new set of trajectories is created and new targets that appear are added to the set of trajectories.
(2) The existing object disappears. When the target existing in the scene leaves the scene, the corresponding track set is marked as an inactive state, and after the target is still not added with the object in continuous multiple frames, the track set is deleted.
(3) And (3) noise. The detection result may not correspond to the prediction result of the kalman filter due to noise existing in the scene or false detection by the detection algorithm. This case can filter detection and trace set matching over successive multiple frames.
The invention has the beneficial effects that:
(1) Compare traditional human gesture tracking algorithm based on vision under the dim circumstances of scene light, the data that the camera was gathered will face very big problem, and the fuzzy human body in addition shelters from the condition, the rate of accuracy and the performance of trail tracking that can reduce by a wide margin and detect. In some aspects, cameras present a privacy violation problem. For example, in the problem of detecting whether the bathroom of the old people falls down, the camera has a serious privacy invasion problem. According to the invention, based on the millimeter wave radar, the influence of adverse factors such as sight distance and light intensity is avoided, and effective information cannot be directly read out from radar data, so that the privacy of a user can be protected.
(2) The method adopts the radio frequency data as input, and solves the problem that the radio frequency information cannot be labeled with the human posture because the radio frequency information cannot be directly labeled and the cross-supervision mode learning is used.
(3) The method effectively solves the problems that human bodies are mutually shielded and a detection model cannot normally predict by using the Kalman filter, and effectively realizes the association between attitude data by using the Hungarian algorithm.
Drawings
Fig. 1 is a diagram of a teacher student network neural network in an embodiment of the present invention.
Fig. 2 is a kalman filter operation flow diagram in an embodiment of the present invention.
Fig. 3 is a flow chart of teacher student network work in an embodiment of the invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the drawings in the specification.
The invention relates to a millimeter wave radar human body posture tracking method based on cross-supervised learning. Firstly, RGB image data I is obtained, and radar echo information R is synchronously obtained by the millimeter wave radar. And processing the image data by using a human body posture estimation method, finishing human body posture estimation, extracting coordinates of human body key points, assisting students in network learning, inputting radio frequency information, and outputting human body posture key point data to obtain a radio frequency human body posture estimation model. The implementation of the tracking process uses a kalman filter to accomplish the motion prediction and update. The specific process is as follows:
for the design of a cross-supervised student network, as shown in fig. 1.
The process of the student network for realizing human body posture estimation by using radio frequency information is as follows:
step 1, data acquisition. A monocular camera and a RXX 6843 millimeter wave radar with three transmitting and four receiving antennas are used for matching to synchronously acquire image data I and radar original radio frequency data R. The image acquisition frame rate and the radar data acquisition frame rate were set to 20FPS.
And 2, selecting the teacher through a network. The teacher network selects to use HRNet network, HRNet takes high resolution sub-network as the first stage, increases sub-network from high resolution to low resolution step by step, forms more stages, and connects multi-resolution sub-network in parallel, so it can keep high resolution representation in network at any time. The invention inputs the image I into a teacher network HRNet network to obtain a human body posture estimation result T i ,T i Is a five-dimensional tensor information of the form (N, C, F, V, M). N is a radical ofThe batch-size of the network input is represented, the C represents a channel, represents a horizontal and vertical coordinate and a confidence score in a pixel coordinate system, the F represents the number of frames, the V represents the number of key points, and the M represents the number of people. Will T i And as the label information, supervising the student radio frequency network learning to obtain a result S (R).
And 3, radar data processing. The acquired radar data is in the form of (S, L, X and F), S is a sampling point sample, L represents the number of transmitting chips per second, X represents the channel number of the radar data and corresponds to the transmitting and receiving antennas of the radar, and F represents the frame number of the acquired radio frequency data. And carrying out dimension transformation and Fourier transformation on the acquired data to obtain a horizontal heat map H and a vertical heat map V.
And 4, training the student network. And inputting the horizontal and vertical heat maps into a radio frequency coding network in parallel, respectively coding, splicing the information of the horizontal and vertical heat maps according to the channel dimension after coding, and inputting the aggregated information into a coding network D again. As shown in fig. 1, the rf code E is constructed, the coding network E is a 9 × 5 × 5 convolutional network with 10 layers, the result is then normalized by batch processing, and the RELU activation function is used at the end of each layer. The construction of the decoding network D uses a 4-layer 3 × 6 × 6 convolutional neural network, again using the RELU activation function after each layer. For both vertical and horizontal encoded networks, 500 frames of heatmap information (25 seconds) are entered each time.
The attitude estimation method adopted by the invention is a bottom-up analysis on the correlation problem of key points. For example, for limb c of person k, the two types of key points associated with it are j 1 ,j 2 (e.g., if limb c is the forearm, j 1 And j2 are shoulder and elbow keypoints, respectively. Limb c is forearm, j 1 、j 2 Respectively, an elbow keypoint and a wrist keypoint). For joint point j 1 And j 2 Setting the corresponding label as
For person k, if the predicted keypoint is on the limb, there are two keypoints-connected directional unit vectors (modulo length is 1), otherwise there are zero vectors. The calculation formula is as follows:
v=(X j2,k -X j1,k )/||X j2,k -X j1,k || 2
mathematically describing a point p on the limb limbc, k, there are two constraints: the distance of the point p from the start key point in the direction of the limb (i.e. the direction of the line connecting the two associated key points) cannot exceed the end key point and the distance in the direction perpendicular to the limb cannot exceed the width of the limb.
0≤v·(p-X j1,k )≤l c,k and|v ⊥ ·(p-X j1k )|≤σ l
v ⊥ Is a normal vector, σ l Is the width of the limb.
The training aim is to minimize the difference between teacher network label data and student network prediction results, the invention defines the loss as the sum of binary cross loss entropies of each key point, and the calculation formula is as follows:
s (R) represents the result of prediction of the student network S with the radio frequency information R as input, and T (I) represents the output of the teacher network T with the visual information I as input.Andthe pixel coordinate position of (i, j) is indicated. After the iteration loop is completed, a network model which only takes the radio frequency signal as input and realizes the estimation of the human body posture can be obtained.
And 5, tracking the posture. As the human bodies possibly have the condition of mutual shielding in the motion process, the condition of losing the tracked object can occur, and the Kalman filter is used for realizing the prediction and the updating of the motion. For the output of the attitude estimate (N, C, F, V, M), the kalman filter sets a frame number threshold Q. When the frame number F accumulation of the attitude information reaches the threshold value of the Kalman filter, the Kalman filter predicts the position of a person possibly appearing in the next frame according to the accumulated attitude pixel coordinate information, for each appearing object, the Kalman filter creates a new track set, adds the information of the object into the track set, and calculates the predicted pixel coordinate information.
And 6, associating the data. Hungarian algorithm is used on the data association problem, and the goal is to create an association between the targets that each person detects and keeps track, thereby minimizing the distance loss of the combination. For each track K, a frame Counter is set, counting the number of frames since the last association was successful. The frame Counter is incremented during kalman filter prediction and is reset to 0 when the trajectory correlation is successful. Exceeding the maximum frame number A max The set trajectory is considered to be an exit scene and the corresponding trajectory is deleted from the set of trajectories. For each test that cannot be associated with an existing trace, a new trace hypothesis is initiated.
The phenomenon that failure may occur in the pairing process and the corresponding method are as follows:
(1) New targets are emerging. After a new target appears, the newly detected target (detection) cannot be associated with the target in the original trajectory set (Tracks), because the kalman filter does not have the historical information of the new target and cannot make a prediction. I.e. a match failure condition occurs. At this time, a new trajectory set needs to be established, and the new target appearing is added into the trajectory set.
(2) The existing object disappears. And after the target existing in the scene exits from the scene, the corresponding track set is marked to be in an inactivated state, and after the target is still not added with the object in continuous multiple frames, the track set is deleted.
(3) Noise. The detection result may not correspond to the prediction result of the kalman filter due to noise existing in the scene or false detection by the detection algorithm. This case can filter detection and trace set matching over multiple consecutive frames.
Compared with the prior art, the attitude tracking method based on the millimeter wave radar can realize the attitude tracking of the human body under the conditions of dim scene light, limited sight distance and smoke interference and can effectively protect the privacy of users. The cross-supervision method effectively solves the problem that the radar radio frequency data cannot be labeled with the skeleton information label. The Kalman filter is used for effectively solving the problem that human bodies are mutually shielded and the situation that a model cannot be normally predicted is detected, and the Hungarian algorithm is used for effectively realizing the association between attitude data.
The above description is only a preferred embodiment of the present invention, and the scope of the present invention is not limited to the above embodiment, but equivalent modifications or changes made by those skilled in the art according to the disclosure of the present invention should be included in the scope of the present invention as set forth in the appended claims.
Claims (10)
1. A human body posture tracking method based on a millimeter wave radar is characterized in that: the method comprises the following steps:
step 1, data acquisition; acquiring image data I and radar original radio frequency data R;
step 2, teacher network selection; selecting HRNet network as teacher network, inputting image data I into teacher network HRNet network to obtain human body posture estimation result T i ,T i Is a five-dimensional tensor information of the form (N, C, F, V, M), where N represents the batch-size of the network input, C represents the channel, represents the abscissa and ordinate of the pixel coordinate system and the confidence score, and FThe number of frames is shown, V is the number of key points, and M is the number of people; will T i As label information, supervising student radio frequency network learning to obtain a result S (R);
step 3, radar data processing; the acquired radar data is in the form of (S, L, X and F), S is a sampling point sample, L represents the number of transmitting chips per second, X represents the channel number of the radar data and represents the transmitting and receiving antennas of the radar, and F represents the frame number of data acquisition; performing dimension transformation and Fourier transformation on the acquired data to obtain a horizontal heat map H and a vertical heat map V;
step 4, network training of students; inputting the horizontal and vertical heat maps into a radio frequency coding network E in parallel, respectively carrying out coding treatment, splicing the information of the horizontal and vertical heat maps according to channel dimensions after coding, and inputting the aggregated information into a coding network D again; training a student network by taking a human body posture estimation result output by the teacher network as a label;
step 5, tracking the posture; using a Kalman filter to realize the prediction and update of the motion;
step 6, associating data; and (3) creating association between the targets detected and kept tracked by each person by using the Hungarian algorithm, minimizing the distance loss of the combination, and outputting a final tracking result.
2. The millimeter wave radar-based human body posture tracking method according to claim 1, wherein: in the step 1, a monocular camera and a RXX 6843 millimeter wave radar with three transmitting and four receiving antennas are matched to synchronously acquire image data I and radar original radio frequency data R.
3. The millimeter wave radar-based human body posture tracking method according to claim 2, wherein: the image acquisition frame rate and the radar data acquisition frame rate were set to 20FPS.
4. The millimeter wave radar-based human body posture tracking method according to claim 1, wherein: in step 2, the HRNet gradually increases the high-resolution to low-resolution subnets from the high-resolution subnet as the first stage to form more stages, and connects the multi-resolution subnets in parallel, so that the high-resolution representation can be maintained in the network at any time.
5. The millimeter wave radar-based human body posture tracking method according to claim 1, wherein: in step 4, the rf encoding network E is a convolutional network of 9 × 5 × 5 with 10 layers, and then the result is normalized in batch, and the RELU activation function is used at the end of each layer.
6. The millimeter wave radar-based human body posture tracking method according to claim 5, wherein: in step 4, for the rf encoding network E, 500 frames of heatmap information are input each time, i.e. 25 seconds.
7. The millimeter wave radar-based human body posture tracking method according to claim 1, wherein: in step 4, the decoding network D is constructed using a 4-layer 3 × 6 × 6 convolutional neural network, using the RELU activation function after each layer.
8. The millimeter wave radar-based human body posture tracking method according to claim 1, wherein: in step 4, for limb c of person k, two types of key points associated with limb c are j 1 ,j 2 (ii) a For joint point j 1 And j 2 Setting the corresponding label as
For the person k, if the predicted key point is on the limb, the unit vector is the direction of the connecting line of the two key points, otherwise, the unit vector is a zero vector, and the calculation formula is as follows:
v=(X j2,k -X j1,k )/||X j2,k -X j1,k || 2
mathematically describing a point p on the limb limbc, k, there are two constraints: the distance between the point p and the start key point in the direction of the limb, i.e. the direction of the connecting line of the two associated key points, cannot exceed the end key point, and the distance in the direction perpendicular to the limb cannot exceed the width of the limb:
v ⊥ is a normal vector, σ l Is the width of the limb;
the training aim is to minimize the difference between the teacher network label data and the student network prediction result, and the loss is defined as the sum of the binary cross loss entropies of each key point, and the calculation formula is as follows:
s (R) represents a prediction result of the student network with the radio frequency information R as input, and T (I) represents output of the teacher network with the visual image information I as input;and(ii) represents the pixel coordinate position of (i, j); and (5) iteratively and circularly finishing the steps to obtain a network model for realizing human body posture estimation by taking the radio-frequency signal as input.
9. The millimeter wave radar-based human body posture tracking method according to claim 1, wherein: in step 5, for the output (N, C, F, V, M) of the attitude estimation, a Kalman filter sets a frame number threshold Q; when the frame number F accumulation of the attitude information reaches the threshold value of the Kalman filter, the Kalman filter predicts the position of a person possibly appearing in the next frame according to the accumulated attitude pixel coordinate information, for each appearing object, the Kalman filter creates a new track set, adds the information of the object into the track set, and calculates the predicted pixel coordinate information.
10. The millimeter wave radar-based human body posture tracking method according to claim 1, wherein: in step 6, setting a frame Counter for each track K, and calculating the number of frames since the last association is successful; the frame number Counter is incremented during the prediction period of the Kalman filter and is reset to 0 when the track association is successful; exceeding the maximum frame number A max The set track is considered to be an exit scene, and the corresponding track is deleted from the track set; for each test that cannot be associated with an existing trajectory, a new trajectory hypothesis is initiated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211007781.8A CN115345906A (en) | 2022-08-22 | 2022-08-22 | Human body posture tracking method based on millimeter wave radar |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211007781.8A CN115345906A (en) | 2022-08-22 | 2022-08-22 | Human body posture tracking method based on millimeter wave radar |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115345906A true CN115345906A (en) | 2022-11-15 |
Family
ID=83954435
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211007781.8A Pending CN115345906A (en) | 2022-08-22 | 2022-08-22 | Human body posture tracking method based on millimeter wave radar |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115345906A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115798053A (en) * | 2023-01-31 | 2023-03-14 | 中国科学技术大学 | Training method of human body posture estimation model, and human body posture estimation method and device |
CN116013548A (en) * | 2022-12-08 | 2023-04-25 | 广州视声健康科技有限公司 | Intelligent ward monitoring method and device based on computer vision |
CN116602663A (en) * | 2023-06-02 | 2023-08-18 | 深圳市震有智联科技有限公司 | Intelligent monitoring method and system based on millimeter wave radar |
CN118068318A (en) * | 2024-04-17 | 2024-05-24 | 德心智能科技(常州)有限公司 | Multimode sensing method and system based on millimeter wave radar and environment sensor |
CN118592943A (en) * | 2024-08-07 | 2024-09-06 | 宁波星巡智能科技有限公司 | Human fall detection method, device and equipment based on key point sequence analysis |
CN118806269A (en) * | 2024-09-10 | 2024-10-22 | 河北锐景能源科技有限公司 | A human fall detection method based on millimeter wave radar |
-
2022
- 2022-08-22 CN CN202211007781.8A patent/CN115345906A/en active Pending
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116013548A (en) * | 2022-12-08 | 2023-04-25 | 广州视声健康科技有限公司 | Intelligent ward monitoring method and device based on computer vision |
CN116013548B (en) * | 2022-12-08 | 2024-04-09 | 广州视声健康科技有限公司 | Intelligent ward monitoring method and device based on computer vision |
CN115798053A (en) * | 2023-01-31 | 2023-03-14 | 中国科学技术大学 | Training method of human body posture estimation model, and human body posture estimation method and device |
CN116602663A (en) * | 2023-06-02 | 2023-08-18 | 深圳市震有智联科技有限公司 | Intelligent monitoring method and system based on millimeter wave radar |
CN116602663B (en) * | 2023-06-02 | 2023-12-15 | 深圳市震有智联科技有限公司 | Intelligent monitoring method and system based on millimeter wave radar |
CN118068318A (en) * | 2024-04-17 | 2024-05-24 | 德心智能科技(常州)有限公司 | Multimode sensing method and system based on millimeter wave radar and environment sensor |
CN118592943A (en) * | 2024-08-07 | 2024-09-06 | 宁波星巡智能科技有限公司 | Human fall detection method, device and equipment based on key point sequence analysis |
CN118592943B (en) * | 2024-08-07 | 2024-11-12 | 宁波星巡智能科技有限公司 | Human fall detection method, device and equipment based on key point sequence analysis |
CN118806269A (en) * | 2024-09-10 | 2024-10-22 | 河北锐景能源科技有限公司 | A human fall detection method based on millimeter wave radar |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115345906A (en) | Human body posture tracking method based on millimeter wave radar | |
CN109460702B (en) | Passenger abnormal behavior identification method based on human body skeleton sequence | |
Madden et al. | Tracking people across disjoint camera views by an illumination-tolerant appearance representation | |
Li et al. | A generic approach to simultaneous tracking and verification in video | |
US20210065384A1 (en) | Target tracking method, device, system and non-transitory computer readable storage medium | |
Chen et al. | Human action recognition using star skeleton | |
Ahmed et al. | Vision based hand gesture recognition using dynamic time warping for Indian sign language | |
CN110378259A (en) | A kind of multiple target Activity recognition method and system towards monitor video | |
Chen et al. | Object tracking across non-overlapping views by learning inter-camera transfer models | |
US20100067741A1 (en) | Real-time tracking of non-rigid objects in image sequences for which the background may be changing | |
CN114187665B (en) | Multi-person gait recognition method based on human skeleton heat map | |
CN111161320A (en) | Target tracking method, target tracking device and computer readable medium | |
JP2017191501A (en) | Information processing apparatus, information processing method, and program | |
CN113608663B (en) | Fingertip tracking method based on deep learning and K-curvature method | |
Satta et al. | Real-time Appearance-based Person Re-identification Over Multiple KinectTM Cameras. | |
CN112989889B (en) | Gait recognition method based on gesture guidance | |
CN107833239B (en) | Optimization matching target tracking method based on weighting model constraint | |
Zhao et al. | An approach based on mean shift and kalman filter for target tracking under occlusion | |
Bouaynaya et al. | A complete system for head tracking using motion-based particle filter and randomly perturbed active contour | |
Sun et al. | Device-free human localization using panoramic camera and indoor map | |
Dou et al. | Robust visual tracking based on joint multi-feature histogram by integrating particle filter and mean shift | |
Zafar et al. | Human silhouette extraction on FPGAs for infrared night vision military surveillance | |
KR102614895B1 (en) | Real-time object tracking system and method in moving camera video | |
Chowdhury et al. | Human detection and localization in secure access control by analysing facial features | |
Singh et al. | Autonomous Multiple Gesture Recognition system for disabled people |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |