CN108520199B - Human body action open set identification method based on radar image and generation countermeasure model - Google Patents

Human body action open set identification method based on radar image and generation countermeasure model Download PDF

Info

Publication number
CN108520199B
CN108520199B CN201810177104.8A CN201810177104A CN108520199B CN 108520199 B CN108520199 B CN 108520199B CN 201810177104 A CN201810177104 A CN 201810177104A CN 108520199 B CN108520199 B CN 108520199B
Authority
CN
China
Prior art keywords
human body
discriminator
model
radar
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810177104.8A
Other languages
Chinese (zh)
Other versions
CN108520199A (en
Inventor
汪清
郎玥
侯春萍
杨阳
管岱
黄丹阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201810177104.8A priority Critical patent/CN108520199B/en
Publication of CN108520199A publication Critical patent/CN108520199A/en
Application granted granted Critical
Publication of CN108520199B publication Critical patent/CN108520199B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/89Radar or analogous systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention relates to the technical field of radar and the field of human body action recognition, and aims to provide a human body action open set recognition method based on a radar image and a generated countermeasure model, which directly distinguishes the known or unknown action types of an input image and outputs the type information of the input image so as to realize end-to-end open set recognition of human body actions. Therefore, the technical scheme adopted by the invention is that the human body action open set identification method based on the radar image and the generated countermeasure model utilizes the characteristic that the micro Doppler image of the radar can reflect the micro motion of the human body, and simultaneously adopts a discriminator in the generated countermeasure model as an open set identifier to directly distinguish the known or unknown action type of the input image and output the type information of the known or unknown action type so as to realize the end-to-end open set identification of the human body action. The invention is mainly applied to the technical field of radars and human body action recognition occasions.

Description

Human body action open set identification method based on radar image and generation countermeasure model
Technical Field
The invention relates to the technical field of radar and the field of human body action recognition, in particular to an open set action recognition method based on a generated confrontation model.
Background
In recent decades, human motion recognition has attracted a wide range of attention in many fields. Motion recognition is considered a topic with broad application prospects due to its increasing demand in entertainment, medical monitoring, security, emergency rescue and other application areas. Human motion recognition has in the past relied primarily on visual sensor data and has achieved a number of results with the aid of computer vision. Thereafter, research in this field is advanced again by the depth sensor. Sensors such as "Kinect" provide researchers with a simple way to obtain depth information. However, these sensors are very susceptible to environmental factors such as light, shading, weather, etc., which are difficult to avoid in practical applications, and thus the robustness of these sensors in different application scenarios is not strong.
The radar can ignore the influence of environmental factors such as weather and the like, and can work in all weather, so the radar becomes a new common sensor for human body action recognition. The "micro-doppler effect" of radar-received echoes refers to micro-doppler shifts caused by certain micro-motions (e.g., hands, feet, limbs) during the sustained movement of the target. Such features may be reflected on the spectrogram after the radar signal is visualized. Currently, there are related researches based on micro-doppler images, for example, manually extracting features (such as trunk frequency, total signal bandwidth, period, etc.) from a radar image, and then classifying the radar image based on the features by using a Support Vector Machine (SVM), k-Nearest Neighbor (kNN), etc. Compared with a method which needs to manually extract features, a Convolutional Neural Network (CNN) has good nonlinear mapping capability and can autonomously extract implicit features in an image, and therefore, the method is widely applied.
The current problem of human body motion recognition is based on closed set data, i.e. the test set data is from the same source and contains the same classes as the training set data. However, in a real environment, human body actions are varied, and it is obviously difficult to construct a data set and label the actions one by one. Even if a certain type of action can be defined in a fixed scene, there is a great difference in the expression of the same action by different people. Therefore, motion recognition in a real-world environment should be viewed as an open set recognition problem, where the data set contains known classes and unknown classes, and solving such problem requires providing a model that can automatically distinguish the unknown classes from the known classes.
At present, some open Set identification methods are proposed, and W.Scheirer et al propose a '1-vs-Set' learning machine, which becomes one of pioneering researches in the open Set identification field. Then, the CAP method (Compact adaptive Probability, CAP) is proposed and combined with the statistical Extreme Value Theory (EVT) to form a Weibull-calibrated support vector Machine (W-SVM), and experiments prove that the method is improved in recognition effect compared with a 1-vs-Set Machine. Abhijit Bendale and Terrance Boult expand the Nearest Class Mean algorithm (Nearest Class Mean type algorithms, NCM) into the Nearest Non-Outlier algorithm (Nearest Non-Outlier, NNO), and the method can balance the relationship between the identification accuracy and the degree of diversity. An open set recognition method utilizing deep learning also becomes an emerging direction, and Abhijit Bendale and Terranece Boult propose a novel layer structure, namely an OpenMax layer, but the method needs to be assisted by a pre-training model, so that the usability of the method on other recognition tasks is not strong.
In summary, existing open-set methods rely on a reasonable choice of probability threshold, and therefore these methods lack robustness among other tasks. In addition, in consideration of the disadvantages of the sensor and the advantages of the radar, the problem of open-set identification of human body actions by using radar micro-doppler images is to be solved.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a human body action open set identification method based on radar images and a generated countermeasure model, which directly distinguishes the known or unknown action types of the input images and outputs the type information of the known or unknown action types so as to realize end-to-end open set identification of human body actions. Therefore, the technical scheme adopted by the invention is that the human body action open set identification method based on the radar image and the generated countermeasure model utilizes the characteristic that the micro Doppler image of the radar can reflect the micro motion of the human body, and simultaneously adopts a discriminator in the generated countermeasure model as an open set identifier to directly distinguish the known or unknown action type of the input image and output the type information of the known or unknown action type so as to realize the end-to-end open set identification of the human body action.
The method comprises the following specific steps:
the method comprises the following steps: sending and receiving a human body echo signal by using an ultra-wideband radar module, preprocessing the data after acquiring the data, performing short-time Fourier transform, noise cancellation and other operations on the echo signal, and determining a useful signal interval;
step two: further eliminating noise interference by using a method for setting a threshold value, and only displaying points with the echo intensity greater than the threshold value in the radar micro Doppler image;
step three: calibrating the collected data and determining a training set, a verification set and a test set;
step four: establishing a generated countermeasure model GAN (generated adaptive New works) by utilizing a dense Connected network DenseNet structure, and mapping the output probability of the originally generated countermeasure network to the probability of each category by adopting a softmax function at the output end of a discriminator of the model;
step five: and training the generated countermeasure model in the fourth step by using the training set data determined in the third step, taking the weight of the discriminator to test the test set data after the training is finished, and verifying the model open set identification effect.
Further, the ultra-wideband radar used in the first step is a pulseon 440 radar module, the radar has a working frequency of 3.1GHz to 4.8GHz, two directional antennas are used for receiving human body echo signals during data acquisition, data are acquired in an indoor environment, seven typical human body actions are acquired, and the seven selected human body actions are respectively: walking, boxing, crawling on the ground, sneaking, standing, forward standing and jumping and running, wherein each action is repeated three times by each testee, and the acquisition time is 7 seconds each time.
The short-time Fourier transform is realized by windowing a non-stationary process into a series of short-time stationary signals, and Fourier transform is performed on the signals in the window to obtain a time-varying frequency spectrum of the signals, wherein the short-time Fourier transform is a formula:
Figure BDA0001587629520000021
wherein τ is the length of a time window, ω is the angular frequency, t is the time, j is an imaginary number, e is a natural constant, G (t) is a window function, f (t) is the collected echo signal of the human body, Gf(. to) is a transformed time-varying spectrum.
The noise cancellation adopts a mean background cancellation method, namely, the column vector of the echo intensity average value is subtracted from the whole echo signal;
the method for determining the useful signal interval is to determine the interval with human body motion through the time-distance image of the signal and then reasonably set the time starting point and the time ending point for time-frequency conversion.
Specifically, the intensity threshold value set in the second step is not displayed, and noise is filtered by adopting a segmented threshold value method in a manual selection mode.
Specifically, in the third step, the radar micro-Doppler image is calibrated, seven actions of walking, boxing, ground crawling, sneaking, standing, forward standing and jumping and running are marked in sequence by using numbers from 0 to 6, and then the images generated in the first step and the second step are divided into a training set, a verification set and a test set according to the ratio of 4:2: 1.
Specifically, the dense connection network in step four includes two parts, namely a "connection block" and a "transition layer", specifically:
each connecting block structure consists of two convolution layers and a connecting operation layer, the connecting block structure connects the characteristics of each layer before the layer as the input of the layer, each convolution layer is followed by a batch normalization operation BN (batch normalization) and a modified Linear unit ReLU (modified Linear units) or a leaked Linear modified unit Leaky ReLU, the expressions of ReLU and Leaky ReLU are respectively as follows:
Figure BDA0001587629520000031
Figure BDA0001587629520000032
wherein p is the input to the cell;
the transition layer represents the part between two connection block structures, and in the generator for generating the countermeasure model, the transition layer consists of a convolution layer and an anti-convolution layer; in the discriminator, the transition layer is composed of a convolution layer and a mean pooling layer.
The generated countermeasure model described in the fourth step is composed of a generator and a discriminator, the generator randomly samples from a potential space as input, the output result of the generator needs to imitate a real sample in a training set as much as possible, the input of the discriminator is the output of the real sample or a generated network, the purpose is to distinguish the output of the generator from the real sample as much as possible, the generator deceives the discriminator as much as possible, the two networks resist each other and continuously adjust the network weight of each layer, the final purpose is to make the discriminator unable to judge whether the output result of the generator is real, and the objective function V (D, G) of the generated countermeasure network is expressed as follows:
Figure BDA0001587629520000033
wherein G denotes a generator, D denotes a discriminator, x denotes an input sample, z denotes a random variable of the input, min (-) denotes a minimization operation, max (-) denotes a maximization operation, log (-) is a logarithm operation with base 10, E (-) denotes expectation, pdata (x) denotes a data distribution obeying a real sample, pz (z) denotes a data distribution obeying a random distribution, and the output part of the discriminator employs a softmax function, which essentially compresses an arbitrary real vector of one K-dimension into a real vector of another K-dimension, where each element in the vector takes on values between (0,1), the softmax function being in the form:
Figure BDA0001587629520000034
in the formula, zjDenotes the jth element, zkDenotes the kth element, e is a natural constant, σ (z)jA softmax value representing the jth element;
in this way, the output of the discriminator can be understood as the probability of the input image in each motion category, and the one with the highest probability is the category of the input image determined by the discriminator.
In the network training process, adaptive Moment estimation adam (adaptive motion estimation) is adopted to optimize network weight, and a gradient penalty strategy is also adopted, namely penalty items are added into an objective function
Figure BDA0001587629520000041
Wherein λ is 10, the ratio of the total of the two,
Figure BDA0001587629520000042
alpha is at
Figure BDA0001587629520000043
A random variable between the number of bits in the random variable to 1,
Figure BDA0001587629520000044
representing false samples generated by the generator, x representing true samples,
Figure BDA0001587629520000045
representing the gradient, E (-) represents the expectation, the objective function V (D, G) is:
Figure BDA0001587629520000046
the effect of a model is evaluated by a common index 'F-measure-Openness curve' in open set identification, wherein the F-measure is defined as follows:
Figure BDA0001587629520000047
wherein the content of the first and second substances,
Figure BDA0001587629520000048
Figure BDA0001587629520000049
TP represents positive samples predicted by the model, TN represents negative samples predicted by the model, FP represents negative samples predicted by the model, and FN represents positive samples predicted by the model.
The invention has the characteristics and beneficial effects that:
according to the invention, the radar micro-Doppler image is used for identifying the human body action, so that the defect that other sensors are easily influenced by the environment can be avoided, and the micro-motion capturing capability is strong; the invention solves the problem of end-to-end identification of unknown actions, namely the problem of open set identification of human actions by using the characteristic of generating a confrontation model, has low algorithm complexity and has certain application value.
Description of the drawings:
FIG. 1 is a block diagram of a human body motion open set identification method based on radar micro Doppler images;
FIG. 2 is a time-distance image of a radar echo signal;
FIG. 3 is an example of micro-Doppler images for various motions;
FIG. 4 is a schematic diagram of a generator structure in a generation pairing antibody model;
FIG. 5 is a schematic diagram of a structure of a discriminator in a generation pairwise reactance model;
FIG. 6 is a graph comparing the results of the experiment of the present invention with those of other methods.
Detailed Description
In order to solve the problems, the invention provides a human body action open set identification method based on a radar image and a generated countermeasure model.
In order to achieve the purpose, the human body action open set identification method based on the radar image and the generated confrontation model comprises the following steps:
the method comprises the following steps: the ultra-wideband radar module is used for sending and receiving human body echo signals, preprocessing the data after the data are collected, and performing short-time Fourier transform, noise cancellation and other operations on the echo signals to determine a useful signal interval.
Step two: and further eliminating noise interference by using a method for setting a threshold value, and only displaying points with the echo intensity larger than the threshold value in the radar micro Doppler image.
Step three: and calibrating the acquired data and determining a training set, a verification set and a test set.
Step four: the method comprises the steps of establishing a generated countermeasure model (GAN) by utilizing a dense Connected probabilistic network (DenseNet) structure, and mapping output probability of the originally generated countermeasure network to probability of each category by adopting a softmax function at an output end of a discriminator of the model.
Step five: and training the generated countermeasure model in the fourth step by using the training set data determined in the third step, taking the weight of the discriminator to test the test set data after the training is finished, and verifying the model open set identification effect.
Specifically, the ultra-wideband radar used in the first step is a pulseon 440 radar module, the working frequency of the radar is 3.1GHz to 4.8GHz, and two directional antennas are adopted to receive the human body echo signals during data acquisition. Data were collected in an indoor environment. Seven typical human body actions are collected in the experiment, and the seven selected human body actions are respectively as follows: walking, boxing, crawling on the ground, sneaking, standing, forward standing and jumping, and running. Each action is repeated three times by each subject, and the acquisition time is about 7 seconds.
The short-time Fourier transform is implemented by windowing the signals in a time dimension and performing Fourier transform on the signals in the window to obtain the time-varying frequency spectrum of the signals. The formula for the short-time fourier transform can be written as:
Figure BDA0001587629520000051
where τ is the time window length, ω is the angular frequency, and t is timeJ is an imaginary number, e is a natural constant, G (t) is a window function, f (t) is an acquired human echo signal, Gf(. to) is a transformed time-varying spectrum.
The noise cancellation adopts a mean background cancellation method, namely, the column vector of the echo intensity average value is subtracted from the whole echo signal.
The method for determining the useful signal interval is to determine the interval with human body motion through the time-distance image of the signal and then reasonably set the time starting point and the time ending point for time-frequency conversion.
Specifically, the setting of the non-displayed intensity threshold in the second step depends on a manual selection mode. In the data acquisition process, because some human body motions are from far to near, the echo intensity tends to become larger gradually, and if a uniform threshold value is adopted, the noise filtering of short-distance echo signals is insufficient or the noise filtering of long-distance echo signals is excessive, so that the detail information of the human body echo signals is lost. Therefore, the invention takes the influence of the distance into consideration and adopts a sectional threshold value method to filter out noise.
Specifically, in the third step, the radar micro-Doppler image is calibrated, seven actions of walking, boxing, ground crawling, sneaking, standing, forward standing and jumping and running are marked in sequence by using numbers from 0 to 6, and then the images generated in the first step and the second step are divided into a training set, a verification set and a test set according to the ratio of 4:2: 1.
Specifically, the dense connection network in step four includes two parts, namely a "connection block" and a "transition layer", and the two components are described below.
Each connecting block structure is composed of two convolution layers and a connecting operation layer, and the connecting block structure can be used for connecting the characteristics of each layer before the layer as the input of the layer. Each convolutional layer is followed by a Batch Normalization (BN) and a modified Linear unit (ReLU) or Leaky Linear modified unit (leak ReLU). The expressions for ReLU and leakage ReLU are expressed as follows:
Figure BDA0001587629520000061
Figure BDA0001587629520000062
where p is the input to the cell.
The transition layer represents the portion between two connected block structures. In a generator for generating a countermeasure model, a transition layer is composed of a convolution layer and an anti-convolution layer; in the discriminator, the transition layer is composed of a convolution layer and a mean pooling layer.
The generative confrontation model described in step four consists of a Generator (Generator) and a Discriminator (Discriminator). The generator takes random samples from the latent space (latency) as input, and its output needs to mimic as much as possible the real samples in the training set. The input to the discriminator is the real sample or the output of the generation network, the purpose of which is to distinguish the output of the generator from the real sample as much as possible. The generator should fool the arbiter as much as possible. The two networks resist each other and continuously adjust the weight of each layer of network, and the final purpose is to make the discriminator unable to judge whether the output result of the generator is real or not. The objective function V (D, G) of the generative countermeasure network may be expressed as follows:
Figure BDA0001587629520000063
wherein G denotes a generator, D denotes a discriminator, x denotes an input sample, z denotes a random variable of the input, min (-) denotes a minimization operation, max (-) denotes a maximization operation, log (-) is a base-10 logarithm operation, E (-) denotes expectation, pdata (x) denotes a data distribution obeying a real sample, and pz (z) denotes a data distribution obeying a random distribution. In addition, the output part of the discriminator adopts a softmax function, and the essence of the function is to compress (map) any real number vector of one K-dimension into a real number vector of another K-dimension, wherein each element in the vector takes a value between (0, 1). The softmax function is of the form:
Figure BDA0001587629520000064
in the formula, zjDenotes the jth element, zkDenotes the kth element, e is a natural constant, σ (z)jDenotes the softmax value of the jth element.
In this way, the output of the discriminator can be understood as the probability of the input image in each motion category, and the one with the highest probability is the category of the input image determined by the discriminator.
In the network training process described in the fifth step, a first-order optimization algorithm, namely Adaptive Moment Estimation (Adam), is adopted to optimize the network weights, and the Adam method pays attention to the selection of step sizes during updating and dynamically adjusts the learning rate of each weight. In addition, in order to prevent the problem of gradient disappearance, the invention also adopts a gradient penalty strategy, namely, a penalty item is added into the objective function
Figure BDA0001587629520000065
Wherein λ is 10, the ratio of the total of the two,
Figure BDA0001587629520000066
alpha is at
Figure BDA0001587629520000067
A random variable between the number of bits in the random variable to 1,
Figure BDA0001587629520000068
representing false samples generated by the generator, x representing true samples,
Figure BDA0001587629520000069
indicating gradient and E (-) indicating expectation. The objective function V (D, G) is then: . The objective function V (D, G) of the present invention is then:
Figure BDA00015876295200000610
the invention adopts a common index 'F-measure-Openness curve' evaluation model effect in open set identification, wherein the F-measure is defined as follows:
Figure BDA0001587629520000071
wherein the content of the first and second substances,
Figure BDA0001587629520000072
Figure BDA0001587629520000073
TP represents positive samples predicted by the model, TN represents negative samples predicted by the model, FP represents negative samples predicted by the model, and FN represents positive samples predicted by the model.
The invention provides a human body action open set identification method based on radar images and a generated confrontation model, which comprises five steps as shown in figure 1. The invention is further explained below with reference to the figures and examples.
Firstly, human motion data is collected.
The invention utilizes an ultra-wideband radar Pulse ON 440 module to send and receive human body echo signals, the radar working Frequency is 3.1GHz to 4.8GHz, the sampling Frequency is 16GHz, the Pulse Repetition Frequency (PRF) is 368Hz, the Coherent Pulse accumulation (CPI) is 0.2 s, and two directional antennas with the height of about 1.2m are adopted to receive the human body echo signals during data acquisition.
In the experiment, four human subjects make seven typical human body motions in the sight line direction of the radar, and the seven selected human body motions are respectively as follows: walking, boxing, crawling on the ground, sneaking, standing, forward standing and jumping, and running. Each action is repeated three times by each subject, and the acquisition time is about 7 seconds. After the raw echo data is obtained, its corresponding time-distance image is made (as shown in figure 2),the interval with human motion is determined through the image, and then the time starting point and the time ending point for time-frequency transformation are reasonably set. Then, a mean background cancellation method is adopted, echo signals are regarded as two-dimensional matrixes, mean values of all the line data are calculated respectively and recorded as miWhere i represents the ith row data, the row vector M' of the average value of the echo intensities is obtained1,m2,,mn]T. It is extended to an n × n mean matrix:
Figure BDA0001587629520000074
and subtracting corresponding elements of the original data matrix and the mean value matrix to obtain a signal matrix after cancellation.
Then, a Short-time Fourier Transform (STFT) is performed on the signal matrix. The short-time Fourier transform is realized by windowing on a time dimension by regarding a non-stationary process as superposition of a series of short-time stationary signals and performing Fourier transform on the signals in the window to obtain a time-varying frequency spectrum of the signals. The formula for the short-time fourier transform can be written as:
Figure BDA0001587629520000075
wherein τ is the length of a time window, ω is the angular frequency, t is the time, j is an imaginary number, e is a natural constant, G (t) is a window function, f (t) is the collected echo signal of the human body, Gf(. to) is a transformed time-varying spectrum. The time window length of the short-time Fourier transform adopted by the invention is 0.1 second, the overlapping rate of the two time windows is 0.9, and the number of Fourier transform points in each window is 1024.
And secondly, setting an intensity threshold value to be displayed.
And further eliminating noise interference by using a method for setting a threshold value, and only displaying points with the echo intensity larger than the threshold value in the radar micro Doppler image.
The method needs to set the undisplayed intensity threshold value in a manual selection mode when the radar micro Doppler image is generated. The movement range of the testee is within the range of 1.2 meters to 5.4 meters away from the radar in the data acquisition process, and the echo intensity inevitably tends to be gradually increased because some human body movement is from far to near. If a uniform threshold value is adopted, the problems that the noise of the short-distance echo signal is not sufficiently filtered, or the noise of the long-distance echo signal is excessively filtered, so that the detail information of the human body echo signal is lost can be caused.
Therefore, the invention takes the influence of the distance into consideration and adopts a sectional threshold value method to filter out noise. Assuming that the maximum value of the signal strength of a certain segment is Max, the minimum strength to be displayed in each distance range is shown in table 1.
TABLE 1
Distance between two adjacent plates Minimum intensity value
1.2-2 m Max-90
2-3.2 m Max-80
3.2-4.5 m Max-70
4.5-5.4 m Max-60
And thirdly, constructing a data set.
Through the first two steps of operation, 700 images can be obtained for each action, and a schematic diagram of each action radar micro-Doppler image is shown in FIG. 3. Calibrating the generated radar micro Doppler image, representing seven actions of walking, boxing, ground crawling, sneaking, standing, forward standing jumping and running by numbers from 0 to 6 respectively, and then dividing the image of each action into a training set, a verification set and a test set according to the proportion of 4:2: 1. Thus, for each action, a training set of 400 sheets, a validation set of 200 sheets, and a test set of 100 sheets may be obtained.
For the open set identification problem, a known class and an unknown class are defined, and the class which is not contained in the training set in the test set becomes the unknown class and is marked as U; the classes in both the test set and the training set are called "known classes," denoted as K. The invention verifies the effect of the model when the opening degree of the data set is different.
And fourthly, constructing and generating a confrontation model.
The generation countermeasure model adopted by the invention consists of a Generator (Generator) and a Discriminator (Discriminator). The generator randomly samples a variable z from a latent space (latency) as an input, and the output result needs to imitate the real sample in the training set as much as possible. The input to the discriminator is the real sample or the output of the generation network, the purpose of which is to distinguish the output of the generator from the real sample as much as possible. The generator should fool the arbiter as much as possible. The two networks resist each other and continuously adjust the weight of each layer of network, and the final purpose is to make the discriminator unable to judge whether the output result of the generator is real or not. The objective function V (D, G) of the generative confrontation model may be expressed as follows:
Figure BDA0001587629520000081
wherein G represents a generator, D represents a discriminator, x represents an input sample, z represents a random variable of the input, min (-) represents a minimization operation, max (-) represents a maximization operation, log (-) is a logarithm operation with a base 10, E (-) represents an expectation, P (-) represents a minimum of the input sample, anddata(x) Representing data distribution obeying real samples, Pz(z) denotes a data distribution subject to a random distribution.
Since the discriminator itself is a two-classifier that can determine whether the input image is "true" or "false", the discriminator can also realize the function of determining whether the input image is "known" or "unknown" in the open set identification problem. In addition, in order to enable the discriminator to simultaneously realize the function of the classifier, the output part of the classifier is changed into a softmax function, and the output is changed into the probability of the input image on each class. The essence is to compress (map) an arbitrary real vector of one dimension K into a real vector of another dimension K, where each element in the vector takes on a value between (0, 1). The softmax function is of the form:
Figure BDA0001587629520000091
in the formula, zjDenotes the jth element, zkDenotes the kth element, e is a natural constant, σ (z)jDenotes the softmax value of the jth element.
In this way, the output of the discriminator can be understood as the probability of the input image in each motion category, and the one with the highest probability is the category of the input image determined by the discriminator.
The structure of the generator and the discriminator adopts a Dense connection network, the Dense connection network comprises a connection Block (sense Block) and a Transition Layer (Transition Layer), and the two components are respectively described below.
Each connecting block structure is composed of two convolution layers and a connecting operation layer, and the connecting block structure can be used for connecting the characteristics of each layer before the layer as the input of the layer. Each convolutional layer is followed by a Batch Normalization (BN) and a modified Linear unit (ReLU) or Leaky Linear modified unit (leak ReLU). The expressions for ReLU and leakage ReLU are expressed as follows:
Figure BDA0001587629520000092
Figure BDA0001587629520000093
where p is the input to the cell.
The transition layer represents the portion between two connected block structures. In a generator for generating a countermeasure model, a transition layer is composed of a convolution layer and an anti-convolution layer; in the discriminator, the transition layer is composed of a convolution layer and a mean pooling layer. Fig. 4 and 5 show the structures of the generator and the discriminator, respectively. Specific parameters in the model are listed in table 2, where "n × n deconv" denotes an deconvolution layer with a convolution kernel size of n × n, "n × n conv" denotes a convolution layer with a convolution kernel size of n × n, "Padding" denotes the number of filled pixels around a picture, and "Pooling" denotes a mean Pooling operation.
TABLE 2 specific parameters of each layer of the generative confrontation model
Figure BDA0001587629520000101
And fifthly, training and testing the model.
In order to generate the common gradient disappearance problem in the training process of the confrontation model, the invention adopts a gradient punishment strategy in the training process, namely, a punishment item is added into the objective function
Figure BDA0001587629520000102
Wherein λ is 10, the ratio of the total of the two,
Figure BDA0001587629520000103
alpha is
Figure BDA0001587629520000104
A random variable between the number of bits in the random variable to 1,
Figure BDA0001587629520000105
representing false samples generated by the generator, x representing true samples,
Figure BDA0001587629520000106
indicating gradient and E (-) indicating expectation. The objective function V (D, G) is then: .
The objective function V (D, G) of the present invention is then:
Figure BDA0001587629520000107
an optimizer in the network training process adopts a first-order optimization algorithm, namely Adaptive Moment Estimation (Adam), to adjust network weights, and the Adam method can pay attention to the selection of step sizes and dynamically adjust the learning rate of each weight during updating.
The invention adopts the commonly used 'F-measure-Openness curve' in open set identification to evaluate the model effect, and the F-measure is defined as follows:
Figure BDA0001587629520000108
wherein the content of the first and second substances,
Figure BDA0001587629520000109
Figure BDA00015876295200001010
TP represents positive samples predicted by the model, TN represents negative samples predicted by the model, FP represents negative samples predicted by the model, and FN represents positive samples predicted by the model. The F-measure value range is between (0,1), and the larger the value is, the better the open set identification algorithm effect is.
Openness is used to express the degree of open sets in an open set identification problem, which is defined as:
Figure BDA0001587629520000111
wherein N isTARepresenting in a training setNumber of classes, NTGIndicating the number of classes to be identified, NTEIndicating the number of categories in the test set.
The experimental result shows that compared with other common open set identification algorithms, the performance of the method can be improved by about ten percent, and the experimental result is shown in fig. 6.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (7)

1. A human body action open set identification method based on radar images and a generated countermeasure model is characterized in that micro Doppler images of radar can reflect the characteristics of human body micromotion, meanwhile, a discriminator in the generated countermeasure model is used as an open set identifier, the known or unknown action types of input images are directly distinguished, and the type information of the input images is output, so that end-to-end open set identification of human body actions is realized; the method comprises the following specific steps:
the method comprises the following steps: sending and receiving a human body echo signal by using an ultra-wideband radar module, preprocessing the data after acquiring the data, and performing short-time Fourier transform and noise cancellation on the echo signal to determine a useful signal interval;
step two: further eliminating noise interference by using a method for setting a threshold value, and only displaying points with the echo intensity greater than the threshold value in the radar micro Doppler image;
step three: calibrating the collected data and determining a training set, a verification set and a test set;
step four: establishing and generating a countermeasure model GAN (generic adaptive network) by utilizing a dense Connected network DenseNet (Densey Connected probabilistic network) structure, and mapping the output probability of the originally generated countermeasure network to the probability of each class by adopting a softmax function at the output end of a discriminator of the model;
step five: training the confrontation model generated in the fourth step by using the training set data determined in the third step, taking the weight of the discriminator to test the test set data after the training is finished, and verifying the open set identification effect of the model;
wherein, the dense connection network in the fourth step includes two parts of a "connection block" and a "transition layer", specifically: each connecting block structure consists of two convolution layers and a connecting operation layer, the connecting block structure connects the characteristics of each layer before the layer as the input of the layer, each convolution layer is followed by a batch normalization operation BN (batch normalization) and a modified Linear unit ReLU (modified Linear units) or a leaked Linear modified unit Leaky ReLU, the expressions of ReLU and Leaky ReLU are respectively as follows:
Figure FDA0003366251350000011
Figure FDA0003366251350000012
wherein p is the input to the cell;
the transition layer represents the part between two connection block structures, and in the generator for generating the countermeasure model, the transition layer consists of a convolution layer and an anti-convolution layer; in the discriminator, the transition layer is composed of a convolution layer and a mean pooling layer.
2. The method for recognizing the human body motion open set based on the radar image and the generated countermeasure model according to claim 1, wherein the ultra-wideband radar used in the first step is a pulseon 440 radar module, the radar operating frequency is 3.1GHz to 4.8GHz, two directional antennas are used for receiving the human body echo signals during data acquisition, the data are acquired in an indoor environment, seven typical human body motions are acquired, and the seven selected human body motions are respectively: walking, boxing, crawling on the ground, sneaking, standing, forward standing and jumping and running, wherein each action is repeated three times by each testee, and the acquisition time is 7 seconds each time.
3. The method of claim 1, wherein the short-time fourier transform is a superposition of a series of short-time stationary signals, the non-stationary process is regarded as a non-stationary process, the short-time property is realized by windowing in the time dimension, and fourier transform is performed on the signals in the window to obtain a time-varying frequency spectrum of the signals, and the formula of the short-time fourier transform is as follows:
Figure FDA0003366251350000021
wherein τ is the length of a time window, ω is the angular frequency, t is the time, j is an imaginary number, e is a natural constant, G (t) is a window function, f (t) is the collected echo signal of the human body, Gf(. to) is a transformed time-varying spectrum.
4. The method according to claim 1, wherein the noise cancellation is a mean background cancellation method, that is, the whole echo signal is subtracted by the echo intensity average value column vector; the method for determining the useful signal interval is to determine the interval with human body motion through the time-distance image of the signal and then reasonably set the time starting point and the time ending point for time-frequency conversion.
5. The method as claimed in claim 1, wherein the intensity threshold value set in step two is selected manually, and the noise is filtered by using a segmented threshold value.
6. The method for recognizing the action open set of the human body based on the radar image and the generated countermeasure model according to claim 1, wherein specifically, the calibration for generating the radar micro-Doppler image in the third step is sequentially marked with numbers from "0" to "6" for seven actions of walking, boxing, ground crawling, sneaking, standing, forward standing jump and running, and then the images generated in the first step and the second step are divided into a training set, a verification set and a test set according to a ratio of 4:2: 1.
7. The method as claimed in claim 1, wherein the generated confrontation model in step four is composed of a generator and a discriminator, the generator randomly samples from the potential space as input, the output result of the generator needs to imitate the real sample in the training set as much as possible, the input of the discriminator is the real sample or the output of the generated network, the purpose is to distinguish the output of the generator from the real sample as much as possible, the generator cheats the discriminator as much as possible, the two networks confront each other and continuously adjust the network weights of each layer, the final purpose is to make the discriminator unable to judge whether the output result of the generator is real, and the objective function V (D, G) of the generated confrontation network is expressed as follows:
Figure FDA0003366251350000022
wherein G represents a generator, D represents a discriminator, x represents an input sample, z represents a random variable of the input, min (-) represents a minimization operation, max (-) represents a maximization operation, log (-) is a logarithm operation with a base 10, E (-) represents an expectation, P (-) represents a minimum of the input sample, anddata(x) Representing data distribution obeying real samples, Pz(z) represents the distribution of data subject to random distribution, and the output part of the discriminator uses a softmax function, which is essentially to compress any real vector in one K-dimension into a real vector in another K-dimension, where each element in the vector takes on a value between (0,1), and the softmax function is of the form:
Figure FDA0003366251350000023
in the formula, zjDenotes the jth element, zkDenotes the kth element, e is a natural constant, σ (z)jA softmax value representing the jth element;
in this way, the output of the discriminator can be understood as the probability of the input image on each action category, and the highest probability is the category of the input image judged by the discriminator;
in the network training process, adaptive Moment estimation adam (adaptive motion estimation) is adopted to optimize network weight, and a gradient penalty strategy is also adopted, namely penalty items are added into an objective function
Figure FDA0003366251350000031
Wherein λ is 10, the ratio of the total of the two,
Figure FDA0003366251350000032
alpha is at
Figure FDA0003366251350000033
A random variable between the number of bits in the random variable to 1,
Figure FDA0003366251350000034
representing false samples generated by the generator, x representing true samples,
Figure FDA0003366251350000035
representing the gradient, E (-) represents the expectation, the objective function V (D, G) is:
Figure FDA0003366251350000036
the effect of a model is evaluated by a common index 'F-measure-Openness curve' in open set identification, wherein the F-measure is defined as follows:
Figure FDA0003366251350000037
wherein the content of the first and second substances,
Figure FDA0003366251350000038
Figure FDA0003366251350000039
TP represents positive samples predicted by the model, TN represents negative samples predicted by the model, FP represents negative samples predicted by the model, and FN represents positive samples predicted by the model.
CN201810177104.8A 2018-03-04 2018-03-04 Human body action open set identification method based on radar image and generation countermeasure model Active CN108520199B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810177104.8A CN108520199B (en) 2018-03-04 2018-03-04 Human body action open set identification method based on radar image and generation countermeasure model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810177104.8A CN108520199B (en) 2018-03-04 2018-03-04 Human body action open set identification method based on radar image and generation countermeasure model

Publications (2)

Publication Number Publication Date
CN108520199A CN108520199A (en) 2018-09-11
CN108520199B true CN108520199B (en) 2022-04-08

Family

ID=63433468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810177104.8A Active CN108520199B (en) 2018-03-04 2018-03-04 Human body action open set identification method based on radar image and generation countermeasure model

Country Status (1)

Country Link
CN (1) CN108520199B (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472757B (en) * 2018-11-15 2020-06-09 央视国际网络无锡有限公司 Image channel logo removing method based on generation of antagonistic neural network
CN109918994B (en) * 2019-01-09 2023-09-15 天津大学 Commercial Wi-Fi-based violent behavior detection method
CN111507361B (en) * 2019-01-30 2023-11-21 富士通株式会社 Action recognition device, method and system based on microwave radar
CN109871805B (en) * 2019-02-20 2020-10-27 中国电子科技集团公司第三十六研究所 Electromagnetic signal open set identification method
CN110084108A (en) * 2019-03-19 2019-08-02 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Pedestrian re-identification system and method based on GAN neural network
CN109948532A (en) * 2019-03-19 2019-06-28 桂林电子科技大学 ULTRA-WIDEBAND RADAR human motion recognition method based on depth convolutional neural networks
CN110109090B (en) * 2019-03-28 2021-03-12 北京邮电大学 Unknown environment multi-target detection method and device based on microwave radar
CN111461267B (en) * 2019-03-29 2023-04-18 太原理工大学 Gesture recognition method based on RFID technology
CN110052000A (en) * 2019-04-12 2019-07-26 漳州泰里斯体育器材有限公司 A kind of identifying processing method and system of combat sports state
CN110033043B (en) * 2019-04-16 2020-11-10 杭州电子科技大学 Radar one-dimensional range profile rejection method based on condition generation type countermeasure network
CN110096976A (en) * 2019-04-18 2019-08-06 中国人民解放军国防科技大学 Human behavior micro-Doppler classification method based on sparse migration network
CN110390650B (en) * 2019-07-23 2022-02-11 中南大学 OCT image denoising method based on dense connection and generation countermeasure network
CN110532909B (en) * 2019-08-16 2023-04-14 成都电科慧安科技有限公司 Human behavior identification method based on three-dimensional UWB positioning
CN111239739A (en) * 2020-01-10 2020-06-05 上海眼控科技股份有限公司 Weather radar echo map prediction method and device, computer equipment and storage medium
CN111796272B (en) * 2020-06-08 2022-09-16 桂林电子科技大学 Real-time gesture recognition method and computer equipment for through-wall radar human body image sequence
CN111914919A (en) * 2020-07-24 2020-11-10 天津大学 Open set radiation source individual identification method based on deep learning
CN112364689A (en) * 2020-10-09 2021-02-12 天津大学 Human body action and identity multi-task identification method based on CNN and radar image
CN112200123B (en) * 2020-10-24 2022-04-05 中国人民解放军国防科技大学 Hyperspectral open set classification method combining dense connection network and sample distribution
CN112560596B (en) * 2020-12-01 2023-09-19 中国航天科工集团第二研究院 Radar interference category identification method and system
CN112560778B (en) * 2020-12-25 2022-05-27 万里云医疗信息科技(北京)有限公司 DR image body part identification method, device, equipment and readable storage medium
CN113296087B (en) * 2021-05-25 2023-09-22 沈阳航空航天大学 Frequency modulation continuous wave radar human body action recognition method based on data enhancement
CN113378718A (en) * 2021-06-10 2021-09-10 中国石油大学(华东) Action identification method based on generation of countermeasure network in WiFi environment
CN113537374B (en) * 2021-07-26 2023-09-08 百度在线网络技术(北京)有限公司 Method for generating countermeasure sample
CN117115596B (en) * 2023-10-25 2024-02-02 腾讯科技(深圳)有限公司 Training method, device, equipment and medium of object action classification model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295684A (en) * 2016-08-02 2017-01-04 清华大学 A kind of the most continuous based on micro-Doppler feature/discontinuous gesture recognition methods
CN107169435A (en) * 2017-05-10 2017-09-15 天津大学 A kind of convolutional neural networks human action sorting technique based on radar simulation image
CN107506799A (en) * 2017-09-01 2017-12-22 北京大学 A kind of opener classification based on deep neural network is excavated and extended method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8379940B2 (en) * 2009-06-02 2013-02-19 George Mason Intellectual Properties, Inc. Robust human authentication using holistic anthropometric and appearance-based features and boosting

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295684A (en) * 2016-08-02 2017-01-04 清华大学 A kind of the most continuous based on micro-Doppler feature/discontinuous gesture recognition methods
CN107169435A (en) * 2017-05-10 2017-09-15 天津大学 A kind of convolutional neural networks human action sorting technique based on radar simulation image
CN107506799A (en) * 2017-09-01 2017-12-22 北京大学 A kind of opener classification based on deep neural network is excavated and extended method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Micro-Doppler-based human activity classification using the mote-scale BumbleBee radar;Çağlıyan B 等;《IEEE Geoscience and Remote Sensing Letters》;20150729;第12卷(第10期);第2135-2139页 *
基于高分辨一维多普勒像的雷达目标机动检测算法;祝依龙 等;《自动化学报》;20110831;第3-16页 *

Also Published As

Publication number Publication date
CN108520199A (en) 2018-09-11

Similar Documents

Publication Publication Date Title
CN108520199B (en) Human body action open set identification method based on radar image and generation countermeasure model
CN108226892B (en) Deep learning-based radar signal recovery method in complex noise environment
CN107290741B (en) Indoor human body posture identification method based on weighted joint distance time-frequency transformation
US20200166611A1 (en) Detection method, detection device, terminal and detection system
CN108629380B (en) Cross-scene wireless signal sensing method based on transfer learning
CN112184849B (en) Intelligent processing method and system for complex dynamic multi-target micro-motion signals
CN110133610B (en) Ultra-wideband radar action identification method based on time-varying distance-Doppler diagram
CN110456320B (en) Ultra-wideband radar identity recognition method based on free space gait time sequence characteristics
CN110286368A (en) A kind of Falls Among Old People detection method based on ULTRA-WIDEBAND RADAR
CN108388850A (en) A kind of human motion recognition method based on k arest neighbors and micro-Doppler feature
CN113296087B (en) Frequency modulation continuous wave radar human body action recognition method based on data enhancement
CN110007366A (en) A kind of life searching method and system based on Multi-sensor Fusion
CN108898066B (en) Human motion detection method based on generating type countermeasure network
CN110647788B (en) Human daily behavior classification method based on micro-Doppler characteristics
CN113657491A (en) Neural network design method for signal modulation type recognition
Qu et al. Human activity recognition based on WRGAN-GP-synthesized micro-Doppler spectrograms
CN102034111A (en) Method for identifying and detecting aircraft structural damage conditions in diversified way
CN115877376A (en) Millimeter wave radar gesture recognition method and recognition system based on multi-head self-attention mechanism
CN116343284A (en) Attention mechanism-based multi-feature outdoor environment emotion recognition method
Janakaraj et al. STAR: Simultaneous tracking and recognition through millimeter waves and deep learning
CN113537120B (en) Complex convolution neural network target identification method based on complex coordinate attention
He et al. Fall detection based on parallel 2DCNN-CBAM with radar multidomain representations
CN116561700A (en) Indoor human body posture recognition method based on millimeter wave radar
CN110111360B (en) Through-wall radar human body action characterization method based on self-organizing mapping network
CN116008982A (en) Radar target identification method based on trans-scale feature aggregation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant