Disclosure of Invention
Aiming at the defect that the existing WiFi-based fall detection method cannot well adapt to the environment change to cause the reduction of the detection precision, the invention aims to solve the technical problem of providing the fall detection method which enables the characteristics of a generated sample to be independent of the environment as far as possible so as to reduce the influence of the environment change on the detection.
The invention adopts the technical scheme that a fall detection method based on sample generation and feature separation comprises the following steps:
1) Fall detection environment deployment: deploying a WiFi transmitter and a WiFi receiver in a detection environment, and covering the whole detection area with WiFi signals;
2) A CSI data acquisition step: setting action types, wherein the action types comprise c actions, namely falling actions and non-falling actions; setting environment types, wherein each user repeatedly executes c actions in each environment, and each action is executed for multiple times; collecting Channel State Information (CSI) data in a detection area when actions are executed each time, and extracting a subcarrier amplitude value from the CSI data corresponding to one execution action as an action sample;
3) Data division: taking an action sample and corresponding action labels and environment labels as source domain data, and forming a source domain data set by using the source domain data corresponding to all actions executed in all environments;
4) Adding random Gaussian noise into an action sample of source domain data falling by utilizing a source domain data set action label, reconstructing by utilizing a self-encoder to generate a falling action virtual sample, generating t x N falling action virtual samples, and generating virtual falling data with the same format as the source domain data by utilizing the falling action virtual sample;
5) Inputting the source domain data and the virtual falling data into a feature extractor for feature separation training, and then respectively inputting the separation features output by feature extraction into a falling and non-falling detection two-domain classifier and a multi-domain classifier for detection training, specifically:
5-1) using a feature extractor to perform motion and environment separation extraction on the source domain data and the virtual fall data to obtain feature vectors
The feature separator to separate domain information related to the environment into feature vectors
The lower half of (1), storing the motion information in the feature vector
The upper half of the training system is used as a target to finish training;
5-2) feature vector
Divided into upper half
And the lower half
Layer of feature vectors
Upper half of (1)
Carrying in fall and non-fall detection two-classifier training, and carrying out the lower half part
Carrying out multi-domain classifier training; fall and non-fall detection bi-classifier for receiving feature vector layer
Upper half of (1)
Finishing training by taking information related to falling and non-falling actions as a target and performing two classification judgment of the falling and non-falling actions; multi-domain classifier training for receiving feature vector layers
The lower half of the training is completed with the goal of keeping the domain information related to the environment;
6) A detection step: and inputting the subcarrier amplitude of the CSI data to be classified into the trained feature extractor, and inputting the upper half part of features of the feature vector output by the feature extractor into the trained fall and non-fall detection secondary classifier to finish fall detection.
According to the influence of different actions of an experimenter in the detection area on the WiFi signals, the amplitude information of each subcarrier is extracted from the CSI data of the WiFi signals for fall detection. In order to solve the problems of insufficient falling sample size and dependence of a falling detection model on the environment, gaussian noise is added into a falling motion sample in source domain data, a self-encoder is used for reconstruction to obtain a virtual falling motion sample, a feature extractor is used for extracting feature vectors of the source domain data and the virtual falling motion data, a neuron of a feature vector layer obtained by the feature extractor can be divided into two parts, the upper part is brought into a falling and non-falling detection two-classifier for training, the lower part is brought into a domain classifier for training, information related to falling and non-falling motions is gradually reserved in the upper part in the training process, and information related to the environment is gradually reserved in the lower part in the training process, so that the falling and non-falling detection two-classifier can better distinguish the falling and non-falling motions.
The method has the beneficial effect that through the step of reconstructing the falling type samples, the problem that the model is over-fitted with non-falling type data due to the fact that the falling type samples are insufficient is solved. In order to further improve the environment-independent fall detection precision, the invention carries out feature separation on the feature vectors obtained by the feature extractor, and brings the two separated parts into a fall detection two-classifier and a non-fall detection two-classifier and a domain classifier for training respectively, thereby enhancing the generalization capability of the fall detection on the environment.
Detailed Description
The method comprises the following specific steps:
1) And (3) deployment of fall detection environments: the invention needs to be carried out in an environment covering WiFi, and a WiFi transmitter and a WiFi receiver are arranged in a detection environment, as shown in FIG. 1;
2) A CSI data acquisition step: setting the action types, wherein c actions are required to include falling actions and non-falling actions, specifically, c actions include 1 falling action 1 And c-1 non-fall actions c i The label of the falling action is i =2, \8230c, c, or the distribution number of other falling actions and non-falling actions is configured according to the requirement; setting environment types, wherein t environments are set; in each environment, each user repeatedly executes c actions, each action is executed for N times, and Channel State Information (CSI) data in a detection area is collected when the action is executed each time; extracting subcarrier amplitude values from the collected CSI data as action samples;
3) Data division: taking CSI data collected in t different environments as a source domain data set
Wherein
Represents the classification of the action as c
i And the environment is classified as t
l Of source domain CSI data, t
l The l-th context class is represented,
representing the classification of actions as c
i A category label of (2), wherein the fall action c
1 Label in the same category, non-falling action c
i (i ≠ 1) is the same category label;
representing the classification of an environment as t
l A domain tag of (a);
4) Generating a falling action virtual sample for an action sample of falling source domain data by utilizing the action label in the source domain data set, and generating t × N falling action virtual samples together, so that the virtual samples and the falling action samples are consistent with the action sample amount of non-falling categories, and the samples are balanced during the subsequent two-category training; generating virtual falling data with the same format as the source domain data by using a falling motion virtual sample; the generation of one virtual sample of the falling motion comprises the following steps:
4-1) based on a source domain dataset X
S One source domain data set with fall as middle action label
Selecting a source domain data
Source domain data
Wherein each sample is data acquired by multiple transmit-receive antenna pairs
In combination wherein a
i Representing antenna pair i. Source domain data
The motion samples in (1) are divided into n sets of data acquired by different antenna pairs:
wherein n represents the number of antenna pairs;
4-2) using a combined auto-encoder for n sets of data acquired by different antenna pairs
Reconstructing to generate n sets of corresponding intermediate layer vectors
The number of the combined self-encoders is n, and corresponds to the number of the antenna pairs;
each generating coder corresponds to a group of data acquired by the same antenna pair, and n generating coders are adopted to respectively correspond to n groups of data acquired by different antenna pairs
Performing feature extraction to obtain n groups of intermediate layer vectors
b is random gaussian noise, and function f represents the generation encoder; each generating encoder in this embodiment is a neural network including 4 fully-connected layers, and may also be implemented by using other fully-connected layers and other network structures;
4-3) each generating decoder corresponds to a group of intermediate layer vectors, n generating decoders are adopted to respectively reconstruct n groups of different intermediate layer vectors to obtain n groups of numbers acquired by different antenna pairsAccording to
N groups of virtual data after reconstruction
The structure of the generating decoder is opposite to that of the generating encoder, the generating decoder is a 4-layer fully-connected neural network, and the function g represents the generating decoder;
4-4) acquiring n groups of data acquired by different antenna pairs
N groups of reconstructed virtual data
Spliced into a complete and original motion sample
Similar fall action virtual samples:
5) Feature separation, comprising the steps of:
5-1) Source Domain data x Using feature extractor
S And virtual fall data
Extracting action characteristics and environment characteristics to obtain a characteristic vector layer
The feature extractor is composed of 5 convolution blocks and a layer of full-connection network, each convolution block comprises 1 convolution layer and 1 pooling layer, and the 5 convolution blocks respectively adopt convolution kernels of 200, 150, 100, 20 and 10; layer of feature vectors
Has a dimension of (m × 2, 1), wherein the value of m needs to be dynamically adjusted according to the number of antenna pairs, and the feature vector layer is divided into two layers
Division into
And
two parts of which
Has the dimension of (m, 1),
has a dimension of (m, 1);
5-2) layer of feature vectors
Upper half of (2)
Bring into tumble and non-tumble and detect two classifier training, the latter half
Carrying in multi-domain classifier training, wherein a falling and non-falling detection two classifier is composed of 4 layers of fully-connected layer neural networks, and a multi-domain classifier is composed of 4 layers of fully-connected layer neural networks; layer of feature vectors
The upper half of (i.e.
Gradually retaining information related to falling and non-falling actions in the training process, and providing a feature vector layer
The lower half of (i.e.
Gradually retaining information related to the environment during the training process;
the feature extractor, the fall and non-fall detection two-classifier and the multi-domain classifier are not limited to the above structural description as long as the structure can support them to meet the training target:
the feature extractor is used for receiving the source domain data and the virtual falling data and outputting a feature vector
To separate context-related domain information into feature vectors
The lower half of (1), storing the motion information in the feature vector
The upper half of the training system is used as a target to finish training;
fall and non-fall detection bi-classifier for receiving feature vector layer
Upper half of (1)
Finishing training by taking the information related to falling and non-falling actions as a target and performing two-classification judgment of the falling and non-falling actions;
multi-domain classifier training for receiving layers of feature vectors
The lower half of the training is completed with the goal of keeping the domain information related to the environment;
6) Inputting the amplitude of the CSI data to be classified into a trained feature extractor, and obtaining featuresThe extractor extracts the feature vector layer
To pair
After characteristic separation to obtain
And
two parts, the upper half being characterised by
And carrying in a trained fall and non-fall detection two classifier to finish fall detection.
Experimental verification
A WiFi transmitter and a WiFi receiver are arranged in a detection environment, the transmitter is a common commercial router, the receiver is a wireless network card provided with an Intel WiFi Link 5300, and the transmitter and the receiver are respectively provided with 3 antennas and form 9 links. And acquiring CSI information from the Intel WiFi Link 5300 wireless network card by using a CSI tools package, wherein each antenna pair can acquire 30 groups of subcarrier information, and each data package totals 270 groups of subcarrier information.
The specific implementation steps are as follows:
step 1: a pair of WiFi transmitter and WiFi receiver is deployed in the detection area, wiFi signals are required to cover the whole detection area, and the experimental environment is schematically shown in FIG. 1.
And 2, step: CSI motion data are collected in the detection area. Selecting a plurality of time periods as different environments, and executing related actions in each environment by each user, wherein the actions specifically comprise standing posture falling, sitting posture falling, squatting, standing, walking, jumping and the like, and each action is repeated for a plurality of times. The sampling frequency was set at 100Hz.
And step 3: taking the action data of different environments as source domains to form a source domain data set
Each user performs each action N times in each context.
The CSI subcarriers may be described as complex forms
For H
i Taking absolute value to obtain amplitude data set X
a ={|H
i An | }; the variance threshold clipping is performed on the extracted amplitude values so that each motion sample size is 300 × 270, and the sampled data can be represented as:
x=[x 1 ,x 2 ,...x 300 ]
where 300 is the number of packets per action, x i =[h 1 ,h 2 ...h 270 ]Is a data packet containing 270 subcarrier amplitudes.
And 4, step 4: a combined self-encoder is adopted to generate virtual samples of the falling action, so that the sample data size of the falling category in the training set is consistent with the sample data size of the non-falling category, as shown in fig. 2:
step 4-1: selecting a source domain data set x
S Data set with fall category as middle action category
Source domain data
Data in which each sample is formed by a plurality of transmit-receive antenna pairs
In combination wherein a
i Representing antenna pair i, source domain data
Each sample is divided into n sets of data acquired by different antenna pairs:
where n represents the number of antenna pairs, in this test n has a value of 9, i.e. there are 9 different sets of antenna pairs; a is
i Representing different pairs of antennas, i.e. source domain data
Each sample in the array is divided into a plurality of groups of data collected by different antenna pairs;
step 4-2: using a combined auto-encoder to n sets of data acquired by different antenna pairs
Reconstructing to generate n sets of corresponding intermediate layer vectors
The number of the combined self-encoders is n, and the number of the combined self-encoders corresponds to the number of different antenna pairs;
each generating coder corresponds to a group of antenna pair data, and n generating coders are adopted to respectively pair n groups of data acquired by different antenna pairs
Feature extraction is performed, thereby obtaining n sets of intermediate layer vectors
Each generating coder is a neural network comprising 4 fully-connected layers, the input dimension is 300 multiplied by 30, 30 is the number of subcarriers contained in each signal link, b is random noise with the dimension of 300 multiplied by 30, and the output dimension is 64 multiplied by 30; in an embodiment, the number n of antenna pairs is 9, then a total of 9 sets of intermediate layer vectors are generated; the function f represents the generating encoder;
step 4-3: each generating decoder corresponds to a group of intermediate layer vectors, n generating decoders are adopted to respectively reconstruct n groups of different intermediate layer vectors to obtain n groups of data acquired by different antenna pairs
N groups of virtual data after reconstruction:
the generation decoder and the generation encoder are of opposite structures and are respectively a 4-layer fully-connected neural network, the input dimension is 64 multiplied by 30, and the output dimension is 300 multiplied by 30; in the embodiment of the present invention, if the number n of antenna pairs is 9, then 9 sets of virtual data are generated in total, each set corresponding to a different antenna pair; function g represents the generation decoder;
wherein, each group of the generation encoder and the generation decoder adopts Mean Square Error (MSE) as a loss function during training, and each group adopts different antenna pairs to collect falling motion samples
Corresponding to virtual samples of falling actions
As an MSE input, the loss function may be expressed as
Training the weights of the encoder and decoder such that reconstructed data is obtained for the data by different antennas
Source domain fall category data approaching corresponding antenna pairs
Step 4-4: will be provided withBy n sets of data acquired by different antenna pairs
Splicing n groups of reconstructed virtual data into a complete and original tumble motion sample
Similar virtual fall action sample
The transceivers adopted by the invention are all 3 antennas, so that 9 signal links are formed in total, namely n =9.
And 5: using feature extractor to align source domain data x
S And virtual fall data
Performing characteristic separation to separate action information and environment information and bring action information characteristics and environment information characteristics into a classifier for training respectively;
step 5-1: the feature extractor is composed of 5 convolution blocks Conv1d and a layer of fully-connected network, each convolution block includes 1 convolution layer and 1 pooling layer, the input dimension of the feature extractor is 300 × 270 × 1, and the output dimension is (m × 2) × 1:
where x is the source domain data x
S Or virtual fall data
For the feature vector layer, the feature vector layer
Dimension of (m × 2, 1), where the value of m needs to be dynamically adjusted according to the number of antenna pairs, in an embodiment of the present invention, the value of m is 1025, and the eigenvector layer is stacked
Division into
And
two parts, wherein
Has the dimension of (m, 1),
has the dimension of (m, 1):
step 5-2: will be provided with
The training is carried out in a falling and non-falling detection two-stage classifier formed by a 2-layer full-connection layer neural network, a probability distribution is obtained through the falling and non-falling detection two-stage classifier, and the probability distribution is combined with a real action label to calculate the cross entropy loss L
1 :
Where M is the number of samples in each training batch, y i The label representing sample i, with a falling action of 1 and a non-falling action of 0 i Representing the probability that sample i is predicted to fall.
Step 5-3: will be provided with
The method is carried out by training in a multi-domain classifier formed by a 2-layer full-connection layer neural network, a probability distribution is obtained through the multi-domain classifier, the probability distribution is combined with a real domain label, and cross entropy loss L is calculated
2 :
Where K is the number of domain classes, z id Is the true domain label (0 or 1) of the sample, z is the true class of sample i equals d id Get 1, otherwise z id Take 0,p ic Is the predicted probability that sample i belongs to class c. The classification effect of each category in the multi-domain classifier is not the focus, and the multi-domain classifier is not used for detection after training is finished. The multi-domain classifier is used for assisting the two-domain classifier to complete training and obtain an accurate two-class detection result.
Step 5-4: the final objective function L is
L=L 1 +L 2
And 6: after the training of the fully-connected neural network is completed, the amplitude value of the CSI data in the new environment to be classified is directly input into a trained feature extractor, the feature extractor can extract the falling action information and the environment information in the new environment data into a feature vector, the information related to the environment is gradually separated into the lower half part of a feature vector layer, and the required information related to the falling action and the non-falling action is stored in the upper half part of the feature vector layer; and inputting the upper half part of the feature vector for reserving the relevant information of the falling action into a trained falling and non-falling detection two classifier to finish falling detection.
In a verification experiment, 12 different environment data are used as source domain data, and the data of 2 new environments are input into a trained model, so that the average falling detection accuracy rate in the new environments is more than 87%, the accuracy is more than 86%, and the recall rate is more than 88%.