CN115457732B

CN115457732B - Fall detection method based on sample generation and feature separation

Info

Publication number: CN115457732B
Application number: CN202211018371.3A
Authority: CN
Inventors: 罗悦; 周瑞; 张子若; 程宇; 张宏旺; 王佳昊
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2022-08-24
Filing date: 2022-08-24
Publication date: 2023-09-01
Anticipated expiration: 2042-08-24
Also published as: CN115457732A

Abstract

The invention provides a fall detection method based on sample generation and feature separation, which is used for extracting amplitude information of each subcarrier from CSI data of WiFi signals to perform fall detection. In order to solve the problem that the falling sample size is insufficient and the falling detection model depends on the environment, a falling action sample in source domain data is subjected to noise reconstruction to obtain a falling action virtual sample, then a feature extractor is used for extracting feature vectors of the source domain data and the virtual falling data, a feature vector layer neuron obtained through the feature extractor can be divided into two parts, the upper part is brought into training of a falling and non-falling detection two-classifier, the lower part is brought into training of the domain classifier, the upper part gradually retains information related to falling and non-falling actions in the training process, and the lower part gradually retains environment related information in the training process, so that the falling and non-falling actions can be better distinguished by the falling and non-falling detection two-classifier, and the generalization capability of the falling detection for the environment is enhanced.

Description

Fall detection method based on sample generation and feature separation

Technical Field

The invention relates to a behavior detection technology based on WiFi signals, in particular to a falling action detection technology.

Background

With the development of the internet of things and artificial intelligence, motion recognition based on wireless signals has become possible. Fall is a major cause of injury to the elderly population, and indoor fall detection has become an urgent need. Existing fall detection techniques are mostly based on wearable devices or video monitoring. However, wearable device-based fall detection requires wearable devices to be worn, whereas video-monitoring-based fall detection is highly dependent on indoor light and presents privacy-violations. These limitations make it difficult to deploy fall detection systems widely in a home environment.

In order to improve the current fall detection method and overcome the defects of the existing detection technology, the invention provides a fall detection method based on WiFi signals. Compared with the traditional video detection mode, wiFi has penetrating capacity, can work normally under the conditions of weak light, no light and shielding, and has little privacy invasion. Compared with a wearable device mode, wiFi falling detection is a non-contact detection mode, and a user does not need to carry any device, so that convenience is greatly improved. Due to the popularity and wide coverage of WiFi, wiFi-based fall detection costs are extremely low. The WiFi is used for indoor falling detection, so that the method has high potential value and future prospect.

In the current fall detection method based on WiFi, the wireless network card and related tools acquire channel state information (Channel State Information, CSI) to perform fall motion recognition. The propagation of WiFi in a room consists of superposition of multiple paths, including a direct path, an obstacle reflection path, and the like. The CSI has good stability and can well reflect the channel state. When the human body performs different actions, the transmission of the WiFi signals is affected differently, and further the CSI is changed, so that the falling action is detected accordingly. CSI information of daily actions (including falling actions) is collected, a classification model capable of identifying the falling actions is trained by using a machine learning algorithm, and real-time CSI data is input into the model for classification, so that whether falling occurs can be detected.

Fall detection is a two-component problem, typically using supervised learning methods. In order to ensure the fall detection accuracy, a large amount of different action data (including fall actions) with labels needs to be collected to train the model, and the collection and labeling cost of the data with labels is high, so that the labor is consumed. Because the environment change can have a larger and difficult-to-predict influence on the WiFi signal, the prediction accuracy of a model trained by using a supervised learning method in a changing environment is greatly reduced, and the model cannot be suitable for practical application.

Disclosure of Invention

Aiming at the defect that the detection precision is reduced due to the fact that the existing fall detection method based on WiFi cannot be well adapted to environmental changes, the technical problem to be solved by the invention is to provide a fall detection method which enables characteristics of a generated sample to be irrelevant to the environment as much as possible, and therefore influence of environmental changes on detection is reduced.

The technical scheme adopted by the invention for solving the problems is that the falling detection method based on sample generation and feature separation comprises the following steps:

1) Fall detection environment deployment: a WiFi transmitter and a WiFi receiver are deployed in a detection environment, and WiFi signals cover the whole detection area;

2) And CSI data acquisition: setting action types, wherein c actions comprise a falling action and a non-falling action; setting environment types, wherein in each environment, each user repeatedly executes c actions, and each action is executed for a plurality of times; collecting Channel State Information (CSI) data in a detection area when each action is executed, and extracting subcarrier amplitude values from the CSI data corresponding to the action to be executed at one time as an action sample;

3) Dividing data: taking an action sample, a corresponding action tag and an environment tag as source domain data, and constructing a source domain data set by using the source domain data corresponding to all actions executed in all environments;

4) Adding random Gaussian noise into an action sample of one source domain data of a falling by using an action tag in a source domain data set, reconstructing the action sample by using a self-encoder to generate a falling action virtual sample, generating t x N falling action virtual samples, and generating virtual falling data with the same format as the source domain data by using the falling action virtual sample;

5) The source domain data and the virtual falling data are input into a feature extractor for feature separation training, and then the separation features output by the feature extraction are respectively input into a falling and non-falling detection two-classifier and a multi-domain classifier for detection training, specifically:

5-1) use of feature extractorPerforming action and environment separation and extraction on the source domain data and the virtual falling data to obtain feature vectorsThe feature separator is used for separating domain information related to environment into feature vectors +.>Is to store motion information in the feature vector +.>The upper half part of the training platform is used for completing training;

5-2) feature vectorDivided into upper half->And lower half->Layer of feature vectors->Upper half of (2)Training with two classifiers for fall and non-fall detection, and performing +_in the lower half>Carrying out multi-domain classifier training; the fall and non-fall detection classifier is used for receiving the feature vector layer +.>Is>To reserveInformation related to the falling and non-falling actions is judged to be the target to finish training by classifying the falling and non-falling actions; multi-domain classifier training for receiving feature vector layers +.>The lower half of the training platform is used for finishing training by taking the domain information related to the reserved environment as a target;

6) The detection step comprises: inputting the subcarrier amplitude of the CSI data to be classified into a trained feature extractor, and inputting the upper half feature of the feature vector output by the feature extractor into a trained fall and non-fall detection two-classifier to complete fall detection.

According to the method, according to the influence of different actions of an experimenter on the WiFi signal in the detection area, the amplitude information of each subcarrier is extracted from the CSI data of the WiFi signal to perform falling detection. In order to solve the problem that the falling sample size is insufficient and the falling detection model depends on the environment, a falling action sample in source domain data is added into Gaussian noise and is reconstructed by a self-encoder to obtain a falling action virtual sample, a feature extractor is used for extracting feature vectors of the source domain data and the virtual falling data, a feature vector layer neuron obtained by the feature extractor can be divided into two parts, the upper part is brought into the training of a falling and non-falling detection two-classifier, the lower part is brought into the training of the domain classifier, the upper part gradually retains information related to falling and non-falling actions in the training process, and the lower part gradually retains information related to the environment in the training process, so that the falling and non-falling detection two-classifier can better distinguish the falling and non-falling actions.

The method has the beneficial effects that through the step of reconstructing the falling type sample, the problem that the model is excessively fitted with non-falling type data due to insufficient sample quantity of the falling type is solved. In order to further improve the environment-independent falling detection precision, the feature extraction method performs feature separation on the feature vector obtained through the feature extractor, and brings the separated two parts into the falling and non-falling detection two-classifier and the domain classifier for training, so that the generalization capability of the falling detection to the environment is enhanced.

Drawings

Fig. 1 is a schematic diagram of an experimental scenario.

Fig. 2 is a unitary frame diagram.

Detailed Description

The method comprises the following specific steps:

1) Fall detection environment deployment: the invention needs to be carried out in a WiFi covered environment, and a WiFi transmitter and a WiFi receiver are arranged in a detection environment, as shown in figure 1;

2) And CSI data acquisition: setting action types, c actions including falling action and non-falling action, specifically, 1 falling action c ₁ And c-1 non-fall actions c _i The falling action labels are i=2, … and c, or the distribution number of other falling actions and non-falling actions is configured according to the requirement; setting environment types, namely t environments; in each environment, each user repeatedly executes c actions, each action is executed for N times, and Channel State Information (CSI) data in a detection area when each action is executed are acquired; extracting subcarrier amplitude values from the acquired CSI data to serve as action samples;

3) Dividing data: taking CSI data acquired in t different environments as a source domain data set Wherein->Representing action classification as c _i And the environment is classified as t _l Source domain CSI data of t _l Representing the first environmental class,/->Representing action classification as c _i Category label of (c), wherein the fall action c ₁ Non-fall action c for the same category label _i (i.noteq.1) are identicalA category label; />Representing the classification of the environment as t _l Is a domain label of (2);

4) Generating a falling action virtual sample for an action sample of one source domain data of falling by utilizing an action label in the source domain data set, generating t x N falling action virtual samples in a symbiotic mode, enabling the virtual samples to be consistent with the quantity of the falling action sample and the action sample of a non-falling category, and facilitating sample equalization in subsequent classification training; then virtual falling data with the same format as the source domain data is generated by using the falling action virtual sample; the generation of one of the virtual samples of the fall motion comprises the following steps:

4-1) Source Domain dataset X based ^S The medium action label is a source domain data set of fallingSelecting a source field data +.>Source Domain data->Each sample is data collected by a plurality of transceiver antenna pairs>Combined, wherein a is _i Representing the antenna pair i. Source domain data->Is divided into n groups of data collected by different antenna pairs:

wherein n represents the number of antenna pairs;

4-2) use of a Combined self-encoderFor n sets of data acquired by different antenna pairsReconstructing to generate n groups of corresponding intermediate layer vectors +.>The number of the combined self-encoders is n, and the number corresponds to the number of the antenna pairs;

each generating encoder corresponds to a group of data collected by the same antenna pair, and n generating encoders are adopted to respectively collect n groups of data collected by different antenna pairsFeature extraction is performed to obtain n groups of intermediate layer vectors

b is random gaussian noise, and the function f represents a generating encoder; each generation encoder of the embodiment is a neural network comprising 4 full-connection layers, and can also be realized by adopting other full-connection layers and other network structures;

4-3) each generating decoder corresponds to a group of intermediate layer vectors, n generating decoders are adopted to reconstruct n groups of different intermediate layer vectors respectively, and n groups of data acquired by different antenna pairs are obtainedN groups of virtual data after reconstruction

Wherein the generating decoder is a 4-layer fully connected neural network with the opposite structure to the generating encoder, and the function g represents the generating decoder;

4-4) data acquired by different antenna pairs into n groupsN groups of virtual data after reconstruction +.>Splicing to form a complete sample of the original action +.>Similar virtual sample of fall motion:

5) The characteristic separation comprises the following steps:

5-1) use of feature extractor for source domain data x ^S And virtual fall dataExtracting action features and environmental features to obtain a feature vector layer ∈>The feature extractor consists of 5 convolution blocks and a layer of fully-connected network, each convolution block comprises a 1-layer convolution layer and a 1-layer pooling layer, and the 5 convolution blocks adopt convolution kernels of 200, 150, 100, 20 and 10 respectively; feature vector layer->The dimension of (2, 1), wherein the value of m needs to be dynamically adjusted according to the number of antenna pairs, the eigenvector layer +.>Divided into->And->Two parts of the two-way valve are arranged on the two sides,wherein->The dimension of (m, 1),. About.>The dimension of (1) is (m);

5-2) layering feature vectorsIs>Training with two classifiers for fall and non-fall detection, the lower half->Carrying out training by using a multi-domain classifier, wherein the two classifiers for fall and non-fall detection are composed of 4 layers of full-connection layer neural networks, and the multi-domain classifier is composed of 4 layers of full-connection layer neural networks; feature vector layer->Upper half of (a), i.e.)>Gradually retaining information related to falling and non-falling actions in training process, and feature vector layer +.>Lower part of (a), i.e.)>Gradually retaining information related to the environment during the training process;

the feature extractor, the two-classifier for fall and non-fall detection and the multi-domain classifier are not limited to the above structural description, as long as the structure can support the training object:

feature extractor for receiving source domain numbersOutputting feature vectors from the data and virtual fall dataTo separate the context-dependent domain information into feature vectors +.>Is to store motion information in the feature vector +.>The upper half part of the training platform is used for completing training;

fall and non-fall detection classifier for receiving feature vector layerIs>Training is completed by taking the judgment of the classification of the falling and non-falling actions as a target, wherein the information related to the falling and non-falling actions is reserved;

multi-domain classifier training for receiving feature vector layersThe lower half of the training platform is used for finishing training by taking the domain information related to the reserved environment as a target;

6) Inputting the amplitude of the CSI data to be classified into a trained feature extractor, and extracting by the feature extractor to obtain a feature vector layerFor->Performing characteristic separation to obtain->And->Two parts, the upper part is characterized by->And carrying out the fall detection by carrying in the trained fall and non-fall detection classifier.

Experiment verification

And arranging a WiFi transmitter and a WiFi receiver in a detection environment, wherein the transmitter is a common commercial router, the receiver is provided with an Intel WiFi Link 5300 wireless network card, and the transmitter and the receiver are respectively provided with 3 antennas to form 9 links. And acquiring the CSI information from the Intel WiFi Link 5300 wireless network card by using the CSI tools kit, wherein 30 groups of subcarrier information can be acquired for each antenna pair, and 270 groups of subcarrier information can be acquired for each data packet.

The specific implementation steps are as follows:

step 1: a pair of WiFi transmitters and WiFi receivers are disposed within the detection area, and the WiFi signals are required to cover the entire detection area, and the experimental environment is schematically shown in fig. 1.

Step 2: CSI action data is collected within a detection zone. And selecting a plurality of time periods as different environments, and executing related actions by each user in each environment, wherein the actions comprise standing, falling, sitting, squatting, standing, walking, jumping and the like, and each type of actions are repeated for a plurality of times. The sampling frequency was set to 100Hz.

Step 3: taking action data of different environments as a source domain to form a source domain data setEach user repeatedly performs each action N times in each environment.

CSI subcarriers may be described as complex formsFor H _i Taking absolute value to obtain amplitude data set X ^a ＝{|H _i |; variance threshold interception is performed on the extracted amplitude values so that the size of each action sample is 300X 270, the sampled data can be expressed as:

x＝[x ₁ ,x ₂ ,...x ₃₀₀ ]

wherein 300 is the number of packets per action, x _i ＝[h ₁ ,h ₂ ...h ₂₇₀ ]Is a data packet containing 270 subcarrier magnitudes.

Step 4: generating virtual samples of the fall actions with a combined self-encoder such that the fall category sample data volume in the training set is consistent with the non-fall category sample data volume, as shown in fig. 2:

step 4-1: selecting a source domain dataset x ^S Data set with action types of falling down typeSource Domain data->Is composed of multiple transceiver antenna pairs>Combined, wherein a is _i Representing antenna pair i, source domain data +.>Is divided into n groups of data collected by different antenna pairs:

where n represents the number of antenna pairs, in this test the value of n is 9, i.e. there are 9 different antenna pairs; a, a _i Representing different antenna pairs, i.e. source domain dataDividing each sample into a plurality of groups of data collected by different antenna pairs;

step 4-2: using combined self-encoder for n groupsData acquired by different antenna pairsReconstructing to generate n groups of corresponding intermediate layer vectors +.>The number of the combined self-encoders is n, and the combined self-encoders correspond to the number of different antenna pairs;

each generating encoder corresponds to one group of antenna pair data, and n generating encoders are adopted to respectively acquire n groups of data acquired by different antenna pairsFeature extraction is performed to obtain n groups of intermediate layer vectors

Each generating encoder is a neural network comprising 4 full-connection layers, the input dimension is 300×30, 30 is the number of subcarriers contained in each signal link, b is random noise with the dimension of 300×30, and the output dimension is 64×30; in an embodiment, the number of antenna pairs n is 9, then a total of 9 sets of intermediate layer vectors would be generated; the function f represents the generation encoder;

step 4-3: each generating decoder corresponds to a group of intermediate layer vectors, n generating decoders are adopted to reconstruct n groups of different intermediate layer vectors respectively, and n groups of data acquired by different antenna pairs are obtainedN groups of virtual data after reconstruction:

the generating decoder and the generating encoder are opposite in structure, and are respectively a 4-layer fully-connected neural network, wherein the input dimension is 64 multiplied by 30, and the output dimension is 300 multiplied by 30; in the embodiment of the invention, the number n of the antenna pairs is 9, so that 9 groups of virtual data are generated in total, and each group corresponds to a different antenna pair; the function g represents the generation decoder;

wherein each set of the generating encoder and the generating decoder adopts a mean square error (Mean Square Error, MSE) as a loss function during training, and each set of falling action samples acquired by different antenna pairsVirtual sample of corresponding Fall action>As MSE input, the loss function may be represented as

Weights of the encoder and decoder are trained such that reconstructed data derived from the data by different antennas +.>Source domain fall category data approaching corresponding antenna pair +.>

Step 4-4: data to be acquired by n sets of different antenna pairsThe n groups of reconstructed virtual data are spliced into a complete sample of the original falling action and the complete sample of the original falling action>Similar virtual fall motion samples

The transceivers adopted by the invention are all 3 antennas, so that 9 signal links are formed in total, namely n=9.

Step 5: for source domain data x with feature extractor ^S And virtual fall dataPerforming feature separation to separate action information and environment information, and respectively bringing the action information features and the environment information features into classifier training;

step 5-1: the feature extractor consists of 5 convolution blocks Conv1d and a layer of fully-connected network, wherein each convolution block comprises 1 layer of convolution layer and 1 layer of pooling layer, the input dimension of the feature extractor is 300 multiplied by 270 multiplied by 1, and the output dimension is (m multiplied by 2) multiplied by 1:

where x is the source domain data x ^S Or virtual fall data Is a feature vector layer, feature vector layer->In the embodiment of the present invention, the value of m is 1025, and the eigenvector layer is ++1>Divided into->And->Two parts, wherein->The dimension of (m, 1),. About.>The dimensions of (1) are (m):

step 5-2: will beTraining in a fall and non-fall detection two-classifier composed of 2-layer full-connection layer neural network, obtaining a probability distribution by the fall and non-fall detection two-classifier, and calculating cross entropy loss L by combining the probability distribution with a real action label ₁ ：

Where M is the number of samples in each training batch, y _i A label representing sample i, a fall action of 1, a non-fall action of 0, p _i Representing the probability that sample i is predicted to fall.

Step 5-3: will beTraining in a multi-domain classifier composed of 2-layer full-connection layer neural networks, obtaining a probability distribution by the multi-domain classifier, combining the probability distribution with a real domain label, and calculating cross entropy loss L ₂ ：

Where K is the number of domain categories, z _id For the true domain label of the sample (0 or 1), z if the true class of sample i equals d _id Get 1, otherwise z _id Taking 0, p _ic The predicted probability that sample i belongs to category c. The classification effect of each category in the multi-domain classifier is not a focus, and the multi-domain classifier is not used for detection after training is completed. The multi-domain classifier is used for assisting the two classifiers to complete training and obtaining accurate two-classification detection results.

Step 5-4: the final objective function L is

L＝L ₁ +L ₂

Step 6: after training of the fully-connected neural network is completed, for the CSI data in the new environment to be classified, directly inputting the amplitude value into a feature extractor after training is completed, wherein the feature extractor can extract the falling action information and the environment information in the new environment data into feature vectors, gradually separate the information related to the environment into the lower half part of a feature vector layer, and store the needed falling and non-falling action related information in the upper half part of the feature vector layer; and inputting the characteristic vector of the upper part of the information related to the reserved falling motion into a two-classifier for detecting falling and non-falling after training is completed, so as to complete falling detection.

In the verification experiment, 12 different environment data are used as source domain data, 2 new environment data are input into a trained model, the average falling detection accuracy rate in the new environment is more than 87%, the precision is more than 86%, and the recall rate is more than 88%.

Claims

1. A fall detection method based on sample generation and feature separation, comprising the steps of:

3) Dividing data: taking an action sample, a corresponding action tag and an environment tag as source domain data, and constructing a source domain data set by using the source domain data corresponding to all actions executed in all environments; the method comprises the following steps: taking CSI data acquired in t different environments as a source domain data setWherein->Representing action classification as c _i And the environment is classified as t _l Source domain CSI data of t _l Representing the first environmental class,/->Representing action classification as c _i Category label of (c), wherein the fall action c ₁ Non-fall action c for the same category label _i (i+.1) is the same category label; />Representing the classification of the environment as t _l Is a domain label of (2);

4) Performing noise reconstruction by using the action label in the source domain data set as an action sample of one source domain data of the falling, generating one falling action virtual sample, generating t x N falling action virtual samples in a symbiotic mode, and generating virtual falling data with the same format as the source domain data by using the falling action virtual sample;

the method comprises the following steps:

4-1) Source Domain dataset X based ^S The medium action label is a source domain data set of fallingSelecting a source domain dataSource Domain data->Each sample is data collected by a plurality of transceiver antenna pairs>Combined, wherein a is _i Representing an antenna pair i; source domain data->Is divided into n groups of data collected by different antenna pairs:

wherein n represents the number of antenna pairs;

4-2) use of combined self-encoder pairs n sets of data acquired by different antenna pairsReconstructing to generate n groups of corresponding intermediate layer vectors +.>The number of the combined self-encoders is n, and the number corresponds to the number of the antenna pairs;

each generating encoder corresponds to a group of data collected by the same antenna pair, and n generating encoders are adopted to respectively collect n groups of data collected by different antenna pairsFeature extractionThereby yielding n sets of intermediate layer vectors:

b is random gaussian noise, and the function f represents a generating encoder;

4-3) each generating decoder corresponds to a group of intermediate layer vectors, n generating decoders are adopted to reconstruct n groups of different intermediate layer vectors respectively, and n groups of data acquired by different antenna pairs are obtainedN groups of virtual data after reconstruction:

wherein the function g represents the generation decoder;

5-1) performing action and environment separation and extraction on the source domain data and the virtual fall data by using a feature extractor to obtain feature vectorsFeature separator to separate context-related domain information into feature vectors ++>Is to store motion information in the feature vector +.>The upper half part of the training platform is used for completing training;

5-2) feature vectorDivided into upper half->And lower half->Layer of feature vectors->Is>Training with two classifiers for fall and non-fall detection, and performing +_in the lower half>Carrying out multi-domain classifier training; the fall and non-fall detection classifier is used for receiving the feature vector layer +.>Is>Training is completed by taking the judgment of the classification of the falling and non-falling actions as a target, wherein the information related to the falling and non-falling actions is reserved; multi-domain classifier training for receiving feature vector layers +.>The lower half of the training platform is used for finishing training by taking the domain information related to the reserved environment as a target;

5) The detection step comprises: inputting the subcarrier amplitude of the CSI data to be classified into a trained feature extractor, and inputting the upper half feature of the feature vector output by the feature extractor into a trained fall and non-fall detection two-classifier to complete fall detection.

2. A method as claimed in claim 1, wherein the c actions set in step 2) comprise in particular 1 fall action and c-1 non-fall actions; setting environment types, namely t environments; within each environment, each user repeatedly performs c actions, each action being performed N times.

3. The method of claim 1, wherein the production decoder is a 4-layer fully connected layer neural network as opposed to the production encoder.

4. The method of claim 1, wherein the feature extractor employs a structure consisting of 5 convolution blocks and a layer of fully-connected network, each convolution block comprising 1 convolution layer and 1 pooling layer; feature vectorThe dimension of (2, 1), wherein the value of the dimension parameter m is dynamically adjusted according to the number of antenna pairs, +.>The dimension of (m, 1),. About.>The dimension of (1) is (m).

5. The method of claim 1, wherein the fall and non-fall detection two-classifier is comprised of a 2-layer fully connected-layer neural network, and the domain classifier is comprised of a 2-layer fully connected-layer neural network.