CN116340849B

CN116340849B - A non-contact cross-domain human activity recognition method based on metric learning

Info

Publication number: CN116340849B
Application number: CN202310556403.3A
Authority: CN
Inventors: 毛一敏; 肖甫; 郭政鑫; 桂林卿; 盛碧云; 李延超; 蔡惠
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2023-05-17
Filing date: 2023-05-17
Publication date: 2023-08-15
Anticipated expiration: 2043-05-17
Also published as: CN116340849A

Abstract

A non-contact cross-domain human activity recognition method based on metric learning is characterized in that data enhancement is carried out on collected activity data by adopting a self-encoder, and then recognition of new activity categories which are not in a training set is completed by adopting the metric learning. The method comprises the following specific steps: acquiring corresponding wireless signal data when personnel are active in an indoor environment, and extracting CSI original data from the wireless signal data; performing data preprocessing, data interpolation, unifying data length and data denoising on original CSI data; performing data enhancement on the data with known activity types by using a self-encoder, and expanding a data set; and (3) using the expanded data set training feature extraction network to input the to-be-identified activity data and the support set into the feature extraction network to obtain corresponding features, and using a metric learning method to compare the to-be-identified data features with the support set data features one by one so as to judge the activity type. The method can realize higher recognition precision for untrained activity types, and enhances generalization and robustness.

Description

A non-contact cross-domain human activity recognition method based on metric learning

技术领域technical field

本发明涉及活动识别领域，具体涉及一种基于度量学习的非接触式跨域人体活动识别方法。The invention relates to the field of activity recognition, in particular to a non-contact cross-domain human activity recognition method based on metric learning.

背景技术Background technique

基于Wi-Fi的人体活动识别（Human Activity Recognition, HAR）旨在利用Wi-Fi信号识别人体活动。由于不需要用户佩戴特殊设备和额外的部署成本，基于Wi-Fi的人体识别在构建人机交互、智能监、入侵检测智能应用方面受到越来越多的关注。然而，由于采集信道状态信息（ChannelState Information, CSI）数据需要大量的人力物力，现有的Wi-Fi感知公开数据集非常匮乏。因此，Wi-Fi感知没有足够的标记数据来训练具有良好性能的机器学习模型。同时，无线信号受到物理环境，人类行为的影响，传输过程中的人体（位置、姿势或动作）不同的动作类型可能引起相似的信号波动，这无疑使棏活动识别的过程变得更加复杂。特别是在实际应用中，系统可能需要识别一些在训练阶段从未见过的活动类型，对未知活动类型进行识别是一个巨大的挑战。Wi-Fi-based Human Activity Recognition (HAR) aims to identify human activities using Wi-Fi signals. Since it does not require users to wear special equipment and additional deployment costs, Wi-Fi-based human body recognition has received more and more attention in building intelligent applications for human-computer interaction, intelligent surveillance, and intrusion detection. However, because collecting Channel State Information (CSI) data requires a lot of manpower and material resources, the existing public Wi-Fi perception datasets are very scarce. Therefore, Wi-Fi sensing does not have enough labeled data to train a machine learning model with good performance. At the same time, the wireless signal is affected by the physical environment and human behavior. Different types of human body movements (position, posture or movement) during transmission may cause similar signal fluctuations, which undoubtedly makes the process of activity recognition more complicated. Especially in practical applications, the system may need to recognize some activity types that have never been seen in the training phase, and it is a great challenge to recognize unknown activity types.

传统的基于Wi-Fi的识别通常采用有监督的方法，只能识别预先定义的活动，并且需要大量的数据来学习足够的先验知识，它们复杂的训练过程加剧了模型训练的成本，使得整体开销大幅增加。这些活动识别方法首先收集大量带标签的样本，然后训练一个传统的机器学习模型来识别活动类型。然而，当需要识别训练过程中未出现的新活动类型时，由于新数据的活动类别和训练集中数据活动类别不同，需要跨类别对新的活动数据进行识别，此时上述方法的模型的识别性能会严重下降。为了解决这个问题，最近的一些工作收集了一些未出现在训练集中的新类型的活动数据来重新训练模型。但是，重训练的过程无疑增加了模型训练的开销。Traditional Wi-Fi-based recognition usually adopts a supervised method, which can only recognize pre-defined activities, and requires a large amount of data to learn sufficient prior knowledge, and their complicated training process aggravates the cost of model training, making the overall Expenses have increased substantially. These activity recognition methods first collect a large number of labeled samples and then train a traditional machine learning model to identify the type of activity. However, when it is necessary to identify new activity types that did not appear in the training process, since the activity categories of the new data are different from the activity categories of the data in the training set, it is necessary to identify new activity data across categories. At this time, the recognition performance of the model of the above method will decline severely. To address this issue, some recent work collects some new types of activity data that did not appear in the training set to retrain the model. However, the process of retraining undoubtedly increases the overhead of model training.

在计算机视觉领域中，度量学习被广泛用于识别未见过的活动和场景。度量学习问题通常被描述为一个优化问题，优化一些衡量数据相似性的目标函数。将数据从原始向量空间映射到隐藏空间，得到与活动相关的特征。针对上述问题，由于Wi-Fi CSI的数据集很少，实际场景中可能会出现需要识别未知类型活动的情况。In the field of computer vision, metric learning is widely used to recognize unseen activities and scenes. The metric learning problem is usually formulated as an optimization problem, optimizing some objective function that measures the similarity of data. Map the data from the original vector space to the hidden space to obtain activity-related features. In response to the above problems, due to the small data sets of Wi-Fi CSI, there may be situations in which unknown types of activities need to be identified in actual scenarios.

发明内容Contents of the invention

本发明的目的是提供一种基于度量学习的非接触式跨域人体活动识别方法，使用自编码器对活动数据集进行扩充来丰富数据集提升识别准确度，之后通过基于度量学习的方法比较数据之间的相似度来实现对未出现在训练集中的新活动类别进行识别的功能。The purpose of the present invention is to provide a non-contact cross-domain human activity recognition method based on metric learning, which uses an autoencoder to expand the activity data set to enrich the data set to improve the recognition accuracy, and then compares the data through the method based on metric learning The similarity between them is used to realize the function of identifying new activity categories that do not appear in the training set.

一种基于度量学习的非接触式跨域人体活动识别方法，包括如下步骤：A non-contact cross-domain human activity recognition method based on metric learning, comprising the following steps:

步骤1，在室内环境中采集人员活动时对应的无线信号数据，从中提取CSI的原始信号；Step 1, collect the wireless signal data corresponding to people's activities in the indoor environment, and extract the original signal of CSI from it;

步骤2，对步骤1中采集到的原始CSI数据进行数据预处理，数据插值统一长度，数据去噪来去除硬件设备造成的噪声信号；Step 2, performing data preprocessing on the original CSI data collected in step 1, data interpolation with a uniform length, and data denoising to remove noise signals caused by hardware devices;

步骤3，使用自编码器对步骤2处理后的数据进行数据增强，生成对应活动相关的数据以扩大数据集；Step 3, use the autoencoder to perform data enhancement on the data processed in step 2, and generate data related to corresponding activities to expand the data set;

步骤4，使用基于度量学习的方法，建立活动识别模型，对步骤3中生成的数据和原有数据进行训练，学习数据之间的相似度，训练活动相关特征提取模型，对比待识别数据和支持集数据的特征，进行未知类型活动的识别。Step 4, use the method based on metric learning to establish an activity recognition model, train the data generated in step 3 and the original data, learn the similarity between the data, train the activity-related feature extraction model, and compare the data to be recognized with the support The characteristics of the collected data are used to identify unknown types of activities.

本发明达到的有益效果：The beneficial effect that the present invention reaches:

（1）利用度量学习的方法将待识别的未知类型活动数据特征与支持集数据特征一一对比，从而实现了未知类型活动的识别。(1) Using the method of metric learning to compare the data features of the unknown type of activities to be identified with the data features of the support set one by one, so as to realize the identification of unknown types of activities.

（2）解决了无线感知中未知类型活动识别的问题，对于未训练的活动类型能实现较高的识别精度，无需额外的训练成本，增强了活动识别网络的鲁棒性。(2) It solves the problem of unknown type activity recognition in wireless sensing, and can achieve high recognition accuracy for untrained activity types without additional training costs, which enhances the robustness of the activity recognition network.

（3）受度量学习核心思想的启发，通过使相似样本更靠近，不相似样本更远离的准则来训练网络，不需要复杂的训练策略就可以实现对未知类型活动的识别。(3) Inspired by the core idea of metric learning, the network is trained by the criterion of making similar samples closer and dissimilar samples farther apart, and the recognition of unknown types of activities can be realized without complicated training strategies.

附图说明Description of drawings

图1是本发明实施例中的基于信道状态信息的未知类型的活动识别方法的流程示意图。Fig. 1 is a schematic flowchart of a method for identifying activities of an unknown type based on channel state information in an embodiment of the present invention.

图2是本发明实施例中的未知活动识别网络的网络结构示意图。Fig. 2 is a schematic diagram of the network structure of the unknown activity recognition network in the embodiment of the present invention.

图3是本发明实施例中在实验室场景下的跨域活动识别效果示意图。Fig. 3 is a schematic diagram of cross-domain activity recognition effects in a laboratory scenario in an embodiment of the present invention.

图4是本发明实施例中在会议室场景下的跨域活动识别效果示意图。Fig. 4 is a schematic diagram of cross-domain activity recognition effects in a meeting room scene in an embodiment of the present invention.

图5是本发明实施例中在卧室场景下的跨域活动识别效果示意图。Fig. 5 is a schematic diagram of the cross-domain activity recognition effect in the bedroom scene in the embodiment of the present invention.

图6是本发明实施例中在走廊场景下的跨域活动识别效果示意图。Fig. 6 is a schematic diagram of the cross-domain activity recognition effect in the corridor scene in the embodiment of the present invention.

图7是本发明实施例中的数据增强方法对活动识别精度的影响示意图。Fig. 7 is a schematic diagram of the influence of the data enhancement method in the embodiment of the present invention on the accuracy of activity recognition.

具体实施方式Detailed ways

下面结合说明书附图对本发明的技术方案做进一步的详细说明。The technical solution of the present invention will be further described in detail below in conjunction with the accompanying drawings.

如图1所示，本发明提供了一种基于度量学习的非接触式跨域人体活动识别方法，其过程如下所述：As shown in Figure 1, the present invention provides a non-contact cross-domain human activity recognition method based on metric learning, the process of which is as follows:

步骤1：使用配备有Intel 5300 网卡和三根天线的电脑作为实验的接收器，配备有Intel 5300 网卡和一根天线的电脑作为实验的发射器，将设备部署在四个不同的日常生活场景中（实验室、会议室、办公室、卧室）。其中发射器和接收器之间的直线距离为3米，高度距离地面1米。收发器在志愿者活动时收发数据包，采集人员活动相关数据。使用Linux802.11 CSI tools从采集到的活动数据中提取用于人体活动识别的CSI原始数据。Step 1: Using a computer equipped with an Intel 5300 network card and three antennas as the receiver of the experiment, and a computer equipped with an Intel 5300 network card and one antenna as the transmitter of the experiment, deploy the device in four different daily life scenarios ( laboratories, conference rooms, offices, bedrooms). The linear distance between the transmitter and receiver is 3 meters, and the height is 1 meter from the ground. The transceiver sends and receives data packets during volunteer activities and collects data related to personnel activities. Use Linux802.11 CSI tools to extract CSI raw data for human activity recognition from the collected activity data.

步骤2：CSI（信道状态信息）刻画了信号传播空间的特性，不同的活动会导致信号不同的多径传播，利用这一特点，可以使用Wi-Fi进行人体活动识别。Step 2: CSI (Channel State Information) describes the characteristics of the signal propagation space. Different activities will cause different multipath propagation of the signal. Using this characteristic, Wi-Fi can be used for human activity recognition.

发送端发射的无线信号在传输过程中受物理环境或人的因素（位置、姿势、动作）影响，形成直射、反射和散射等多条路径传播，产生多径效应。当有人员活动时，接收端信号发生变化。此时接收端接收到的信号反映了由于人员活动造成的多径变化。CSI包含了每个子载波上的幅度和相位信息，刻画信号空间传播的特性，能够反映信号的变化。接收端接收到的信号可以描述成：The wireless signal transmitted by the sending end is affected by the physical environment or human factors (position, posture, action) during the transmission process, forming multiple paths such as direct radiation, reflection, and scattering, resulting in multipath effects. When there is human activity, the signal at the receiving end changes. At this time, the signal received by the receiving end reflects the multi-path change caused by personnel activities. CSI contains amplitude and phase information on each subcarrier, characterizes the characteristics of signal spatial propagation, and can reflect signal changes. The signal received by the receiver can be described as:

其中为信道矩阵，表示CSI信息；接收和发射的信号向量分别为和；为加性高斯白噪声。 in is the channel matrix, representing the CSI information; the received and transmitted signal vectors are respectively and ; is additive white Gaussian noise.

由于在室内环境中射频信号从发射器到接收器通过多条路径传播，CSI是所有路径信号的叠加。因此信道矩阵可以表示为：Since an RF signal propagates through multiple paths from transmitter to receiver in an indoor environment, CSI is the sum of all path signals. So the channel matrix can be expressed as:

这里是路径的数量，是复衰减，是第条路径的传播长度，是波长。由此模型可以得到当室内人员活动时，信号经过人体的反射，改变了信号路径长度，从而影响信道状态矩阵。 here is the number of paths, is complex attenuation, is the first The propagation length of the path, is the wavelength. From this model, it can be obtained that when people are active in the room, the signal is reflected by the human body, which changes the length of the signal path, thereby affecting the channel state matrix.

步骤2-1：数据插值，由于设备硬件和信道影响，数据传输过程中会发生丢包现象，根据时间戳使用三次样条插值对CSI数据进行补全，使得CSI数据的长度统一。插值补全Matlab中有直接提供插值函数，是一种常用的数据处理方法。Step 2-1: Data interpolation. Due to the influence of device hardware and channels, packet loss may occur during data transmission. Use cubic spline interpolation to complete the CSI data according to the timestamp, so that the length of the CSI data is uniform. Interpolation completion Matlab directly provides interpolation functions, which is a commonly used data processing method.

步骤2-2：数据去噪，对于人类日常活动行为，CSI的幅值在30 ~ 60 Hz之间波动。在保证信号不失真的前提下最大的去除硬件设备带来的噪声，本发明利用巴特沃斯低通滤波器对步骤2-1中的CSI数据进行去燥，去除环境中的高频噪声，利用Hample滤波器去除CSI数据中的异常值，得到预处理后的数据。Step 2-2: Data denoising, for human daily activities, the amplitude of CSI fluctuates between 30 ~ 60 Hz. Under the premise of ensuring that the signal is not distorted, the noise caused by the hardware device is removed to the greatest extent. The present invention uses a Butterworth low-pass filter to denoise the CSI data in step 2-1, removes high-frequency noise in the environment, and utilizes The Hample filter removes outliers in the CSI data to obtain the preprocessed data.

步骤3：为了从收集到的数据中生成更多对应的活动数据，设计了一个多自编码器模块，其中每个子模块专用于一个活动类型。数据增强模块整体框架如图2中的(a)所示，对于第个活动类型，使用第个自动编码器来生成相关的活动数据。其中每个自动编码器都是一个编码器-解码器结构。首先，通过编码器对数据进行压缩，提取与其活动相关的特征信息，将输入编码为正态分布。假设产生数据的概率为；编码器在输入为的条件下输出潜在向量z的概率为。然后将提取的特征和高斯随机噪声输入解码器以重构数据，其中解码器被描述为。 Step 3: To generate more corresponding activity data from the collected data, a multi-autoencoder module is designed, where each submodule is dedicated to one activity type. The overall framework of the data enhancement module is shown in (a) in Figure 2. For the first activity type, using the Autoencoders to generate relevant activity data. Each of these autoencoders is an encoder-decoder structure. First, the data is compressed by an encoder, feature information related to its activity is extracted, and the input is encoded as a normal distribution . Hypothetically generate data The probability of ; The encoder is input as The probability of outputting latent vector z under the condition of . The extracted features and Gaussian random noise are then fed into the decoder to reconstruct the data, where the decoder is described as .

最后，在训练阶段引入损失函数。目标是训练编码器输出潜在向量z的概率近似于生成潜在向量z的概率，其中符合的正态分布。将编码器损失定义为Kullback–Leibler散度，和分别表示类别的正态分布的期望和方差。 Finally, a loss function is introduced during the training phase. The goal is to train the encoder to output the probability of latent vector z approximates the probability of generating the latent vector z ,in conform to normal distribution of . Define the encoder loss as the Kullback–Leibler divergence, and Respectively represent the category The expectation and variance of the normal distribution of .

编码器的输出混合噪声输入解码器中来重构输入，其中为真实值，最小化重构损失为 The output of the encoder is mixed with noise into the decoder to reconstruct the input ,in is the real value, and the minimum reconstruction loss is

那么，数据增强模块的整体损失函数定义为：Then, the overall loss function of the data augmentation module is defined as:

通过数据增强模块可以生成大量与训练数据相似但不同的合成数据用来扩充训练集。Through the data enhancement module, a large amount of synthetic data similar to but different from the training data can be generated to expand the training set.

步骤4：对于未知的活动类型，不能直接将它们映射到已知的标签。受度量学习思想的启发，通过学习两个样本之间的相似程度，可以将简单的分类问题转化为对比问题，即对比样本之间的相似度。由此本发明提出了一个活动识别模块。Step 4: For unknown activity types, they cannot be directly mapped to known labels. Inspired by the idea of metric learning, by learning the similarity between two samples, the simple classification problem can be transformed into a comparison problem, that is, comparing the similarity between samples. Thus the present invention proposes an activity recognition module.

步骤4-1：划分数据集。Step 4-1: Divide the dataset.

基于步骤3，获得增强后的扩充数据集，以此数据集作为训练集。之后，以的形式对训练集进行标注，其中正样本与基准样本具有相同的类别标签，而负样本则具有不同的活动类别标签。因此，训练数据集可以表示为，其中对应第个样本。 Based on step 3, the enhanced and expanded data set is obtained, and this data set is used as the training set. Afterwards, with Label the training set in the form of , where the positive samples with the benchmark sample have the same class labels, while the negative samples has a different activity category label. Therefore, the training dataset can be expressed as ,in corresponding to the first samples.

对于未知类型的活动数据，从未知类型的活动数据集中，随机选取一组数据（每个样本对应一个类别）来生成支持集。将支撑集描述为，其中为随机抽取样本对应的标签，对应的是未知类型活动的数量。 For activity data of unknown type, a set of data (each sample corresponding to a category) is randomly selected from the activity data set of unknown type to generate a support set. Describe the support set as ,in for a random sample the corresponding label, Corresponds to the number of activities of unknown type.

步骤4-2：训练特征提取器，提取活动相关特征。Step 4-2: Train the feature extractor to extract activity-related features.

特征提取模块整体框架如图2中的(b)所示。具体来说，活动识别方法由三个模块组成，它们使用同样的前馈网络并共享相同的参数。活动识别方法有三个输入和对应的两个中间输出值，它们表示两个输入的L2距离；为了从模型中输出一个比较操作符，Softmax函数被应用于输出以创建一个比率度；前馈网络主要由六个网络层构成，CNN→LSTM→CNN→LSTM→CNN→Linear；CNN层学习CSI的深层特征信号，它由BatchNorm→CNN→Maxpool构成；LSTM层学习CSI信号在时域的相关性，具体为BatchNorm→LSTM→Dropout；线性层Linear则由Linear→Relu→Linear组成，将提取的数据映射到特征空间中。The overall framework of the feature extraction module is shown in (b) in Figure 2. Specifically, the activity recognition method consists of three modules that use the same feed-forward network and share the same parameters. The activity recognition method has three inputs and corresponding two intermediate output values, which represent the L2 distance of the two inputs; in order to output a comparison operator from the model, the Softmax function is applied to the output to create a ratio degree; the feedforward network mainly It consists of six network layers, CNN→LSTM→CNN→LSTM→CNN→Linear; the CNN layer learns the deep feature signal of CSI, which is composed of BatchNorm→CNN→Maxpool; the LSTM layer learns the correlation of CSI signals in the time domain, specifically It is BatchNorm→LSTM→Dropout; the linear layer Linear is composed of Linear→Relu→Linear, and maps the extracted data to the feature space.

训练的目的是使同一类别之间的数据距离尽可能小，不同类别之间的数据距离尽可能大。因此将损失定义为：The purpose of training is to make the data distance between the same category as small as possible and the data distance between different categories as large as possible. So the loss is defined as:

其中，in,

式中，函数E()指特征提取器；目标是使得特征提取网络的损失趋近于0，此外定义，表示和之间的欧氏距离，表示和之间的欧氏距离。 In the formula, the function E() refers to the feature extractor; the goal is to make the loss of the feature extraction network approach to 0, and define , express and The Euclidean distance between express and Euclidean distance between.

使用反向传播算法同时更新三个前馈网络。最后，将训练集输入特征提取器，通过训练获得了一个特征提取器来提取CSI数据中与活动相关的特征。Three feedforward networks are updated simultaneously using the backpropagation algorithm. Finally, the training set is fed into a feature extractor, and a feature extractor is obtained through training to extract activity-related features in CSI data.

步骤4-3：活动识别。Step 4-3: Activity Recognition.

如图2所示，输入支持集和目标样本数据到特征提取网络，获得相应的活动特征。活动识别过程如图2中的(c)所示，测量目标样本的特征值与支持集中所有样本特征值之间的相似性，选择相似度最大的支持集样本对应的标签作为目标样本的标签。 As shown in Figure 2, the input support set And the target sample data to the feature extraction network to obtain the corresponding activity features. The activity recognition process is shown in (c) in Figure 2. Measure the similarity between the eigenvalues of the target sample and all the eigenvalues of the samples in the support set, and select the label corresponding to the support set sample with the largest similarity as the label of the target sample .

其中指的是支持集中样本对应的标签，/>指支持集，i指支持集中第i个数据；支持集样本和目标样本/>之间的余弦相似度/>为in Refers to the label corresponding to the sample in the support set, /> Refers to the support set, i refers to the i-th data in the support set; support set samples and target samples/> Cosine similarity between /> for

通过以上步骤来完成对未知类型活动的识别。Through the above steps, the identification of unknown types of activities is completed.

为了评估本方法在不同环境下的鲁棒性，在4个室内场景(实验室、会议室、卧室和客厅)中实现了所提出的方法，这些室内场景具有各种复杂的无线环境。每个场景使用两台笔记本电脑(Think-pad X200)作为收发器，两台设备都配备了Intel 5300卡，安装了Linux802.11n CSI Tool，用于采集CSI数据。发射端天线数为Nt= 1，接收端天线数为Nr= 3。通过采集人员活动信号，可以得到1 × 3 × 30 = 90个子载波的CSI数据，其中每个收发天线对有30个子载波。本方法中，信号频率为5.8 GHz，带宽为20MHz，采样频率为200Hz。To evaluate the robustness of the proposed method in different environments, the proposed method is implemented in 4 indoor scenarios (laboratory, conference room, bedroom, and living room) with various complex wireless environments. Each scene uses two laptops (Think-pad X200) as transceivers, both devices are equipped with Intel 5300 cards, and Linux802.11n CSI Tool is installed for collecting CSI data. The number of antennas at the transmitting end is Nt=1, and the number of antennas at the receiving end is Nr=3. By collecting human activity signals, CSI data of 1 × 3 × 30 = 90 subcarriers can be obtained, where each transceiver antenna pair has 30 subcarriers. In this method, the signal frequency is 5.8 GHz, the bandwidth is 20 MHz, and the sampling frequency is 200 Hz.

为了评估本方法对不同人员的适应能力，在每个实验场景中，招募了7名不同身高和体重的志愿者来完成8项活动(起立、站立、坐下、坐下、弯腰、行进、招手和转身)。每个场景的数据分为训练集(包含四种活动类型)、测试集(包含四种未见过的活动类型)和支持集(仅包含每种活动的一个样本)。In order to evaluate the adaptability of this method to different people, in each experimental scenario, 7 volunteers with different heights and weights were recruited to complete 8 activities (standing up, standing, sitting down, sitting down, bending over, marching, wave and turn). The data for each scene is split into a training set (contains four activity types), a test set (contains four unseen activity types), and a support set (contains only one sample of each activity).

使用准确率来评价系统的性能。Use the accuracy rate to evaluate the performance of the system.

本发明的识别效果如下：The recognition effect of the present invention is as follows:

1、整体识别准确率。1. Overall recognition accuracy.

在四个场景中验证提出的方法。根据下表内容可以看到，与现有的一些网络进行对比，本方法的识别准确度有很明显的提升。具体而言，本发明的活动识别方法相较于CNN和MatNet活动识别准确度有显著提升；此外，本发明提出的方法在4种生活场景下的准确率对比迁移学习方法分别达到了9.31%、5.25%、10.22%和21.33%的提升。这主要得益于本发明提出的方法挖掘数据之间的关系，通过对比数据与支持集数据的相似度来进行识别，从而更有效地区分样本。The proposed method is verified in four scenarios. According to the table below, it can be seen that compared with some existing networks, the recognition accuracy of this method has been significantly improved. Specifically, the activity recognition method of the present invention has significantly improved the accuracy of activity recognition compared with CNN and MatNet; in addition, the accuracy rate of the method proposed in the present invention in four life scenarios compared with the transfer learning method reached 9.31%, 9.31%, 5.25%, 10.22% and 21.33% improvements. This is mainly due to the method proposed in the present invention to mine the relationship between data, and identify by comparing the similarity between the data and the support set data, so as to distinguish samples more effectively.

表 1Table 1

Modelmodel LaboratoryLaboratory Meeting roommeeting room BedroomBedroom ParlorParlor CNNCNN 26.1526.15 23.4823.48 23.7923.79 21.7421.74 MatNetMatNet 55.3555.35 52.8752.87 53.1053.10 52.6152.61 Transfertransfer 63.1963.19 68.8568.85 70.2870.28 56.6756.67 WIUSWIUS 72.5072.50 74.1074.10 80.5080.50 78.0078.00

2、未出现在训练集中的新类别活动识别准确率。2. The recognition accuracy of new categories of activities that did not appear in the training set.

通过绘制活动分类的混淆矩阵，详细说明本发明的活动识别方法在不同的类别活动中的有效性。如图3-6所示，对新类别活动识别的准确率分别达到72.5%、74.1%、80.5%和78%。实验结果表明，对未见过的"起立"动作的正确识别概率高于其他动作。原因是当人员站起来时，他们首先会经历快速加速，然后速度下降到零。此外，起立动作是一个持续的接近收发对的过程，这个过程很好地区别于其他活动。而“挥手”和“转身”，都有一个远离收发器链路到最远距离后靠近收发对的过程，因此它们很容易被错误地归类到对方的类别中。By drawing the confusion matrix of activity classification, the effectiveness of the activity recognition method of the present invention in different categories of activities is described in detail. As shown in Figure 3-6, the accuracy rates of new category activity recognition reached 72.5%, 74.1%, 80.5% and 78% respectively. The experimental results show that the correct recognition probability of the unseen "stand up" action is higher than other actions. The reason is that when the person stands up, they first experience a rapid acceleration and then the velocity drops to zero. Furthermore, standing up is a continuous process of approaching the sending and receiving pair, which is well differentiated from other activities. Both "waving" and "turning around" have a process of moving away from the transceiver link to the longest distance and then approaching the transceiver pair, so they are easily misclassified into the other party's category.

3、数据增强方法对活动识别精度的影响。3. The impact of data enhancement methods on the accuracy of activity recognition.

为了验证本方法中提出的数据增强方法的有效性，在使用数据增强方法和不使用数据增强方法的跨域活动识别的情况下进行消融分析。如图7所示，增强数据集后本方法的未知活动识别准确率有大幅提升，在四个生活场景中，对本方法未知活动的平均识别准确率提高了10.95%。特别是在卧室和走廊中，数据集的扩充使得本方法有足够的数据对模型进行训练，提升了模型的特征提取能力。To verify the effectiveness of the data augmentation method proposed in this method, ablation analysis is performed with and without data augmentation for cross-domain activity recognition. As shown in Figure 7, after enhancing the data set, the recognition accuracy of unknown activities of this method has been greatly improved. In the four life scenarios, the average recognition accuracy of unknown activities of this method has increased by 10.95%. Especially in bedrooms and corridors, the expansion of the data set enables this method to have enough data to train the model, which improves the feature extraction ability of the model.

以上所述仅为本发明的较佳实施方式，本发明的保护范围并不以上述实施方式为限，但凡本领域普通技术人员根据本发明所揭示内容所作的等效修饰或变化，皆应纳入权利要求书中记载的保护范围内。The above descriptions are only preferred embodiments of the present invention, and the scope of protection of the present invention is not limited to the above embodiments, but all equivalent modifications or changes made by those of ordinary skill in the art according to the disclosure of the present invention should be included within the scope of protection described in the claims.

Claims

1. A non-contact cross-domain human activity recognition method based on metric learning, characterized in that: the method comprises the steps of:

Step 1, collect the wireless signal data corresponding to people's activities in the indoor environment, and extract the original signal of CSI from it;

Step 2, performing data preprocessing on the original CSI data collected in step 1, data interpolation with a uniform length, and data denoising to remove noise signals caused by hardware devices;

Step 3, use the autoencoder to perform data enhancement on the data processed in step 2, and generate data related to corresponding activities to expand the data set;

Step 4, use the method based on metric learning to establish an activity recognition model, train the data generated in step 3 and the original data, learn the similarity between the data, train the activity-related feature extraction model, and compare the data to be recognized with the support Collect the characteristics of the data to identify unknown types of activities;

In step 4, the activity recognition module includes a feature extraction network and an activity classification module;

Step 4-1: Divide the dataset;

Based on step 3, the enhanced and expanded data set is obtained, and this data set is used as the training set; the training data set is expressed as train={(x ₁ , x ₁₊ , x _1- ), (x ₂ , x ₂₊ , x _2- ),...(x _i , x _i+ , x _i- )}, where the positive sample x ₊ has the same class label as the baseline sample x, and the negative sample x ₋ has a different active class label;

From an active dataset of unknown type, randomly select a set of data to generate a support set; describe the support set as where /> is the data sample /> The corresponding label, k is the number of activities of unknown type;

Step 4-2: Train the feature extractor to extract activity-related features;

The feature extraction module consists of three modules, which use the same feed-forward network and share the same parameters; the feature extraction module has three inputs and corresponding two intermediate output values, which represent the L2 distance of the two inputs; in order to extract from the model A comparison operator is output, and the Softmax function is applied to the output to create a ratio; the feedforward network is mainly composed of six network layers, CNN→LSTM→CNN→LSTM→CNN→Linear; the CNN layer learns the deep feature signal of CSI, It consists of BatchNorm→CNN→Maxpool; the LSTM layer learns the correlation of CSI signals in the time domain, specifically BatchNorm→LSTM→Dropout; the linear layer Linear is composed of Linear→Relu→Linear, which maps the extracted data to the feature space ;

The purpose of training is to make the data distance between the same category as small as possible, and the data distance between different categories as large as possible, so the loss is defined as:

in,

In the formula, the function E() refers to the feature extractor; the goal is to make the loss of the feature extraction network close to 0, define margin=1, d _- means the Euclidean distance between x _- and x, d ₊ means x ₊ and Euclidean distance between x;

Step 4-3: activity recognition;

Input the support set S and the target sample data to the feature extraction network to obtain the corresponding activity features; measure the similarity between the feature value of the target sample and all sample feature values in the support set, and select the label corresponding to the support set sample with the largest similarity as The label y _t of the target sample;

in, refers to the label corresponding to the sample in the support set, S refers to the support set, and i refers to the i-th data in the support set; the cosine similarity d _k between the support set sample and the target sample x _t is:

2. A non-contact cross-domain human activity recognition method based on metric learning according to claim 1, characterized in that: in step 1, use Linux 802.11 CSI tools to extract the original signal of CSI therefrom.

3. A kind of non-contact cross-domain human activity recognition method based on metric learning according to claim 1, characterized in that: in step 2, the data preprocessing method is as follows:

Step 2-1: Data interpolation, for packet loss that occurs during data transmission, use cubic spline interpolation to complete the CSI data according to the timestamp;

Step 2-2: Data denoising, use the Butterworth low-pass filter to denoise the CSI data in step 2-1, remove high-frequency noise in the environment, and use the Hampel filter to remove outliers in the CSI data, Get the preprocessed data.

4. A kind of non-contact cross-domain human activity recognition method based on metric learning according to claim 1, characterized in that: in step 3, the specific method of data enhancement is as follows:

Use an autoencoder to generate relevant activity data, where each autoencoder is an encoder-decoder structure; first, the data is compressed by the encoder, feature information related to its activity is extracted, and the input is encoded as a positive state distribution N(0, σ); assuming that the probability of generating data x is P(x); the encoder outputs the probability of latent vector z under the condition that the input is x is P(z|x); then the extracted features and Gaussian random noise is fed into the decoder to reconstruct the data;

Finally, a loss function is introduced during the training phase; the goal is to train the encoder to output a latent vector z with probability P(z|x) that approximates the probability P(z) of generating latent vector z, where z conforms to the positive The encoder loss is defined as the Kullback–Leibler divergence, and μ _i and σ _i represent the expectation and variance of the normal distribution of category i, respectively;

L _encoder = D _KL (N(μ _i , σ _i ), N(0, σ));

The output of the encoder is mixed with noise n into the decoder to reconstruct the input x, where is the real value, the minimum reconstruction loss is:

Then, the overall loss function of the data augmentation module is defined as:

L _aug = min(L _encoder , L _decoder ).