WO2023226186A1 - 神经网络训练方法、人体运动识别方法及设备、存储介质 - Google Patents

神经网络训练方法、人体运动识别方法及设备、存储介质 Download PDF

Info

Publication number
WO2023226186A1
WO2023226186A1 PCT/CN2022/108857 CN2022108857W WO2023226186A1 WO 2023226186 A1 WO2023226186 A1 WO 2023226186A1 CN 2022108857 W CN2022108857 W CN 2022108857W WO 2023226186 A1 WO2023226186 A1 WO 2023226186A1
Authority
WO
WIPO (PCT)
Prior art keywords
training
graph
neural network
data
input
Prior art date
Application number
PCT/CN2022/108857
Other languages
English (en)
French (fr)
Inventor
颜延
廖天正
赵金津
任旭超
赵瑞麒
马良
王磊
刘语诗
熊璟
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2023226186A1 publication Critical patent/WO2023226186A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of neural network technology, and in particular to a neural network training method, human movement recognition method and equipment, and storage media.
  • This application mainly provides a neural network training method, human motion recognition method and equipment, and storage media to solve the problem that traditional manual feature methods are time-consuming and the extracted features lack incremental and unsupervised learning capabilities and generalization capabilities.
  • the neural network training method includes:
  • each training graph data is the training graph data of a time slice in the training data
  • the training graph data is input into a graph neural network for training, wherein the graph neural network includes several graph convolution layers connected in sequence;
  • the weight matrix of the final graph neural network is obtained to complete the neural network training, wherein the weight matrix of the graph neural network is composed of the final weights of the several graph convolution layers.
  • inputting the training graph data into a graph neural network for training includes:
  • the first output is input to the next graph convolution layer of the first graph convolution layer, so that the first output is used as the input of the next graph convolution layer for training until the graph neural network is completed. Training of all graph convolutional layers of the network.
  • the first output is input to the next graph convolution layer of the first graph convolution layer, so that the first output is used as the next graph convolution layer.
  • Inputs for training include:
  • the fused data is input to the next graph convolution layer, so that the fused data is used as the input of the next graph convolution layer for training.
  • the first output is generated by calculation of the training image data and the training weight of the first graph convolution layer
  • the first output is converted into the input of the next graph convolutional layer through an activation function.
  • inputting the training graph data into a graph neural network for training includes:
  • the spatial features and the diagonal matrix are used to form the output of the graph neural network for training.
  • using the spatial features and the diagonal matrix to form the output of the graph neural network for training includes:
  • the updated spatial features and the diagonal matrix are used to form the output of the graph neural network for training.
  • updating the spatial characteristics of each node feature based on the preset convolution kernel receptive field includes:
  • each node characteristic is input into the Chebyshev polynomial recursive equation, and the updated spatial characteristics of each node characteristic are recursively obtained.
  • the graph neural network is further connected with at least one fully connected layer after the several graph convolution layers, and the at least one fully connected layer is used for training classification tasks.
  • the weight matrix of the final graph neural network is obtained based on the training results.
  • the neural network training method further includes:
  • the migration neural network is retrained.
  • the human movement recognition method includes:
  • Preprocess the human body motion data to obtain human body motion map data
  • the graph neural network is trained by the above-mentioned neural network training method.
  • a terminal device which includes a memory and a processor coupled to the memory;
  • the memory is used to store program data
  • the processor is used to execute the program data to implement the above-mentioned neural network training method and/or human movement recognition method.
  • the computer storage medium is used to store program data.
  • the program data is executed by the computer, it is used to implement the above Neural network training methods and/or human motion recognition methods.
  • the neural network training method includes: obtaining a training data set, and preprocessing the training data in the training data set to obtain several training data.
  • Graph data wherein each training graph data is the training graph data of a time slice in the training data; the training graph data is input into a graph neural network for training, wherein the graph neural network includes several graphs connected in sequence Convolution layer; based on the training results, obtain the weight matrix of the final graph neural network to complete the neural network training, wherein the weight matrix of the graph neural network is composed of the final weights of the several graph convolution layers.
  • Figure 1 is a schematic flow chart of an embodiment of a neural network training method provided by this application.
  • Figure 2 is a schematic diagram of the framework of the graph neural network provided by this application.
  • Figure 3 is a schematic diagram of the main flow of the neural network training method provided by this application.
  • Figure 4 is a schematic framework diagram of the learning process of transfer learning provided by this application.
  • Figure 5 is a schematic flow chart of an embodiment of the human movement recognition method provided by this application.
  • Figure 6 is a schematic structural diagram of an embodiment of a terminal device provided by this application.
  • Figure 7 is a schematic structural diagram of an embodiment of a computer storage medium provided by this application.
  • the human body's daily exercise behavior is closely related to people's health indicators and energy balance. For example, individual energy consumption can be calculated by monitoring exercise behaviors such as running and walking, which has positive significance in terms of personal healthy exercise and body energy balance. In addition, through the recognition of abnormal human movement behaviors (such as falls, etc.), timely rescue can be effectively provided to individuals in dangerous situations.
  • HAR Human activity recognition
  • machine vision in the early days is a popular direction, which captures images or video streams to detect human behavior using image/video processing technology. For example, good results have been achieved in the field of video-based HAR. results.
  • this method is limited by the impact of complex scenes, the uncertainty of actions, and the privacy issues caused by the camera need to be considered. It is only suitable for some specific scenes.
  • wearable sensors are less susceptible to environmental interference, the signals collected are more continuous and accurate, and can be used in a wider range of scenarios.
  • This application proposes a solution to the HAR problem for sensors from the perspective of building a graph.
  • limbs will cooperate and work together.
  • a graph neural network modeling based on graph theory is used. Through the graph The network learns the action information contained in the graph and the correlation between sensors to classify actions.
  • this application constructed a complete HAR framework, selected graph neural network to model human movement, and confirmed that the GNN (Graph Neural Network, graph neural network) network has strong transfer learning capabilities and multi-angle capabilities in the HAR field.
  • the learning ability effectively makes up for the inability of traditional deep learning to effectively capture graph structure data relationships in non-Euclidean space, and proposes new ideas for modeling sensor-based human motion graph structure data.
  • Figure 1 is a schematic flow chart of an embodiment of a neural network training method provided by this application.
  • Figure 2 is a schematic framework diagram of a graph neural network provided by this application.
  • Figure 3 is a neural network provided by this application. Schematic diagram of the main flow of the training method.
  • the neural network training method may specifically include the following steps:
  • Step S11 Obtain a training data set, and preprocess the training data in the training data set to obtain several training graph data, where each training graph data is the training graph data of a time slice in the training data.
  • the data sets used in this application can be the MHEALTH data set and the PAMAP2 data set. In other embodiments, other data sets can also be used, without limitation here.
  • This dataset includes data from 10 participants in an out-of-laboratory setting. Each subject wore wearable sensors attached to the chest, right wrist, and left ankle. Physical activities such as standing, sitting, lying, walking, climbing stairs, bending forward at the waist, raising forearms, bending knees, riding a bicycle, jogging, running, and jumping forward were all involved in the experiment. The sampling rate of recorded data is 50Hz. Then there are 12 activity categories in the MHEALTH dataset, with a total of 21 channels of sensing signals. In this method, the sensory information of the user's body is captured through the chest sensor, while the other two are from the back sensor.
  • This dataset includes data obtained from nine participants aged 24 to 30 years. Participants wore an IMU (Inertial Measurement Unit) on the user's dominant side wrist, ankle, and chest. The activities performed by each person include lying, sitting, standing, walking, running, cycling, brisk walking, going up stairs, going down stairs, and skipping rope. Each IMU contains two 3D acceleration sensors, a gyroscope sensor, and a magnetometer sensor, with a sampling frequency of 100Hz. Each IMU contains nine-axis sensor information, with a total of 27 channel sensing signals. In this method, this application only requires information from 3 sensors in the data set, including the right waist, left ankle and back, to maintain the consistency of sensor locations.
  • IMU Inertial Measurement Unit
  • the terminal device Before inputting the above training set data into the graph neural network for training, the terminal device needs to preprocess the training set data and convert the training set data into graph data.
  • the specific preprocessing process is as follows:
  • the terminal equipment performs noise filtering and normalization on the training data collected by all sensors according to the time series and then resamples to 50Hz.
  • the training data is divided into windows using a sliding window with a fixed length of 128 and an overlap rate of 50%. In other embodiments, sliding windows of different lengths can also be used, which will not be described again here.
  • the terminal device can obtain data from MHEALTH based on the sampling frequency of different data sets. For example, the duration of each window of the MHEALTH data set with a sampling frequency of 50Hz is 2.56 seconds, and the duration of each window of the PAMAP2 data set with a sampling frequency of 100Hz is 1.28 seconds. There are 5361 activity time series segments of the data set and 11784 activity time series segments of the PAMAP2 data set.
  • the terminal device regards each activity time series segment as a training sample and establishes graph data for each training sample as the input of the GNN network.
  • a sensor channel will be regarded as a node.
  • the Pearson correlation coefficient is used to calculate the correlation between each node to obtain the correlation coefficient matrix.
  • Two nodes with a correlation coefficient greater than 0.2 are regarded as nodes with high correlation and are It connects lines and embeds data of length 128 into the points of the corresponding sensor channels to form a graph data based on a time slice.
  • the length of the graph data is determined by the length of the sliding window.
  • the terminal device first undergoes relevant preprocessing work on the human sensor data to filter out unnecessary noise information and interference information. Then, the data is divided into windows and each time series segment is mapped as a GNN. input to the network.
  • Step S12 Input the training graph data into the graph neural network for training, where the graph neural network includes several graph convolution layers connected in sequence.
  • GCN Graph Convolutional Network
  • Graph Convolutional Network is different from traditional deep learning in that it is a deep learning model that operates in non-Euclidean space. It shows incomparable advantages in other deep models in non-Euclidean space. For example, human action recognition based on videos exceeds other deep models. In text-based sensor-based human activity recognition, each sensor has a potential graph structure relationship, so This application uses the GCN network as the neural network to be trained.
  • this application proposes a new ResGCNN framework, including parameter sharing using the same residual graph network structure as the training weights.
  • the graph neural network of this application includes a number of graph convolution layers (ChebNet Layers) connected in sequence, and the output of each graph convolution layer and the previous graph convolution layer is used as input.
  • the terminal device can also connect a fully connected layer after several sequentially connected graph convolution layers, using the graph convolution layer for feature extraction, and using the fully connected layer for classification tasks.
  • the terminal device constructs a 16-layer ResChebNet model based on sensor-based human movement recognition.
  • the ResGCNN framework includes four ResChebNet blocks and two additional fully connected (FC) layers.
  • FC fully connected
  • the multi-layer ResChebNet modeling shown in Figure 2 effectively learns the non-Euclidean structure relationship on the sensor, introducing Residual structure and graph normalization PairNorm solves the problem of over-smoothing and gradient disappearance. It also introduces the local residual structure to fully learn the local structure perception, and more fully learns the relationship between the graph structure based on the sensor human movement, making the results more accurate and The generalization ability is more powerful.
  • a training graph data G which consists of N vertices and edges formed by N vertices, such that an edge between any two vertices I and J represents their similarity.
  • the adjacency matrix A of the graph data is a sparse matrix with equal entries of I and J. If I and J have a connecting edge, the value is 1, otherwise it is 0.
  • each node in the graph data has an F-dimensional feature vector, and X ⁇ R N ⁇ F represents the feature matrix of all N nodes. Among them, the dimension of the node's feature vector is determined by the length of the graph data.
  • the L-layer graph convolutional neural network (GCN) consists of L-layer graph convolution, such as the 16-layer graph convolution shown in Figure 2. Each convolutional layer uses the output of each node of the previous layer to construct the input of each node of the current convolutional layer. Its expression is as follows:
  • the terminal device can also use graph theory and convolution theorem to generalize the traditional Fourier transform to the Fourier transform on the graph, and its formula as follows:
  • U is the eigenvector matrix decomposed by the Laplacian matrix L, that is, the Laplacian operator
  • f is the node feature of the input graph data
  • h is the topological space extracted by the trainable and parameter-shared convolution kernel. feature.
  • the core of the convolution operation of the GCN network is a trainable and parameter-shared convolution kernel.
  • GCN combines the above diagonal elements in Replace with the learnable parameter ⁇ , and then adjust the parameter ⁇ through backpropagation for training. Therefore, the training formula of the GCN network can be expressed as:
  • the topological space is then propagated to the next layer through the activation function ⁇ .
  • the Laplacian matrix needs to be divided into features, and matrix multiplication must be calculated during each forward propagation process.
  • the time complexity is O(n 2 ) , very time-consuming.
  • the number of convolution kernels of the graph neural network is n.
  • node feature updates are slow.
  • the representation vectors of node features tend to be consistent and the nodes are difficult to distinguish.
  • the weight parameter is ⁇ k .
  • the node connected to the intermediate node k-hop can be obtained. That is, whether the element in L k is 0 indicates whether the node in the graph data can reach another node after k hops.
  • k represents the size of the receptive field of the convolution kernel
  • the feature representation of the central node is updated by aggregating the adjacent nodes within k-hop of each central node
  • the parameter ⁇ k is the weight of the k-th neighbor.
  • the final formula result does not require matrix decomposition, but transforms (reconstructs) the Laplacian matrix L, which significantly reduces the amount of calculation. Among them, generally k ⁇ n.
  • the convolution kernel parameters of the GCN network are reduced from n to k. From the original global convolution to the current local convolution, the nodes k-hop away from the central node are regarded as adjacent nodes, and the computational complexity is reduced through iterative definition.
  • Step S13 Based on the training results, obtain the weight matrix of the final graph neural network and complete the neural network training.
  • the weight matrix of the graph neural network consists of the final weights of several graph convolution layers.
  • the process of neural network training is the process of adjusting parameters.
  • the more layers of the neural network the more parameters (weights and biases) that can be adjusted, which means the greater the degree of freedom of adjustment, and thus the better the approximation effect.
  • Deep neural networks have always been a hot issue, and graph neural networks (GCN) are no exception.
  • GCN graph neural networks
  • Various experiments in the past and analyzes from different aspects have analyzed the GCN network. As the number of layers increases, the node representation becomes more The globalization is smoother at the same time, and each layer of convolution is equivalent to making the node representation closer to the same. There is no distinction in dense parts, but in sparse parts, the information obtained is relatively not much. This is the phenomenon of over-smoothing.
  • ChebNet Cross-Norm standardization
  • PairNorm standardization is used, and structures such as PairNorm standardization are introduced to control the sum of the distances of feature vectors between all pairs of nodes to a constant, which can make the features of distant nodes more The vector distance is also relatively far.
  • transfer learning is a very important deep learning strategy. It reuses the knowledge gained from solving one problem by applying it to another different but related problem, that is, transferring knowledge from the source domain to the target domain, which will have a huge impact on many fields that are difficult to improve due to insufficient training data. Positive impact,The learning process of transfer learning is shown in, Figure 4.
  • Deep transfer learning is divided into four categories: instance-based deep transfer learning, mapping-based deep transfer learning, network-based deep transfer learning, and adversarial-based deep transfer learning.
  • This application uses parameter-based deep transfer learning. Because the sensors used in the experiment are of the same type and the data collected are of the same type, if their input dimensions are the same, the residual networks constructed are also the same, which makes it very suitable to use parameter-based transfer learning to optimize and add residuals. Learning efficiency of GNN network.
  • ResGCNN deep transfer learning consists of three main stages, including:
  • a single position sensor (9 channels) data or three position sensors (27 channels) will be selected from the PAMAP2 data set and input into the ResGCNN network for learning and classification, while retaining the parameters learned by the structure in the residual network part.
  • the other three data sets are input into the network for classification testing. It should be noted that their number of sensors must be the same (that is, the number of channels is the same) to ensure that they have the same input dimension.
  • the previously trained PAMAP2 residual network parameters will be directly transferred to the new training and its parameters will be locked, so for the new training, the iteratively optimized parameters are only the final fully connected layer part. In order to prove the transfer learning ability of the ResGCNN network in small samples, this application will take 30% of the original new sample set for testing.
  • the terminal device uses the model of the target data set sample to adaptively optimize the fully connected layer in the target model.
  • the last part of ResGCNN uses the Softmax layer as the HAR classifier.
  • the data sets are input into the network for training respectively, so that each layer The weights are continuously optimized.
  • the terminal device performs transfer learning on ResGCNN using the pre-trained blocks in the ResGCNN structure executed on the source domain as feature extractors in the target domain.
  • classification accuracy, recall, F1 score and confusion matrix are used to illustrate the completed results.
  • the model's predictions are compared to the ground truth labels to calculate the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).
  • the overall accuracy ACC is equal to:
  • F1-Score is a balanced combination of precision and recall, and its calculation formula is:
  • the terminal device obtains a training data set and preprocesses the training data in the training data set to obtain several training graph data, where each training graph data is a time slice in the training data.
  • training graph data input the training graph data into a graph neural network for training, wherein the graph neural network includes several graph convolution layers connected in sequence; based on the training results, obtain the weight matrix of the final graph neural network to complete the neural network Network training, wherein the weight matrix of the graph neural network is composed of the final weights of the several graph convolution layers.
  • FIG. 5 is a schematic flowchart of an embodiment of the human movement recognition method provided by the present application.
  • the human movement recognition method may specifically include the following steps:
  • Step S21 Use wearable sensors to obtain the user's human body movement data.
  • the terminal device obtains the user's human body movement data through wearable sensors on the user's body.
  • Step S22 Preprocess the human body motion data to obtain human body motion map data.
  • step S11 of the above embodiment for the specific data preprocessing process of step S22, which will not be described again here.
  • Step S23 Input the human body motion map data into the pre-trained graph neural network, and obtain the prediction information of the user's human body motion by the graph neural network based on the human body motion map data.
  • the pre-trained graph neural network can specifically be the graph neural network trained in the above embodiment, and the training process will not be described again here.
  • Step S24 Obtain the user's motion status based on the prediction information.
  • this application proposes a solution to solve the HAR problem for sensors from the perspective of building a graph.
  • This method uses the correlation of sensors worn at different positions on the human body to map the data collected by the human body, and uses a graph neural network modeling based on graph theory. Classify actions by learning the action information contained in the graph and the relationship between sensors through the graph network.
  • This application proves that it is feasible to use graph neural networks for sensor-based human movement recognition. It proposes a data preprocessing method that converts information collected by sensors into a graph structure. The data set of this method is equivalent to traditional depth. (Models (CNN, RNN, LSTM, DEEP-LSTM) achieve closer or better results, and also propose new ideas for using graph neural networks in sensor-based human motion recognition.
  • sensor-based human motion recognition it is proposed A multi-modal fusion method based on the graph network model builds a multi-layer residual graph neural network with high generalization, and trains it on multiple public data sets and its own data sets to achieve very good classification results. ; On sensor-based human motion recognition. Prove the transferability of the graph neural network model in transfer learning, and train and verify on multiple data sets, achieving very good results, contributing to the trained human motion recognition data set Multi-layer residual graph neural network model parameters with high generalization.
  • the writing order of each step does not mean a strict execution order and does not constitute any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible The internal logic is determined.
  • FIG. 6 is a schematic structural diagram of an embodiment of a terminal device provided by this application.
  • the terminal device 500 in the embodiment of the present application includes a processor 51, a memory 52, an input and output device 53, and a bus 54.
  • the processor 51, the memory 52, and the input and output device 53 are respectively connected to the bus 54.
  • the memory 52 stores program data.
  • the processor 51 is used to execute the program data to implement the neural network training method and/or the above embodiments. Human movement recognition method.
  • the processor 51 may also be called a CPU (Central Processing Unit).
  • the processor 51 may be an integrated circuit chip with signal processing capabilities.
  • the processor 51 can also be a general-purpose processor, a digital signal processor (DSP, Digital Signal Process), an application specific integrated circuit (ASIC, Application Specific Integrated Circuit), a field programmable gate array (FPGA, Field Programmable Gate Array) or other available Programmed logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the general processor may be a microprocessor or the processor 51 may be any conventional processor or the like.
  • FIG. 7 is a schematic structural diagram of an embodiment of the computer storage medium provided by this application.
  • the computer storage medium 600 stores program data 61.
  • the program data 61 is in When executed by the processor, it is used to implement the neural network training method and/or human movement recognition method in the above embodiments.
  • the embodiments of the present application When the embodiments of the present application are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the method described in each embodiment of the application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了一种神经网络训练方法、人体运动识别方法及设备、存储介质,该神经网络训练方法包括:获取训练数据集,并对所述训练数据集中的训练数据进行预处理,得到若干训练图数据,其中,每一训练图数据为所述训练数据中一个时间片的训练图数据;将所述训练图数据输入图神经网络进行训练,其中,所述图神经网络包括依次连接的若干图卷积层;基于训练结果,获取最终图神经网络的权重矩阵,完成神经网络训练,其中,所述图神经网络的权重矩阵由所述若干图卷积层的最终权重组成。通过上述方式,本申请对训练数据集进行数据预处理,从而得到满足图神经网络输入的训练图数据,从而提高神经网络的训练效率以及训练准确性。

Description

神经网络训练方法、人体运动识别方法及设备、存储介质 技术领域
本申请涉及神经网络技术领域,特别涉及一种神经网络训练方法、人体运动识别方法及设备、存储介质。
背景技术
长期以来,作为典型的模式识别问题,许多传统的机器学习算法被用于解决基于传感器的HAR(human activity recognition,人体运动行为识别)问题,包括决策树、随机森林、支持向量机、贝叶斯网络、马尔可夫模型等。在严格的控制环境和有限的输入下,传统的最大外观算法取得了良好的分类效果,但是传统的手工特征方法耗时长,提取的特征缺乏增量和无监督的学习能力和泛化能力。
发明内容
本申请主要提供一种神经网络训练方法、人体运动识别方法及设备、存储介质,以解决传统的手工特征方法耗时长,提取的特征缺乏增量和无监督的学习能力和泛化能力的问题。
为解决上述技术问题,本申请采用的一个技术方案是:提供一种神经网络训练方法,所述神经网络训练方法包括:
获取训练数据集,并对所述训练数据集中的训练数据进行预处理,得到若干训练图数据,其中,每一训练图数据为所述训练数据中一个时间片的训练图数据;
将所述训练图数据输入图神经网络进行训练,其中,所述图神经网络包括依次连接的若干图卷积层;
基于训练结果,获取最终图神经网络的权重矩阵,完成神经网络训练,其中,所述图神经网络的权重矩阵由所述若干图卷积层的最终权重组成。
根据本申请提供的一实施方式,所述将所述训练图数据输入图神经网络进行训练,包括:
将每一所述训练图像数据输入所述图神经网络的首个图卷积层,获取所述首个图卷积层的首个输出;
将所述首个输出输入所述首个图卷积层的下一个图卷积层,以将所述首个输出作为所述下一个图卷积层的输入进行训练,直至完成所述图神经网络的所有图卷积层的训练。
根据本申请提供的一实施方式,所述将所述首个输出输入所述首个图卷积层的下一个图卷积层,以将所述首个输出作为所述下一个图卷积层的输入进行训练,包括:
将所述首个输出以及所述训练图数据进行叠加,得到融合数据;
将所述融合数据输入所述下一个图卷积层,以将所述融合数据作为所述下一个图卷积层的输入进行训练。
根据本申请提供的一实施方式,所述首个输出由所述训练图像数据、所述首个图卷积层的训练权重计算生成;
所述首个输出通过激活函数转换为所述下一个图卷积层的输入。
根据本申请提供的一实施方式,所述将所述训练图数据输入图神经网络进行训练,包括:
利用拉普拉斯算子从所述训练图数据的节点特征中提取空间特征;
利用所述图神经网络的训练权重作为对角线元素,构建对角矩阵;
利用所述空间特征以及所述对角矩阵,形成所述图神经网络的输出进行训练。
根据本申请提供的一实施方式,所述利用所述空间特征以及所述对角矩阵,形成所述图神经网络的输出进行训练,包括:
获取每一个节点特征的空间特征;
基于预设卷积核感受野,更新所述每一个节点特征的空间特征;
利用更新后的空间特征以及所述对角矩阵,形成所述图神经网络的输出进行训练。
根据本申请提供的一实施方式,所述基于预设卷积核感受野,更新所述每一个节点特征的空间特征,包括:
按照所述预设卷积核感受野,设置切比雪夫多项式递归方程;
将所述每一个节点特征的空间特征输入所述切比雪夫多项式递归方程,递归得到所述每一个节点特征更新后的空间特征。
根据本申请提供的一实施方式,所述图神经网络在所述若干图卷积层之后,还连接有至少一层全连接层,所述至少一层全连接层用于训练分类任务。
根据本申请提供的一实施方式,所述基于训练结果,获取最终图神经网络的权重矩阵,完成神经网络训练之后,所述神经网络训练方法还包括:
将完成神经网络训练的图神经网络迁移到其他神经网络,作为其他神经网络的一部分网络结构,从而形成迁移神经网络;
重新对所述迁移神经网络进行训练。
为解决上述技术问题,本申请采用的另一个技术方案是:提供一种人体运动识别方法,所述人体运动识别方法包括:
利用可穿戴传感器获取用户的人体运动数据;
对所述人体运动数据进行预处理,得到人体运动图数据;
将所述人体运动图数据输入预先训练的图神经网络,获取所述图神经网络基于所述人体运动图数据对所述用户的人体运动的预测信息;
基于所述预测信息,获取所述用户的运动状态;
其中,所述图神经网络通过上述的神经网络训练方法训练得到。
为解决上述技术问题,本申请采用的另一个技术方案是:提供一种终端设备,所述终端设备包括存储器以及与所述存储器耦接的处理器;
其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据以实现如上述的神经网络训练方法和/或人体运动识别方法。
为解决上述技术问题,本申请采用的另一个技术方案是:提供一种计算机存储介质,所述计算机存储介质用于存储程序数据,所述程序数据在被计算机执行时,用以实现如上述的神经网络训练方法和/或人体运动识别方法。
本申请提供了一种神经网络训练方法、人体运动识别方法及设备、存储介质,该神经网络训练方法包括:获取训练数据集,并对所述训练数据集中的训练数据进行预处理,得到若干训练图数据,其中,每一训练图数据为所述训练数据中一个时间片的训练图数据;将所述训练图数据输入图神经网络进行训练,其中,所述图神经网络包括依次连接的若干图卷积层;基于训练结果,获取最终图神经网络的权重矩阵,完成神经网络训练,其中,所述图神经网络的权重矩阵由所述若干图卷积层的最终权重组成。通过上述方式,本申请对训练数据集进行数据预处理,从而得到满足图神经网络输入的训练图数据,从而提高神经网络的训练效率以及训练准确性。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图,其中:
图1是本申请提供的神经网络训练方法一实施例的流程示意图;
图2是本申请提供的图神经网络的框架示意图;
图3是本申请提供的神经网络训练方法的主要流程示意图;
图4是本申请提供的迁移学习的学习过程的框架示意图;
图5是本申请提供的人体运动识别方法一实施例的流程示意图;
图6是本申请提供的终端设备一实施例的结构示意图;
图7是本申请提供的计算机存储介质一实施例的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请的一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
需要说明,若本申请实施例中有涉及方向性指示(诸如上、下、左、右、前、后……),则该方向性指示仅用于解释在某一特定姿态(如附图所示)下各部件之间的相对位置关系、运动情况等,如果该特定姿态发生改变时,则该方向性指示也相应地随之改变。
另外,若本申请实施例中有涉及“第一”、“第二”等的描述,则该“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。
人体日常运动行为与人的健康指标、能量平衡有着密切联系。例如可以通过对跑步,走路等运动行为的监测计算出个人的能量消耗,这在个人的健康运动以及身体能量平衡等方面具有积极的意义。另外,通过对人体异常运动行为(如跌倒等)的识别可以有效的对出现危险状况的个人进行及时救助。
早期使用机器视觉的人体运动行为识别(human activity recognition,HAR)是一个流行的方向,它捕获图像或视频流以使用图像/视频处理技术检测人类的行为,例如在基于视频的HAR领域取得了不错的成绩。但是这种方法受限于复杂场景带来的影响,动作的不确定性,需要考虑摄像头带来的隐私问题,只 适用于一些特定的场景。相比之下,可穿戴传感器不易受环境干扰,采集到的信号更连续、更准确,可用于更广泛的场景。
在过去的十年中,传感器技术在计算能力、尺寸、精度和制造成本等多个领域取得了非凡的进步。这些进步使大多数传感器能够集成到智能手机和其他便携式设备中,使这些设备更加智能和实用。通常用于HAR的可穿戴传感器是加速度计、磁力计、陀螺仪和集成惯性测量单元(integrated inertial measurement units,IMU)。
基于深度学习的研究逐渐在人体运动行为识别领域取得了优异的成绩并占据了主导地位。通过多层神经网络自动提取特征,显着减少了特征的预处理,并且深度学习结构已被证明在无监督学习和强化学习中表现良好。
本申请提出了一种构建图的角度来解决针对传感器的HAR问题的方案。人在活动的过程中肢体间会相互配合、共同作用,通过佩戴在人身上不同位置传感器的相关性对人采集到的数据建图,使用一种基于图谱理论的图神经网络建模,通过图网络学习图中蕴含的动作信息及传感器间相互关系对动作进行分类。
对此,本申请构建了一个完整的HAR框架,选取图神经网络对人体运动建模,并证实了GNN(Graph Neural Network,图神经网络)网络在HAR领域具有很强的迁移学习能力及多角度学习能力,有效的弥补了传统深度学习无法有效捕捉非欧空间的图结构数据关系的不足,提出在基于传感器的人体运动图结构数据上建模新思路。
基于以上技术基础,本申请提供了一种具体的图神经网络的训练方法。具体请参阅图1至图3,图1是本申请提供的神经网络训练方法一实施例的流程示意图,图2是本申请提供的图神经网络的框架示意图,图3是本申请提供的神经网络训练方法的主要流程示意图。
如图1所示,本申请实施例的神经网络训练方法具体可以包括以下步骤:
步骤S11:获取训练数据集,并对训练数据集中的训练数据进行预处理,得到若干训练图数据,其中,每一训练图数据为训练数据中一个时间片的训练 图数据。
在本申请实施例中,本申请所采用的数据集可以为MHEALTH数据集和PAMAP2数据集,在其他实施例中,也可以采用其他数据集,在此不做限制。下面分别对以上两种数据集的数据进行说明:
MHEALTH数据集
该数据集包括来自实验室外环境中10名参与者的数据。每个受试者都佩戴连接到胸部、右手腕和左脚踝的可穿戴传感器。站立、坐、卧、行走、爬楼梯、前屈腰、前臂上抬、屈膝、骑自行车、慢跑、跑步、前跳等身体活动均参与实验。记录数据的采样率为50Hz。然后在MHEALTH数据集中有12个活动类别,共计21个通道的传感信号,此方法中用户身体的感知信息是通过胸部传感器捕获的,而另外两个来自背部传感器。
PAMAP2数据集
该数据集包括从9名24至30岁的参与者获得的数据。参与者在用户的优势侧手腕、脚踝和胸部佩戴IMU(Inertial Measurement Unit,惯性测量单元)。每个人进行的活动包括躺、坐、站、走、跑步、骑自行车、快走、上楼梯、下楼梯、跳绳十个动作。每个IMU包含两个3D加速度传感器、一个陀螺仪传感器、一个磁力计传感器,采样频率为100Hz。每个IMU包含九轴传感器信息,共计27个通道传感信号,此方法中本申请只需要数据集中的3个传感器的信息,包括右腰部、左脚踝和背部,以保持传感器位置的一致性。
在将以上训练集的数据输入图神经网络进行训练之前,终端设备需要对训练集的数据进行预处理,以及将训练集的数据转化为图数据。具体的预处理过程如下:
首先,终端设备对所有传感器采集的训练数据按照时间序列进行噪声滤波归一化后重采样至50Hz。其次,用固定长度为128、重叠率为50%的滑动窗口对训练数据进行分窗,在其他实施例中,也可以采用不同长度的滑动窗口,在此不再赘述。
终端设备根据不同数据集的采样频率,例如,采样频率为50Hz的 MHEALTH数据集每个窗口持续时间为2.56秒,采样频率为100Hz的PAMAP2数据集每个窗口持续时间为1.28秒,可以得到来自MHEALTH数据集的5361个活动时间序列片段、PAMAP2数据集的11784个活动时间序列片段。
终端设备将每个活动时间序列片段看做一个训练样本并对每个训练样本建立图数据作为GNN网络的输入。其中,一个传感器通道会被看作一个节点,运用皮尔逊相关系数来计算每个节点间的关联,得到相关系数矩阵,令相关系数大于0.2的两个节点视为有高相关性的节点并将其连线,并将长度为128的数据嵌入到相应的传感器通道的点中,形成一个基于一个时间片的图数据。其中,图数据的长度由滑动窗口的长度决定。
如图3所示,终端设备将人体传感器的数据首先经过相关的预处理工作,滤除不必要的噪声信息和干扰信息,然后,对数据进行分窗后对每个时间序列片段建图作为GNN网络的输入。
步骤S12:将训练图数据输入图神经网络进行训练,其中,图神经网络包括依次连接的若干图卷积层。
在本申请实施例中,图卷积神经网络(Graph Convolutional Network,GCN)区别于传统的深度学习,是作用在非欧几里得空间的深度学习模型。在非欧空间中表现出其他深度模型中无法比拟的优势,例如基于视频中的人体动作识别超过其它的深度模型,文本基于传感器对人体活动识别中,各个传感器存在一个潜在的图结构关系,故本申请使用GCN网络作为待训练的神经网络。
对此,本申请提出了一种新的ResGCNN框架,包括参数共享使用与训练权重相同的残差图网络结构。具体如图2所示,本申请的图神经网络包括若干依次连接的图卷积层(ChebNet Layer),每一层图卷积层与上一层图卷积层的输出作为输入。另外,终端设备还可以在若干依次连接的图卷积层之后连接全连接层,利用图卷积层用于特征提取,利用全连接层用于分类任务。
在本申请实施例中,终端设备在基于传感器的人体运动识别上,构建了一个16层的ResChebNet模型。解决过平滑的问题以及梯度消失等问题,ResGCNN框架包括四个ResChebNet块和两个额外的全连接(Fully Connected,FC)层。同时,涉及到块内残差结构,它将四个块的输入添加到最 后一个块的输出中,作为ResChebNet块的最终输出。
在基于传感器上人体运动识别上,相比与传统的深度模型(CNN,LSTM,DEEP-LSTM等),图2所示的多层ResChebNet建模有效学习了在上传感器非欧图结构关系,引入残差结构和图标准化PairNorm解决了过平滑问题和梯度消失的情况,同时也引入局部残差结构充分学习到局部结构感知,更加充分学习基于传感器人体运动时图结构的关系,使结果更加精准而泛化能力更加强大。
基于图2所示的ResChebNet模型,假设给定一个训练图数据G,它由N个顶点和N个顶点形成的边组成,使得任意两个顶点I和J之间的一条边代表它们的相似性。图数据的邻接矩阵A是一个I,J项相等的稀疏矩阵,I和J有连接边,则值为1,否则为0。
此外,图数据中的每个节点有一个F维的特征向量,X∈R N×F表示所有N个节点的特征矩阵。其中,节点的特征向量的维度由图数据的长度决定。L层图卷积神经网络(GCN)由L层图卷积组成,如图2所示的16层图卷积。每个卷积层通过上一个层每个节点的输出来构建当层卷积层每个节点的输入,其表现形式如下所示:
Z (l+1)=A′X (l)W (l),X (l+1)=σ(Z (l+1))
其中,
Figure PCTCN2022108857-appb-000001
是N个节点在l层图卷积的输入,X (0)=X;
Figure PCTCN2022108857-appb-000002
σ(·)是激活函数,通常会选择ReLU;D为度矩阵,其具体计算公式如下:
Figure PCTCN2022108857-appb-000003
其中,
Figure PCTCN2022108857-appb-000004
是可以学习的权重矩阵,是为了将下游的学习任务的特征进行变换的矩阵。
进一步地,在每层卷积层的特征提取以及特征变换的过程中,终端设备还可以通过图谱理论以及卷积定理,将传统的傅里叶变换推广到图上的傅里叶变换,其公式如下:
Figure PCTCN2022108857-appb-000005
其中,U为拉普拉斯矩阵L分解的特征向量矩阵,即拉普拉斯算子,f为输入的图数据的节点特征,h为可训练且参数共享的卷积核来提取的拓扑空间特征。
GCN网络卷积操作核心是可训练且参数共享的卷积核,GCN将上述
Figure PCTCN2022108857-appb-000006
中的对角元素
Figure PCTCN2022108857-appb-000007
替换为可学习的参数θ,然后,通过反向传播调整参数θ进行训练,故GCN网络的训练公式可以表示为:
Y=σ(Ug(θ)U Tx)
其中,x是图数据中每个节点特征的表示向量,Y是每个节点特征经过GCN网络卷积后的输出;图数据中的每一个节点特征都要经过卷积核卷积来提取相应的拓扑空间,然后经过激活函数σ传播到下一层。
进一步地,由于GCN网络存在缺点,需要对拉普拉斯矩阵进行特征分局,每次前向传播过程中都要计算矩阵乘法,当图数据规模较大时,时间复杂度为O(n 2),十分耗时。其中,图神经网络的卷积核个数为n,当n很大时,节点特征更新缓慢。多层GCN网络存在一个过平滑问题,节点特征的表示向量趋向一致,节点难以区分。
因此,本申请采用k阶ChebShev(切比雪夫)多项式对卷积核进行近似,带入上述图傅里叶变换中,表示公式如下:
Figure PCTCN2022108857-appb-000008
其中,权重参数为θ k,对于矩阵的k次方,可以得到与中间节点k-hop相连的节点,即L k中元素是否为0表示图数据中的节点经过k跳是否能到达另一节点,这里k表示卷积核感受野的大小,通过将每个中心节点k-hop内的邻接节点聚合来更新中心节点的特征表示,而参数θ k就是第k-hop邻接的权重。最终的公式结果不需要进行矩阵分解,而是对拉普拉斯矩阵L进行变换(重建),计算量明显减少。其中,一般k<n。
其中,上述切比雪夫多项式的递归定义为:
Figure PCTCN2022108857-appb-000009
GCN网络的卷积核参数n个减少到k个,从原先的全局卷积到现在的局部卷积,即将距离中心节点k-hop的节点作为邻接节点,过迭代定义降低了计算复杂度。
步骤S13:基于训练结果,获取最终图神经网络的权重矩阵,完成神经网络训练,其中,图神经网络的权重矩阵由若干图卷积层的最终权重组成。
在本申请实施例中,神经网络训练的过程就是调整参数的过程,神经网络层数越多,可以调整的参数(weights and bias)越多意味着调整的自由度越大,从而逼近效果越好。深层神经网络一直是热点问题,图神经网络(GCN)也不例外,在过去的各种实验和从不同的方面(比如动力系统的角度)分析了GCN网络随着层数的增加,节点表示更加全局化同时更加平滑了,每层卷积等价于让节点表示趋近于一致。在密集部分没有什么区分性了,而稀疏的部分,得到的信息相对来说却不是很多,这就是过平滑现象。
由于深层GCN的过平滑现象,本申请引入了图2所示的ResChebNet模型,公式表示如下:
Figure PCTCN2022108857-appb-000010
X (l+1)=σ(Z (l+1))+X (l)
本申请中使用ChebNet(切比雪夫多项式近似图卷积核),引入PairNorm标准化等结构使得控制全部的两两节点间的特征向量的距离和为一个常数,这样可以使得距离较远的节点的特征向量的距离也比较远。
进一步,迁移学习是一种非常重要的深度学习策略。它通过将解决一个问题所获得的知识应用于另一个不同但相关的问题来重用这些知识,即将知识从源域迁移到目标域,这将对许多由于训练数据不足而难以改进的领域产生巨大 的积极影响,迁移学习的学习过程如图4所示。
深度迁移学习分为四类:基于实例的深度迁移学习、基于映射的深度迁移学习、基于网络的深度迁移学习和基于对抗的深度迁移学习。本申请运用的是其中的基于参数的深度迁移学习。因为实验所应用的传感器类型相同、采集到的数据也是相同的类型,若他们的输入维度相同,那所构建的残差网络也相同,就很适合运用基于参数的迁移学习来优化、加残差GNN网络的学习效率。
本申请考虑了具有不同传感器设置或活动类型的不同数据集之间的深度迁移学习。ResGCNN深度迁移学习包括三个主要阶段,包括:
1)利用大规模训练数据集对网络进行源域训练。
2)部分对源域进行预处理的网络迁移到为目标域设计的新网络中。
3)针对新训练任务对所传输的子网络进行微调策略的更新。
首先会从PAMAP2数据集中选取单个位置传感器(9个通道)数据或三个位置传感器(27个通道)输入到ResGCNN网络中进行学习、分类,同时保留该结构在残差网络部分学习到的参数。
接下来将其他3个数据集分别输入到该网络中进行分类测试,其中需要注意的是他们的传感器个数需相同(即通道数相同)以保证他们有相同的输入维度。搭建与PAMAP2数据集相同的残差网络结构,并根据不同数据集的分类需要修改、添加全连接层。训练新数据集时会将之前训练好的PAMAP2残差网络参数直接转移到新训练中来并将其参数锁定,所以对于新的训练来说迭代优化的参数仅是最后的全连接层部分。为了证明ResGCNN网络在小样本中迁移学习能力,本申请会取原新样本集的30%进行测试。
如图3所示,终端设备利用目标数据集样本的模型自适应优化目标模型中的全连接层,ResGCNN的最后部分使用Softmax层作为HAR分类器,数据集分别输入网络进行训练,使每一层的权重不断优化。最后,终端设备使用源域上执行的ResGCNN结构中的预训练块作为目标域中的特征提取器对ResGCNN进行迁移学习。
进一步地,对于分类任务,使用分类准确率、召回率、F1分数和混淆矩阵来说明完成的结果。对于数据集中的每个活动类别,将模型的预测与基本事实标签进行比较,以计算真阳性(TP)、真阴性(TN)、假阳性(FP)和假阴性(FN)的数量。总体精确度ACC等于:
Figure PCTCN2022108857-appb-000011
并且一个典型类别的查准率(Precision)和召回率(Recall)可以通过以下公式计算:
Figure PCTCN2022108857-appb-000012
Figure PCTCN2022108857-appb-000013
F1-Score是查准率和召回率的平衡组合,其计算公式为:
Figure PCTCN2022108857-appb-000014
这些活动标签的平均值用于评估每个实验。此外,混淆矩阵还涉及模型性能的可视化。
在本申请实施例中,终端设备获取训练数据集,并对所述训练数据集中的训练数据进行预处理,得到若干训练图数据,其中,每一训练图数据为所述训练数据中一个时间片的训练图数据;将所述训练图数据输入图神经网络进行训练,其中,所述图神经网络包括依次连接的若干图卷积层;基于训练结果,获取最终图神经网络的权重矩阵,完成神经网络训练,其中,所述图神经网络的权重矩阵由所述若干图卷积层的最终权重组成。通过上述方式,本申请对训练数据集进行数据预处理,从而得到满足图神经网络输入的训练图数据,从而提高神经网络的训练效率以及训练准确性。
请继续参阅图5,图5是本申请提供的人体运动识别方法一实施例的流程示意图。
如图5所示,本申请实施例的人体运动识别方法具体可以包括以下步骤:
步骤S21:利用可穿戴传感器获取用户的人体运动数据。
在本申请实施例中,终端设备通过用户身上的可穿戴传感器获取用户的人体运动数据。
步骤S22:对人体运动数据进行预处理,得到人体运动图数据。
在本申请实施例中,步骤S22的数据预处理具体过程请参阅上述实施例的步骤S11,在此不再赘述。
步骤S23:将人体运动图数据输入预先训练的图神经网络,获取图神经网络基于人体运动图数据对用户的人体运动的预测信息。
在本申请实施例中,预先训练的图神经网络具体可以为上述实施例训练所得的图神经网络,其训练过程在此不再赘述。
步骤S24:基于预测信息,获取用户的运动状态。
在本申请实施例中,本申请中提出了一个构建图的角度来解决针对传感器的HAR问题的方案。人在活动的过程中肢体间会相互配合、共同作用,此方法通过佩戴在人身上不同位置传感器的相关性对人采集到的数据建图,使用一种基于图谱理论的图神经网络建模,通过图网络学习图中蕴含的动作信息及传感器间相互关系对动作进行分类。其构建了一个完整的HAR框架,选取图神经网络对人体运动建模,并证实了GNN网络在HAR领域具有很强的迁移学习能力及多角度学习能力,有效的弥补了传统深度学习无法有效捕捉非欧空间的图结构数据关系的不足,提出在基于传感器的人体运动图结构数据上建模新思路。
本申请证明在基于传感器的人体运动识别上,使用图神经网络是可行的,提出了一个将传感器收集到的信息转换成图结构的数据预处理方法,在此方法的数据集上相当于传统深度(模型(CNN、RNN、LSTM、DEEP-LSTM)取得更接近或者更好的结果,同时也提出使用图神经网络在基于传感器的人体运动识别的新思路。在基于传感器的人体运动识别上,提出一个基于图网络模型的多 模态融合方式,搭建了一个具有高泛化性的多层残差图神经网络,并在多个公开的数据集以及自己的数据集训练,取得非常好的分类结果;在基于传感器的人体运动识别上。证明在迁移学习中图神经网络模型可迁移性,并在多个数据集上训练和验证,取得非常好的结果,贡献训练好的在人体运动识别数据集上具有高泛化性的多层残差图神经网络模型参数。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
请继续参见图6,图6是本申请提供的终端设备一实施例的结构示意图。本申请实施例的终端设备500包括处理器51、存储器52、输入输出设备53以及总线54。
该处理器51、存储器52、输入输出设备53分别与总线54相连,该存储器52中存储有程序数据,处理器51用于执行程序数据以实现上述实施例所述的神经网络训练方法和/或人体运动识别方法。
在本申请实施例中,处理器51还可以称为CPU(Central Processing Unit,中央处理单元)。处理器51可能是一种集成电路芯片,具有信号的处理能力。处理器51还可以是通用处理器、数字信号处理器(DSP,Digital Signal Process)、专用集成电路(ASIC,Application Specific Integrated Circuit)、现场可编程门阵列(FPGA,Field Programmable Gate Array)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器51也可以是任何常规的处理器等。
本申请还提供一种计算机存储介质,请继续参阅图7,图7是本申请提供的计算机存储介质一实施例的结构示意图,该计算机存储介质600中存储有程序数据61,该程序数据61在被处理器执行时,用以实现上述实施例的神经网络训练方法和/或人体运动识别方法。
本申请的实施例以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或 部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (12)

  1. 一种神经网络训练方法,其特征在于,所述神经网络训练方法包括:
    获取训练数据集,并对所述训练数据集中的训练数据进行预处理,得到若干训练图数据,其中,每一训练图数据为所述训练数据中一个时间片的训练图数据;
    将所述训练图数据输入图神经网络进行训练,其中,所述图神经网络包括依次连接的若干图卷积层;
    基于训练结果,获取最终图神经网络的权重矩阵,完成神经网络训练,其中,所述图神经网络的权重矩阵由所述若干图卷积层的最终权重组成。
  2. 根据权利要求1所述的神经网络训练方法,其特征在于,
    所述将所述训练图数据输入图神经网络进行训练,包括:
    将每一所述训练图像数据输入所述图神经网络的首个图卷积层,获取所述首个图卷积层的首个输出;
    将所述首个输出输入所述首个图卷积层的下一个图卷积层,以将所述首个输出作为所述下一个图卷积层的输入进行训练,直至完成所述图神经网络的所有图卷积层的训练。
  3. 根据权利要求2所述的神经网络训练方法,其特征在于,
    所述将所述首个输出输入所述首个图卷积层的下一个图卷积层,以将所述首个输出作为所述下一个图卷积层的输入进行训练,包括:
    将所述首个输出以及所述训练图数据进行叠加,得到融合数据;
    将所述融合数据输入所述下一个图卷积层,以将所述融合数据作为所述下一个图卷积层的输入进行训练。
  4. 根据权利要求2所述的神经网络训练方法,其特征在于,
    所述首个输出由所述训练图像数据、所述首个图卷积层的训练权重计算生成;
    所述首个输出通过激活函数转换为所述下一个图卷积层的输入。
  5. 根据权利要求1所述的神经网络训练方法,其特征在于,
    所述将所述训练图数据输入图神经网络进行训练,包括:
    利用拉普拉斯算子从所述训练图数据的节点特征中提取空间特征;
    利用所述图神经网络的训练权重作为对角线元素,构建对角矩阵;
    利用所述空间特征以及所述对角矩阵,形成所述图神经网络的输出进行训练。
  6. 根据权利要求5所述的神经网络训练方法,其特征在于,
    所述利用所述空间特征以及所述对角矩阵,形成所述图神经网络的输出进行训练,包括:
    获取每一个节点特征的空间特征;
    基于预设卷积核感受野,更新所述每一个节点特征的空间特征;
    利用更新后的空间特征以及所述对角矩阵,形成所述图神经网络的输出进行训练。
  7. 根据权利要求6所述的神经网络训练方法,其特征在于,
    所述基于预设卷积核感受野,更新所述每一个节点特征的空间特征,包括:
    按照所述预设卷积核感受野,设置切比雪夫多项式递归方程;
    将所述每一个节点特征的空间特征输入所述切比雪夫多项式递归方程,递归得到所述每一个节点特征更新后的空间特征。
  8. 根据权利要求1所述的神经网络训练方法,其特征在于,
    所述图神经网络在所述若干图卷积层之后,还连接有至少一层全连接层,所述至少一层全连接层用于训练分类任务。
  9. 根据权利要求1所述的神经网络训练方法,其特征在于,
    所述基于训练结果,获取最终图神经网络的权重矩阵,完成神经网络训练之后,所述神经网络训练方法还包括:
    将完成神经网络训练的图神经网络迁移到其他神经网络,作为其他神经网络的一部分网络结构,从而形成迁移神经网络;
    重新对所述迁移神经网络进行训练。
  10. 一种人体运动识别方法,其特征在于,所述人体运动识别方法包括:
    利用可穿戴传感器获取用户的人体运动数据;
    对所述人体运动数据进行预处理,得到人体运动图数据;
    将所述人体运动图数据输入预先训练的图神经网络,获取所述图神经网络基于所述人体运动图数据对所述用户的人体运动的预测信息;
    基于所述预测信息,获取所述用户的运动状态;
    其中,所述图神经网络通过1~9任一项所述的神经网络训练方法训练得到。
  11. 一种终端设备,其特征在于,所述终端设备包括存储器以及与所述存储器耦接的处理器;
    其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据以实现如权利要求1~9任一项所述的神经网络训练方法和/或权利要求10所述的人体运动识别方法。
  12. 一种计算机存储介质,其特征在于,所述计算机存储介质用于存储程序数据,所述程序数据在被计算机执行时,用以实现如权利要求1~9任一项所述的神经网络训练方法和/或权利要求10所述的人体运动识别方法。
PCT/CN2022/108857 2022-05-26 2022-07-29 神经网络训练方法、人体运动识别方法及设备、存储介质 WO2023226186A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210585190.2A CN114943324B (zh) 2022-05-26 2022-05-26 神经网络训练方法、人体运动识别方法及设备、存储介质
CN202210585190.2 2022-05-26

Publications (1)

Publication Number Publication Date
WO2023226186A1 true WO2023226186A1 (zh) 2023-11-30

Family

ID=82908434

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/108857 WO2023226186A1 (zh) 2022-05-26 2022-07-29 神经网络训练方法、人体运动识别方法及设备、存储介质

Country Status (2)

Country Link
CN (1) CN114943324B (zh)
WO (1) WO2023226186A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117953588A (zh) * 2024-03-26 2024-04-30 南昌航空大学 一种融合场景信息的羽毛球运动员动作智能识别方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471910A (zh) * 2022-09-06 2022-12-13 中国科学院深圳先进技术研究院 基于fpga实现的运动活动识别模型的模型训练方法及其设备
CN115907001B (zh) * 2022-11-11 2023-07-04 中南大学 基于知识蒸馏的联邦图学习方法及自动驾驶方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304795A (zh) * 2018-01-29 2018-07-20 清华大学 基于深度强化学习的人体骨架行为识别方法及装置
CN110222653A (zh) * 2019-06-11 2019-09-10 中国矿业大学(北京) 一种基于图卷积神经网络的骨架数据行为识别方法
CN112183315A (zh) * 2020-09-27 2021-01-05 哈尔滨工业大学(深圳) 动作识别模型训练方法和动作识别方法及装置
US20210012181A1 (en) * 2019-01-03 2021-01-14 Boe Technology Group Co., Ltd. Computer-implemented method of training convolutional neural network, convolutional neural network, computer-implemented method using convolutional neural network, apparatus for training convolutional neural network, and computer-program product

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171134A (zh) * 2017-12-20 2018-06-15 中车工业研究院有限公司 一种操作动作辨识方法及装置
US10416755B1 (en) * 2018-06-01 2019-09-17 Finch Technologies Ltd. Motion predictions of overlapping kinematic chains of a skeleton model used to control a computer system
CN109215036A (zh) * 2018-08-01 2019-01-15 浙江深眸科技有限公司 基于卷积神经网络的人体分割方法
CN110334573B (zh) * 2019-04-09 2022-04-29 北京航空航天大学 一种基于密集连接卷积神经网络的人体运动状态判别方法
CN110929029A (zh) * 2019-11-04 2020-03-27 中国科学院信息工程研究所 一种基于图卷积神经网络的文本分类方法及系统
CN113326930B (zh) * 2020-02-29 2024-05-03 华为技术有限公司 数据处理方法、神经网络的训练方法及相关装置、设备
KR102196962B1 (ko) * 2020-03-05 2020-12-31 강윤 매트릭스 압력 센서를 이용한 인체의 움직임 인식 및 이를 통한 인체 동작 예측 시스템
CN112633482B (zh) * 2020-12-30 2023-11-28 广州大学华软软件学院 一种高效宽度图卷积神经网络模型系统及训练方法
CN112767553A (zh) * 2021-02-02 2021-05-07 华北电力大学 一种自适应群体服装动画建模方法
CN113240714B (zh) * 2021-05-17 2023-10-17 浙江工商大学 一种基于情境感知网络的人体运动意图预测方法
CN113642379B (zh) * 2021-05-18 2024-03-01 北京航空航天大学 基于注意力机制融合多流图的人体姿态预测方法及系统
CN113255798A (zh) * 2021-06-02 2021-08-13 苏州浪潮智能科技有限公司 一种分类模型训练方法、装置、设备及介质
CN113705772A (zh) * 2021-07-21 2021-11-26 浪潮(北京)电子信息产业有限公司 一种模型训练方法、装置、设备及可读存储介质
CN113642432A (zh) * 2021-07-30 2021-11-12 南京师范大学 基于协方差矩阵变换的卷积神经网络用于人体姿态识别方法
CN114330670A (zh) * 2022-01-04 2022-04-12 京东科技信息技术有限公司 图神经网络训练方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304795A (zh) * 2018-01-29 2018-07-20 清华大学 基于深度强化学习的人体骨架行为识别方法及装置
US20210012181A1 (en) * 2019-01-03 2021-01-14 Boe Technology Group Co., Ltd. Computer-implemented method of training convolutional neural network, convolutional neural network, computer-implemented method using convolutional neural network, apparatus for training convolutional neural network, and computer-program product
CN110222653A (zh) * 2019-06-11 2019-09-10 中国矿业大学(北京) 一种基于图卷积神经网络的骨架数据行为识别方法
CN112183315A (zh) * 2020-09-27 2021-01-05 哈尔滨工业大学(深圳) 动作识别模型训练方法和动作识别方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Master's Thesis", 26 March 2021, NANJING NORMAL UNIVERSITY, CN, article WANG, ZHENYU: "Research on Human Activity Recognition Algorithm Based on Deep Neural Network", pages: 1 - 68, XP009550840, DOI: 10.27245/d.cnki.gnjsu.2021.000765 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117953588A (zh) * 2024-03-26 2024-04-30 南昌航空大学 一种融合场景信息的羽毛球运动员动作智能识别方法

Also Published As

Publication number Publication date
CN114943324A (zh) 2022-08-26
CN114943324B (zh) 2023-10-13

Similar Documents

Publication Publication Date Title
WO2023226186A1 (zh) 神经网络训练方法、人体运动识别方法及设备、存储介质
Ha et al. Convolutional neural networks for human activity recognition using multiple accelerometer and gyroscope sensors
CN107529650B (zh) 闭环检测方法、装置及计算机设备
CN111539941B (zh) 帕金森病腿部灵活性任务评估方法及系统、存储介质及终端
WO2019227479A1 (zh) 人脸旋转图像的生成方法及装置
CN111612243A (zh) 交通速度预测方法、系统及存储介质
Gao et al. A canonical polyadic deep convolutional computation model for big data feature learning in Internet of Things
CN106570522B (zh) 物体识别模型的建立方法及物体识别方法
CN106909938B (zh) 基于深度学习网络的视角无关性行为识别方法
CN105184767B (zh) 一种运动人体姿态相似性度量方法
CN111160294B (zh) 基于图卷积网络的步态识别方法
Amsaprabhaa Multimodal spatiotemporal skeletal kinematic gait feature fusion for vision-based fall detection
Tahir et al. Hrnn4f: Hybrid deep random neural network for multi-channel fall activity detection
JP6900576B2 (ja) 移動状況認識モデル学習装置、移動状況認識装置、方法、及びプログラム
JP2018010626A (ja) 情報処理装置、情報処理方法
CN113158861A (zh) 一种基于原型对比学习的运动分析方法
CN113688765A (zh) 一种基于注意力机制的自适应图卷积网络的动作识别方法
Cao et al. QMEDNet: A quaternion-based multi-order differential encoder–decoder model for 3D human motion prediction
CN108009512A (zh) 一种基于卷积神经网络特征学习的人物再识别方法
Sekaran et al. Smartphone-based human activity recognition using lightweight multiheaded temporal convolutional network
CN110580456A (zh) 基于相干约束图长短时记忆网络的群体活动识别方法
Li et al. Multi-convLSTM neural network for sensor-based human activity recognition
Bhattacharjee et al. A comparative study of supervised learning techniques for human activity monitoring using smart sensors
WO2023142886A1 (zh) 表情迁移方法、模型训练方法和装置
Ishwarya et al. Performance-enhanced real-time lifestyle tracking model based on human activity recognition (PERT-HAR) model through smartphones

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22943377

Country of ref document: EP

Kind code of ref document: A1