CN114638285A - Multi-mode identification method for mobile phone inertial sensor data - Google Patents

Multi-mode identification method for mobile phone inertial sensor data Download PDF

Info

Publication number
CN114638285A
CN114638285A CN202210179112.2A CN202210179112A CN114638285A CN 114638285 A CN114638285 A CN 114638285A CN 202210179112 A CN202210179112 A CN 202210179112A CN 114638285 A CN114638285 A CN 114638285A
Authority
CN
China
Prior art keywords
task
data
mobile phone
model
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210179112.2A
Other languages
Chinese (zh)
Other versions
CN114638285B (en
Inventor
张沪寅
苏今腾
郭迟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202210179112.2A priority Critical patent/CN114638285B/en
Publication of CN114638285A publication Critical patent/CN114638285A/en
Application granted granted Critical
Publication of CN114638285B publication Critical patent/CN114638285B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a multi-mode identification method for mobile phone inertial sensor data, which comprises the steps of collecting training sample data when an inertial sensor carrier uses a smart phone; cutting original sensor data by adopting a sliding window to generate a plurality of samples, and setting a label for each sample; constructing a single-task deep neural network model and training, wherein the mode number of the corresponding task is the moving direction number, the pedestrian identity number and the mobile phone carrying mode number respectively; constructing a multi-task deep network model, and enabling the model to perform transfer learning on the single-task network model and train the single-task network model; collecting test data, inputting the test data into a multi-task model for calculation, outputting probability prediction corresponding to each task, detecting a maximum value, and returning a corresponding label as a prediction result of the task if the maximum value is larger than a corresponding threshold value. The invention can effectively dig out the information hidden in the sensor data and accurately identify the moving direction, identity authentication and mobile phone carrying mode of the mobile phone carrier.

Description

Multi-mode identification method for mobile phone inertial sensor data
Technical Field
The invention discloses a multi-mode identification method for mobile phone inertial sensor data (comprising motion direction identification, pedestrian identity verification and mobile phone carrying mode identification) and a method for transfer learning among sensor data classification tasks, and belongs to the technical field of artificial intelligence.
Background
With the development of embedded technology, sensors and wearable devices have more and more functions, and applications are also becoming very wide. In addition to applications in the fields of navigation, positioning, etc., data on inertial sensors can help us identify specific patterns. When the pedestrian wears the inertial sensor to walk, the pedestrian can be identified by the moving direction and the identity of the pedestrian is verified. And if the pedestrian carries a smart phone with an inertial sensor, the carrying mode of the smart phone can be identified.
The motion direction recognition refers to recognition of the moving direction (forward, backward, left and right) of a pedestrian, belongs to a human behavior recognition technology, brings more and more benefits in scientific research, production economy and life service at present, is also paid attention by scientists and scholars, and is mainly divided into two types, namely image-based and inertial sensor-based. The identity authentication is to use the inherent physiological characteristics or behavior characteristics of human body to authenticate the personal identity, but most of the existing biological characteristic identification technologies are to identify the iris, fingerprint or face of the human body, are mostly based on images, and have slow operation speed. The human behavior recognition technology and the identity verification technology based on the inertial sensor have good application prospects due to the fact that the sensor is low in cost, small in energy consumption, small in data volume, easy to calculate and not prone to being influenced by the environment. The mobile phone carrying mode identification refers to identifying a mode of carrying a mobile phone by a human body, generally comprising pocket, swing, holding and locking, the technology has a great effect on estimating a relative angle between a mobile phone coordinate system and a human body coordinate system, and can be applied to navigation and positioning based on a mobile phone inertial sensor.
At present, the scholars use various traditional machine learning methods for the above three classification tasks and achieve good effects. However, these machine learning methods cannot effectively mine information features to a certain extent, resulting in low accuracy, so some researchers begin to use deep learning methods to construct neural networks to complete classification tasks on inertial sensors, and the effect of the methods is greatly improved compared with the traditional machine learning methods. However, no matter the machine learning method or the deep learning method, a local optimal point may be trapped in the model training process, so that the finally trained model is limited in effect, and therefore how to avoid the local optimal point also becomes a difficulty in the field of artificial intelligence.
Disclosure of Invention
The invention aims to provide an effective deep learning method for multi-classification of inertial sensor data, and based on transfer learning, the model avoids local optimization in the training process, the final model can effectively mine hidden information from the inertial sensor data of a smart phone, and the motion direction identification, identity verification and mobile phone carrying mode identification of a mobile phone carrier are completed.
In order to achieve the above object, the technical solution proposed by the present invention includes a method for multi-mode recognition of mobile phone inertial sensor data, comprising the steps of:
step 1, collecting training sample data when an inertial sensor carrier uses a smart phone;
step 2, cutting original sensor data by adopting a sliding window method to generate a plurality of samples, setting each sample to contain a plurality of frame data, overlapping data of adjacent samples, and setting a label for each sample while generating the samples;
step 3, constructing a single-task deep neural network model, constructing one for each classification task, and totally three models, wherein each model comprises three convolution layers, two LSTM units, an attention mechanism module and a full connection layer, the neuron number of the output layer of the full connection layer is equal to the mode number of the corresponding task, and the mode numbers of the corresponding tasks of the three models are respectively the moving direction number, the pedestrian identity number and the mobile phone carrying mode number;
step 4, inputting the labels of the corresponding tasks of the samples generated in the step 2 into the deep network models of the corresponding tasks in the step 3, and training the models to be convergent;
step 5, constructing a multi-task deep network model, wherein the model comprises three parallel multi-convolution layers, two layers of time sequence convolution layers and three parallel decoding layers, and each decoding layer comprises two LSTM units, an attention mechanism module and a full connection layer; enabling the model to perform transfer learning on the single task network model in the step 4 and training based on the sample in the step 2 until convergence;
step 6, collecting test data when a user uses the smart phone with the inertial sensor;
step 7, inputting the test data into the multi-task model trained in the step 5 for calculation, wherein the model outputs three vectors which respectively correspond to probability prediction of each task; detecting the maximum value in each probability prediction vector, and if the maximum value is greater than the corresponding threshold value, returning the corresponding label as the prediction result of the task; otherwise, return to-1, which indicates that the data is of an illegal type in the task.
And when the multitask model performs migration learning on the single-task model in the step 5, the multi-convolution layers in the three corresponding single-task models are identified through moving direction identification, identity authentication and mobile phone carrying mode identification, and the multi-convolution layers are migrated into the multitask model to form three parallel multi-convolution layers, so that the multitask model can dig out more features, and local optimization is slowed down.
In step 2, a sliding window with the length of 128 and the step length of 64 is adopted to cut the original sensor data, a single generated sample has 128 pieces of data, each frame of data has 6 floating point numbers, and the floating point numbers respectively correspond to x-axis data, y-axis data and z-axis data of an accelerometer and x-axis data, y-axis data and z-axis data of a gyroscope; each sample pair is provided with three labels, and the content of each label is the identity number, the moving direction and the mobile phone carrying mode of the data acquirer corresponding to the sample.
Moreover, in the single-task deep neural network model in step 3, the first convolutional layer contains 64 one-dimensional convolutional kernels with the length of 25, the second convolutional layer and the third convolutional layer respectively contain 64 one-dimensional convolutional kernels with the length of 21, the number of hidden layer neurons in the two LSTM units is 128, the processing procedure is as follows,
inputting a sample with a size of (128,6) into the first convolution layer to obtainSize (104,6,64) feature map FM1,FM1Input into the second convolution layer to obtain a feature map FM of size (84,6,64)2,FM2Input into the third convolutional layer to obtain a feature map FM with a size of (64,6,64)3(ii) a Then FM will be3Reducing the vector size to two-dimensional matrix of (64,6 x 64), i.e. 64 vectors with length of 384, inputting the vector size to the first LSTM unit to generate 64 times of outputs, the output vector length of each time is 128, the 64 vectors will be inputted to the second LSTM unit again to generate 64 vectors with length of 128, i.e. two-dimensional matrix of (64, 128), the two-dimensional matrix is hlstm,hlstmWill be input to the attention mechanism module to perform weighted summation of scores, which is calculated as follows:
Figure BDA0003521667130000031
Figure BDA0003521667130000032
Figure BDA0003521667130000033
wherein,
Figure BDA0003521667130000034
αiis hlstmThe ith vector
Figure BDA0003521667130000035
V is a column vector of length 80, W is a two-dimensional matrix of (80,128), b is a column vector of length 80, N is hlstmThe number of medium vectors, tanh is a hyperbolic tangent function;
the output of the attention mechanism module is a vector h with the length of 128attenion,hattenionThen the data is input into a full connection layer and is subjected to softmax transformation to finally obtain the data capable of representing and identifyingA vector of results, each value in the vector corresponding to a predicted probability for each pedestrian identity.
Furthermore, the multitask network model in step 5 processes the data as follows,
after entering the network, the samples with the size of (128,6) are copied into three copies and respectively enter three parallel multi-convolution layers Convdirection,Convid,ConvposeThree feature maps of size (64,6,64) are obtained, and these three feature maps are stacked along the third dimension to obtain a feature map of size (64,6,192), denoted as FMall
FMallSequentially entering two time sequence convolution layers, wherein the expansion coefficients of the two time sequence convolution layers are respectively 2 and 4, the number of convolution kernels is respectively 96 and 48, and finally obtaining the feature mapping FM with the size of (64,6 and 48)tcn(ii) a Then reshaped into a two-dimensional matrix of size (64,6 x 48) and input to three parallel decoders simultaneously; data entering a decoder firstly enters a double-layer LSTM unit, the number of neurons in each layer is 64, each LSTM layer outputs 64 vectors with the length of 64, and a two-dimensional matrix with the size of (64,64) is formed; the two-dimensional matrix is input into an attention module, and the output of each attention module enters a full-connection layer to obtain a probability distribution vector;
the multi-task model outputs three probability distribution vectors which correspond to the prediction results of moving direction identification, identity authentication and mobile phone carrying mode identification.
The invention utilizes a deep learning method to complete the identification of the moving direction, the identity of the pedestrian and the carrying mode of the mobile phone, and also utilizes a method for migrating the single task model to the multi-task model based on the idea of migration learning, thereby avoiding the local optimization in the training process to a certain extent. Compared with the prior art, the technology has the advantages of high identification accuracy, low possibility of being influenced by the environment and low required cost. And the technology can also be used for assisting the indoor positioning technology and improving the positioning precision.
The scheme of the invention is simple and convenient to implement, has strong practicability, solves the problems of low practicability and inconvenient practical application of the related technology, can improve the user experience, and has important market value.
Drawings
FIG. 1 is a diagram of a single-tasking neural network architecture according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a sliding window cut-to-produce sample according to an embodiment of the present invention;
FIG. 3 is a diagram of a multitasking neural network architecture according to an embodiment of the present invention.
Detailed Description
The technical solution of the present invention is specifically described below with reference to the accompanying drawings and examples.
The invention provides a deep neural network method capable of simultaneously completing motion direction identification, pedestrian identity verification and mobile phone carrying mode identification based on a mobile phone inertial sensor, and provides a training method based on transfer learning so as to avoid local optimization. The core of the technology is a deep neural network which can effectively dig out information hidden in sensor data after learning and training and accurately identify the moving direction and identity of a pedestrian and the carrying mode of a mobile phone.
The technical core of the invention is two deep neural network models (a single-task network model and a multi-task network model) and a training method for migrating and learning the multi-task model from the single-task model.
1) Single task network model
Referring to fig. 1, the model consists of three convolutional layers, two LSTM units, an attention mechanism module, and a fully connected layer. The first convolution layer contains 64 one-dimensional convolution kernels with the length of 25, the second convolution layer and the third convolution layer respectively contain 64 one-dimensional convolution kernels with the length of 21, the number of neurons in hidden layers in the two LSTM units is 128, and the number of neurons in output layers of the full connection layers is equal to the number of classified neurons. After the sample with the size of (128,6) is input into the first convolution layer, the feature map FM with the size of (104,6,64) is obtained1,FM1Input to the second convolutional layer to obtain a feature map FM of size (84,6,64)2,FM2Inputting into the third convolution layer to obtain the sizeFeature map FM of (64,6,64)3. Then FM will be3Dimension reduction into two-dimensional matrix with size of (64,6 x 64), that is, 64 vectors with length of 384, input into the first LSTM unit to generate 64 times of output, each time the output vector length is 128, the 64 vectors will be input into the second LSTM unit again to generate 64 vectors with length of 128, that is, two-dimensional matrix with size of (64, 128), and the two-dimensional matrix is recorded as hlstm,hlstmThe data is input into an attention mechanism module to carry out score weighted summation, and the calculation method is as follows:
Figure BDA0003521667130000051
Figure BDA0003521667130000052
Figure BDA0003521667130000053
wherein,
Figure BDA0003521667130000054
αiis hlstmThe ith vector
Figure BDA0003521667130000055
V is a column vector of length 80, W is a two-dimensional matrix of (80,128), b is a column vector of length 80, v, W, b are learnable network parameters, N is hlstmThe number of medium vectors, tanh, is the hyperbolic tangent function.
The output of the attention mechanism module is a vector h with the length of 128attention,hattentionAnd then inputting the data into a full connection layer, and finally obtaining a vector capable of representing the recognition result through softmax transformation, wherein each value in the vector corresponds to the prediction probability of each category.
2) Multitasking model
Referring to fig. 3, the multitask model is formed by splitting and merging three different single task models, and after the training of the three single task models (the motion direction identification model, the identity verification model and the mobile phone carrying mode identification model) is completed, the convolution layer parts in the three models of the motion direction identification, the identity verification and the mobile phone carrying mode identification are sequentially: convdirection,Convid,Convpose. The invention migrates the three convolutional layer parts into a multitask model, and the input samples respectively enter Cinv at the same timedirection,Convid,ConvposeObtaining three feature maps with the size of (64,6,64), which are respectively marked as FMdirection,FMid,FMposeAnd stacking the three feature maps along a third dimension to obtain a feature map of size (64,6,192), denoted as FMall
Then, the invention adopts a time sequence convolution method to carry out FM based on the idea of void convolutionallAnd (5) performing spatial dimension reduction. FMallSequentially entering two time sequence convolution layers, wherein the differences are respectively 2 and 4, the number of convolution kernels is respectively 96 and 48, and finally obtaining the feature mapping FM with the size of (64,6 and 48)tcn(ii) a Then reshaped into a two-dimensional matrix of size (64,6 x 48) and input to three parallel decoders simultaneously; the data going to the decoder will first go into a two-layer LSTM unit (64 neurons per layer), and each LSTM layer will output 64 vectors of length 64, i.e. a two-dimensional matrix of size (64, 64).
The three two-dimensional matrixes are respectively input into an attention module (the structure of the attention module is the same as that of the attention module in the single-task model), and the output of each attention module enters the full-connection layer to obtain a probability distribution vector representing the recognition result.
The multi-task model outputs three vectors which respectively correspond to the prediction results of motion direction identification, identity authentication and mobile phone carrying mode identification.
When a multi-task model is used for multi-classification (motion direction identification, identity authentication and mobile phone carrying mode identification), the method mainly comprises the following steps.
The first step is as follows: collecting training sample data when an inertial sensor carrier uses a smartphone: in specific implementation, a mobile phone carrier can use the smart phone to collect data for neural network learning. During collection, the collection frequency is set to be 50 Hz. The collected data includes four movement directions (forward, backward, left and right), four mobile phone carrying modes (pocket, swing, holding, and sitting), and a plurality of collectors.
The second step is that: cutting original sensor data by adopting a sliding window method to generate a plurality of samples, wherein each sample contains n frame data, and the adjacent samples have P% data overlap, and labeling is done for each sample while generating the sample: the raw sensor data was sliced using a sliding window of 128 long, 64 steps to generate a number of samples, each sample containing 128 frames of data, with 50% overlap of data between adjacent samples. And while generating the samples, labeling each sample, and marking the corresponding moving direction, the mobile phone carrying mode and the number of the collector.
Referring to fig. 2, in the embodiment, it is preferable to set the generated single sample to have 128 frames of data, each frame of data has 6 floating point numbers, which correspond to the x, y, and z axis data of the accelerometer and the x, y, and z axis data of the gyroscope, respectively. When the label is made, each sample has three labels, and the content of the label is the identity number, the moving direction and the mobile phone carrying mode of the data acquirer corresponding to the sample.
The third step: constructing a single-task deep neural network model, wherein each classified task is constructed by one model, and the three models are total, each model comprises three convolution layers, two LSTM units, an attention mechanism module and a full connection layer, wherein the first convolution layer contains 64 one-dimensional convolution kernels with the length of 25, the second convolution layer and the third convolution layer respectively contain 64 one-dimensional convolution kernels with the length of 21, the number of neurons in hidden layers in the two LSTM units is 128, the number of neurons in output layers of the full connection layer is equal to the number of modes of corresponding tasks, and the number of modes of corresponding tasks of the three models is respectively the number of moving directions, the number of pedestrians and the number of mobile phone carrying modes;
the specific single-task deep neural network model implementation is seen in section 1) above.
The fourth step: and inputting the sample set generated in the third step into each single task model for learning and training, and setting appropriate training parameters (learning rate, training round number and the like) to train the model until the model converges.
The fifth step: constructing a multi-task deep network model, wherein the model comprises three parallel multi-convolution layers, two layers of time sequence convolution layers and three parallel decoding layers, and each decoding layer comprises two LSTM units, an attention mechanism module and a full connection layer; enabling the model to perform transfer learning on the single task network model in the fourth step and training based on the samples in the second step until convergence;
and loading the multi-task model into the convolution layer parameters of each single-task model, reading the sample set, and training until convergence. And storing the trained multi-task model to the rear end of the server, so that the server can calculate the received sensor data in real time, and finish moving direction identification, identity authentication and mobile phone carrying mode identification.
And a sixth step: collecting test data when a user uses a smartphone with an inertial sensor: in specific implementation, a test user can acquire data through a WeChat applet on the smart phone, and each group of data (with the period of 2.56s) is automatically sent to the back end of the server.
The seventh step: inputting the test data into the multi-task model trained in the fifth step for calculation, wherein the model outputs three vectors which respectively correspond to probability prediction of each task; detecting the maximum value in each probability prediction vector, and if the maximum value is greater than a corresponding threshold value, returning a corresponding label as a prediction result of the task; otherwise, returning to-1, indicating that the data is of an illegal type in the task: during specific implementation, the multitask model on the server processes and calculates data, and returns the recognition result of each task to the mobile phone terminal.
In specific implementation, a person skilled in the art can implement the automatic operation process by using a computer software technology, and a system device for implementing the method, such as a computer-readable storage medium storing a corresponding computer program according to the technical solution of the present invention and a computer device including a corresponding computer program for operating the computer program, should also be within the scope of the present invention.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (5)

1. A multi-mode identification method for mobile phone inertial sensor data is characterized by comprising the following steps:
step 1, collecting training sample data when an inertial sensor carrier uses a smart phone;
step 2, cutting original sensor data by adopting a sliding window method to generate a plurality of samples, setting each sample to contain a plurality of frame data, overlapping data of adjacent samples, and setting a label for each sample while generating the samples;
step 3, constructing a single-task deep neural network model, constructing one for each classification task, and totally three models, wherein each model comprises three convolution layers, two LSTM units, an attention mechanism module and a full connection layer, the neuron number of the output layer of the full connection layer is equal to the mode number of the corresponding task, and the mode numbers of the corresponding tasks of the three models are respectively the moving direction number, the pedestrian identity number and the mobile phone carrying mode number;
step 4, inputting the labels of the corresponding tasks of the samples generated in the step 2 into the deep network models of the corresponding tasks in the step 3, and training the models to be convergent;
step 5, constructing a multi-task deep network model, wherein the model comprises three parallel multi-convolution layers, two layers of time sequence convolution layers and three parallel decoding layers, and each decoding layer comprises two LSTM units, an attention mechanism module and a full connection layer; enabling the model to perform transfer learning on the single task network model in the step 4 and training based on the sample in the step 2 until convergence;
step 6, collecting test data when a user uses the smart phone with the inertial sensor;
step 7, inputting the test data into the multi-task model trained in the step 5 for calculation, wherein the model outputs three vectors which respectively correspond to probability prediction of each task; detecting the maximum value in each probability prediction vector, and if the maximum value is greater than a corresponding threshold value, returning a corresponding label as a prediction result of the task; otherwise, return to-1, which indicates that the data is of an illegal type in the task.
2. A method of multimodal identification of the number of inertial sensors of a mobile phone according to claim 1, characterized in that: and 5, when the multi-task model performs migration learning on the single-task model, the multi-convolution layers in the three corresponding single-task models are identified through moving direction identification, identity authentication and mobile phone carrying mode identification and are migrated into the multi-task model to form three parallel multi-convolution layers, so that the multi-task model can dig out more features and slow down local optimization.
3. A method for multimodal identification of the number of inertial sensors of a mobile phone according to claim 1 or 2, characterized in that: in the step 2, a sliding window with the length of 128 and the step length of 64 is adopted to cut the original sensor data, a single generated sample has 128 pieces of data, each frame of data has 6 floating point numbers, and the data respectively correspond to x-axis data, y-axis data and z-axis data of an accelerometer and x-axis data, y-axis data and z-axis data of a gyroscope; each sample pair is provided with three labels, and the content of each label is the identity number, the moving direction and the mobile phone carrying mode of the data acquirer corresponding to the sample.
4. A method of multimodal identification of the number of inertial sensors of a mobile phone according to claim 3, characterized in that: in the single-task deep neural network model in step 3, the first convolutional layer contains 64 one-dimensional convolutional kernels with the length of 25, the second convolutional layer and the third convolutional layer respectively contain 64 one-dimensional convolutional kernels with the length of 21, the number of neurons in hidden layers in the two LSTM units is 128, the processing process is as follows,
after a sample with a size of (128,6) is input into the first convolutional layer, a feature map FM with a size of (104,6,64) is obtained1,FM1Input into the second convolution layer to obtain a feature map FM of size (84,6,64)2,FM2Input into the third convolutional layer to obtain a feature map FM with a size of (64,6,64)3(ii) a Then FM will be3Dimension reduction into two-dimensional matrix with size of (64,6 x 64), that is, 64 vectors with length of 384, input into the first LSTM unit to generate 64 times of outputs, each time the output vector length is 128, the 64 vectors will be input into the second LSTM unit again to generate 64 vectors with length of 128, that is, two-dimensional matrix with size of (64, 128), and the two-dimensional matrix is recorded as hlstm,hlstmWill be input to the attention mechanism module to perform weighted summation of scores, which is calculated as follows:
Figure FDA0003521667120000021
Figure FDA0003521667120000022
Figure FDA0003521667120000023
wherein,
Figure FDA0003521667120000024
αiis hlstmThe ith vector
Figure FDA0003521667120000025
V is a column vector of length 80, W is a two-dimensional matrix of (80,128), b is a column vector of length 80, N is hlstmThe number of medium vectors, tanh, is hyperbolicA cut function;
the output of the attention mechanism module is a vector h with the length of 128attenion,hattenionAnd then the data are input into a full connection layer, and are transformed by softmax to finally obtain a vector capable of representing the recognition result, wherein each value in the vector corresponds to the prediction probability of each pedestrian identity.
5. A method of multimodal identification of the number of inertial sensors of a mobile phone according to claim 4, characterized in that: the processing procedure of the multitask network model in step 5 on the data is as follows,
after the sample with the size of (128,6) enters the network, it is copied into three copies, and the three copies enter three parallel multiple convolution layers Convdirection,Convid,ConvposeThree feature maps of size (64,6,64) are obtained, and these three feature maps are stacked along the third dimension to obtain a feature map of size (64,6,192), denoted as FMall
FMallSequentially entering two time sequence convolution layers, wherein the expansion coefficients of the two time sequence convolution layers are respectively 2 and 4, the number of convolution kernels is respectively 96 and 48, and finally obtaining the feature mapping FM with the size of (64,6 and 48)tcn(ii) a Then reshaped into a two-dimensional matrix of size (64,6 x 48) and simultaneously input into three parallel decoders; data entering a decoder firstly enters a double-layer LSTM unit, the number of neurons in each layer is 64, each LSTM layer outputs 64 vectors with the length of 64, and a two-dimensional matrix with the size of (64,64) is formed; the two-dimensional matrix is input into an attention module, and the output of each attention module enters a full connection layer again to obtain a probability distribution vector;
the multi-task model outputs three probability distribution vectors which correspond to the prediction results of moving direction identification, identity authentication and mobile phone carrying mode identification.
CN202210179112.2A 2022-02-25 2022-02-25 Multi-mode identification method for mobile phone inertial sensor data Active CN114638285B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210179112.2A CN114638285B (en) 2022-02-25 2022-02-25 Multi-mode identification method for mobile phone inertial sensor data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210179112.2A CN114638285B (en) 2022-02-25 2022-02-25 Multi-mode identification method for mobile phone inertial sensor data

Publications (2)

Publication Number Publication Date
CN114638285A true CN114638285A (en) 2022-06-17
CN114638285B CN114638285B (en) 2024-04-19

Family

ID=81947129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210179112.2A Active CN114638285B (en) 2022-02-25 2022-02-25 Multi-mode identification method for mobile phone inertial sensor data

Country Status (1)

Country Link
CN (1) CN114638285B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740052A (en) * 2018-12-26 2019-05-10 武汉大学 The construction method and device of network behavior prediction model, network behavior prediction technique
CN110929243A (en) * 2019-11-22 2020-03-27 武汉大学 Pedestrian identity recognition method based on mobile phone inertial sensor
CN111079547A (en) * 2019-11-22 2020-04-28 武汉大学 Pedestrian moving direction identification method based on mobile phone inertial sensor
US20210281918A1 (en) * 2019-04-23 2021-09-09 Tencent Technology (Shenzhen) Company Limited Video recommendation method and device, computer device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740052A (en) * 2018-12-26 2019-05-10 武汉大学 The construction method and device of network behavior prediction model, network behavior prediction technique
US20210281918A1 (en) * 2019-04-23 2021-09-09 Tencent Technology (Shenzhen) Company Limited Video recommendation method and device, computer device and storage medium
CN110929243A (en) * 2019-11-22 2020-03-27 武汉大学 Pedestrian identity recognition method based on mobile phone inertial sensor
CN111079547A (en) * 2019-11-22 2020-04-28 武汉大学 Pedestrian moving direction identification method based on mobile phone inertial sensor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
匡晓华;何军;胡昭华;周媛;: "面向人体行为识别的深度特征学习方法比较", 计算机应用研究, no. 09, 28 August 2017 (2017-08-28) *

Also Published As

Publication number Publication date
CN114638285B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
CN110287880A (en) A kind of attitude robust face identification method based on deep learning
CN109146921B (en) Pedestrian target tracking method based on deep learning
CN108629288B (en) Gesture recognition model training method, gesture recognition method and system
CN111325664B (en) Style migration method and device, storage medium and electronic equipment
CN110427867A (en) Human facial expression recognition method and system based on residual error attention mechanism
CN106909938B (en) Visual angle independence behavior identification method based on deep learning network
CN111079547B (en) Pedestrian moving direction identification method based on mobile phone inertial sensor
Su et al. HDL: Hierarchical deep learning model based human activity recognition using smartphone sensors
Jing et al. Spatiotemporal neural networks for action recognition based on joint loss
CN110443309A (en) A kind of electromyography signal gesture identification method of combination cross-module state association relation model
CN109508686A (en) A kind of Human bodys' response method based on the study of stratification proper subspace
Ying et al. Processor free time forecasting based on convolutional neural network
CN111291713B (en) Gesture recognition method and system based on skeleton
Savio et al. Image processing for face recognition using HAAR, HOG, and SVM algorithms
Liu et al. Lightweight monocular depth estimation on edge devices
Li et al. Multimodal gesture recognition using densely connected convolution and blstm
Liang et al. A lightweight method for face expression recognition based on improved MobileNetV3
CN113743247A (en) Gesture recognition method based on Reders model
CN116823868A (en) Melanin tumor image segmentation method
Li et al. [Retracted] Human Motion Representation and Motion Pattern Recognition Based on Complex Fuzzy Theory
Song et al. Track foreign object debris detection based on improved YOLOv4 model
CN114638285A (en) Multi-mode identification method for mobile phone inertial sensor data
CN114140524B (en) Closed loop detection system and method for multi-scale feature fusion
Wu et al. Infrared target detection based on deep learning
CN114492732A (en) Lightweight model distillation method for automatic driving visual inspection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant