CN115273236A

CN115273236A - Multi-mode human gait emotion recognition method

Info

Publication number: CN115273236A
Application number: CN202210903599.4A
Authority: CN
Inventors: 郭明; 赵琰; 陈向勇
Original assignee: Linyi University
Current assignee: Linyi University
Priority date: 2022-07-29
Filing date: 2022-07-29
Publication date: 2022-11-01

Abstract

The invention discloses a multi-modal human gait emotion recognition method, and particularly relates to the field of emotion recognition, wherein a head-mounted VR (virtual reality) device is used for stimulating a tested person to generate emotion, and gait emotion data are collected by using an inertial sensor node worn on the leg part and an electroencephalogram sensor node worn on the head part of the tested person; segmenting the gait emotion data by using a sliding window method to obtain gait emotion samples; the method comprises the steps of converting inertial gait emotion samples into frequency domain representation by a fast Fourier transform method, and converting electroencephalogram gait emotion samples into time-frequency domain representation by a wavelet continuous transform method to obtain gait emotion images; performing model training and parameter optimization by using training data to obtain a convolutional neural network model based on a channel attention mechanism; and respectively taking the inertia gait emotion feature matrix, the electroencephalogram gait emotion feature matrix and the fusion gait emotion feature matrix as the input of a full-connection classifier, establishing a corresponding decision layer fusion mechanism, and carrying out human gait emotion recognition. The invention can effectively overcome the influence of single mode on the gait emotion recognition effect and effectively improve the human gait emotion recognition accuracy.

Description

Multi-mode human gait emotion recognition method

Technical Field

The invention relates to the technical field of emotion recognition, in particular to a multi-mode human gait emotion recognition method.

Background

Human-computer interaction is becoming more and more frequent with the development of artificial intelligence along with technological and economic progress. In addition to simply operating a computer with a mouse and keyboard, it is more desirable that the computer understand and handle human emotions. Most information that people communicate and disseminate is affective. Emotion recognition is used as a key part of emotion calculation, and the emotion recognition research is to enable a computer to receive and process signals representing human emotions so as to deduce emotional states and realize man-machine interaction with human centers. Gait is the posture and motion pattern of a human body in the normal walking process, and has obvious individual difference and uniqueness. Gait recognition is a prerequisite for gait emotion recognition, and there is evidence that human emotion is expressed to some extent by walking, changes in emotion lead to changes in gait motion characteristics, gait is non-invasive and acceptable, and gait-based emotion recognition is effective.

Emotional expression itself is a diverse process and is not sufficient for studies of single emotions. Along with the improvement of the multi-source heterogeneous information fusion theory, the complementation and the promotion of the multi-mode information can make up the defect of a single mode, and the emotion recognition is more favorably realized. Multimodal emotion recognition is becoming a popular research trend. The combination optimization of multi-source information is called data fusion and can be divided into input-level fusion, feature-level fusion and decision-level fusion. With the rapid development of deep neural networks, deep learning shows superior performance in the field of emotion recognition, and can directly learn the most effective features from input data without manual extraction.

Disclosure of Invention

In order to overcome the defects that a single input mode in the prior art cannot be suitable for all human gait emotion recognition and a single mode has a certain classification error, the embodiment of the invention provides a multi-mode human gait emotion recognition method.

In order to achieve the purpose, the invention provides the following technical scheme: a multi-modal human gait emotion recognition method comprises the following specific recognition steps:

step S1: stimulating the tested person to generate e emotions through the head-mounted VR equipment, and collecting inertial gait emotion data and electroencephalogram gait emotion data of v tested persons by using n inertial sensor nodes worn on the legs of the tested person and m electroencephalogram sensor nodes worn on the head of the tested person;

step S2: respectively carrying out segmentation processing on the inertial gait emotion data and the electroencephalogram gait emotion data acquired by each sensor node according to fixed intervals by using a sliding window method so as to obtain all inertial gait emotion samples and electroencephalogram gait emotion samples;

and step S3: converting the inertial gait emotion samples obtained in the step S2 into frequency domain representation by a fast Fourier transform method to obtain inertial gait emotion images; converting the electroencephalogram gait emotion samples obtained in the step S2 into time-frequency domain representation through a wavelet continuous transformation method to obtain electroencephalogram gait emotion images;

and step S4: respectively dividing the inertial gait emotion image and the electroencephalogram inertial gait emotion image obtained in the step S3 into training data and test data, and performing model training and parameter optimization by using the training data to obtain a convolutional neural network model based on a channel attention mechanism;

step S5: extracting the characteristics of the inertial gait emotion image obtained in the step S3 through the convolutional neural network model based on the channel attention mechanism obtained in the step S4 to obtain an inertial gait emotion characteristic matrix, and extracting the characteristics of the electroencephalogram gait emotion image obtained in the step S3 to obtain a gait electroencephalogram emotion characteristic matrix;

step S6: performing feature fusion processing on the inertial gait emotion feature matrix and the electroencephalogram gait emotion feature matrix obtained in the step S5 to obtain a fusion gait emotion feature matrix;

step S7: and (4) respectively taking the inertia gait emotion feature matrix and the electroencephalogram gait emotion feature matrix obtained in the step (5) and the fusion gait emotion feature matrix obtained in the step (6) as the input of a full-connection classifier to obtain corresponding prediction labels, establishing a corresponding decision layer fusion mechanism, and carrying out human gait emotion recognition.

In a preferred embodiment, the step S2 specifically includes:

s2.1: respectively dividing inertial gait emotion data and electroencephalogram gait emotion data acquired by n inertial sensor nodes and m electroencephalogram sensor nodes worn by each tested person into a plurality of inertial gait emotion data fragments and electroencephalogram gait emotion data fragments with the same length by using a sliding window method, wherein each gait emotion data fragment is used as a sample to obtain all inertial gait emotion samples and electroencephalogram gait emotion samples;

s2.2: the parameters in the sliding window method comprise window size and sliding step length, the window size is len, the sliding step length between two adjacent windows is mu, so as to obtain T windows, wherein T = f (l/mu len) -1,l represents the total length of gait emotion data, f () represents a downward integer taking function, and the nearest integer less than the calculation result is taken;

s2.3: the x-axis acceleration data of the inertia gait emotion data segment in the T (T =1,2, …, T) window collected by the k (k =1,2, …, n) inertial sensor node

Y-axis acceleration data

Z-axis acceleration data

X-axis angular velocity data

Y-axis angular velocity data

Z-axis angular velocity data

X-axis magnetic field data

Y-axis magnetic field data

Z-axis magnetic field data

Composition, can be expressed as

The electroencephalogram gait emotion data segment in the T (T =1,2, …, T) window acquired by the j (j =1,2, …, m) th electroencephalogram sensor node is represented as

。

In a preferred embodiment, the step S3 specifically includes:

s3.1: converting the inertial gait emotion samples from time domain representation to frequency domain representation by a fast Fourier transform method to obtain frequency domain inertial gait emotion samples with the size of len multiplied by 9, splicing the inertial gait emotion samples and the frequency domain gait inertial emotion samples according to the time dimension to obtain time-frequency inertial gait emotion samples with the size of 2len multiplied by 9, and corresponding sample data values to gray values in an image to form a gray value matrix to obtain an inertial gait emotion image;

s3.2: converting time domain representation into time-frequency domain representation in the EEG gait emotion sample by a wavelet continuous transformation method, and obtaining M frequency bands by N-layer wavelet decomposition, wherein,

acquiring a time-frequency domain EEG gait emotion sample with lenxM size, and comparing the sample data value with the gray value in the imageAnd carrying out correspondence to form a gray value matrix so as to obtain an electroencephalogram gait emotion image.

In a preferred embodiment, the step S5 specifically includes:

s5.1: the convolutional neural network mainly comprises a convolutional layer, a pooling layer and a full-link layer, in order to check the important correlation characteristics among a plurality of channel signals of the inertial sensor hidden in the inertial gait emotion image, a plurality of convolutional kernels with different sizes are used, the step length and pooling parameters of the convolutional kernels are adjusted to obtain correlation characteristic matrixes among channels with different numbers, the correlation characteristic matrixes are spliced according to the channel dimensions, and the important correlation characteristics among a plurality of frequency bands of wavelet signals of each electroencephalogram sensor in the electroencephalogram gait emotion image are extracted by using the same convolutional kernel parameters and network structures;

s5.2: the feature matrix F obtained after the convolution of the neural network can be expressed as

Where c represents the number of channels, and a c-dimensional vector can be obtained using a channel attention mechanism

Where each value belongs to [0,1]And multiplying the weight by the characteristic matrix by FxAtn to obtain an inertia gait emotional characteristic matrix and an electroencephalogram gait emotional characteristic matrix.

In a preferred embodiment, the step S7 specifically includes: the sizes of the inertial gait emotional characteristic matrix and the electroencephalogram gait emotional characteristic matrix are respectively

And

the two feature matrixes are connected in series and are subjected to feature normalization to form a new high-dimensional composite feature representation with the size of

To obtain a fusion gait emotion characteristic matrix.

In a preferred embodiment, the step S7 specifically includes:

s7.1: obtaining F1 scores of the e emotions to the o classification model as contribution rates through P-turn cross validation;

s7.2: and establishing an evaluation matrix of a decision layer fusion model according to the contribution rate:

wherein, R represents an evaluation matrix,

representing the contribution rate of the e emotion to the o classification model;

s7.3: contrast intensity of h (h =1,2, …, o) th classification model

Conflict with index

Expressed in terms of standard deviation and correlation coefficient, respectively:

wherein the content of the first and second substances,

representing the contrast strength of the h-th classification model,

representing the e-th mood versus the h-th classification modelThe rate of contribution of the model(s),

represents the mean value of the e emotional contribution rates in the h classification model,

representing the collision index of the h-th classification model,

representing a correlation coefficient between the jth and the h classification models;

and obtaining the information quantity of the e-th emotion to the h-th classification model by using the following formula:

wherein, the first and the second end of the pipe are connected with each other,

representing the information quantity of the e-th emotion to the h-th classification model;

the weight of the e emotion to the h classification model is obtained according to the following formula:

representing the weight of the e emotion to the h classification model;

the output result of the e emotion to the h classification model can be obtained through the following formula:

wherein the content of the first and second substances,

representing that the test sample is classified into e emotion;

s7.4: and establishing a corresponding decision layer fusion mechanism to carry out human gait emotion recognition.

The invention provides a multi-modal human gait emotion recognition method, which comprises the steps of stimulating a tested person to generate emotion by using a head-mounted VR device, collecting gait emotion data by using an inertial sensor node and an electroencephalogram sensor node, converting the gait emotion data into gait emotion images, and constructing a human gait emotion data set; in the aspect of emotion recognition, a convolutional neural network model based on channel attention is provided, the model mainly utilizes the effective combination of a channel attention mechanism and the convolutional neural network model to automatically extract deep features from gait emotion images, an end-to-end network structure is used, the engineering complexity is reduced, the lightweight is realized by designing the network structure and parameters, the model complexity is reduced, and the algorithm performance is improved; and at the fusion level, fusing the classification decision of the inertial gait emotion characteristic matrix, the electroencephalogram gait emotion characteristic matrix and the fusion gait emotion characteristic matrix, wherein the output weight of each model is obtained by the evaluation matrix. The algorithm provided by the invention can effectively overcome the influence of using a single mode to the identification effect in the gait emotion identification, and can effectively improve the robustness of the system and the identification accuracy of the human gait emotion identification method.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention.

FIG. 2 is a schematic diagram of a convolutional neural network model structure according to the present invention.

Detailed description of the preferred embodiments

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention provides a multi-modal human gait emotion recognition method as shown in figure 1, which comprises the following specific recognition steps:

step S1: the method comprises the steps of stimulating a tested person to generate e emotions through a head-mounted VR device, and collecting inertial gait emotion data and electroencephalogram gait emotion data of v tested persons by utilizing n inertial sensor nodes worn on the leg of the tested person and m electroencephalogram sensor nodes worn on the head of the tested person.

Specifically, the tested person wears the head-wearing VR device to watch the video suitable for the tested person so as to generate specific emotion, and after entering an emotion state, walking motion is started; acquiring inertial gait emotion data and electroencephalogram gait emotion data in walking movement by using n inertial sensor nodes fixed on the legs of a measured person and m electroencephalogram sensor nodes fixed on the head of the measured person, wherein each inertial sensor node consists of an xyz triaxial accelerometer, an xyz triaxial gyroscope and an xyz triaxial magnetometer and respectively acquires xyz triaxial acceleration data, xyz triaxial angular velocity data and xyz triaxial magnetic field data; for the k (k =1,2, …, n) th inertial sensor node, the acquired xyz triaxial acceleration data may be represented as

The xyz triaxial angular velocity data may be expressed as

The xyz tri-axial magnetic field data may be expressed as

For j (j =1,2, …, m) brain electrical sensor nodes, the acquired brain electrical data may be represented as

The EEG gait emotion data collected by the EEG sensor nodes and the inertia gait emotion data collected by the inertia sensor nodes can form gait emotion data

。

Step S2: and (3) respectively carrying out segmentation processing on the inertia gait emotion data and the electroencephalogram gait emotion data acquired by each sensor node according to fixed intervals by using a sliding window method so as to obtain all inertia gait emotion samples and electroencephalogram gait emotion samples.

The method comprises the steps of respectively dividing inertial gait emotion data and electroencephalogram gait emotion data acquired by n inertial sensor nodes and m electroencephalogram sensor nodes worn by each tested person into a plurality of inertial gait emotion data fragments and electroencephalogram gait emotion data fragments with the same length by using a sliding window method, and taking each gait emotion data fragment as a sample to obtain all inertial gait emotion samples and electroencephalogram gait emotion samples.

The parameters in the sliding window method comprise window size and sliding step length, the window size is len, the sliding step length between two adjacent windows is mu, so as to obtain T windows, wherein T = f (l/mu len) -1,l represents the total length of gait emotion data, f () represents a downward integer taking function, and the nearest integer less than a calculation result is taken;

the pieces of inertial gait emotion data in the T (T =1,2, …, T) window collected for the k (k =1,2, …, n) th inertial sensor node are measured by the x-axis acceleration data

Y-axis acceleration data

Z-axis acceleration data

X-axis angular velocity data

Y-axis angular velocity data

Z-axis angular velocity data

X-axis magnetic field data

Y-axis magnetic field data

Z-axis magnetic field data

Composition, can be expressed as

；

And step S3: converting the inertial gait emotion samples obtained in the step S2 into frequency domain representation by a fast Fourier transform method to obtain inertial gait emotion images; converting the electroencephalogram gait emotion sample obtained in the step S2 into time-frequency domain representation through a wavelet continuous transformation method to obtain an electroencephalogram gait emotion image, and specifically comprising the following steps:

converting the inertial gait emotion sample from time domain representation to frequency domain representation by a fast Fourier transform method to obtain a frequency domain inertial gait emotion sample with the size of len multiplied by 9, reflecting the property of a signal by using frequency spectrum information of the signal, splicing the inertial gait emotion sample and the frequency domain gait inertial emotion sample according to the time dimension, increasing the data difference by using logarithmic operation processing to obtain a time-frequency inertial gait emotion sample with the size of 2len multiplied by 9, normalizing the data, mapping the numerical value to [0,255], corresponding the sample data value to the gray value in the image to form a gray value matrix so as to obtain an inertial gait emotion image;

transforming the time domain representation into the time-frequency domain representation in the electroencephalogram gait emotion sample by a wavelet continuous transformation method, analyzing the localization of the time and spatial frequency of the signal, and obtaining M frequency bands by N-layer wavelet decomposition, wherein,

to obtain time-frequency domain EEG gait emotion samples with len multiplied by M size, data is normalized, and the value size is mapped to [0,255]]The sample data values are corresponding to the gray values in the image to form a gray value matrix so as to obtain an electroencephalogram gait emotion image;

and step S4: respectively dividing the inertial gait emotion image and the electroencephalogram inertial gait emotion image obtained in the step S3 into training data and testing data, and performing model training and parameter optimization by using the training data to obtain a convolutional neural network model based on a channel attention machine system;

specifically, the inertial gait emotion image and the electroencephalogram inertial gait emotion image obtained in the step S3 are divided into training data and test data respectively, for example, a B-fold cross validation method is used for optimizing network parameters, specifically, the training data are divided into B parts on average, wherein B-1 part of data is used as a training model and parameter optimization, the remaining 1 part of data is used for verifying the experimental effect, each part of data is used as validation data, B training processes are iterated totally, parameters obtained in one time with the highest accuracy in the B training processes are used as optimized parameters, and a convolutional neural network model based on a channel attention machine and capable of accurately identifying each human body gait emotion is obtained; testing the trained model by using the test data, and using the accuracy, the recall rate and the F1 score as model evaluation factors;

step S5: extracting the characteristics of the inertial gait emotion image obtained in the step S3 through the convolutional neural network model based on the channel attention mechanism obtained in the step S4 to obtain an inertial gait emotion characteristic matrix, and extracting the characteristics of the electroencephalogram gait emotion image obtained in the step S3 to obtain a gait electroencephalogram emotion characteristic matrix, wherein the specific steps comprise;

the convolutional neural network model based on the channel attention mechanism shown in fig. 2 includes a convolutional neural network and an attention mechanism; the convolutional neural network mainly comprises a convolutional layer, a pooling layer and a full-connection layer, in order to check important correlation characteristics among a plurality of channel signals of an inertial sensor hidden in an inertial gait emotion image, convolution kernels with different sizes are used in parallel and are respectively 1 multiplied by 3, 5 multiplied by 5, 9 multiplied by 9 and 13 multiplied by 13, the step length and the pooling parameter of the convolution kernels are adjusted to obtain correlation characteristic matrixes among a single channel, 5 channels, 9 channels and 13 channels, the correlation characteristic matrixes are spliced according to channel dimensions to form an inclusion structure, and then two layers of convolutions are used for further optimizing and reducing the characteristics; extracting relevant important characteristics among wavelet signal single frequency bands, 5 frequency bands, 9 frequency bands and 13 frequency bands of each electroencephalogram sensor in the electroencephalogram gait emotion image by using the same convolution kernel parameters and network structures;

Wherein c represents the number of channels; using a channel attention mechanism to obtain a c-dimensional vector

Where each value belongs to [0,1]And obtaining the self-adaptive weight corresponding to each channel, and multiplying the weight by the feature matrix by F multiplied by Atn to obtain an inertia gait emotion feature matrix and an electroencephalogram gait emotion feature matrix.

specifically, the sizes of the inertial gait emotion characteristic matrix and the electroencephalogram gait emotion characteristic matrix after passing through the flattening layer are respectively

And

the two eigenvectors are connected in series and normalized according to the maximum and minimum normalization

Wherein, in the process,

is the normalized data of the data obtained by the method,

is the ith data of the original data to form a new high-dimensional synthesized feature representation with the size of

To obtain the gait emotion feature vector.

Step S7: respectively taking the inertial gait emotion feature matrix and the electroencephalogram gait emotion feature matrix obtained in the step S5 and the fusion gait emotion feature matrix obtained in the step S6 as the input of a full-connection classifier to obtain corresponding prediction labels, establishing a corresponding decision layer fusion mechanism, and carrying out human gait emotion recognition, wherein the specific steps comprise:

obtaining F1 scores of the e emotions to the o classification model as contribution rates through P-fold cross validation;

and establishing an evaluation matrix of a decision layer fusion model according to the contribution rate:

wherein, R represents an evaluation matrix,

contrast strength of h (h =1,2, …, o) th classification model

Conflict with index

representing the contrast strength of the h-th classification model,

representing the contribution rate of the ith emotion to the h-th classification model,

representing the contrast strength of the h-th classification model,

wherein the content of the first and second substances,

express the e-th emotion pairInformation content of the h-th classification model;

wherein the content of the first and second substances,

representing the weight of the e emotion to the h classification model;

wherein the content of the first and second substances,

representing that the test sample is classified into the e emotion;

and establishing a corresponding decision layer fusion mechanism to carry out human gait emotion recognition.

The implementation mode specifically comprises the following steps:

for example, a gait emotion motion is captured and close range gait emotion data acquisition is achieved using a brain electrical sensor and an inertial sensor encapsulating an accelerometer, a gyroscope and a magnetometer, and a gait emotion data acquisition platform is constructed using these sensors. Two inertial sensors and one EEG sensor are respectively fixed on the thigh and the head of a measured person and are connected to an upper computer through a wireless Bluetooth technology. 16 healthy volunteers were recruited in total for constructing gait emotion data sets, with equal male and female proportions. By adopting the method of generating emotion through external media stimulation, each volunteer watches the video in the VR equipment five minutes in advance, starts to walk according to own habits when immersed in the video, and continuously watches the video in the walking process so as to collect three gait emotion data of neutrality, happiness and fear.

The walking is a cyclic and repeated process, the whole time sequence of the gait emotion data is divided into a plurality of sub-time sequences, and the number of gait emotion samples is increased to realize finer human gait emotion identification. The data segmentation is carried out by using a sliding window method, the selection of the size of a window can influence the overall recognition effect of the system, the number of gait emotion data points collected within one second is used as the size of the window, and 50% of adjacent two windows are overlapped, so that the data of the adjacent windows are ensured to be mutually associated and information is not lost, and all inertial gait emotion samples and electroencephalogram gait emotion samples are obtained.

The inertial gait emotion sample is expressed by a time domain expression method and a frequency domain expression method, the time domain expression method is to map the inertial gait emotion sample with pixels in a gray level image, data normalization is carried out on the inertial gait emotion data, the numerical value is mapped to a range of [0,255], each data value corresponds to a gray level pixel in the image, the sampling frequency of an inertial sensor is 50hz, and the time domain gait inertial emotion image with the size of 9 x 50 is obtained. The frequency domain representation is that the time domain inertia gait emotion image applies fast Fourier transform along the time dimension, uses frequency spectrum information to reflect data properties, performs logarithm operation to enhance data difference to obtain a frequency domain gait inertia emotion image, and splices the time domain inertia gait emotion image and the frequency domain gait inertia emotion image according to the time dimension to obtain the inertia gait emotion image.

Electroencephalogram signals are typically unstable and non-stationary signals, and are not represented only in a frequency domain or a time domain. The time-frequency domain representation combines the characteristics and advantages of the time domain representation and the frequency domain representation, and electroencephalogram gait emotion data can be better reflected. Decomposing and representing the EEG signals by using wavelet transformation, and obtaining M frequency bands by N-layer wavelet decomposition, wherein,

the sampling frequency of the electroencephalogram sensor is 512hz, so that the electroencephalogram sensor can be used for detecting the electroencephalogramThe 1 × 512 signals are converted into a required 64 × 512 matrix by continuous wavelet transform to obtain an electroencephalogram gait emotion image.

The convolutional neural network model based on the attention mechanism comprises a feature extraction module and an attention module. The pure convolutional layer is used as a backbone for extracting deep features in the convolutional neural network, so that the complexity of the network structure is reduced. After the multi-dimensional feature matrix is obtained, a channel attention mechanism is used to focus on the salient parts and find the most critical parts from the complex data.

The network model for extracting the inertial gait emotion image and the electroencephalogram gait emotion image is shown in fig. 2. In order to detect important characteristics of hidden correlation modes between 9 channel signals and 75 frequency bands of a sensor in an input gait emotion image, for an initial structure, 128 convolution kernels with the step length of 1 and the sizes of 1 × 3, 5 × 5, 9 × 9 and 13 × 13 are used for detecting characteristics of different channels in an inertial gait emotion image and different frequency bands in an electroencephalogram gait emotion image. Two convolutional layers are then used to optimize the features. Each convolutional layer performs a 2D convolution, followed by a batch normalization layer, a Relu activation function, and a max-pooling downsampling layer. After the second maximum pooling downsampling layer, the feature vectors are connected to a full connection layer through a flattening layer, the feature matrix is flattened into one-dimensional feature vectors, and the probability distribution of each human gait emotion category is obtained through a softmax layer.

The feature matrix F obtained after the convolution of the neural network can be expressed as

The sizes of the inertial gait emotional characteristic matrix and the electroencephalogram gait emotional characteristic matrix are respectively 64 multiplied by 324 and 64 × 8 × 11. The two features are spliced together in series in a flattening layer to obtain a new feature representation called a fusion gait emotion feature matrix. Two features concatenated together have the same numerical scale to balance the fused features. Time domain min-max normalization is applied to the fusion gait emotion feature matrix. Different features have different dimensions according to the formula

Mapping data to [0,1 ] by linear transformation]To eliminate the influence of dimensionality, wherein,

is the normalized data of the data obtained by the method,

is the ith data in the original data.

Dividing the obtained inertial gait emotion image and the electroencephalogram inertial gait emotion image into training data and test data respectively, optimizing network parameters by using an 8-fold cross validation method, and particularly, averagely dividing the training data into 8 parts, wherein 7 parts of the training data are used as a training model and parameter optimization, the remaining 1 part of the training data are used for verifying the experimental effect, each part of the data is used as verification data, 8 training processes are iterated totally, the parameter obtained at the time with the highest accuracy in the 8 training processes is used as the optimized parameter, so that a convolutional neural network model based on channel attention system and capable of accurately recognizing each human gait emotion is obtained; the decision layer fusion mechanism provided by the invention is utilized to classify and identify the test data, and the accuracy, the recall rate and the F1 score are used as model evaluation elements.

Then, an experiment is carried out, and the method provided by the embodiment of the invention adopts three human gait emotion recognition results under different evaluation indexes for comparison, as shown in table 1:

TABLE 1 chart of recognition results of three human gait emotions under different evaluation indexes

As can be seen from table 1, the method provided by the embodiment of the present invention is effective.

The invention discloses a multi-modal human gait emotion recognition method, and particularly relates to the field of emotion recognition, wherein a head-mounted VR (virtual reality) device is used for stimulating a tested person to generate emotion, and gait emotion data are collected by using an inertial sensor node worn on the leg part and an electroencephalogram sensor node worn on the head of the tested person; segmenting the gait emotion data by using a sliding window method to obtain gait emotion samples; the method comprises the steps of converting inertia gait emotion samples into frequency domain representation through a fast Fourier transform method, and converting electroencephalogram gait emotion samples into time-frequency domain representation through a wavelet continuous transform method to obtain gait emotion images; performing model training and parameter optimization by using training data to obtain a convolutional neural network model based on a channel attention mechanism; and respectively taking the inertia gait emotion characteristic matrix, the electroencephalogram gait emotion characteristic matrix and the fusion gait emotion characteristic matrix as the input of a full-connection classifier, and establishing a corresponding decision layer fusion mechanism to carry out human gait emotion recognition. The invention can effectively overcome the influence of single mode on the gait emotion recognition effect and effectively improve the human gait emotion recognition accuracy.

Finally, it should be noted that: the above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that are within the spirit and principle of the present invention are intended to be included in the scope of the present invention.

Claims

1. A multi-mode human body gait emotion recognition method is characterized by comprising the following steps: the specific identification steps are as follows:

step S1: stimulating a tested person to generate e emotions through a head-mounted VR device, and collecting inertia gait emotion data and electroencephalogram gait emotion data of v tested persons by using n inertia sensor nodes worn on legs of the tested person and m electroencephalogram sensor nodes worn on the head of the tested person;

and step S3: converting the inertial gait emotion samples obtained in the step S2 into frequency domain representation through a fast Fourier transform method to obtain an inertial gait emotion image; converting the electroencephalogram gait emotion samples obtained in the step S2 into time-frequency domain representation through a wavelet continuous transformation method to obtain electroencephalogram gait emotion images;

step S7: and (4) respectively taking the inertial gait emotion characteristic matrix and the electroencephalogram gait emotion characteristic matrix obtained in the step (S5) and the fusion gait emotion characteristic matrix obtained in the step (S6) as the input of a full-connection classifier to obtain corresponding prediction labels, and establishing a corresponding decision layer fusion mechanism to carry out human gait emotion recognition.

2. The method for recognizing the multi-modal human gait emotion according to claim 1, characterized in that: the step S2 specifically comprises the following steps:

Y-axis acceleration data

Z-axis acceleration data

X-axis angular velocity data

Y-axis angular velocity data

Z-axis angular velocity data

X-axis magnetic field data

Y-axis magnetic field data

Z-axis magnetic field data

Composition, can be expressed as

。

3. The method for recognizing the multi-modal human gait emotion according to claim 1, characterized in that: the step S3 specifically comprises the following steps:

s3.1: converting the time domain representation into frequency domain representation through a fast Fourier transform method to obtain a frequency domain inertia gait emotion sample with the size of len multiplied by 9, splicing the inertia gait emotion sample and the frequency domain gait inertia emotion sample according to the time dimension to obtain a time-frequency inertia gait emotion sample with the size of 2len multiplied by 9, and corresponding the sample data value to the gray value in the image to form a gray value matrix to obtain an inertia gait emotion image;

and corresponding the sample data value with the gray value in the image to form a gray value matrix so as to obtain the time-frequency domain electroencephalogram gait emotion sample with the size of len multiplied by M, thereby obtaining the electroencephalogram gait emotion image.

4. The method for recognizing human gait emotion in multiple modes according to claim 1, wherein: the step S5 specifically includes:

s5.1: the convolutional neural network mainly comprises a convolutional layer, a pooling layer and a full connection layer, in order to check the important correlation characteristics among a plurality of channel signals of the inertial sensor hidden in the inertial gait emotion image, a plurality of convolutional kernels with different sizes are used, the step length and pooling parameters of the convolutional kernels are adjusted to obtain correlation characteristic matrixes among different channels, the correlation characteristic matrixes are spliced according to the channel dimensions, and the important correlation characteristics among a plurality of frequency bands of wavelet signals of each electroencephalogram sensor in the electroencephalogram gait emotion image are extracted by using the same convolutional kernel parameters and network structures;

5. The method for recognizing human gait emotion in multiple modes according to claim 1, wherein: the step S6 specifically includes: the sizes of the inertial gait emotional characteristic matrix and the electroencephalogram gait emotional characteristic matrix are respectively