CN114170671A

CN114170671A - Massage manipulation identification method based on deep learning

Info

Publication number: CN114170671A
Application number: CN202111086878.8A
Authority: CN
Inventors: 雷静桃; 朱盛鼎; 陈冬冬
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2021-09-16
Filing date: 2021-09-16
Publication date: 2022-03-11
Anticipated expiration: 2041-09-16

Abstract

The invention discloses a massage manipulation recognition method based on deep learning, which is characterized in that the force distribution and force magnitude information of hands during massage are collected through a flexible touch sensor, and the manipulation characteristics are extracted by adopting a neural network, so that the recognition of the massage manipulation is realized; a variational self-encoder is adopted to realize data enhancement; extracting key frames of input data by using a frame difference method, and removing input redundant frames; extracting and training the spatial domain and time domain characteristics of the massage dot matrix thermal diagram set through a two-dimensional convolution neural network and a circulation neural network; a frame attention mechanism is introduced behind the convolutional neural network, so that the recognition precision of the network to the massage manipulation is improved. The method expands the original sensor data without increasing the cost of data acquisition; key frames of the image group data are extracted, so that the phenomenon of network overfitting is reduced, and the network generalization capability is improved; extracting time domain information among video frames by using a neural network to obtain time domain characteristics of a massage manipulation; by introducing a frame attention mechanism, the identification precision is effectively improved.

Description

Massage manipulation identification method based on deep learning

Technical Field

The invention belongs to the field of artificial intelligence, and particularly relates to a massage manipulation identification method based on deep learning.

Background

The massage is a method of acting on a specific part of the body surface by hand to regulate the physiological and pathological conditions of the body and achieve the purpose of physical therapy. Traditional manual massage requires a lot of physical power of the masseur and also requires a lot of training cost.

With the development of robotics, robots can replace manual work for massage services. In order to realize the robot massage, firstly, a system is needed to know the massage manipulation of a professional masseur, the characteristics of the massage manipulation are researched, and a reference is provided for the robot to reproduce the massage manipulation of the masseur.

Currently, in the identification of manipulations at home and abroad, most of the visual sensors are adopted to acquire and label manipulation images, and a two-dimensional convolution neural network is used to train single-frame manipulation image data to achieve the effect of identifying the category of the manipulations. However, the visual sensor can only collect the motion information of the manipulation and can not collect the force information of the massage manipulation; the two-dimensional convolutional neural network can only train and extract the spatial domain information of the massage manipulation, but cannot extract the time sequence information closely related to the massage manipulation; in addition, conventional sensor data collection also requires significant costs. Therefore, a more effective and novel method for identifying a massage manipulation is required.

Disclosure of Invention

The invention aims to solve the technical problem of providing a method for identifying a massage manipulation of a doctor, which realizes the aim of identifying the massage manipulation by acquiring data and labeling the data through a flexible dot-matrix touch sensor integrated on a glove, and processing and training data through a neural network.

In order to achieve the purpose of the invention, the conception of the invention is as follows:

a massage manipulation recognition method based on deep learning is characterized in that a model capable of generating target data from hidden variables is constructed through a Variational auto-encoder (VAE), and additional target data are obtained through interpolation of the hidden variables under the condition that artificially collected data are not added; extracting a key frame of the collected massage manipulation image by a frame difference method; extracting features of a space domain and a time domain of a massage force dot matrix diagram by combining a two-dimensional Convolutional Neural Network (2 DCNN) with a Recurrent Neural Network (RNN); weights are added to the massage force bitmap by introducing a frame attention mechanism and trained through a neural network to optimize model performance.

According to the inventive concept, the invention adopts the following technical scheme:

a massage manipulation recognition method based on deep learning comprises the following operation steps:

the method comprises the following steps: acquiring data corresponding to the massage action through a flexible distributed touch sensor;

step two: visualizing the sensor data into a massage dot matrix thermodynamic diagram through an upper computer;

step three: constructing a model for generating target data from hidden variables by a variational self-encoder (VAE) to expand original acquired data;

step four: extracting a key frame of the massage dot matrix thermodynamic diagram by a frame difference method;

step five: extracting the spatial characteristics of each input frame of massage force dot matrix diagram by using a two-dimensional convolutional neural network;

step six: introducing a frame attention mechanism after the convolutional neural network, and giving a weight value to the video frame dimension of the data;

step seven: extracting time domain characteristics of each frame of massage force dot matrix chart by using a recurrent neural network;

step eight: and the output of the cyclic neural network is accessed to the linear layer to reduce the dimensional data, train the network and realize the identification of the massage manipulation.

Preferably, the massage manipulation recognition method based on deep learning is divided into three modules: the device comprises a data acquisition module, a data processing module and a deep learning module.

Preferably, in the step one, the data acquisition module is used for acquiring the force intensity and distribution applied by the hand of the doctor during massage through the touch sensor and the data acquisition system; and the massage data acquisition is realized.

Preferably, in the second step, the data processing module is used for receiving the sensor data in real time through the upper computer and visualizing the data into the massage dot matrix thermodynamic diagram for storage; and realizing data visualization.

Preferably, in the second step, the data massage data corresponding to the collected massage actions are visualized, each sensing unit of the touch sensor is visualized in the upper computer according to the actual distribution position of the sensing unit on the glove through Matlab, the stress of the sensing unit is represented by thermodynamic diagram, and the stress of the sensing unit is changed from small to large and corresponds to the color of the sensing unit from cold to warm.

Preferably, in the third step, the data processing module is utilized, the distribution of real sample data is indirectly obtained by introducing the hidden variable through the variational self-encoder VAE, the hidden variable is interpolated and decoded into a generated sample, and the massage lattice thermodynamic diagram is expanded under the condition of not additionally acquiring data, so that data enhancement is realized. The method constructs a model for generating target data from hidden variables through a Variational auto-encoder (VAE), wherein the hidden variables correspond to the target data one by one, and the original collected data is expanded by interpolating the hidden variables and decoding the hidden variables into the generating samples so as to reduce a large amount of cost for acquiring new data; data enhancement is achieved.

Preferably, in the fourth step, the data processing module is used for extracting key frames in the massage force dot matrix image group by a frame difference method, extracting N frames of key frames, and arranging the key frames in a time sequence to be used as the input of the neural network; and realizing the key frame extraction of the image group.

Preferably, in the fourth step, the frame difference method averages pixels of every 10 frames of dot matrix thermodynamic diagrams corresponding to 120 frames of dot matrix thermodynamic diagrams by a massage method, selects a picture with the largest pixel difference from the average picture by matching the average picture with the 10 frames of pictures, retains the picture with the largest difference value and the average picture, finally obtains 20 frames of key frames, and arranges the key frames in time sequence to be used as input of the neural network, thereby realizing data key frame extraction.

Preferably, in step five, the spatial domain features of the massage manipulation are extracted through a two-dimensional convolutional neural network.

Further preferably, in the fifth step, a deep learning module is used, a two-dimensional convolutional neural network is adopted to extract the spatial features of each input frame of the massage force dot map, the original input data dimensions of the neural network are (batch size, frames, channels, image size x, image size y), and after the spatial features are extracted by the two-dimensional convolutional neural network, the data dimensions become (batch size, frames, CNN embed dim); the extraction of the spatial domain characteristics of the massage manipulation is realized.

Preferably, in the sixth step, a frame attention mechanism is introduced after the convolutional neural network by using a deep learning module, the data dimensions are converted into (1,1, frames) through a global pooling layer, a linear layer and a normalization layer, the dimensions except the video frame are all converted into 1, a weight value frames of the frame is trained through a back propagation algorithm of the neural network, then the weight value is multiplied onto the dimensions of the frames of the data, the dimensions of the data are restored into (batch size, frames and CNN embedded dim), and the video frame dimensions are endowed with the weight value in the range of 0-1.

Preferably, in step seven, the time domain features of the massage manipulations are extracted through a recurrent neural network.

Further preferably, in the seventh step, the deep learning module is utilized to input the data with dimensions (batch size, frames, CNN embedded dim) into a Long Short-Term Memory (LSTM) neural network to train the time domain features of the learning data; the extraction of the time domain characteristics of the massage manipulation is realized.

Preferably, in the step eight, a deep learning module is utilized to connect the output of the last hidden layer of the recurrent neural network into a linear layer to reduce the dimension of the data into (batch size, N Categories) to obtain a manipulation recognition result; calculating loss between the identification result and the data real label by using a cross entropy function; updating each weight parameter of the network by using a back propagation algorithm every training round; through n rounds of training, a neural network capable of identifying the massage manipulation is finally obtained; the purpose of identifying massage manipulations is realized.

The massage manipulation identification method based on deep learning utilizes the flexible touch sensor which can be integrated on a glove and is provided with a distributed sensing unit to acquire force distribution and size data of the hand of a professional masseur, obtains additional target data by interpolating an implicit variable through a variational self-encoder under the condition of not increasing manually acquired data, and extracts a data key frame through a frame difference method; meanwhile, a novel network combining a two-dimensional convolutional neural network and a cyclic neural network is provided to extract the spatial domain and time domain information of the massage dot matrix thermodynamic diagram, a frame attention mechanism is introduced into the network, and the weight of a video frame is trained to improve the identification precision; by adopting the method, the massage manipulation data of a professional masseur can be effectively collected, and the data is trained by using the neural network, so that the aim of identifying the massage manipulation is fulfilled, and a reference is provided for the robot to reproduce the massage manipulation of the masseur. The method has the advantages of low data acquisition cost, short network training time, high identification precision, strong generalization capability and the like.

Compared with the prior art, the invention has the following obvious prominent substantive characteristics and obvious advantages:

1. according to the invention, through a Variational auto-encoder (VAE), an original database is expanded under the condition of not increasing acquired data, data enhancement is realized, and the cost for acquiring new data is greatly reduced;

2. the invention extracts the key frame of the massage force dot matrix diagram by a frame difference method, removes redundant frames, reduces the training data volume of the neural network and improves the generalization capability of the network;

3. the invention can effectively extract and train the space domain and time domain characteristics of the massage force dot matrix map group by combining the two-dimensional convolution neural network and the circulation neural network;

4. according to the invention, the frame attention mechanism is introduced into the neural network, so that the weight is added to each frame of the massage force dot matrix image group, and the identification precision of the manipulation is effectively improved.

Drawings

Fig. 1 is a flowchart of a massage manipulation recognition method based on deep learning according to a preferred embodiment of the present invention.

FIG. 2 is a schematic diagram of a massage lattice thermodynamic diagram of a data glove of a preferred embodiment of the present invention.

Fig. 3 is a schematic diagram of a Variational auto-encoder (VAE) according to a preferred embodiment of the present invention.

Fig. 4 is a schematic diagram of data processing of the tactile sensor according to the preferred embodiment of the present invention.

Fig. 5 is a flowchart of the recognition method based on deep learning according to the preferred embodiment of the present invention.

Detailed Description

Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.

The first embodiment is as follows:

referring to fig. 1, a massage manipulation recognition method based on deep learning includes the following steps:

In the massage manipulation recognition method based on deep learning, the flexible lattice type touch sensor integrated on the glove is used for collecting data and labeling the data, and the purpose of massage manipulation recognition is achieved through neural network processing and training data.

Example two:

this embodiment is substantially the same as the first embodiment, and is characterized in that:

in this embodiment, in the second step, the data corresponding to the collected massage actions is visualized, each sensing unit of the tactile sensor is visualized in the upper computer according to the actual distribution position of the sensing unit on the glove by Matlab, the magnitude of the stress on the sensing unit is represented by thermodynamic diagram, and the stress on the sensing unit is changed from small to large and corresponds to the color from cold to warm of the sensing unit.

In the third step, the variational self-encoder VAE indirectly obtains the distribution of real sample data by introducing hidden variables, interpolates the hidden variables and decodes the hidden variables into generated samples, expands the massage lattice thermodynamic diagram without additionally acquiring data, and realizes data enhancement.

In the fourth step, the frame difference method calculates the average value of the pixels of each 10 frames of dot matrix thermodynamic diagrams corresponding to 120 frames of dot matrix thermodynamic diagrams by a massage method, selects a frame of image with the largest pixel difference from the average value image by the average value image and the 10 frames of image, retains the image with the largest difference value and the average value image, finally obtains 20 frames of key frames, and well arranges the key frames according to the time sequence to be used as the input of a neural network so as to realize the extraction of the data key frames.

And in the fifth step, extracting the spatial domain characteristics of the massage manipulation through a two-dimensional convolutional neural network.

In the sixth step, the frame attention mechanism converts the data dimension into (1,1, frames) through a global pooling layer, a linear layer and a normalization layer, converts all the dimensions except the video frame into 1, trains out the weight value frames of the video frame through a back propagation algorithm of a neural network, then multiplies the weight value to the frame dimension of the data, restores the data dimension into (batch size, frames, CNN embedded dim), and endows the video frame dimension frames with the weight value in the range of 0-1.

In the seventh step, the time domain characteristics of the massage manipulation are extracted through a recurrent neural network.

In the seventh step, the output of the last hidden layer of the cyclic neural network is connected to a linear layer to reduce the dimension of the data into (batch size, N Categories) to obtain a manipulation identification result; calculating loss between the identification result and the data real label by using a cross entropy function; updating each weight parameter of the network by using a back propagation algorithm every training round; through n rounds of training, a neural network capable of identifying the massage manipulation is finally obtained, and the massage manipulation identification is realized.

The massage manipulation identification method based on deep learning comprises the steps of collecting force distribution and size data of hands of a professional masseur by using a flexible touch sensor with a distributed sensing unit, obtaining additional target data by interpolating an implicit variable through a variational self-encoder under the condition of not increasing manually collected data, and extracting a data key frame by using a frame difference method; meanwhile, a novel network of a two-dimensional convolutional neural network and a cyclic neural network is combined to extract the spatial domain and time domain information of the massage dot matrix thermodynamic diagram, a frame attention mechanism is introduced into the network, and the weight of a video frame is trained to improve the identification precision; by adopting the method, the massage manipulation data of a professional masseur can be effectively collected, and the data is trained by using the neural network, so that the aim of identifying the massage manipulation is fulfilled, and a reference is provided for the robot to reproduce the massage manipulation of the masseur. The method has the advantages of low data acquisition cost, short network training time, high identification precision, strong generalization capability and the like.

Example three:

in this embodiment, as shown in fig. 1, a flow chart of this embodiment is a system divided into three major modules: the system comprises a data acquisition module, a data processing module and a deep learning module, and specifically comprises the following steps:

the method comprises the following steps: in the data acquisition module, the magnitude and the distribution of the force applied by the hands of a doctor during massage are acquired through a touch sensor and a data acquisition system; and the massage data acquisition is realized.

Step two: in the data processing module, an upper computer receives data in real time through Matlab, converts the data in an ASCII code format into a shaping format, visualizes the data into a massage dot matrix thermodynamic diagram and stores the massage dot matrix thermodynamic diagram, and one massage action corresponds to a plurality of massage dot matrix thermodynamic diagrams arranged in time sequence; and realizing data visualization.

Step three: in the data processing module, a model which can generate target data from hidden variables is constructed by data through a Variational auto-encoder (VAE) based on a massage lattice thermodynamic diagram acquired originally, the hidden variables correspond to the target data one by one, the hidden variables are interpolated and decoded into generating samples to expand the original acquired data so as to reduce a large amount of cost for acquiring new data, different conditions are added to the Variational auto-encoder (VAE) for different massage methods so as to change the Variational auto-encoder (CVAE) into a Conditional Variational auto-encoder (CVAE), and the data of different methods are expanded through the Conditional Variational auto-encoder; data enhancement is achieved.

Step four: in the data processing module, a massage method corresponds to 120 frames of left and right dot matrix thermodynamic diagrams, the pixels of each 10 frames of dot matrix thermodynamic diagrams are averaged according to the time sequence, a picture with the largest pixel difference between the mean picture and the 10 frames of pictures is selected, the picture with the largest difference value and the mean picture are reserved, finally, about 20 frames of key frames are obtained, and the key frames are well arranged according to the time sequence to be used as the input of a neural network; and realizing the key frame extraction of the image group.

Step five: in the deep learning module, a two-dimensional convolutional neural network is used for extracting the spatial characteristics of each input frame of massage force dot map, the two-dimensional convolutional neural network adopts a ResNet-152 model pre-trained on a data set ILSVRC-2012-CLS, the last layer of linear layer of an original ResNet-152 model is removed, two layers of linear layers are added, three dimensions of a color channel and a horizontal and vertical pixel of a picture are reduced into an embedded layer dimension 512, the original input data dimensions of the neural network are (batch size, frames, channels, image size x and image size y), and the data dimensions become (batch size, frames and CNN embedded dim) after the spatial characteristics are extracted through the two-dimensional convolutional neural network; the extraction of the spatial domain characteristics of the massage manipulation is realized.

Step six: in the deep learning module, a frame attention mechanism is introduced after a convolutional neural network, a data dimension with the dimensions of (batch size, frames and CNN embedded dim) is converted into (batch size, CNN embedded dim) through dimension transposition, the data dimension is reduced into (1,1 and frames) through a layer of global pooling layer, the data dimension is changed into (1,1 and frames/r) through a layer of linear layer, nonlinearity is introduced into the network through a layer of RenActivate function, the data dimension is reduced into (1,1 and frames) through a layer of linear layer, the value of the frames is normalized between 0 and 1 through a layer of Sigmoid function, the data dimension is (1,1 and frames), the data at the moment are respectively multiplied to the original data according to the dimension, the video frame dimension of the original data is endowed with the weight between 0 and 1, and is transposed through the dimension, finally, data with dimensions reduced into (batch size, frames, CNN embedded dim) are obtained.

Step seven: in the deep learning module, the dimension of the video frame to which the weight is given is: (batch size, frames, CNNembed dim) data input into Long Short-Term Memory (LSTM) recurrent neural network to train time domain features of learning data, LSTM can store and access information in Long time period through unique gate structure, and input of LSTM is from cell state C at last moment_t-1The timing input x associated with this time_tComposition, output as cell status C at this time_tAnd a hidden layer h_t(ii) a The extraction of the time domain characteristics of the massage manipulation is realized.

Step eight: in the deep learning module, an output hidden layer h corresponding to the last time sequence of a Long Short-Term Memory (LSTM) recurrent neural network_nThe output is connected into a linear layer to reduce the dimension of the data into: (batch size, N Categories), the loss function of the network adopts a cross entropy loss function, the optimizer of the network adopts an Adam optimizer, and the network is trained; the purpose of identifying massage manipulations is realized.

Further, in the massage dot matrix thermodynamic diagram in the second step, as shown in fig. 2, the size and distribution of the massage force of the doctor can be visualized into a dot matrix thermodynamic diagram form through an upper computer, and the stress of each sensing unit is changed from small to large, and the corresponding color is changed from cold to warm.

Further, a schematic diagram of a Variational auto-encoder (VAE) in the above step three is shown in fig. 3, where the real samples collected by the sensor are: { X₁，X₂...X_n}; introducing an implicit variable Z, for a given real sample X_kSuppose there is a specialization X_kDistribution p ═ Z | X_k) Further assume that this is exclusively X_kThe distribution of (a) is a normal distribution of independent multivariate, each normal distribution having two parameters: mean μ and variance σ²Two neural networks are respectively constructed: mu.s_k＝f₁(X_k)，

To calculate the mean and variance; from distribution p ═ (Z | X)_k) Middle sampling a Z_kObtained by a generator constructed by a neural network

By minimizing

To train the neural network of the generator; therefore, the distribution of the original data can be indirectly obtained from the distribution of the hidden variable Z, and the hidden variable is interpolated and decoded into a generated sample, so that the massage dot matrix thermodynamic diagram can be expanded under the condition of not additionally acquiring data, and the data enhancement is realized.

Further, as shown in fig. 4, a schematic diagram of sensor data processing of a data processing module in the system is obtained, and data expansion is performed on each group of the acquired massage dot matrix thermal diagrams corresponding to different manipulations by using a Conditional variable auto-encoder (CVAE), at this time, the number of samples of each massage manipulation is increased while the number of video frames corresponding to each sample is unchanged; and extracting expanded sensor data by using a frame difference method, and extracting the corresponding 120 frames of massage dot matrix thermal images of each sample to about 20 frames by using the frame difference method to be used as the input of the convolutional neural network.

Further, as shown in fig. 5, a flow chart of a deep learning method of a deep learning module in the system is obtained by firstly inputting preprocessed data as an input of a neural network and setting the number of iterations of the network; extracting the spatial domain characteristics of the massage data by using a two-dimensional convolutional neural network, extracting the time domain characteristics of the massage data by using a long-short term memory cyclic neural network, and introducing a frame attention mechanism between the two neural networks; the two-dimensional convolutional neural network adopts a ResNet-152 model pre-trained on a data set ILSVRC-2012-CLS, partial weight parameters in the pre-trained ResNet-152 model can be frozen in the network training process to reduce the training calculation cost, the two-dimensional convolutional neural network respectively reduces the dimension of the 20 frames of left and right massage lattice thermodynamic diagrams corresponding to each massage manipulation sample, extracts the characteristics to encode, and then arranges the encoded massage lattice thermodynamic diagrams in time sequence, and the data dimension after being encoded by the two-dimensional convolutional neural network is changed into (batch size, frames, CNN embedded dim); after the encoded data is subjected to a frame attention mechanism comprising a global pooling layer, two linear layers and two activation functions, the video frame dimension frames of the encoded data obtains a weight value ranging from 0 to 1, and the data dimension is (batch size, frames, CNN embedded dim); inputting data into a Long Short-Term Memory (LSTM) cyclic neural network according to the order of frames dimensions to train the time domain characteristics of learning data, and adding a linear layer behind the cyclic neural network to reduce the dimensions of the data into (batch size, frames); then, the data output is normalized between 0 and 1 through a softmax layer, the loss between the recognition result and the real label is calculated by using a cross entropy loss function, and each weight parameter of the network is updated by using a back propagation algorithm in each training round; and finally obtaining the neural network capable of identifying the massage manipulation through n rounds of training.

The massage manipulation recognition method in the embodiment has the following remarkable advantages:

1. in the embodiment, the original database is expanded by a Variational auto-encoder (VAE) under the condition of not increasing acquired data, so that data enhancement is realized, and the cost for acquiring new data is greatly reduced;

2. the embodiment extracts the key frame of the massage force dot matrix diagram through a frame difference method, removes redundant frames, reduces the training data volume of the neural network and improves the generalization capability of the network;

3. the embodiment can effectively extract and train the spatial domain and time domain characteristics of the massage force dot matrix diagram group by combining the two-dimensional convolution neural network with the circulation neural network;

4. the embodiment introduces the frame attention mechanism into the neural network, adds the weight to each frame of the massage force dot matrix image group, and effectively increases the identification precision of the manipulation.

In summary, in the massage manipulation recognition method based on deep learning in the above embodiments of the present invention, the force distribution and force magnitude information of the hands of the professional masseur during massage are collected by the flexible touch sensor, and the neural network is used to explore the manipulation characteristics of the professional masseur during massage, so as to recognize the massage manipulation of the professional masseur; a variational self-encoder is adopted to realize data enhancement; extracting key frames of input data by using a frame difference method, and removing input redundant frames; sequentially extracting and training the spatial domain and time domain characteristics of the massage dot matrix thermal diagram set by a method of combining a two-dimensional convolution neural network and a circulation neural network; by introducing a frame attention mechanism after the convolutional neural network, the recognition precision of the network on the massage manipulation is further improved. The method of the embodiment of the invention expands the original sensor data without increasing the cost of data acquisition; key frames of the image group data are extracted, so that the overfitting phenomenon of the network is reduced, and the generalization capability of the network is improved; the neural network further extracts time domain information among the video frames to obtain important time domain characteristics of the massage manipulation; by introducing a frame attention mechanism, the identification precision is effectively improved.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and various changes in form and detail may be made therein without departing from the spirit of the present invention, and it is intended that all such changes and modifications as fall within the true scope of the present invention be interpreted in accordance with the principles of the present invention.

Claims

1. A massage manipulation recognition method based on deep learning is characterized by comprising the following operation steps:

2. The massage manipulation recognition method based on deep learning of claim 1, wherein in the second step, the data massage data corresponding to the collected massage actions are visualized, each sensing unit of the touch sensor is visualized in the upper computer according to the actual distribution position of the sensing unit on the glove through Matlab, the magnitude of the stress on the sensing unit is represented by thermodynamic diagram, and the stress on the sensing unit is changed from small to large to correspond to the color of the sensing unit from cold to warm.

3. The massage manipulation recognition method based on deep learning of claim 1, wherein in the third step, the variational auto-encoder VAE indirectly obtains the distribution of real sample data by introducing hidden variables, interpolates the hidden variables and decodes the interpolated hidden variables into generated samples, and expands the massage lattice thermodynamic diagram without additionally acquiring data, thereby realizing data enhancement.

4. The method for identifying massage manipulations based on deep learning of claim 1, wherein in the fourth step, the frame difference method averages the pixels of every 10 frames of dot matrix thermodynamic diagrams corresponding to 120 frames of dot matrix thermodynamic diagrams of a massage manipulation, selects a picture with the largest pixel difference from the mean picture by using the mean picture and the 10 frames of pictures, retains the picture with the largest difference value and the mean picture, finally obtains 20 frames of key frames, and arranges the key frames in time sequence to be used as the input of a neural network, thereby realizing the extraction of data key frames.

5. The method of claim 1, wherein in the fifth step, spatial domain features of the massage technique are extracted by a two-dimensional convolutional neural network.

6. The massage manipulation recognition method based on deep learning of claim 1, wherein in the sixth step, the frame attention mechanism converts the data dimension into (1,1, frames) through a global pooling layer, a linear layer and a normalization layer, converts all the dimensions except the video frame into 1, trains the weighted value frames of the video frame through a back propagation algorithm of a neural network, multiplies the weighted value into the frame dimension of the data, restores the data dimension into (batch size, frames, CNN embedded dim), and gives the weighted value in the range of 0-1 to the video frame dimension frames.

7. The method of claim 1, wherein in the seventh step, the time domain features of the massage technique are extracted by a recurrent neural network.

8. The deep learning-based massage manipulation recognition method of claim 1, wherein in the seventh step, the last hidden layer output of the recurrent neural network is connected to a linear layer to reduce the dimension of the data into (batch size, N Categories) to obtain a manipulation recognition result; calculating loss between the identification result and the data real label by using a cross entropy function; updating each weight parameter of the network by using a back propagation algorithm every training round; through n rounds of training, a neural network capable of identifying the massage manipulation is finally obtained, and the massage manipulation identification is realized.