CN114707539A

CN114707539A - Hand joint angle estimation method, hand joint angle estimation device, storage medium and equipment

Info

Publication number: CN114707539A
Application number: CN202210246659.XA
Authority: CN
Inventors: 耿艳娟; 于哲彬; 崔晗; 龙昱丞; 陈子寅; 李光林
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2022-02-23
Filing date: 2022-03-14
Publication date: 2022-07-05
Also published as: WO2023159674A1

Abstract

The invention discloses a hand joint angle estimation method, an estimation device, a storage medium and equipment. The hand joint angle estimation method comprises the following steps: acquiring a real-time surface electromyographic signal generated when a hand of an object to be detected moves; and inputting the surface electromyographic signals into a pre-trained hand joint angle estimation model to obtain an estimated joint angle of the object to be detected during hand motion, wherein the hand joint angle estimation model comprises a multi-scale convolution module and a multi-layer multi-head attention module which are sequentially connected. The joint angle related characteristics can be better extracted from the electromyogram signal time sequence through the multi-scale convolution module, so that higher regression precision can be realized compared with the existing Gaussian process regression joint angle estimation algorithm, in addition, the integral time sequence characteristics of the electromyogram sequence are learned through a multi-layer multi-head attention mechanism, the problem of long-distance dependence caused by traditional cyclic neural learning is avoided, and meanwhile, the operation efficiency is further improved through parallel calculation.

Description

Hand joint angle estimation method, hand joint angle estimation device, storage medium and equipment

Technical Field

The invention belongs to the technical field of computer vision, and particularly relates to a hand joint angle estimation method, an estimation device, a computer readable storage medium and computer equipment.

Background

Manipulators have been widely studied and used in search and rescue, industry, and artificial limbs for decades. The surface electromyogram signal has the characteristics of leading motion generation and being convenient to collect, is an ideal physiological signal which can be used for extracting human motion intention, has numerous applications in the fields of rehabilitation, human-computer interaction and the like, and is widely used for dexterous control of a manipulator worldwide. In the past, myoelectric control usually adopts a discrete action recognition mode to realize manipulator control, however, hands are the most distinctive organs of human bodies and have high flexibility, and a multi-degree-of-freedom real-time continuous control method is the future development direction of the manipulator, because the method can provide more natural control which accords with human intuition. In order to realize a continuous real-time control strategy, a plurality of latest researches establish a mapping relation between an algorithm realization electromyographic signal and a finger joint angle. In addition, considering two problems that the application of amputees and the hand myoelectricity are difficult to collect, the human commonly uses the arm myoelectricity to establish a regression algorithm so as to estimate the joint angle during the hand movement. Since the muscle of the arm of the hand of a human body can move when being used in the motion process, the hand joint angle can be estimated through the arm electromyographic signal. In order to integrate the current research, the approaches of real-time joint angle estimation of human body through electromyographic signals are mainly divided into two types: model-based methods and data-based methods. Model-based methods, most commonly physiological models such as Hill models, Huxley models, etc. The model-based method can explain the generation process of human body movement, and parameters in the model can express the attributes of human skeleton and muscle system, such as muscle fiber length, tendon length and the like. The method based on data is the most commonly used supervised method realized based on machine learning and deep learning at present, and the method directly establishes a regression algorithm from skin surface electromyogram signals to continuous motion amount, and is simple and reliable. At present, regression algorithms such as Gaussian process regression and long-and-short time memory network have been widely applied to the task of estimating hand joint angles in the motion process in real time.

Although the joint angle estimation method based on the model has strong interpretability, the model contains a large number of parameters which are difficult to directly measure, and the method is only used for motion estimation of a few joints with low degree of freedom at present, and the movement of the hand of a person needs to be matched with more joints, so that the estimation method is required to support more degrees of freedom. The method based on data can often realize estimation of relatively more degrees of freedom, the regression algorithm of the Gaussian process applied in the prior art can realize mapping from skin surface electromyographic signals to finger joint angles of multiple degrees of freedom, but in the motion process, the joint angle at the current moment may be associated with the electromyographic signals before the current moment, but the regression algorithm of the Gaussian process does not take the associations into consideration. The long-time and short-time memory network can extract time sequence characteristics in the electromyographic signals, and most of the existing deep learning networks based on the electromyographic estimation of hand joint angles are added with the long-time and short-time memory network to learn the overall time sequence characteristics of the electromyographic sequences. The long-term memory network belongs to a circulating neural network structure, and when the sequence is long, part of input sequence information is easily lost due to the long-term dependence problem of the circulating neural network.

Disclosure of Invention

(I) technical problems to be solved by the invention

The technical problem solved by the invention is as follows: the problem of long-distance dependence caused by traditional recurrent neural learning is solved by effectively extracting the characteristics related to joint angles and the overall time sequence characteristics from the electromyographic signals so as to improve regression accuracy.

(II) the technical scheme adopted by the invention

A hand joint angle estimation method, the hand joint angle estimation method comprising:

acquiring a real-time surface electromyographic signal generated when a hand of an object to be detected moves;

and inputting the surface electromyographic signals into a pre-trained hand joint angle estimation model to obtain an estimated joint angle of the object to be detected during hand motion, wherein the hand joint angle estimation model comprises a multi-scale convolution module and a multi-layer multi-head attention module which are sequentially connected.

Preferably, the method for pre-training the hand joint angle estimation model comprises the following steps:

acquiring joint angle measurement values and corresponding surface electromyogram signal data when a human hand moves;

inputting the surface electromyogram signal data serving as a training sample into the hand joint angle estimation model, sequentially processing the surface electromyogram signal data by the multi-scale convolution module and the multi-layer multi-head attention module, and outputting a predicted value of the hand joint angle;

and calculating error loss according to the hand joint angle predicted value and the joint angle measured value, and updating network parameters of the multi-scale convolution module and the multi-layer multi-head attention module according to the error loss.

Preferably, the hand joint angle estimation method further comprises:

and performing feature extraction on the surface electromyographic signal data to obtain a root mean square eigenvector corresponding to the surface electromyographic signal data, and taking the root mean square eigenvector as a training sample.

Preferably, the multi-scale convolution module includes two multi-scale convolution sub-networks and a pooling layer connecting the two multi-scale convolution sub-networks, and the network structures of the two multi-scale convolution sub-networks are the same, where the multi-scale convolution sub-networks include three parallel branches and a feature concatenation layer, each branch includes two convolution layers, and the method for processing the training sample by the multi-scale convolution module includes:

respectively inputting the training sample into three branches of a first multi-scale convolution sub-network for convolution feature extraction to obtain three first convolution features, and splicing the three first convolution features by using the feature splicing layer to form a first convolution fusion feature;

and after the sequence length compression is carried out on the first convolution fusion feature by using the pooling layer, the first convolution fusion feature is respectively input into three branches of a second multi-scale convolution sub-network, convolution feature extraction is carried out to obtain three second convolution features, and then the three second convolution features are spliced by using a feature splicing layer of the second multi-scale convolution sub-network to form a second convolution fusion feature, wherein the second convolution fusion feature is used as input data of the multi-layer multi-head attention module.

Preferably, the hand joint angle estimation method further comprises:

and performing position coding on the second convolution fusion features to serve as input data of the multi-layer multi-head attention module.

Preferably, the multi-layer multi-head attention module includes a plurality of layers of multi-head attention networks and a first fully-connected layer, which are sequentially connected, each layer of the multi-head attention network includes three single-head attention networks with the same structure and arranged in parallel, a feature splicing layer, and a second fully-connected layer, wherein the method for processing the second convolution fusion feature by using the multi-layer multi-head attention module includes:

the second convolution fusion features after position coding are used as input data of a first-layer multi-head attention network and are respectively input into the three single-head attention networks, and three single-head attention output features are obtained after first linear mapping operation is carried out;

splicing the three single-head attention output features along the dimension of a feature channel by using a feature splicing layer of the first layer of the multi-head attention network to obtain an attention fusion feature;

performing a second linear mapping operation on the attention fusion features by using a second full-connection layer to obtain multi-head attention output features, taking the multi-head attention output features as input data of the multi-head attention network of the next layer, and repeating the steps until the multi-head attention output features are output by the multi-head attention network of the last layer;

and inputting the multi-head attention output characteristics output by the multi-head attention network of the last layer into the first full-connection layer to obtain a hand joint angle predicted value sequence.

Preferably, the method for performing a first linear mapping operation on the position-coded second convolution fusion feature by using the single-headed attention network includes:

respectively calculating to obtain query feature vector sequences according to the following formulas

Key feature vector sequence

Sum value feature vector sequence

Q＝CW_Q，K＝CW_k，V＝CW_v

Wherein, W_Q、W_k、W_vFor linear mapping determined by trainingOperation d_q、d_k、d_vIs a constant, n represents the number of signal channels of the electromyographic signal, C belongs to R^n×lRepresenting the second convolution fusion features after position encoding;

the single-head attention output characteristic H is obtained according to the following formula:

the application also discloses hand joint angle estimation device, hand joint angle estimation device includes:

the signal acquisition unit is used for acquiring real-time surface electromyographic signals generated when the hand of the object to be detected moves;

and the pre-trained hand joint angle estimation model is used for predicting the estimated joint angle of the object to be measured during hand motion according to the collected real-time surface electromyographic signals, and comprises a multi-scale convolution module and a multi-layer multi-head attention module which are sequentially connected.

The application also discloses a computer readable storage medium, which stores a hand joint angle estimation program, and the hand joint angle estimation program realizes the hand joint angle estimation method when being executed by a processor.

The application also discloses a computer device, which comprises a computer readable storage medium, a processor and a hand joint angle estimation program stored in the computer readable storage medium, wherein the hand joint angle estimation program realizes the hand joint angle estimation method when being executed by the processor.

(III) advantageous effects

The invention discloses a hand joint angle estimation method, an estimation device, a storage medium and equipment, which have the following technical effects compared with the prior art:

the hand joint angle estimation model used in the finger joint angle estimation method can better extract the characteristics related to the joint angle from the electromyographic signal time sequence through the multi-scale convolution module, so that higher regression precision can be realized compared with the existing Gaussian process regression joint angle estimation algorithm. Meanwhile, the whole time sequence characteristic of the electromyographic sequence is learned through a multi-layer multi-head attention mechanism, the problem of long-distance dependence caused by traditional cyclic neural learning is avoided, and meanwhile, the running efficiency is further improved through parallel computing.

Drawings

FIG. 1 is a flowchart illustrating a hand joint angle estimation method according to a first embodiment of the present invention;

FIG. 2 is a flowchart illustrating a training process of a hand joint angle estimation model according to a first embodiment of the present invention;

FIG. 3 is a network structure diagram of a hand joint angle estimation model according to a first embodiment of the present invention;

fig. 4 is a comparison graph of joint angle estimation values obtained by the hand joint angle estimation model according to the first embodiment of the present invention and actual joint angles;

FIG. 5 is a schematic block diagram of a hand joint angle estimation apparatus according to a second embodiment of the present invention;

fig. 6 is a schematic diagram of a computer device according to a fourth embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.

Before describing in detail the various embodiments of the present application, the inventive concepts of the present application are first briefly described: in the prior art, a deep learning model based on data is mainly based on a regression algorithm of a Gaussian process and a long-short time memory network, the former does not fully consider the correlation between joint angles and electromyographic signals, so that the regression precision is not high, and the latter is easy to lose information of partial input sequences due to the long-range dependence problem when facing long sequences. Therefore, according to the hand joint angle estimation method provided by the embodiment, the hand joint angle estimation model is constructed by using the multi-scale convolution module and the multi-layer multi-head attention module, and is trained, so that the hand joint angle estimation model fully learns the correlation between joint angles and surface electromyogram signals and the overall time sequence characteristics of the surface electromyogram signals, and the method is beneficial to improving the prediction precision and avoiding the problem of long-distance dependence of traditional cyclic neural learning.

Specifically, as shown in fig. 1, the hand joint angle estimation method of the first embodiment includes the following steps:

step S10: acquiring a real-time surface electromyographic signal generated when a hand of an object to be detected moves;

step S20: and inputting the surface electromyographic signals into a hand joint angle estimation model which is trained in advance, and obtaining an estimated joint angle of the object to be detected during hand motion.

The hand joint angle estimation model comprises a multi-scale convolution module and a multi-layer multi-head attention module which are sequentially connected, and as shown in fig. 2, the method for pre-training the hand joint angle estimation model comprises the following steps:

step S101: acquiring joint angle measurement values and corresponding surface electromyogram signal data when a human hand moves;

step S102: inputting the surface electromyogram signal data serving as a training sample into the hand joint angle estimation model, sequentially processing the surface electromyogram signal data by the multi-scale convolution module and the multi-layer multi-head attention module, and outputting a predicted value of the hand joint angle;

step S103: and calculating error loss according to the hand joint angle predicted value and the joint angle measured value, and updating network parameters of the multi-scale convolution module and the multi-layer multi-head attention module according to the error loss.

Exemplarily, in step S101, a myoelectric differential electrode of a Delsys myoelectric acquisition system is used as a myoelectric sensor to perform surface myoelectric signal acquisition, the acquired surface myoelectric signals are from extensor digitorum, flexor digitorum, biceps brachii, triceps brachii and a circle of muscle of which the forearm is 2-6 cm away from the elbow, the sampling frequency of the myoelectric signals is 2000Hz, and a butterworth filter of 5-450 Hz is selected to perform band-pass filtering on the myoelectric signals for baseline correction and noise removal of the myoelectric signals. And (3) collecting hand joint angle data by adopting a data glove with a CyberGlove II model, wherein the sampling frequency is 20 Hz. Aiming at the joint angle, firstly resampling to 2000Hz to ensure that the surface electromyogram signal is synchronous with the joint angle sequence in time, and then smoothing the original joint angle signal by using a 2Hz zero-phase low-pass filter to avoid step jitter in the signal so as to enable the signal to be more similar to a normal motion curve of a human body. And further recording the maximum value and the minimum value of the collected electromyographic data and the joint angle data for the normalization of training and testing data.

Further, in this embodiment, feature extraction is performed on the surface electromyographic signal data to obtain a root-mean-square eigenvector corresponding to the surface electromyographic signal data, and the root-mean-square eigenvector is used as an input to the hand joint angle estimation model. Specifically, 100 sampling points are taken as the length of a sliding window, and after the root mean square characteristic of the current window is calculated each time, the sampling points are slid backwards by 1, so that a characteristic sequence which is as long as the surface electromyographic signal of the original skin and corresponds to a time point can be generated, and the data volume can be improved to the maximum extent. And finally, generating a root-mean-square eigenvector sequence and a joint angle sequence with the window length of 400 sampling points by using a sliding window, wherein the step length of the sliding window is 100 sampling points, the joint angle and the root-mean-square eigenvector data in each sliding window are used as sample data, and the joint angle vector dimension represents the number of the estimated joint angles. In this example, 60% of the data samples were used as training data and 40% of the data samples were used as test data.

Exemplarily, in step S102, as shown in fig. 3, the multi-scale convolution module includes two multi-scale convolution sub-networks and a pooling layer connecting the two multi-scale convolution sub-networks, and a first dotted line box and a second dotted line box represent the first multi-scale convolution self-network and the second multi-scale convolution self-network, respectively. The network structures of the two multi-scale convolution sub-networks are the same, wherein the multi-scale convolution sub-networks comprise three parallel branches and a characteristic splicing layer, and each branch comprises two convolution layers. The first convolution layer of each branch is a one-dimensional convolution layer with a convolution kernel size of 1, the second convolution layer of each branch is also a one-dimensional convolution layer, the convolution kernels of the second convolution layers of each branch are different in size, the size range of the convolution kernels is 3-11, and selection is required according to experimental conditions. The characteristic splicing layer is used for splicing sequences obtained by the three branches along a characteristic direction, and the dimension range of output characteristics after splicing is 128-512. The pooling layer is used to compress the sequence length of the output features of the first multi-scale convolution sub-network, and in this embodiment, the pooling layer compresses the sequence length of the output features to half of the original sequence length. The output features of the second multi-scale convolution sub-network have dimensions of 32-128. Specifically, in step S102, the method for processing the training sample by the multi-scale convolution module includes the following steps:

step S1021: and respectively inputting the training sample into three branches of a first multi-scale convolution sub-network for convolution feature extraction to obtain three first convolution features, and splicing the three first convolution features by using a feature splicing layer to form a first convolution fusion feature.

Step S1022: and performing sequence length compression on the first convolution fusion features by using a pooling layer, then inputting the compressed first convolution fusion features into three branches of a second multi-scale convolution sub-network respectively, performing convolution feature extraction to obtain three second convolution features, and splicing the three second convolution features by using a feature splicing layer of the second multi-scale convolution sub-network to form second convolution fusion features, wherein the second convolution fusion features are used as input data of the multi-layer multi-head attention module.

As a preferred embodiment, after the convolution operation is performed on the two multi-scale convolution sub-networks, the input and output sequences are guaranteed to be equal in length through a zero padding operation, and after the two multi-scale convolution sub-networks are both added with an exponential linear unit activation function, the nonlinearity of the deep learning network is increased.

Further, in order to distinguish the precedence relationship of the electromyographic signals in the time dimension by the attention mechanism, the embodiment further uses an additional position coding, that is, the position coding is performed on the second convolution fusion feature. Specifically, the second convolution fusion feature and the absolute position code are added and then sent to the multi-layer multi-head attention mechanism module, where the position code used in this embodiment is sinussoid absolute position code, which is a more common position coding manner, and the specific position coding process is not described herein again.

Specifically, in step S102, the multi-layer multi-head attention module includes a plurality of layers of multi-head attention networks and a first full-connection layer, where the multi-head attention network is shown as a third dotted line frame in fig. 3, and each layer of multi-head attention network includes three single-head attention networks, a feature splicing layer, and a second full-connection layer, which have the same structure and are arranged in parallel. The number of layers of the multi-head attention network ranges from 2 to 4, and the specific number of layers is selected according to actual conditions.

The method for processing the second convolution fusion feature by using the multilayer multi-head attention module comprises the following steps:

step S1023: and taking the second convolution fusion features subjected to position coding as input data of the first-layer multi-head attention network, respectively inputting the input data into the three single-head attention networks, and performing first linear mapping operation to obtain three single-head attention output features.

Specifically, the query feature vector sequences are respectively calculated according to the following formula

Key feature vector sequence

Sum value feature vector sequence

Q＝CW_Q，K＝CW_k，V＝CW_v

Wherein, W_Q、W_k、W_vFor linear mapping operations determined by training, d_q、d_k、d_vIs a constantThe number is generally set to be 32-128 and needs to be adjusted according to actual conditions, and C belongs to R^n×lRepresenting the second convolution fusion characteristic after position coding, and n represents the number of signal channels of the electromyographic signal.

wherein d is_kIs a parameter set artificially and is used for scaling the numerical value to prevent the numerical value from being unstable due to overlarge calculation result.

Step S1024: and splicing the three single-head attention output features H along the dimension of the feature channel by using the feature splicing layer of the first-layer multi-head attention network to obtain the attention fusion feature.

Step S1025: and performing a second linear mapping operation on the attention fusion features by using a second full-connection layer to obtain multi-head attention output features, taking the multi-head attention output features as input data of the multi-head attention network of the next layer, and repeating the steps until the multi-head attention output features are output by the multi-head attention network of the last layer. It should be noted that, the repetition of the above steps described herein means the repetition of steps S1023 to S1025.

The operation of the feature splice layer and the second fully-connected layer of the multi-head attention network of each layer may be represented as follows:

Z＝Concat(H₁,H₂,H₃)W_O

wherein Z represents a multi-head attention output characteristic, H₁,H₂,H₃Three single-headed attention output features are represented, Concat represents a feature splicing operation, W_OIndicating that a second linear mapping operation determined by training is required.

Step S1026: and inputting the multi-head attention output characteristics output by the multi-head attention network of the last layer into the first full-connection layer to obtain a predicted value sequence of the hand joint angle.

After the multi-head attention network of a plurality of layers is processed, the multi-head attention output characteristics output by the multi-head attention network of the last layer are processed by utilizing the first full connection layer, and a predicted value sequence of the hand joint angle is generated. Illustratively, the last vector of the hand joint angle predictor sequence is taken as the hand joint angle predictor.

In step S103, a mean square error loss is calculated according to the predicted value of the hand joint angle and the measured value of the joint angle, and the network parameters of the multi-scale convolution module and the multi-layer multi-head attention module are updated by the mean square error loss through an adaptive matrix estimation algorithm, and related calculation processes and updating processes are prior art and are not described in detail herein.

After the training, a hand joint angle estimation model with optimal parameters is obtained. In order to evaluate the performance of the model, a test data set is input into the hand joint angle estimation model to obtain a continuously estimated joint angle curve, and the continuously estimated joint angle curve is compared with an actual joint angle curve obtained by a joint angle sensor. Further, the pearson Correlation Coefficient (CC) and Root Mean Square Error (RMSE) were used as evaluation criteria for the model. The pearson Correlation Coefficient (CC) is calculated as follows:

wherein, theta_est、

θ_realAnd

respectively representing a joint angle predicted value, an average value of joint angle predicted values, a value of an actual joint angle and an average value of the actual joint angle. The root mean square difference (RMSE) calculation is as follows:

in the experimental process of the embodiment, 15 testees are used for training and testing the hand joint angle estimation model according to joint angle data and myoelectric data under the condition that the testees complete the gripping actions of objects with different sizes, 60% of data of each tester is used for model training, the other 40% of data is used for model testing, and two regression model performance indexes of RMSE and CC of proximal interphalangeal joints and metacarpophalangeal joints of hands are tested. As shown in table 1, the test results show that both indexes are greater than two joint angle regression algorithms (algorithm based on long and short term memory network, algorithm based on sparse pseudo input gaussian process) commonly used at present. This shows that the hand joint angle estimation model adopted in the present embodiment can better implement the joint angle estimation task. The estimated effect of this embodiment on partial joints is shown in fig. 4. Fig. 4 shows the effect of estimating the finger joint angle at each time using the hand joint angle estimation method of the present embodiment, in which the solid line represents the true value of the finger joint angle, and the dotted line represents the predicted value of the method of the present embodiment. The experimental results show that the predicted value (dotted line) of the finger joint angle by the method of the present embodiment is similar to the actual measurement value (solid line) without distortion. For the near-end joints of five fingers which move frequently, the method predicts accurate movement trend and can match the real movement mode of the finger joints. When the joint angle is greatly changed, the prediction of the algorithm is still stable, and no distortion phenomenon occurs.

Evaluation index	This example A model	Long and short term memory network	Sparse pseudo-input Gaussian process
				CC	0.87±0.01	0.79±0.01	0.75±0.01
RMSE	9.65±0.55	11.67±0.69	12.07±0.75

TABLE 1 regression performance evaluation of three different algorithms in the same hand multi-joint angle estimation task

In summary, according to the finger joint angle estimation method provided by the embodiment, the multi-scale convolution module can better extract the features related to the joint angle from the electromyogram signal time sequence, so that higher regression accuracy can be achieved compared with the existing gaussian process regression joint angle estimation algorithm. Meanwhile, the whole time sequence characteristic of the electromyographic sequence is learned through a multi-layer multi-head attention mechanism, the problem of long-distance dependence caused by traditional cyclic neural learning is avoided, and meanwhile, the running efficiency is further improved through parallel computing.

As shown in fig. 5, the second embodiment further discloses a hand joint angle estimation device, which includes a signal acquisition unit 100 and a hand joint angle estimation model 200 trained in advance. The signal acquisition unit 100 is used for acquiring real-time surface electromyographic signals generated during hand movement of an object to be detected; the pre-trained hand joint angle estimation model 200 is used for predicting an estimated joint angle of the object to be measured during hand motion according to the collected real-time surface electromyographic signals, wherein the hand joint angle estimation model 200 comprises a multi-scale convolution module and a multi-layer multi-head attention module which are sequentially connected. For a specific working process of the signal acquisition unit 100 and the hand joint angle estimation model 200, reference may be made to the description of the first embodiment, which is not described herein again.

The third embodiment also discloses a computer readable storage medium, the hand joint angle estimation program is stored in the computer readable storage medium, and the hand joint angle estimation program is executed by the processor to realize the hand joint angle estimation method.

Further, the fourth embodiment also discloses a computer device, which comprises, on a hardware level, as shown in fig. 6, a processor 12, an internal bus 13, a network interface 14, and a computer-readable storage medium 11. The processor 12 reads a corresponding computer program from the computer-readable storage medium and then runs, forming a request processing apparatus on a logical level. Of course, besides software implementation, the one or more embodiments in this specification do not exclude other implementations, such as logic devices or combinations of software and hardware, and so on, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices. The computer-readable storage medium 11 stores thereon a hand joint angle estimation program that realizes the hand joint angle estimation method described above when executed by a processor.

Computer-readable storage media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer-readable storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.

Although a few embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents, and that such changes and modifications are intended to be within the scope of the invention.

Claims

1. A hand joint angle estimation method, comprising:

2. The hand joint angle estimation method of claim 1, wherein the method of pre-training the hand joint angle estimation model comprises:

3. A hand joint angle estimation method according to claim 2, further comprising:

4. The hand joint angle estimation method of claim 3, wherein the multi-scale convolution module comprises two multi-scale convolution sub-networks and a pooling layer connecting the two multi-scale convolution sub-networks, the two multi-scale convolution sub-networks having the same network structure, wherein the multi-scale convolution sub-networks comprise three parallel branches and a feature splicing layer, each branch comprises two convolution layers, and the method for processing the training samples by the multi-scale convolution module comprises:

and after the sequence length compression is carried out on the first convolution fusion feature by utilizing the pooling layer, the first convolution fusion feature is respectively input into three branches of a second multi-scale convolution sub-network, convolution feature extraction is carried out to obtain three second convolution features, then the three second convolution features are spliced by utilizing a feature splicing layer of the second multi-scale convolution sub-network to form a second convolution fusion feature, and the second convolution fusion feature is used as input data of the multi-layer multi-head attention module.

5. A hand joint angle estimation method according to claim 4, further comprising:

6. The hand joint angle estimation method of claim 5, wherein the multi-layer multi-head attention module comprises a plurality of layers of multi-head attention networks and a first fully-connected layer, the multi-head attention network comprises three single-head attention networks with the same structure and arranged in parallel, a feature splicing layer and a second fully-connected layer, and the method for processing the second convolution fusion feature by using the multi-layer multi-head attention module comprises:

the second convolution fusion features subjected to position coding are used as input data of a first-layer multi-head attention network and are respectively input into the three single-head attention networks, and the first linear mapping operation is carried out to obtain three single-head attention output features;

7. The hand joint angle estimation method according to claim 6, wherein the method of performing a first linear mapping operation on the second convolution fusion feature after position coding by using the single-headed attention network comprises:

respectively calculating to obtain query feature vector sequences according to the following formula

Key feature vector sequence

Sum value feature vector sequence

Q＝CW_Q，K＝CW_k，V＝CW_v

Wherein, W_Q、W_k、W_vFor linear mapping operations determined by training, d_q、d_k、d_vIs a constant, n represents the number of signal channels of the electromyographic signal, C belongs to R^n×lRepresenting the second convolution fusion features after position encoding;

8. a hand joint angle estimation device, characterized by comprising:

the pre-trained hand joint angle estimation model is used for predicting to obtain an estimated joint angle of an object to be measured during hand motion according to the collected real-time surface electromyographic signals, and comprises a multi-scale convolution module and a multi-layer multi-head attention module which are sequentially connected.

9. A computer-readable storage medium storing a hand joint angle estimation program which, when executed by a processor, implements the hand joint angle estimation method of any one of claims 1 to 7.

10. A computer device comprising a computer readable storage medium, a processor, and a hand joint angle estimation program stored in the computer readable storage medium, the hand joint angle estimation program when executed by the processor implementing the hand joint angle estimation method of any one of claims 1 to 7.