CN112434679A

CN112434679A - Rehabilitation exercise evaluation method and device, equipment and storage medium

Info

Publication number: CN112434679A
Application number: CN202110107879.XA
Authority: CN
Inventors: 汤有才; 乔元风; 李哲; 李恩耀
Original assignee: Xuanwei Beijing Biotechnology Co ltd
Current assignee: Xuanwei Beijing Biotechnology Co ltd
Priority date: 2021-01-27
Filing date: 2021-01-27
Publication date: 2021-03-02
Anticipated expiration: 2041-01-27
Also published as: CN112434679B

Abstract

Disclosed are a rehabilitation exercise evaluation method, a rehabilitation exercise evaluation device, equipment and a storage medium. In an embodiment of the present application, the method for evaluating rehabilitation exercise may include: acquiring continuous multi-frame rehabilitation motion images of an evaluation object; extracting two-dimensional bone node information frame by frame aiming at the continuous multi-frame rehabilitation motion image; predicting three-dimensional bone node information of each frame of rehabilitation motion image through a pre-established bone node tree structure based on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion image; and evaluating the rehabilitation motion posture of the evaluation object at the corresponding moment based on the three-dimensional bone node information of the rehabilitation motion image of the single frame. The method and the device can realize efficient and accurate evaluation of various rehabilitation motion postures of the evaluation object with lower hardware cost, and have special significance for judging the motion quality and motion error of the evaluation object and knowing the initial station habit of the evaluation object.

Description

Rehabilitation exercise evaluation method and device, equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for evaluating rehabilitation exercise, a device, and a storage medium.

Background

At present, patients needing rehabilitation training generally need to go to a rehabilitation hospital for on-site training, and the effect of rehabilitation training is ensured by watching standard actions displayed by a network rehabilitation platform to follow rehabilitation exercises or doing designated rehabilitation actions under the guidance of a professional doctor (such as a rehabilitation teacher). Whether the training action is standard or not is judged by subjective feeling of a professional doctor, namely, the professional doctor needs to participate in the rehabilitation training, so that the training time and the training field are restricted.

Chinese patent CN107982898A discloses a rehabilitation exercise training system and method which uses multiple sensor devices to perform motion recognition on a patient. The training system comprises: the attitude control system comprises a first attitude sensor, a second attitude sensor, a third attitude sensor and a controller; the first attitude sensor, the second attitude sensor and the third attitude sensor respectively acquire a first quaternion, a second quaternion and a third quaternion of the human body attitude and send the first quaternion, the second quaternion and the third quaternion to the controller; the controller respectively converts the first quaternion, the second quaternion and the third quaternion into Euler angle information of different rotation sequences under an inertial navigation coordinate system, and judges the current body position of the human body according to the Euler angle information. The invention realizes the measurement and quantification of the rehabilitation action made by the patient, evaluates whether the training action standard is satisfied or not according to the quantity value, and has special significance for judging the action quality and the action error made by the patient and knowing the initial station habit of the patient. And a professional doctor (a rehabilitation teacher) is not required to participate in the training, the training is not restricted by time and places, and the remote rehabilitation training and guidance are realized.

Chinese patent application CN111444879A discloses a joint strain autonomous rehabilitation action recognition method and system mainly comprising a skeleton network for detecting human body posture. The method comprises the following steps: preprocessing the collected human body posture image and video, and labeling human body bones, sitting and standing postures and autonomous rehabilitation movement to obtain a training data set; constructing a human body posture estimation neural network, training the human body posture estimation network by using a training data set to identify labeled human body skeleton points, and performing joint connection on the identified human body skeleton points to obtain human body skeleton sequence characteristics; constructing a posture classification neural network, inputting human body skeleton sequence characteristics extracted from the sitting posture and the autonomous rehabilitation motion video, and training the posture classification neural network to obtain an action recognition result; adjusting the network depth and the characteristic quantity of the human body posture estimation neural network and the posture classification neural network; and collecting the rehabilitation training action video of the patient in real time, and inputting the rehabilitation training action video into the trained human body posture estimation neural network and the posture classification neural network to obtain the action recognition result of the patient.

Chinese patent application CN201711080208.9 discloses a measurement system and method for rehabilitation exercise parameters, wherein the measurement system comprises an attitude sensor and a controller. The rehabilitation exercise parameters include: a primary angle; the attitude sensor is used for acquiring an initial quaternion and a current quaternion of the attitude of the training part and sending the initial quaternion and the current quaternion to the controller; the controller is used for converting the initial quaternion into at least two pieces of initial Euler angle information of different rotation sequences under an inertial navigation coordinate system, converting the current quaternion into at least two pieces of current Euler angle information of different rotation sequences under the inertial navigation coordinate system, and calculating the main angle according to the at least two pieces of initial Euler angle information and the at least two pieces of current Euler angle information. On the basis of the attitude quaternion, the Euler angles in multiple rotation sequences are fused to calculate the main angle, so that singular points are avoided, and the motion angle of the training part of the patient can be accurately represented.

The first and last related technologies require a plurality of sensor devices to identify the movement of the patient, and have the problems of complex operation and single rehabilitation exercise training. Although the second related technology includes steps such as a skeleton network for detecting the posture of a human body, the output result is two-dimensional skeleton node information, the process of the rehabilitation movement of the patient cannot be comprehensively and accurately evaluated, human-computer interaction is lacked, and the patient cannot be guided to do the rehabilitation movement.

Disclosure of Invention

In order to partially or fully solve the above technical problems, it is desirable to provide a new rehabilitation exercise assessment method capable of efficiently and accurately assessing various rehabilitation exercise postures of an assessment subject with low hardware cost.

In one aspect of the present application, there is provided a rehabilitation exercise evaluation method, including:

acquiring continuous multi-frame rehabilitation motion images of an evaluation object;

extracting two-dimensional bone node information frame by frame aiming at the continuous multi-frame rehabilitation motion image;

predicting three-dimensional bone node information of each frame of rehabilitation motion image through a pre-established bone node tree structure based on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion image;

and evaluating the rehabilitation motion posture of the evaluation object at the corresponding moment based on the three-dimensional bone node information of the rehabilitation motion image of the single frame.

In some examples, the method of assessing rehabilitative motion further comprises:

before predicting the three-dimensional bone node information of each frame of rehabilitation motion image, performing time dimension optimization on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion images by using a pre-constructed two-dimensional posture adjustment model to obtain a time sequence, wherein the time sequence comprises the optimized two-dimensional bone node information which is sorted according to the frame number or time.

In some examples, the two-dimensional pose adjustment model comprises a plurality of cascaded basic units, each basic unit in the middle comprises one or more hidden layers, the input of each hidden layer comprises a hidden state of a previous hidden layer at a time t, a hidden state at a time t +1, an input vector at the time t and an input vector at the time t +1, and the hidden state at the time t +1 is output after the nonlinear processing of feedforward linear transformation and an activation function.

In some examples, before performing time-dimension optimization on the two-dimensional bone node information of the continuous multi-frame rehabilitation moving images by using a pre-constructed two-dimensional posture adjustment model to obtain a time sequence, the method further includes: fitting based on the time sequence output by the two-dimensional attitude adjustment model to obtain an estimated value sequence of a preset physical parameter of the motion capture system; and constructing a loss function of the two-dimensional posture adjustment model based on the estimated value sequence of the preset physical parameters and the real values of the preset physical parameters so as to train the two-dimensional posture adjustment model by using the loss function.

In some examples, the predetermined physical parameters of the motion capture system include a focal length and focal coordinates of a camera in the motion capture system in a coordinate direction.

In some examples, fitting based on the time series of two-dimensional pose adjustment model outputs to obtain estimates of physical parameters of a motion capture system includes: predicting the three-dimensional bone node information of each frame of the rehabilitation motion image by the motion capture system by utilizing the time sequence output by the two-dimensional posture adjustment model; and obtaining an estimation value sequence of the preset physical parameters of the motion capture system by using the three-dimensional bone node information of each frame of the rehabilitation motion image and adopting a least square method for estimation.

In some examples, the bone node tree structure is created by analyzing a human pose in a rehabilitative exercise, which contains a plurality of bone joint pairs, with two bone node parent-child associations in each bone joint pair.

In some examples, predicting three-dimensional bone node information of each frame of the rehabilitation moving image through a pre-created bone node tree structure includes: predicting the length and direction unit vectors of each bone node pair in the bone node tree structure through a pre-constructed first prediction sub-model based on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion image; and estimating the three-dimensional bone node information of each bone node through a pre-constructed second prediction sub model according to the length and direction unit vectors of the bone joint pairs in each frame of rehabilitation motion image and the time stamps thereof.

In some examples, the three-dimensional bone node information includes three-dimensional coordinates of each bone node and a pose angle of each bone node pair.

In some examples, the first predictor model includes a multi-layer perceptron (MLP) that includes two hidden layers.

In some examples, the first predictor model is trained by a pre-constructed length loss function and a direction loss function.

In some examples, the second predictor model includes two multi-layered perceptrons, one of which is configured to estimate a distance between two bone nodes in a pair of bone nodes and whose input data is the two-dimensional bone node information, and the other of which is configured to estimate an angle between two bone nodes in a pair of bone nodes and whose input data includes timestamps of the two-dimensional bone node information and its corresponding rehabilitation motion image frame.

In some examples, the method of assessing rehabilitative motion further comprises: before the rehabilitation movement of the evaluation object is evaluated, the three-dimensional bone node information of the continuous multi-frame rehabilitation movement image is optimized by using the time sequence characteristics of the rehabilitation movement.

In some examples, the three-dimensional bone node information of the continuous multi-frame rehabilitation motion image is optimized by utilizing the time sequence characteristics of rehabilitation motion through a pre-constructed three-dimensional posture adjustment model, wherein the three-dimensional posture adjustment model comprises a deformation long-time memory network (Mogrifier LSTM).

In some examples, the three-dimensional posture adjustment model further includes a dropout layer disposed in front of the deformation long-time memory network.

In some examples, optimizing three-dimensional bone node information of the continuous multi-frame rehabilitation motion image using timing characteristics of rehabilitation motion includes: and (4) alternately multiplying the three-dimensional bone node information at the current moment and the three-dimensional bone node information at the previous moment, and then calculating the long-time and short-time memory network.

In some examples, evaluating the rehabilitation motion posture of the evaluation object at the corresponding moment based on the three-dimensional bone node information of the rehabilitation motion image of a single frame includes: periodically estimating the difference value between the three-dimensional bone node information of the single-frame rehabilitation motion image and the standard three-dimensional bone node information at the corresponding moment, and comparing the difference value with a preset threshold value to obtain the motion posture estimation result of the estimation object at the corresponding moment.

In some examples, the standard three-dimensional bone node information is extracted from a continuous multi-frame standard moving image synchronized with the continuous multi-frame rehabilitation moving image.

In some examples, the method of assessing rehabilitative motion further comprises: before the rehabilitation movement of the evaluation object is evaluated, normalization operation is respectively carried out on the three-dimensional bone node information of the single-frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding moment.

In some examples, the difference between the three-dimensional bone node information of the single frame rehabilitation motion image and the standard three-dimensional bone node information at the corresponding time is estimated according to the following relation:

wherein | | | purple hair₂It is shown that the L2 norm is calculated,

represents the square of the norm of L2, m represents the difference between the three-dimensional bone node information of the t-th frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding time,

normalized bone node three-dimensional coordinates representing the rehabilitation motion image of the t-th frame,

normalized bone node three-dimensional coordinates representing the t-th frame standard moving image,

normalized attitude angle information representing the rehabilitation moving image of the t-th frame,

and normalized attitude angle information representing the t-th frame standard moving image.

In some examples, the method of assessing rehabilitative motion further comprises: and controlling a playing device to play a standard video formed by the standard moving image to the evaluation object at the same time of evaluating the rehabilitation motion of the evaluation object or after the evaluation so as to guide the evaluation object to complete the rehabilitation motion.

In some examples, the method of assessing rehabilitative motion further comprises: periodically adjusting the current playing frame number of the standard video so that the played standard video is synchronized with the rehabilitation motion of the evaluation subject.

In some examples, the current playing frame number of the standard video is calculated by the following formula:

wherein | | | purple hair₂It is shown that the L2 norm is calculated,

represents the square of L2 norm, Q represents a preset search range, and Q = [ b =_t-b_f, b_t+b_f]，b_tRepresenting the current playing frame number of the standard video, b_fRepresenting the number of frames transmitted per second of the standard video,

three-dimensional coordinates of a normalized bone node representing the standard moving image of the i + t th frame in the standard video,

expressing normalized attitude angle information of the i + t frame standard motion image in the standard video, and expressing tauAnd the current playing frame number of the standard video.

In some examples, the method for evaluating rehabilitative exercise further comprises:

and when the motion posture evaluation result of the evaluation object indicates that the motion posture of the evaluation object is wrong, sending an alarm signal, intercepting the rehabilitation motion image frame and/or the standard motion image frame at the corresponding moment, and controlling a display device to display the rehabilitation motion image frame and/or the standard motion image frame to the evaluation object.

In one aspect of the present application, there is provided a rehabilitation exercise evaluation device including:

an acquisition unit configured to acquire a continuous multi-frame rehabilitation moving image of an evaluation subject;

a two-dimensional posture estimation unit configured to extract two-dimensional bone node information frame by frame for the continuous multi-frame rehabilitation motion image;

the three-dimensional posture estimation unit is configured to predict three-dimensional bone node information of each frame of the rehabilitation motion image through a pre-established bone node tree structure based on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion image;

and the evaluation unit is configured to evaluate the rehabilitation motion posture of the evaluation object at the corresponding moment based on the three-dimensional bone node information of the rehabilitation motion image of a single frame.

In some examples, the rehabilitation exercise evaluation device further includes: and the two-dimensional posture adjusting unit is configured to perform time dimension optimization on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion images by using a pre-constructed two-dimensional posture adjusting model to obtain a time sequence before predicting the three-dimensional bone node information of each frame of rehabilitation motion images, wherein the time sequence comprises the optimized two-dimensional bone node information which is sorted according to the frame number or time.

In some examples, the bone node tree structure is created by analyzing objective rules of posture changes in rehabilitation exercises, and comprises a plurality of bone joint pairs, wherein two bone nodes in each bone joint pair are associated with a father and a son.

In some examples, the three-dimensional pose estimation unit includes: a first estimation subunit configured to predict, based on the two-dimensional bone node information of the consecutive multi-frame rehabilitation moving images, a length and direction unit vector of each bone node pair in the bone node tree structure through a pre-constructed first prediction sub-model; and the second estimation subunit is configured to estimate the three-dimensional bone node information of each bone node through a second pre-constructed prediction sub-model according to the length and direction unit vectors of the bone joint pairs in each frame of the rehabilitation motion image and the time stamps thereof.

In some examples, the rehabilitation exercise evaluation device further includes: and the three-dimensional posture adjusting unit is configured to optimize the three-dimensional bone node information of the continuous multi-frame rehabilitation motion image by using the time sequence characteristics of rehabilitation motion before evaluating the rehabilitation motion of the evaluation object.

In some examples, the three-dimensional posture adjustment unit is configured to optimize three-dimensional bone node information of the continuous multi-frame rehabilitation motion image by using a time sequence characteristic of rehabilitation motion through a pre-constructed three-dimensional posture adjustment model, and the three-dimensional posture adjustment model comprises a deformation long-time memory network Mogrifier LSTM.

In some examples, the evaluation unit includes: a difference estimation subunit configured to periodically estimate a difference between the three-dimensional bone node information of the single-frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding time; and the comparison subunit is configured to compare the difference value with a preset threshold value to obtain a motion posture evaluation result of the evaluation object at the corresponding moment.

In some examples, the rehabilitation exercise evaluation device further includes: and the normalization operation unit is configured to respectively perform normalization operation on the three-dimensional bone node information of the single-frame rehabilitation motion image and the standard three-dimensional bone node information at the corresponding moment.

In some examples, the rehabilitation exercise evaluation device further includes: a playback control unit configured to control a playback device to play back a standard video formed of the standard moving image to the evaluation subject so as to instruct the evaluation subject to complete a rehabilitation exercise.

In some examples, the rehabilitation exercise evaluation device further includes: a synchronization unit configured to periodically adjust a current play frame number of the standard video so that the played standard video is synchronized with a rehabilitation motion of the evaluation subject.

In some examples, the rehabilitation exercise evaluation device further includes: and the prompting unit is configured to send out an alarm signal when the motion posture evaluation result determined by the evaluation unit indicates that the motion posture of the evaluation object is wrong, and meanwhile, intercept the rehabilitation motion image frame and/or the standard motion image frame at the corresponding moment and control the display device to display the rehabilitation motion image frame and/or the standard motion image frame on the evaluation object.

In one aspect of the present application, there is provided a computing device comprising:

one or more processors;

a memory for storing the processor-executable instructions;

the processor is used for reading the executable instructions stored in the memory to execute the evaluation method of the rehabilitation exercise.

In one aspect of the present application, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor is capable of implementing the above-mentioned rehabilitation exercise assessment method.

According to the embodiment of the application, the three-dimensional bone node information of the evaluation object is obtained by performing two-dimensional analysis and three-dimensional prediction on the continuous multi-frame rehabilitation motion image of the evaluation object, accurate measurement and three-dimensional quantification of rehabilitation motions of the evaluation object can be realized without arranging too many sensors and complicated operations, the rehabilitation motion posture of the evaluation object can be evaluated according to the three-dimensional bone node information, efficient and accurate evaluation of various rehabilitation motion postures of the evaluation object is realized with lower hardware cost, and the method has special significance for judging motion quality and motion errors made by the evaluation object and knowing the initial standing habit of the evaluation object.

Drawings

Several embodiments of the present application are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

fig. 1 is a schematic flow chart of a rehabilitation exercise evaluation method according to an embodiment of the present application.

Fig. 2 is an exemplary diagram of a human bone node according to an embodiment of the present application.

Fig. 3 is a diagram of an exemplary network structure of a highherhnnet in an embodiment of the present application.

Fig. 4 is a schematic diagram of a calculation process of each hidden layer in the trellis net according to an embodiment of the present application.

Fig. 5 is an exemplary network structure diagram of the trellis net in an embodiment of the present application.

Fig. 6 is a diagram of an exemplary network structure of the morrifier LSTM according to an embodiment of the present application.

Fig. 7 is a schematic flowchart of an evaluation apparatus for rehabilitation exercise according to an embodiment of the present application.

Fig. 8 is a schematic structural diagram of a computing device according to an embodiment of the present application.

Detailed Description

The principles and spirit of the present application will be described with reference to a number of exemplary embodiments. It should be understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present application, and are not intended to limit the scope of the present application in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Fig. 1 shows an exemplary flow of an evaluation method of rehabilitation exercise in the embodiment of the present application. Referring to fig. 1, the method for evaluating rehabilitation exercise in the embodiment of the present application may include the following steps:

step S110, acquiring continuous multi-frame rehabilitation motion images of an evaluation object;

step S120, extracting two-dimensional bone node information frame by frame aiming at the continuous multi-frame rehabilitation motion image;

step S130, based on the two-dimensional bone node information of continuous multi-frame rehabilitation motion images, predicting the three-dimensional bone node information of each frame of rehabilitation motion image through a pre-established bone node tree structure;

and step S140, evaluating the rehabilitation motion posture of the evaluation object at the corresponding moment based on the three-dimensional bone node information of the single-frame rehabilitation motion image.

As can be seen from the above, in the embodiment of the present application, the three-dimensional bone node information of the evaluation object is obtained by performing two-dimensional analysis and three-dimensional prediction on the continuous multi-frame rehabilitation moving image of the evaluation object, accurate measurement and three-dimensional quantization of rehabilitation motions of the evaluation object can be achieved without arranging too many sensors, and the rehabilitation motion posture of the evaluation object can be evaluated according to the three-dimensional bone node information, so that efficient and accurate evaluation of various rehabilitation motion postures of the evaluation object is achieved with a low hardware cost, and special significance is provided for determining motion quality and motion errors made by the evaluation object (e.g., a patient or a trainee) and understanding an initial standing habit of the evaluation object.

In step S110, the continuous multi-frame rehabilitation moving image refers to a multi-frame rehabilitation moving image of a predetermined time window, and each frame of rehabilitation moving image includes rehabilitation movement of the evaluation object. Here, the time window may be set in advance. In the embodiment of the present application, the selection of the time window is related to the two-dimensional pose adjustment model in step S130 and the three-dimensional pose adjustment model in step S150. For example, the time window may be set to 9 frames.

In step S120, two-dimensional bone node information may be extracted frame by frame through a pre-constructed two-dimensional pose estimation model. Here, the two-dimensional bone node information may include, but is not limited to, two-dimensional coordinate information of a human bone node designated in advance. Fig. 2 shows an exemplary view of a human bone node in an embodiment of the present application. Referring to fig. 2, the number of human bone nodes specified in advance is J = 18.

In at least some embodiments, the two-dimensional pose estimation model described above can be implemented by a highherhrnet network. Fig. 3 shows an exemplary network structure of a highherhnet in the embodiment of the present application, which is responsible for outputting two-dimensional bone node information of a corresponding frame from an input rehabilitation motion image frame.

Referring to fig. 3, the highhernernet may include a preprocessing module 31, a first stage network module 32, a second stage network module 33, a third stage network module 34, a deconvolution module 35, an upsampling module (not shown in the figure), and a feature refinement module 36, which are connected in sequence. In FIG. 3, "

"represents a characteristic diagram"

"represents a convolution operation"

"represents a leapfrog convolution") "

"represents up-sampling"

"representing feature splice"

"denotes deconvolution.

The preprocessing module 31 may be implemented by a Stem module, and is configured to extract a first feature map of one frame of the rehabilitation moving image, and reduce the resolution of the first feature map to 1/4 of the original image. Here, the pre-processing module includes two convolution layers, and the convolution kernels are the same. For example, a convolution kernel of 3 × 64 may be used, with a step size of 2 and a padding parameter (padding) of 1 for the convolution operation.

The first level network module 32 may be configured to reduce the width of the first profile to a predetermined value C. Specifically, the first-stage network module 32 may include two residual error units and one convolution layer, wherein each residual error unit is formed by a bottleneck layer (bottleeck) with a width (number of channels) of 64; the convolutional layer may be a 3x3 convolutional layer, reducing the width of the first feature map to C. For example, C may be set to 32.

The second level network module 33 may include one or two resolution blocks, each of which may include two 3x3 convolutional layers, and four residual units.

The third level network module 34 includes three resolution modules and three residual units, each resolution module includes 23 × 3 convolutions with product widths of C, 2C and 4C, respectively.

The deconvolution module 35 takes as input the feature map from higherrnet and the prediction heat map and generates a new feature map with a resolution 2 times greater than the input feature map. In the present embodiment, a high-quality and high-resolution feature map can be efficiently generated by deconvolution. In some examples, the deconvolution module 35 may perform a deconvolution operation by using a 4 x 4 convolution kernel.

The upsampling module is configured to learn to upsample the input feature map using batch normalization (BatchNorm) and activation function processing (ReLU) after the deconvolution module 35.

The feature refinement module 36 may include 4 residual error units to refine the feature map output by the upsampling module, and finally obtain and output two-dimensional bone node information of one frame of rehabilitation moving image.

In addition, the network module connected between the third-stage network module 34 and the deconvolution module 35 in fig. 3 is used to realize feature map aggregation, and if the third network module 34 up-samples all feature maps with different resolutions to the resolution of the input image by bilinear interpolation, the part of the network module passing through the third network module 34 can average all scales of the feature maps for final prediction, and this part can be calculated as a part of the third network module 34 or as a network module independent from the third network module 34. In fig. 3, "1/4", "1/8", "1/16" and "1/2" respectively indicate the resolution of the feature map output by the corresponding module with respect to the size of the original image. Assuming that the original image size of the input highherhnet is 512 x 512, the feature map resolution output by the first stage network module 32 is 1/4, i.e., 128 x 128; the second network module 33 outputs a feature map resolution of 1/8, i.e., 64 x64, the third network module 34 outputs a feature map resolution of 1/16, i.e., 32 x 32, and the feature refinement module 36 outputs a feature map resolution of 1/2, i.e., 256 x 256.

In the embodiment of the present application, the two-dimensional pose estimation model may also be implemented by, but not limited to, hnnet (High-Resolution Network), openpos, DeepCut, RMPE (Regional multi-person position estimation), Mask RCNN, Hourglass, CPN, MSPN, and the like.

Referring to fig. 1, after step S120 and before step S130, optionally, the rehabilitation exercise evaluation method according to the embodiment of the present application may further include: step S121, performing time dimension optimization on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion image by using a pre-constructed two-dimensional posture adjustment model to obtain a time sequence, wherein the time sequence comprises the optimized two-dimensional bone node information which is sorted according to the frame number or time. In the embodiment of the application, a time sequence is obtained through optimization and adjustment of time dimension, the relative positions of the bone nodes in each frame of image can be reflected, and the continuous change of the information of the bone nodes along with time can be reflected, namely the continuous change of the rehabilitation motion posture of the evaluation object along with time can be reflected.

In at least some embodiments, the two-dimensional pose adjustment model may include a plurality of cascaded basic units, each basic unit in the middle includes one or more hidden layers, and the input of each hidden layer includes the hidden state of the previous hidden layer at time t, the hidden state at time t +1, the input vector at time t and the input vector at time t +1, and the hidden state at time t +1 is output after being subjected to feedforward linear transformation and nonlinear processing of an activation function. Therefore, the two-dimensional bone node information can be optimized in the time dimension through the two-dimensional posture adjustment model, so that the two-dimensional bone node information can reflect the continuous change of the corresponding bone node position, and the continuous change of the rehabilitation motion posture of the evaluation object along with the time can also be reflected.

In at least some embodiments, the two-dimensional pose adjustment model may be, but is not limited to, a trellis net, that is, the trellis net may be used to perform time-dimension optimization on two-dimensional bone node information of a continuous multi-frame rehabilitation motion image. Here, the trellis net may be implemented by a long short memory network (LSTM), a time sequential convolutional network (TCN), or other similar network models.

The trellis net comprises a plurality of basic units which are cascaded, each basic unit in the middle comprises one or more hidden layers which are connected in sequence, and each hidden layer comprises a hidden state extraction operation, a feedforward linear transformation operation and an activation operation. Fig. 4 shows the calculation process for each hidden layer in the trellis net. Referring to fig. 4, t represents time, i represents a network layer, W represents weight, x represents an input vector, and z represents hidden state, where the input of each hidden layer (i.e., i +1 layer in fig. 4) is the hidden state of the previous layer, i.e., i layer, at time t and time t +1, and the input vector at time t and time t +1, and the hidden state of the current layer i +1 at time t +1 is output through feedforward linear transformation (bias is omitted) and nonlinear processing of an activation function.

In some examples, the feedforward linear transformation described above may be represented by the following equation (1):

（1）

in some examples, the operation of the activation function f in the hidden layer may be expressed as the following formula (2):

（2）

here, the activation function f may employ, but is not limited to, a tanh activation function.

Assuming that two hidden layers are arranged in each basic unit in the middle, and the time window is 9 frames, the overall structure of the model of the trellis net can adopt the structure shown in fig. 5. In FIG. 5, each

Representing one hidden layer in fig. 4, xi (i =1 … 9) represents two-dimensional bone node information of each frame of the rehabilitation moving image in a time window, and yi (i =1 … 9) represents optimized two-dimensional bone node information. As can be seen from fig. 5, the trellis net can perform time-dimension optimization adjustment on the two-dimensional bone node information output by the HigherHrnet.

Training of a two-dimensional pose adjustment model (e.g., the above trellis net) can be achieved through a pre-constructed loss function. The two-dimensional pose adjustment model can be trained by one, two or more loss functions.

In some embodiments, the fitting may be based on a time series output by the two-dimensional pose adjustment model to obtain a sequence of estimated values of a predetermined physical parameter of the motion capture system; and then, constructing a loss function of the two-dimensional posture adjustment model based on the estimated value sequence of the preset physical parameters and the real values of the preset physical parameters, so that the two-dimensional posture adjustment model is trained by using the loss function to further enhance the reliability of the two-dimensional posture adjustment model. Here, the specific type of the predetermined physical parameter is not limited. In some examples, the predetermined physical parameters of the motion capture system may include, but are not limited to, a focal length and focal coordinates of the camera in a coordinate direction in the motion capture system.

In some examples, fitting based on the time series output by the two-dimensional pose adjustment model to obtain a sequence of estimated values of a predetermined physical parameter of the motion capture system may include: a1, predicting the three-dimensional bone node information of each frame of rehabilitation motion image by the motion capture system by using the time sequence output by the two-dimensional posture adjustment model; and a2, estimating by using the three-dimensional bone node information of each frame of rehabilitation motion image by using a least square method to obtain an estimation value sequence of the preset physical parameters of the motion capture system.

For example, when training the trellis net, the x direction of the camera (i.e. the depth direction of the object far from and near the camera, i.e. the optical axis direction of the camera) is fitted by using the two-dimensional bone node information output by the trellis net by using the least square methodUp) focus estimate

And a focus position estimate

The focal length estimated value obtained by fitting

And a focus position estimate

The real focal length f of the camera in the x direction_xAnd true focal position c_xBy comparison, as a function of the loss, the reliability of the trellis net model can be enhanced.

Still taking the trellis net as an example, it is assumed that the two-dimensional bone node information output by the trellis net can be expressed as the following formula (3).

（3）

Wherein the content of the first and second substances,

the coordinate value of the bone node at the t moment obtained by the TrellisNet optimization in the two-dimensional coordinate system is shown,

representing the confidence coefficient at time t of the trellis net output,

a two-dimensional real matrix is represented, with each dimension being J (the predefined number of bone nodes) and 2. The real coordinate values of the corresponding bone nodes in the sample images adopted in training the TrellisNet can be obtained by artificial marking, and the assumption is thata _t，b _t。

The first loss function of the trellis net, that is, the loss function corresponding to the two-dimensional bone node information thereof, can be constructed as shown in formula (4):

（4）

after the optimized two-dimensional bone node information is obtained through the trellis net, corresponding three-dimensional bone node information can be obtained through fitting of a motion capture system or structured light photography and the like, and the focal length estimation value and the focal point coordinate estimation value of the corresponding camera in the x direction are estimated through the three-dimensional bone node information through a least square method, wherein the estimation process is shown in the following formula (5).

（5）

Wherein the content of the first and second substances,

the average of the abscissa values representing the J bone nodes,

represents the sum of coordinate values of J bone nodes,

an abscissa value representing the jth bone node,

，x^jand coordinate values representing the jth bone node x in the predefined J bone nodes.

And (3) constructing a second loss function for training the trellis net according to the estimated focal length estimated value and focal coordinate estimated value of the camera in the x direction, and the real focal length value and focal coordinate of the camera in the x direction, as shown in the following formula (6).

（6）

In step S130, a bone node tree structure is created by analyzing an objective rule of posture change in rehabilitation exercise, and the bone node tree structure includes a plurality of bone joint pairs, and two bone nodes in each bone joint pair are related in a father-son manner. The bone node relationships in various three-dimensional postures can be described through the bone node tree structure, and therefore the three-dimensional bone node information of the evaluation object can be predicted more accurately.

Assume that the rehabilitation exercise evaluation method of the embodiment of the application takes the evaluation subject to lift the right hand, the wrist is higher than the head as the mark for the readiness of the evaluation subject, namely, the mark for starting the evaluation, and takes the wrist is higher than the head as the mark for the end of the evaluation. From this, it can be analyzed that the wrist node is a child node of the elbow node, and the elbow node is a parent node of the wrist node and is also a child node of the shoulder node. Similarly, the bone node tree structure of the corresponding rehabilitation exercise can be constructed in advance by analyzing the specific postures of the human body in various rehabilitation exercises. Taking fig. 2 as an example, the number of bone nodes is J =18, the bone node pairs having a parent-child association include 17, and the corresponding bone node tree structure can be represented as { [14, 16], [0, 14], [15, 17], [0, 15], [0,1], [12, 13], [11,12], [6, 7], [5, 6], [9, 10], [8, 9], [2, 3], [3, 4], [1, 2], [1, 5], [1, 8], [1, 11 }, each number representing a predetermined number of one bone node, and each number pair representing one bone node pair. Correspondingly, a total of 17 attitude angles are set. To correspond to the number of bone nodes, a first attitude angle is set to 0.

In step S130, three-dimensional bone node information may be predicted by a three-dimensional pose estimation model constructed in advance. In some embodiments, the three-dimensional pose estimation model may include a first predictor model operable to predict length and direction unit vectors for each bone node pair and a second predictor model operable to estimate three-dimensional bone node information for the bone nodes. That is, the prediction process in step S130 may include: b1, based on the two-dimensional bone node information of continuous multiframe rehabilitation motion images, predicting the length and direction unit vector of each bone node pair in the bone node tree structure through a pre-constructed first prediction sub model; and b2, estimating the three-dimensional bone node information of each bone node through a pre-constructed second predictor model according to the length and direction unit vectors of the bone joint pairs in each frame of rehabilitation motion image and the time stamps thereof.

In step S130, before step b1, input data for predicting the length and direction unit vectors of each bone node pair in the bone node tree structure is constructed using the two-dimensional bone node information of the consecutive multi-frame rehabilitation moving images. Wherein the input data may include: the time sequence formed by two-dimensional bone node information of continuous multi-frame rehabilitation moving images, the time sequence formed by the difference value between the father and son bone nodes of each bone joint pair in the continuous multi-frame rehabilitation moving images, and the time sequence formed by the Euclidean distance of pixels between the father and son bone nodes of each bone joint pair in the continuous multi-frame rehabilitation moving images. In particular, the input data may be represented as

Wherein, in the step (A),

，

a two-dimensional coordinate sequence representing a sub-bone node,

a two-dimensional coordinate sequence representing a parent bone node,

representing the Euclidean distance of pixels between two bone nodes related to father and son, R is a real number set^9xJx1Representing a three-dimensional tensor, the dimensions of which in each dimension are 9, J, 1, respectively.

The first predictor model may predict the length between two bone nodes, i.e., a pair of bone nodes, of the paternal-child association

And direction unit vector

It may comprise a multi-layer perceptron (MLP) comprising two hidden layers.

Here, the first predictor model is trained by a length loss function and a direction loss function constructed in advance. Specifically, the length loss function may be constructed as a function shown in the following equation (7), and the directional loss function may be constructed as a function shown in the following equation (8).

（7）

（8）

Where l and r are true values and the symbol "< >" represents the vector product of the unit vectors in both directions.

The second predictor model may be based on a length between two bone nodes of the father-son association

And direction unit vector

Three-dimensional bone node information for each bone node is calculated, which may include an MLP comprising two hidden layers, which may include the three-dimensional coordinates of each bone node and the pose angle of each bone node pair. Specifically, the calculation result of the second predictor model can be expressed as shown in the following expression (9).

And is and

（9）

wherein the content of the first and second substances,

represents the distance between two bone nodes in the jth joint point pair, which is equal to the length between two bone nodes of the father-son association

And direction unit vector

The product of the two or more of the above,

representing the three-dimensional coordinates of the middle bone node in the jth pair of joint points,

representing the three-dimensional coordinates of the parent bone node in the jth pair of joint points.

The second prediction submodel may include two multi-layer perceptrons, wherein one multi-layer perceptron is configured to estimate a distance between two bone nodes in the bone node pair and input data of the multi-layer perceptron is the two-dimensional bone node information, and the other multi-layer perceptron is configured to estimate an angle between two bone nodes in the bone node pair and input data of the multi-layer perceptron includes the two-dimensional bone node information and a timestamp of a rehabilitation motion image frame corresponding to the two-dimensional bone node information. Here, when the second predictor model is trained, the loss function of the second predictor model may be constructed by comparing real data of a rehabilitation exercise performed by an evaluation target acquired by a motion capture (MOCAP) system or a camera having an active structured light projection function with a result predicted by the second predictor model.

In this embodiment of the application, after step S130 and before step S140, the method may further include: step S131, before evaluating the rehabilitation movement of the evaluation object, optimizing the three-dimensional bone node information of the continuous multi-frame rehabilitation movement image by using the time sequence characteristics of the rehabilitation movement.

Specifically, the three-dimensional bone node information of the continuous multi-frame rehabilitation motion image can be optimized by utilizing the time sequence characteristics of rehabilitation motion through a pre-constructed three-dimensional posture adjustment model, wherein the three-dimensional posture adjustment model comprises a deformation long-time memory network (Mogrifier LSTM).

Here, the input of the morrifier LSTM is the three-dimensional bone node information of the above-mentioned continuous multi-frame rehabilitation moving image (for example, 9-frame rehabilitation moving image), and the output is a time series optimized by a time series including optimized three-dimensional bone node information sorted by frame number or time, each optimized three-dimensional bone node information corresponding to one frame rehabilitation moving image.

To better embody the sequence feature, the step S131 may include: and (3) alternately multiplying the three-dimensional bone node information at the current moment with the long and short memory network state at the previous moment, and then calculating the long and short memory network.

Taking Mogrifier LSTM as an example, the main process can be as follows: alternately letting input x before normal long and short memory network (LSTM) computationⁱ(i.e., three-dimensional bone node information of the above-described continuous multi-frame rehabilitation moving image (e.g., 9 rehabilitation moving images, i =1, …, 9)) and a result h output at a previous time of the Mogrifier LSTM_prev(i.e., a time series comprising optimized three-dimensional bone node information sorted by frame number or time, each optimized three-dimensional bone node information corresponding to a frame of rehabilitation motion image) interaction (e.g., element multiplication).

The process of the Mogrifier LSTM can be expressed as the following formula (10),

is defined as

The middle superscript is the maximum value.

（10）

Specifically, the calculation process of Mogrifier LSTM is expressed as the following formula (11)

（11）

Wherein x is^-1=x，

The number of rounds r is a hyperparameter. If r =0, Mogrifier LSTM is a normal LSTM. In the embodiment of the present application, r =5 is taken, and an exemplary structure of the model is shown in fig. 6.

In order to better suppress the overfitting, a dropout layer arranged before the Morrifier LSTM may be further included in the three-dimensional pose adjustment model. Here, the ratio of dropout layers may be 1-k, with k being the confidence rate for each bone node.

The optimized output of the three-dimensional attitude adjustment model is assumed to be S_optConsidering the continuity and smoothness of the rehabilitation exercise motion in the time series, the loss function of the three-dimensional posture adjustment model can be constructed as shown in the following formula (12).

（12）

The H and F functions are first and second derivatives of S, so that smooth continuation of S in the first and second orders is guaranteed, S is a vector of 3xJ (J is the number of bone nodes) and represents a set of three-dimensional coordinates of all the bone nodes, and derivation is performed on S in time so that continuity and smoothness of actions are guaranteed.

An exemplary implementation of step S140 may include: step c1, periodically (for example, every 100ms after synchronization) estimating the difference between the three-dimensional bone node information of the single-frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding moment; and c2, comparing the difference value with a preset threshold value to obtain the A motion posture evaluation result of the evaluation object at the corresponding moment.

Here, the standard three-dimensional bone node information is extracted from a continuous multi-frame standard moving image synchronized with a continuous multi-frame rehabilitation moving image.

Since the evaluation objects are different in height and weight, in order to perform unified calculation and enhance the reliability of the system, in step S140, before the rehabilitation motion of the evaluation object is evaluated, normalization operations may be performed on the three-dimensional bone node information of the single frame rehabilitation motion image and the standard three-dimensional bone node information at the corresponding time, respectively. Specifically, to simplify the calculation, the normalization operation may be performed with respect to the three-dimensional coordinate information in the three-dimensional bone node information.

Assuming that the three-dimensional bone node information after normalization is represented as

，k_3D,tIs three-dimensional coordinate information of a bone node, theta_3D,tIs the pose angle information of the bone node pair. In step c1, the difference between the three-dimensional bone node information of the single frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding time can be estimated according to the following relation:

（13）

wherein | | | purple hair²It is shown that the L2 norm is calculated,

representing normalization of the rehabilitation motion image of the t-th frameThe information of the attitude angle is changed into the information of the attitude angle,

In step c2, the motion posture estimation result of the object under evaluation at the corresponding time can be obtained by the following formula (14).

（14）

Wherein m is_tIs a threshold value. m is_tThe empirical value can be taken, and the values are different when the rehabilitation exercises are different in types. For example, m may be set according to the requirements of rehabilitation exercises_tIs 15.24. That is, when the difference m is greater than or equal to the threshold m_tAnd f (m) takes 1 to represent the motion posture error of the evaluation object at the corresponding moment. When the difference m is smaller than the threshold m_tAnd f (m) takes 0, which indicates that the motion posture of the evaluation object at the corresponding moment is correct.

After step S140, when the motion posture evaluation result of the evaluation object at a moment indicates that the motion posture of the evaluation object is wrong, an alarm signal may be sent, and at the same time, the rehabilitation motion image frame and/or the standard motion image frame at the corresponding moment may be intercepted and the display device may be controlled to display on the evaluation object, so that the evaluation object may be automatically prompted to correct the motion posture thereof while alarming.

Simultaneously with or after step S140, the method may further include: simultaneously with or after the evaluation of the rehabilitation exercise of the evaluation subject, the playback device is controlled to play a standard video formed of a standard moving image to the evaluation subject so as to guide the evaluation subject to complete the rehabilitation exercise. Therefore, in the process of rehabilitation exercise training of the evaluation object, the evaluation object can learn by self without the participation of a professional doctor (such as a rehabilitation teacher), is not restricted by time and field, and realizes remote rehabilitation training and guidance.

Here, the step S140 may further include, simultaneously with or after: periodically adjusting a current play frame number of the standard video so that the played standard video is synchronized with the rehabilitation motion of the evaluation subject.

Since there is a process of learning in evaluating a subject (e.g., a patient or a trainee) to do rehabilitation exercise, the exercise speed is often slow. In order to improve the rehabilitation exercise effect and achieve the purpose of guiding the evaluation subject to complete the standard rehabilitation exercise, the system needs to periodically (for example, every 3 seconds) adjust the playing frame number of the standard video so as to achieve the purpose of synchronizing with the rehabilitation exercise of the evaluation subject. Here, the basic idea of synchronization is to determine the current frame number corresponding to the standard video by estimating that the difference between the corresponding position and angle of the three-dimensional bone node information of the object in a short period of time and the three-dimensional bone node information of the standard video in the period of time is minimum.

In some examples, the current playing frame number of the standard video is calculated by the following equation (15):

（15）

wherein | | | purple hair²It is shown that the L2 norm is calculated,

represents the square of L2 norm, Q represents a preset search range, and Q = [ b =_t-b_f, b_t+b_f]，b_tRepresenting the current playing frame number of the standard video, b_fRepresents the number of Frames transmitted Per Second (FPS) of standard video,

normalized bone node three-dimensional coordinates representing the t-th frame rehabilitation motion image,

rehabilitation motion map representing t-th frameThe normalized attitude angle information of the image,

the normalized attitude angle information of the i + T-th frame standard moving image in the standard video is represented, τ represents the current playing frame number of the standard video, and T represents the frame number in the preset time window (for example, T =9 may be set, see above).

The frame number of the standard video which should be played at present is obtained through the formula (15), and then the standard video is adjusted to play the standard moving image of the frame number, so that the standard video can correctly guide and evaluate to carry out effective rehabilitation exercise.

Fig. 7 shows an exemplary structure of a rehabilitation exercise evaluation device provided in an embodiment of the present application. The rehabilitation exercise evaluation device may be configured to implement the rehabilitation exercise evaluation method described above. Referring to fig. 7, the rehabilitation exercise evaluation apparatus may include:

an acquisition unit 71 configured to acquire a continuous multi-frame rehabilitation moving image of an evaluation subject;

a two-dimensional posture estimation unit 72 configured to extract two-dimensional bone node information frame by frame for the continuous multi-frame rehabilitation moving image;

a three-dimensional posture estimation unit 73 configured to predict three-dimensional bone node information of each frame of the rehabilitation moving image through a pre-created bone node tree structure based on two-dimensional bone node information of the consecutive multi-frame rehabilitation moving images;

and the evaluation unit 74 is configured to evaluate the rehabilitation motion posture of the evaluation object at the corresponding moment based on the three-dimensional bone node information of the rehabilitation motion image of a single frame.

In some examples, the above apparatus for evaluating rehabilitation exercise may further include: the two-dimensional posture adjustment unit 75 is configured to perform time dimension optimization on the two-dimensional bone node information of the consecutive multi-frame rehabilitation moving images by using a pre-constructed two-dimensional posture adjustment model to obtain a time series before the three-dimensional posture estimation unit 73 predicts the three-dimensional bone node information of each frame rehabilitation moving image, wherein the time series includes the optimized two-dimensional bone node information sorted according to the number of frames or time.

In some examples, the two-dimensional pose adjustment model comprises a plurality of cascaded basic units, each basic unit comprises one or more hidden layers, the input of each hidden layer comprises a hidden state of a previous hidden layer at a time t, a hidden state at a time t +1, an input vector at the time t and an input vector at the time t +1, and the hidden state at the time t +1 is output after the feedforward linear transformation and the nonlinear processing of an activation function.

In some examples, the three-dimensional pose estimation unit 73 may include:

a first estimating sub-unit 731 configured to predict a length and a direction unit vector of each bone node pair in the bone node tree structure through a pre-constructed first prediction sub-model based on two-dimensional bone node information of the consecutive multi-frame rehabilitation moving images;

a second estimation sub-unit 732 configured to estimate three-dimensional bone node information of each bone node through a second pre-constructed prediction sub-model according to the length and direction unit vectors of the bone joint pairs in each frame of the rehabilitation moving image and the time stamps thereof.

In some examples, the above apparatus for evaluating rehabilitation exercise may further include: and a three-dimensional posture adjusting unit 76 configured to optimize the three-dimensional bone node information of the continuous multi-frame rehabilitation moving image by using the time sequence characteristics of the rehabilitation movement before evaluating the rehabilitation movement of the evaluation object.

Here, the three-dimensional posture adjustment unit 76 may be configured to optimize the three-dimensional bone node information of the continuous multi-frame rehabilitation moving image by using the time-series characteristics of the rehabilitation motion through a pre-constructed three-dimensional posture adjustment model including a deformed long-and-short memory network.

In some examples, the evaluation unit 74 may include:

a difference estimation operator unit 741 configured to periodically estimate a difference between the three-dimensional bone node information of the single frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding time;

a comparison subunit 742 configured to compare the difference value with a preset threshold to obtain a motion posture estimation result of the estimation object at the corresponding time.

In some examples, the above apparatus for evaluating rehabilitation exercise may further include: a normalization operation unit 77 configured to perform normalization operations on the three-dimensional bone node information of the single-frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding time, respectively.

In some examples, the above apparatus for evaluating rehabilitation exercise may further include: a playback control unit 78 configured to control a playback device to play back a standard video formed of the standard moving image to the evaluation subject so as to instruct the evaluation subject to complete a rehabilitation exercise.

In some examples, the above apparatus for evaluating rehabilitation exercise may further include: a synchronization unit 79 configured to periodically adjust the current play frame number of the standard video so that the played standard video is synchronized with the rehabilitation exercise of the evaluation subject.

In some examples, the above apparatus for evaluating rehabilitation exercise may further include: and the prompting unit 710 is configured to send out an alarm signal when the motion posture evaluation result determined by the evaluation unit 74 indicates that the motion posture of the evaluation object is wrong, and simultaneously intercept the rehabilitation motion image frame and/or the standard motion image frame at the corresponding moment and control the display device to display the rehabilitation motion image frame and/or the standard motion image frame to the evaluation object.

The above-described rehabilitation exercise evaluation device may be implemented by software, hardware, or a combination of both. For other technical details of the rehabilitation exercise evaluation device, reference is made to the above description of the rehabilitation exercise evaluation method, and details are not repeated.

Fig. 8 shows an exemplary structure of a computing device in an embodiment of the present application. In practice, the computing device may be a computer, server, or cluster thereof having high-performance processing capabilities. It should be noted that the computing device shown in fig. 8 is only an example, and the specific structure of the computing device of the embodiment of the present application is not limited thereto.

Referring to fig. 8, a computing device may include: one or more processors or processing units 801, a memory 802, a bus 803 connecting the various system components (including the memory 802 and the processing unit 801), the memory 802 operable to store processor executable instructions, the one or more processors or processing units 801 operable to read the executable instructions stored in the memory 802 to perform the above rehabilitation exercise assessment method.

The computing devices typically include a variety of computer system readable media. Such media may be any available media that is accessible by a computing device and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 802 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 8021 and/or cache memory 8022. The computing device may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, ROM8023 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 8, and typically referred to as a "hard disk drive"). Although not shown in FIG. 8, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to the bus 803 by one or more data media interfaces. Included in memory 802 may be at least one program product having a set (e.g., at least one) of program modules configured to perform the steps of the rehabilitation motion assessment method in the embodiments of the present application.

Program/utility 8025, having a set (at least one) of program modules 8024, can be stored, for example, in memory 802, and such program modules 8024 include, but are not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment. Program modules 8024 generally perform the functions and/or methods of embodiments described herein.

The computing device may also communicate with one or more external devices 804, such as an image capture device, a pointing device, a display device, a video capture card, etc. Such communication may be through input/output (I/O) interfaces 805. Also, the computing device may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through the network adapter 806. As shown in FIG. 8, the network adapter 806 communicates with other modules of the computing device (e.g., processing unit 801, etc.) via the bus 803. It should be appreciated that although not shown in FIG. 8, other hardware and/or software modules may be used in conjunction with the computing device.

When the external device 804 may include a display device, the display device (e.g., a display screen, a touch display screen, etc.) may alert or display a rehabilitation motion image frame and/or a standard motion image frame corresponding to an erroneous motion posture to the evaluation subject under the control of the processor or processing unit 801; and/or, a standard video formed of the standard moving image may be played to the evaluation subject under the control of the processor or processing unit 801 so as to guide the evaluation subject to complete a rehabilitation exercise.

The external device 804 may further include an image pickup device such as a camera or a camera, which can pick up a rehabilitation moving image of the evaluation subject in real time.

The video capture card in the external device 804 may be used to convert the rehabilitation motion image captured by the image capture device in real time into data recognizable by the processor or processing unit 801 and transmit the data to the processor or processing unit 801 or the memory 802.

The embodiment of the present application further provides a computer-readable storage medium, which stores a computer program, and the computer program can implement the above-mentioned rehabilitation exercise evaluation method when being executed by a processor. Here, examples of the computer-readable storage medium may include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, or other optical and magnetic storage media, which are not described in detail herein.

In the description of the present application, it is noted that the terms "first", "second", and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the apparatus, the device, and the computer-readable storage medium described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application.

Claims

1. A method of assessing rehabilitative exercise, comprising:

2. The rehabilitation exercise assessment method of claim 1, further comprising:

3. The rehabilitation exercise evaluation method according to claim 2, wherein the two-dimensional posture adjustment model comprises a plurality of cascaded basic units, each basic unit in the middle comprises one or more hidden layers, and the input of each hidden layer comprises a hidden state of a previous hidden layer at time t, a hidden state at time t +1, an input vector at time t and an input vector at time t +1, and the hidden state at time t +1 is output after being subjected to feedforward linear transformation and nonlinear processing of an activation function.

4. The rehabilitation exercise evaluation method according to claim 2, before performing time dimension optimization on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion image by using the pre-constructed two-dimensional posture adjustment model to obtain a time sequence, further comprising:

fitting based on the time sequence output by the two-dimensional attitude adjustment model to obtain an estimated value sequence of a preset physical parameter of the motion capture system;

and constructing a loss function of the two-dimensional posture adjustment model based on the estimated value sequence of the preset physical parameters and the real values of the preset physical parameters so as to train the two-dimensional posture adjustment model by using the loss function.

5. The rehabilitation exercise evaluation method of claim 4, wherein the predetermined physical parameters of the motion capture system include a focal length and a focal point coordinate of the camera in a coordinate direction in the motion capture system.

6. The rehabilitation exercise assessment method of claim 4, wherein fitting based on the time series output by the two-dimensional pose adjustment model to obtain a series of estimated values of a predetermined physical parameter of a motion capture system comprises:

predicting the three-dimensional bone node information of each frame of the rehabilitation motion image by the motion capture system by utilizing the time sequence output by the two-dimensional posture adjustment model;

and obtaining an estimation value sequence of the preset physical parameters of the motion capture system by using the three-dimensional bone node information of each frame of the rehabilitation motion image and adopting a least square method for estimation.

7. The rehabilitation exercise assessment method of claim 1, wherein the bone node tree structure is created by analyzing a human body posture in the rehabilitation exercise, and comprises a plurality of bone joint pairs, and two bone nodes in each bone joint pair are associated with father and son.

8. The evaluation method of rehabilitation exercise according to claim 1 or 7, wherein predicting three-dimensional bone node information of each frame of the rehabilitation moving image through a pre-created bone node tree structure includes:

predicting the length and direction unit vectors of each bone node pair in the bone node tree structure through a pre-constructed first prediction sub-model based on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion image;

and estimating the three-dimensional bone node information of each bone node through a pre-constructed second prediction sub model according to the length and direction unit vectors of the bone joint pairs in each frame of rehabilitation motion image and the time stamps thereof.

9. The rehabilitation exercise evaluation method of claim 8, wherein the three-dimensional bone node information includes three-dimensional coordinates of each bone node and a posture angle of each bone node pair.

10. The rehabilitation exercise assessment method of claim 8, wherein the first predictor model comprises a multi-layered perceptron including two hidden layers.

11. The rehabilitation exercise evaluation method of claim 8, wherein the first predictor model is trained by a length loss function and a direction loss function constructed in advance.

12. The rehabilitation exercise assessment method of claim 8, wherein the second prediction sub-model comprises two multi-layered perceptrons, one of the multi-layered perceptrons is used for estimating the distance between two bone nodes in the bone node pair and its input data is the two-dimensional bone node information, the other multi-layered perceptron is used for estimating the angle between two bone nodes in the bone node pair and its input data comprises the two-dimensional bone node information and its corresponding time stamp of the rehabilitation exercise image frame.

13. The rehabilitation exercise assessment method of claim 1, further comprising:

before the rehabilitation movement of the evaluation object is evaluated, the three-dimensional bone node information of the continuous multi-frame rehabilitation movement image is optimized by using the time sequence characteristics of the rehabilitation movement.

14. The rehabilitation exercise evaluation method of claim 13, wherein the three-dimensional bone node information of the continuous multi-frame rehabilitation motion image is optimized by using the time-series characteristics of the rehabilitation exercise through a pre-constructed three-dimensional posture adjustment model, and the three-dimensional posture adjustment model comprises a deformed long-and-short time memory network.

15. The rehabilitation exercise evaluation method according to claim 14, wherein the three-dimensional posture adjustment model further includes a dropout layer disposed in front of the deformation long-time memory network.

16. The rehabilitation exercise evaluation method of claim 13, wherein optimizing the three-dimensional bone node information of the continuous multi-frame rehabilitation motion image using the timing characteristics of rehabilitation exercise comprises: and (3) alternately multiplying the three-dimensional bone node information at the current moment with the long and short memory network state at the previous moment, and then calculating the long and short memory network.

17. The rehabilitation exercise evaluation method according to claim 1, wherein the evaluation of the rehabilitation exercise posture of the evaluation subject at the corresponding time based on the three-dimensional bone node information of the rehabilitation motion image of a single frame comprises: periodically estimating the difference value between the three-dimensional bone node information of the single-frame rehabilitation motion image and the standard three-dimensional bone node information at the corresponding moment, and comparing the difference value with a preset threshold value to obtain the motion posture estimation result of the estimation object at the corresponding moment.

18. The rehabilitation exercise evaluation method according to claim 17, wherein the standard three-dimensional bone node information is extracted from a continuous multi-frame standard moving image synchronized with the continuous multi-frame rehabilitation moving image.

19. The rehabilitation exercise assessment method of claim 17, further comprising:

before the rehabilitation movement of the evaluation object is evaluated, normalization operation is respectively carried out on the three-dimensional bone node information of the single-frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding moment.

20. The evaluation method of rehabilitation exercise according to claim 17 or 18, wherein the difference between the three-dimensional bone node information of a single frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding time is estimated according to the following relation:

wherein | | | purple hair₂It is shown that the L2 norm is calculated,

21. The rehabilitation exercise assessment method of claim 18, further comprising:

and controlling a playing device to play a standard video formed by the standard moving image to the evaluation object at the same time of evaluating the rehabilitation motion of the evaluation object or after the evaluation so as to guide the evaluation object to complete the rehabilitation motion.

22. The rehabilitation exercise assessment method of claim 21, further comprising:

periodically adjusting the current playing frame number of the standard video so that the played standard video is synchronized with the rehabilitation motion of the evaluation subject.

23. The rehabilitation exercise assessment method of claim 22, wherein the current playing frame number of the standard video is calculated by the following formula:

wherein | | | purple hair₂It is shown that the L2 norm is calculated,

indicating normalized attitude angle information of the i + t frame standard motion image in the standard video, and tau indicating the current state of the standard videoThe number of play frames.

24. The rehabilitation exercise assessment method of claim 17, further comprising:

25. An apparatus for evaluating rehabilitation exercise, comprising:

26. The rehabilitation exercise evaluation device of claim 25, further comprising:

and the two-dimensional posture adjusting unit is configured to perform time dimension optimization on the two-dimensional bone node information of the continuous multi-frame rehabilitation motion images by using a pre-constructed two-dimensional posture adjusting model to obtain a time sequence before predicting the three-dimensional bone node information of each frame of rehabilitation motion images, wherein the time sequence comprises the optimized two-dimensional bone node information which is sorted according to the frame number or time.

27. The rehabilitation exercise evaluation device of claim 26, wherein the two-dimensional pose adjustment model comprises a plurality of cascaded basic units, each basic unit in the middle comprises one or more hidden layers, and the input of each hidden layer comprises a hidden state of a previous hidden layer at time t, a hidden state at time t +1, an input vector at time t and an input vector at time t +1, and the hidden state at time t +1 is output after being subjected to feedforward linear transformation and nonlinear processing of an activation function.

28. The rehabilitation exercise assessment device of claim 25, wherein the bone node tree structure is created by analyzing objective rules of posture changes in rehabilitation exercise, and comprises a plurality of bone joint pairs, two bone node father-son associations in each bone joint pair.

29. The rehabilitation exercise evaluation device of claim 25, wherein the three-dimensional posture estimation unit includes:

a first estimation subunit configured to predict, based on the two-dimensional bone node information of the consecutive multi-frame rehabilitation moving images, a length and direction unit vector of each bone node pair in the bone node tree structure through a pre-constructed first prediction sub-model;

and the second estimation subunit is configured to estimate the three-dimensional bone node information of each bone node through a second pre-constructed prediction sub-model according to the length and direction unit vectors of the bone joint pairs in each frame of the rehabilitation motion image and the time stamps thereof.

30. The rehabilitation exercise evaluation device of claim 25, further comprising:

and the three-dimensional posture adjusting unit is configured to optimize the three-dimensional bone node information of the continuous multi-frame rehabilitation motion image by using the time sequence characteristics of rehabilitation motion before evaluating the rehabilitation motion of the evaluation object.

31. The rehabilitation exercise evaluation device of claim 30, wherein the three-dimensional posture adjustment unit is configured to optimize the three-dimensional bone node information of the continuous multi-frame rehabilitation motion image by using the time-series characteristics of the rehabilitation exercise through a pre-constructed three-dimensional posture adjustment model, and the three-dimensional posture adjustment model comprises a deformation long-time memory network.

32. The rehabilitation exercise evaluation device of claim 25, wherein the evaluation unit includes:

a difference estimation subunit configured to periodically estimate a difference between the three-dimensional bone node information of the single-frame rehabilitation moving image and the standard three-dimensional bone node information at the corresponding time;

and the comparison subunit is configured to compare the difference value with a preset threshold value to obtain a motion posture evaluation result of the evaluation object at the corresponding moment.

33. The rehabilitation exercise evaluation device of claim 32, further comprising:

and the normalization operation unit is configured to respectively perform normalization operation on the three-dimensional bone node information of the single-frame rehabilitation motion image and the standard three-dimensional bone node information at the corresponding moment.

34. The rehabilitation exercise evaluation device of claim 25, further comprising:

a playback control unit configured to control a playback device to play a standard video formed of a standard moving image to the evaluation subject so as to instruct the evaluation subject to complete a rehabilitation exercise.

35. The rehabilitation exercise assessment device of claim 34, further comprising:

a synchronization unit configured to periodically adjust a current play frame number of the standard video so that the played standard video is synchronized with a rehabilitation motion of the evaluation subject.

36. The rehabilitation exercise evaluation device of claim 32, further comprising:

and the prompting unit is configured to send out an alarm signal when the motion posture evaluation result determined by the evaluation unit indicates that the motion posture of the evaluation object is wrong, and meanwhile, intercept the rehabilitation motion image frame and/or the standard motion image frame at the corresponding moment and control the display device to display the rehabilitation motion image frame and/or the standard motion image frame on the evaluation object.

37. A computing device, the computing device comprising:

one or more processors;

a memory for storing the processor-executable instructions;

the processor for reading executable instructions stored in the memory to perform the method of assessing rehabilitative exercise of any of claims 1-24.

38. A computer-readable storage medium, which stores a computer program which, when executed by a processor, is capable of implementing the assessment method of rehabilitative exercise of any one of claims 1-24.