CN112562850A

CN112562850A - Facial nerve paralysis rehabilitation detection system based on artificial intelligence

Info

Publication number: CN112562850A
Application number: CN202011466952.4A
Authority: CN
Inventors: 黄振海; 徐双双
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-12-14
Filing date: 2020-12-14
Publication date: 2021-03-26

Abstract

The invention provides a facial nerve paralysis rehabilitation detection system based on artificial intelligence, which comprises: a facial three-dimensional model construction module for constructing a three-dimensional model of the patient's face from the acquired facial image of the patient; a facial three-dimensional model calibration module for calibrating a three-dimensional model of the patient's face based on the acquired thermal image of the patient's face; a three-dimensional model parameter prediction module for predicting a next frame of the three-dimensional model based on the thermal imaging map of the patient's face and parameters of the three-dimensional model; the facial muscle detection module is used for obtaining the responsiveness of facial muscles based on the obtained myoelectric current response sequence and obtaining the motion degree of the facial muscles based on a three-dimensional model obtained through prediction; and the rehabilitation degree detection module is used for obtaining the rehabilitation degree of the patient based on the motion degree of the facial muscles and the responsiveness of the facial muscles. The system can accurately quantify the rehabilitation degree of the patient and assist the patient to carry out rehabilitation training.

Description

Facial nerve paralysis rehabilitation detection system based on artificial intelligence

Technical Field

The invention relates to the field of medical treatment and artificial intelligence, in particular to a facial paralysis rehabilitation detection system based on artificial intelligence.

Background

At present, most methods for detecting facial nerve paralysis diseases use methods of facial semantic segmentation and key point detection or analysis methods based on facial symmetry, but the methods cannot completely analyze and extract the characteristics of facial muscles of patients, and accuracy and efficiency are lacked. For example, patent application publication No. CN111553250A proposes a method for detecting facial paralysis based on analyzing movement of each region of the face, in which all facial key points in each motion sequence are analyzed, but the key points described in the invention do not describe more detailed movement characteristics of facial muscles.

Disclosure of Invention

In order to solve the above problems, the present invention provides an artificial intelligence-based facial paralysis rehabilitation detection system, which comprises:

the facial three-dimensional model building module is used for building a three-dimensional model of the face of the patient according to the collected facial image of the patient;

the facial three-dimensional model calibration module is used for calibrating the three-dimensional model of the face of the patient based on the acquired thermal imaging image of the face of the patient;

the three-dimensional model parameter prediction module is used for predicting a next frame of three-dimensional model based on the thermal imaging image of the face of the patient and the parameters of the calibrated three-dimensional model;

the facial muscle detection module is used for obtaining the responsiveness of facial muscles based on the obtained myoelectric current response sequence and obtaining the motion degree of the facial muscles based on a three-dimensional model obtained through prediction; the method for acquiring the exercise degree of the facial muscles comprises the following steps:

selecting grid points from the ith frame three-dimensional model to obtain a grid point set p_iWith p_iGenerating Gaussian hot spots by taking each grid point as a center to obtain a Gaussian hot spot set

For Gaussian hot spot at a certain position in space

And (3) performing superposition processing:

H_i(x) Shows the result of superposition of the Gaussian hotspots, H_i-1(x) Showing the superposition result of the Gaussian hot spot before the current frame, alpha is a superposition coefficient,

the motion degree of the facial muscle corresponding to the ith frame of three-dimensional model is as follows:

M_i(x) For the coefficient of interest, M_i(x) Is 0 or 1, x belongs to R³Representing x as belonging to a three-dimensional real space;

and the rehabilitation degree detection module is used for obtaining the rehabilitation degree of the patient based on the motion degree of the facial muscles and the responsiveness of the facial muscles.

The calibration of the three-dimensional model of the face of the patient is specifically as follows:

obtaining a first semantic graph of facial features of a patient according to a three-dimensional model of the face of the patient, and obtaining a second semantic graph of the facial features of the patient through a semantic segmentation network by using a thermal imaging graph of the face of the patient; the facial features include eyebrows, eyes, nose, mouth;

for any one of the facial features: acquiring a first semantic region and a second semantic region of the facial feature according to the first semantic map and the second semantic map, calculating the intersection ratio of the first semantic region and the second semantic region, and if the intersection ratio is greater than a threshold value, the facial feature region in the three-dimensional model does not need to be calibrated; otherwise, respectively acquiring central points of the first semantic region and the second semantic region, projecting the two central points into the three-dimensional model to obtain a space coordinate q₁、q₂According to q₁And q is₂Obtaining a vector q, and when the three-dimensional model is calibrated, enabling the facial feature region in the three-dimensional model to move along the direction of the vector q until the cross-over ratio is greater than a threshold value, wherein the moving step length of each time is d; in particular, the amount of the solvent to be used,

b is a hyperparameter, d₀Representing the distance between two pixels with the farthest distance in the second semantic region, A₂Representing the area of the second semantic region, A₃Representing according to a first semantic regionThe area of a third semantic region obtained by the domain and the second semantic region;

the third semantic area is obtained by the following steps: and acquiring an overlapping region of the first semantic region and the second semantic region, and removing the overlapping region from the second semantic region to obtain a third semantic region.

The predicting of the next frame three-dimensional model specifically comprises:

and sending the parameters of the multi-frame second semantic graph and the current frame three-dimensional model into a deep neural network, and outputting the parameters of the next frame three-dimensional model after processing.

The neural network comprises an encoder, a first full-connection layer and a second full-connection layer, wherein the input of the encoder is a diagram formed by overlapping a plurality of frames of second semantic diagrams, the output of the encoder is a characteristic diagram, the input of the first full-connection layer is a characteristic diagram, the output of the first full-connection layer is a high-dimensional vector, the length of the high-dimensional vector is consistent with the length of a three-dimensional model parameter, the high-dimensional vector is added with the parameter of a current frame three-dimensional model and then input into the second full-connection layer, the parameter of a next frame of three-dimensional model is output after processing, and the next frame of three-dimensional model can be obtained.

The obtained responsivity of the facial muscles based on the obtained myoelectric current response sequence is specifically as follows:

obtaining electromyographic current sequence I ═ { I ═ at a certain spatial position in the three-dimensional model₁,I₂,I₃,…,I_m,…,I_nN is the sequence length, the electromyographic current sequence is filtered, the length of a filtering window is l, and the filtering result is that delta I is equal to { delta I ═ I₁,ΔI₂,ΔI₃,…,ΔI_mM is the length of the filtering result, Δ I_j＝max(I_j,I_j+1,…,I_j+l-1)-min(I_j,I_j+1,…,I_j+l-1) J has a value range of [1, m]Max is a function for taking the maximum value, min is a function for taking the minimum value; the degree of myoelectric current response at that location is

top_k(Δ I) means that the first k data are selected after sorting Δ I in descending order, mean is the averaging function,

the mean value of the electromyographic current sequence I is shown;

the responsiveness of the facial muscles is then:

c represents the responsiveness of facial muscles, w (x) is a weight coefficient, w (x) and H_i(x) Is in positive correlation.

The patient's recovery degree is:

z represents the degree of rehabilitation of the patient, gamma and beta are hyper-parameters, s₀Representing the controllable facial muscle area of a normal person, s representing the uncontrollable facial muscle area of a patient, L_fThe degree of movement of the facial muscles obtained when the value of i is f.

The invention has the beneficial effects that:

1. the method obtains the motion degree of the facial muscles of the patient by analyzing the grid points in the three-dimensional model of the face of the patient, wherein the motion degree of the facial muscles is obtained by superposing the spatial heat and is used for representing the motion range, the motion speed, the motion amplitude and the like of the facial muscles; in addition, the present invention combines the varying characteristics of muscle current to obtain a patient's facial muscle responsiveness to muscle current that characterizes a tendency for muscle recovery to be good. The rehabilitation degree of the patient is obtained by fusing the muscle movement degree and the responsiveness of the patient, so that the rehabilitation condition of the patient can be quantized, and the movement characteristic and the muscle current characteristic of the muscle are combined during the quantization result, so that the quantization result is accurate and reliable, and the method has an important auxiliary value for the rehabilitation training of the patient.

2. The method corrects the constructed three-dimensional model of the face of the patient based on the thermal imaging image of the face of the patient, so that the obtained three-dimensional model is more accurate, and predicts the three-dimensional model of the face of the patient of the next frame based on the multi-frame thermal imaging image, thereby reducing the power consumption of the system and avoiding frequently using a three-dimensional face reconstruction network with complex parameters and large calculation amount.

Drawings

FIG. 1 is a block diagram of the system of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, the following detailed description will be given with reference to the accompanying examples.

The diagnosis of patients with facial neuroparalysis diseases mainly depends on detecting the motor function of facial muscles. If the motor function of the facial muscles of the patient gradually becomes stronger, the patient is in slow rehabilitation, otherwise, the patient has no tendency of rehabilitation.

The invention aims to detect the rehabilitation degree of a patient by detecting the facial muscle movement condition of the patient with the facial nerve paralysis disease and combining the change characteristics of facial muscle current; the system structure is shown in fig. 1, and comprises a face three-dimensional model building module for building a three-dimensional model of the face of a patient according to the acquired face image of the patient; a facial three-dimensional model calibration module for calibrating a three-dimensional model of the patient's face based on the acquired thermal image of the patient's face; a three-dimensional model parameter prediction module for predicting a next frame of three-dimensional model based on the thermal imaging image of the patient's face and the parameters of the current frame of three-dimensional model; the facial muscle detection module is used for obtaining the responsiveness of facial muscles based on the obtained myoelectric current response sequence and obtaining the motion degree of the facial muscles based on a three-dimensional model obtained through prediction; and the rehabilitation degree detection module is used for obtaining the rehabilitation degree of the patient based on the motion degree of the facial muscles and the responsiveness of the facial muscles.

Example (b):

the face of a patient is opposite to the camera, RGB image data of the face of the patient are collected, and a 3D model of the face of the patient is obtained by the face three-dimensional model building module, namely 3D reconstruction of the face. There are many methods for 3D reconstruction of human face, such as PRNet, VRNet, 2DASL, etc., and embodiments use the 2DASL method to obtain a 3DMM model of the face, i.e., a 3D mesh of the face.

2DASL is a public human face three-dimensional reconstruction method, the input of which is a human face image and a special single-channel image, the pixel value of the single-channel image at the key point of the human face is 1, and the rest positions are-1; and outputting parameters of the 3DMM model, wherein the parameters act on the 3DMM model to obtain a face 3D mesh corresponding to the face image. The 3D DMM model is a deformable 3D face 3D mesh, and the shape and the expression of the face 3D mesh are adjusted by changing some parameters, so that the 3D mesh can be deformed into different faces, and the purpose of face 3D reconstruction is achieved.

Because the data sets used by the 2DASL three-dimensional face reconstruction method are all normal faces, when the method is used for reconstructing the face 3D mesh of a normal person, the result is relatively accurate, but when the method is used for reconstructing the face 3D mesh of a facial paralysis patient, a large error may exist, because all the data sets for performing three-dimensional face reconstruction are the data sets of the normal faces, and because of reasons such as privacy protection, the face RGB image data of the facial paralysis patient are very rare, and an accurate three-dimensional face model of the facial paralysis patient cannot be obtained; it is therefore desirable to reduce such errors before patient recovery can be achieved.

Although the face image data of the facial nerve palsy patient is protected by privacy and cannot acquire a large number of data sets, the face thermal imaging image of the patient does not reveal the privacy of the patient and can be acquired in a large number to form the data sets; however, the face thermal imaging image cannot describe the details of the face, and the 3D mesh of the face cannot be reconstructed directly according to the face thermal imaging image, so that the invention obtains the positions and the areas of the five sense organs of the face according to the face thermal imaging image of the patient, and then corrects the 3D mesh of the face obtained by the 2DASL according to the positions and the areas of the five sense organs, wherein the five sense organs are specifically the nose, the mouth, the eyes and the eyebrows.

The size of the whole shape of the face of a facial nerve paralysis patient is the same as that of a normal person, only the difference between five sense organs and the expression is large, so that although an error exists in the obtained 3D mesh of the face of the patient, the shape and the size of the mesh are accurate, the obtained 3D mesh of the face can be accurately aligned with the face on an image, and only grid points at the five sense organs or muscles have errors.

The specific steps of calibrating the three-dimensional model of the face of the patient based on the thermal imaging image of the face of the patient in the face three-dimensional model calibration module are as follows:

as is well known, facial muscles of a human face correspond to a plurality of grid points on a 3D mesh, a grid corresponding to five sense organs is obtained on the 3D mesh, and the grid is projected onto an image plane, so as to obtain a region of the five sense organs on the 3D mesh corresponding to an image, that is, a first semantic graph of the five sense organs on the 3D mesh.

And inputting the facial thermal imaging image of the patient into a semantic segmentation network to obtain a second semantic image of the facial features of the patient. Common semantic segmentation networks include DeepLabV3, Mask-RCNN, etc., and embodiments use the DeepLabV3 network to obtain semantic graphs.

And comparing and calculating the first semantic graph of the five sense organs on the 3D mesh with the second semantic graph obtained according to the facial thermal imaging graph, so as to obtain the error between the 3D mesh and the real patient face.

The semantic graph comprises several facial features of a nose, a mouth, eyes and eyebrows, the embodiment takes the mouth as an example, and the calibration process is explained, wherein firstly, the error between the 3D mesh and the real patient face is calculated:

the mouth corresponds to a first semantic region in the first semantic map and corresponds to a second semantic region in the second semantic map, it should be noted that the first semantic region and the second semantic region are binary mask maps, the pixel value of the mouth region is 1, and the pixel values of other regions are 0; calculating an intersection ratio IoU of the first semantic region and the second semantic region, and when IoU is greater than a threshold value of 0.9, indicating that the first semantic region and the second semantic region are basically overlapped, namely the error of the mouth region on the 3D mesh of the face of the patient is small; when IoU is less than or equal to the threshold value of 0.9, it is indicated that the overlapping area of the first semantic area and the second semantic area is small, that is, the error of the mouth area on the 3D mesh of the patient face is large, calibration is required, that is, the mouth movement is controlled by iterating and continuously adjusting the parameters of the face 3DMM model, and when IoU is detected to be greater than 0.9, the iteration is stopped, and at this time, the obtained position or posture of the mouth is accurate; specifically, how to reduce the iteration times needs to be considered during iteration to quickly calibrate the mouth to an accurate position, and the invention provides the following calibration method:

obtaining a first semantic region E₁And a second semantic area E₂The second semantic region removes the overlapping region to obtain a third semantic region E₃Specifically: e₃＝max(E₂-E₁0), max is taken to be the maximum value, which is indicated at E₂In (E)₁And E₂The pixel value of the overlapping connected component area of (2) is set to 0.

Respectively obtaining central points of the first semantic area and the second semantic area, projecting the two central points into the three-dimensional model to obtain a space coordinate q₁、q₂According to q₁And q is₂Obtaining a vector q, and when the three-dimensional model is calibrated, enabling a mouth region in the three-dimensional model to move along the direction of the vector q until the cross-over ratio is greater than a threshold value, wherein the moving step length of each time is d; specifically, the method comprises the following steps:

denotes the length of the vector q, d₀Representing the Euclidean distance between two pixels with the farthest distance in the second semantic region, A₂Representing the area of the second semantic region, A₃Representing the area of the third semantic region,

characterized by E₁And E₂The greater the ratio, the greater the degree of non-overlap of (A) is, the more E is₁And E₂The smaller the overlap between them, the larger the step d; b is a hyperparameter determined by the practitioner, in which b is the angle between the left and right corners of the 3D meshThe euclidean distance between them.

The calibration of the mouth position can be completed quickly by using the method, and the calibration method of the nose, eyes and eyebrow positions is similar to the method, so that the invention is not repeated.

And finishing the calibration of the three-dimensional face model.

The invention needs to detect the movement of muscles to predict the rehabilitation condition of the patient, needs the patient to make some expression actions including but not limited to smiling, mouth opening and the like to face the camera, and therefore, each frame of image needs to be subjected to three-dimensional reconstruction of the face, but the DNN network scale of the three-dimensional reconstruction of the face is large, the parameter quantity is large, and the calculation resources are consumed. Considering that the patient does not have too large facial movement when using the invention and the muscle control capability of the patient on the face is weaker, the face of the patient does not have large muscle movement change, so the invention obtains the trend of the face or the muscle movement of the face through the second semantic graph, the 3D mesh of the face of the current frame adjusts the mesh shape according to the movement trend, and further constructs the 3D mesh of the face of the next frame, and the purpose of doing so is to reduce the power consumption.

The specific steps of the three-dimensional model prediction of the next frame part in the three-dimensional model parameter prediction module are as follows:

and sending the parameters of the multi-frame second semantic graph and the three-dimensional model of the face of the patient at the current frame into a deep neural network, and outputting the parameters of the three-dimensional model at the next frame after processing.

The deep neural network comprises an encoder, a first full-connection layer and a second full-connection layer, wherein the input of the encoder is a diagram formed by overlapping multiple frames of second semantic diagrams including a current frame, in the embodiment, 5 frames of second semantic diagrams are overlapped to form a diagram with 5 channels as the input of the encoder, the output is a characteristic diagram, the input of the first full-connection layer is a characteristic diagram, the output is a high-dimensional vector, the length of the high-dimensional vector is consistent with the length of a three-dimensional model parameter, the high-dimensional vector is added with the parameter of the three-dimensional model of the current frame and then input into the second full-connection layer, and the parameter of the three-dimensional model of the next frame is output after processing.

The training data set of the deep neural network can be simply and efficiently acquired by a simulator in a large quantity, which is also a design advantage of the deep neural network, and the acquisition method of the data set is as follows:

in a three-dimensional simulator, for example, maya and 3Dmax, parameters of a 3DMM model are adjusted, so that the 3DMM model simulates facial postures of different facial nerve paralysis disease patients, and the 3DMM model performs different actions and facial expressions, such as face rotation, smiling, mouth opening and the like; in addition, grid areas of five sense organs are marked on the 3DMM model in advance; the method comprises the steps that animation video data of a 3DMM model are collected by a virtual camera, each frame of image in the video only has a semantic area of five sense organs and does not contain other information. The obtained semantic area image data of the multiframe five sense organs is input data of an encoder; and acquiring 3DMM model parameters corresponding to each frame of acquired image data, wherein the parameters are one of input data of the second full-connection layer and label data.

Thus, input data and label data are obtained, a loss value of the network is calculated by adopting a mean square error loss function, and the gradient of the network model is updated by adopting a random gradient descent algorithm.

The deep neural network has the advantages of small calculated amount, uncomplicated model and low power consumption. So far, the prediction of the next frame of three-dimensional model can be completed; in order to reduce errors, the method reconstructs the 3D mesh of the human face by reusing the 2DASL model every 10 frames.

Next, the motion degree of the facial muscles and the responsiveness of the facial muscles are obtained at a facial muscle detection module:

when expression action is carried out, facial muscles can move, grid points of 3D mesh of the human face can move along with the muscles, only the orbicularis oris muscle, the levator labialis, the masseter muscle and the buccinator muscle of the face are concerned, therefore, only the grid points corresponding to the muscles and the grid points corresponding to the positions of several facial features are concerned, and the subsequent grid points refer to the grid points at the positions.

Selecting grid points from the three-dimensional model corresponding to the ith frame facial image to obtain a set p of coordinates of a plurality of grid points_i' considering that the human face may move when the image is acquired, such as the small-range rotation of the human face or the small-amplitude head-lifting action, which may affect the subsequent calculation, considerThe position of the grid point of the eyebrow center is only related to the whole position of the face and can not change along with the movement of muscles, and the coordinates of the grid point of the eyebrow center are set as

Then

I.e. p'_iThe abscissa of each grid point of (1) minus the abscissa of the eyebrow center grid point, the ordinate of each grid point minus the ordinate of the eyebrow center grid point, p_iIndependent of the overall movement of the face, only the movement of muscles.

With p_iGenerating Gaussian hot spots by taking each grid point as a center to obtain a Gaussian hot spot set

Finally, each three-dimensional grid point corresponds to a Gaussian hot spot, so each three-dimensional coordinate point in the space corresponds to a heat value, the heat value of the center of the hot spot is the highest and is 1.0, the heat value is gradually attenuated to zero from the center of the hot spot to the periphery, the diameter of the hot spot is d in the same way as the hot spot of a key point on an image₁Obtaining Euclidean distances between all adjacent grid points, and averaging all distances, in the embodiment d₁Is 0.8 times the mean value.

It is important to point out that

Each of the hot spots in (a) is continuously varied in three-dimensional space, and for convenience of description, is used

To represent a collection

Heat value at different spatial positions on a certain hot spot: x is any point in the three-dimensional space,

the heat value of the hot spot at x is shown, and if x is not a hot spot, the heat value is noted as 0.

Superimposing the hot spots at x over multiple frames:

H_i(x) Shows the result of superposition of the Gaussian hotspots, H_i-1(x) Showing the superposition result of the Gaussian hot spot before the current frame, wherein alpha is a superposition coefficient, in the embodiment, alpha is 0.05,

when the muscle motion amplitude is relatively large, the hot spot superposition result H of the grid points corresponding to the muscles_i(x) The heat value of (2) is lower, and the heat value of the superposition result is larger when the muscle movement speed or movement amplitude is smaller. The spatial position of the heat value superposition result close to 1.0 represents that the muscle at the position has no movement or has small movement degree, and the spatial position of the heat value close to 0 represents that the muscle at the position has strong movement capability.

For subsequent analysis and calculation, the invention needs to obtain the motion area of each grid point, and introduces the attention coefficient M_i(x)，M_i(x) Representing the motion area of all the grid points in the three-dimensional model of the ith frame is a spatial mask, and if the three-dimensional space position x belongs to the motion area of the grid point, M is_i(x) 1 is ═ 1; if the spatial position x is not the motion region of the grid point, M_i(x)＝0；M_i(x) The specific calculation method is as follows:

considering that the grid points are one point and the point has no concept of "area" in space, the present invention indicates the area of the grid point by the hot spot area of each grid point and indicates the motion area of the grid point by the motion area of the hot spot.

Because the heat value of the grid point corresponding to the hot spot is attenuated from the center to the periphery, the invention adjusts the attenuation of the heat value through the following mapping relationForce:

the purpose is to make the heat value with smaller heat value on the hot spot smaller and close to 0, so that the area of the hot spot is reduced a little;

the heat value after mapping at the space position x in the three-dimensional model of the ith frame, namely the heat spot distribution of all the grid points in the space, is shown.

As is well known, the motion region M of all grid points in the three-dimensional model of the current ith frame_i(x) Is the motion region M of all grid points in the previous frame of three-dimensional model_i-1(x) The position areas of all grid points in the current ith frame three-dimensional model

The union of (a) and (b), thus:

m obtained at this time_i(x) Is not binary, so it is necessary to pair M_i(x) Carrying out binarization treatment:

M_i(x) When greater than 0.01, M_i(x) Reassign value to 1, M_i(x) When it is 0.01 or less, M_i(x) The value is reassigned to 0. M_i(x) The spatial region having a value of 1 is a motion region of the grid point.

M_i(x) Is taken as0 or 1, x ∈ R³The representation x belongs to a three-dimensional real number space, and x represents the position of the grid point in the three-dimensional space.

H_i(x) Is the superposition result of the heat of the hot spot at the position x in the space, the points are continuous, and the grid points with large motion degree indicate that the muscle motion degree is large, and the grid points with large motion degree are H_i(x) The smaller the heat value of the corresponding point in (1), otherwise, the larger the heat value, the maximum value is 1.0. M_i(x) Corresponding to a spatial mask, only points within the mask can participate in the calculation of the degree of muscle movement.

The integration is performed over the entire three-dimensional real space. The integration here is only to add up the values corresponding to spatially continuous points.

To this end, the degree of movement of the facial muscles, L, is obtained_iThe motor capacity of facial muscles is characterized, and the specific characteristics are the movement speed, the movement range and the like of the muscles in a three-dimensional space.

L_iThe larger the patient is, the greater the degree of rehabilitation of the patient is indicated; in the embodiment, the movement degree of the facial muscle obtained when the value of i is f is the movement degree L of the facial muscle for subsequent calculation of the rehabilitation degree of the patient_f(ii) a The value of f in the examples is 100.

But for some muscles of the face that do not move much or do not move, L_iThe degree of movement of these muscles cannot be accurately characterized, and for this case, the recovery degree of the muscles needs to be detected by muscle current in an auxiliary way; generally, when a large change in muscle current is detected, it indicates that the muscle is sensitive to exercise and tends to recover.

Specifically, the patient needs to use a sensor, such as an electromyography, to acquire the muscle current of the facial muscle of the patient, and the specific method is as follows:

the sensor is attached to the skin of the face of a patient, and the position of the sensor on the face can be obtained by using visual detection methods such as semantic segmentation, key point positioning and the like; the sensor is provided with a plurality of electricity sensing units, each electricity sensing unit can detect myoelectric current of muscles to enable a patient to do some expression actions, myoelectric currents of different positions of facial muscles of the patient are fluctuated when the patient does the expression actions, and the position of each electricity sensing unit on a face can be obtained according to the position of the electricity sensing unit and the position of the sensor on the face. Finally, projecting the position to a human face 3D mesh; according to the position of the electricity sensing unit and the output current data, the muscle current of the corresponding position on the human face can be obtained.

In the above, only one way of obtaining the muscle current at different positions of the face of the patient is to obtain the muscle current at the corresponding positions, and the practitioner can obtain the muscle current at the corresponding positions by other embodiments.

Obtaining electromyographic current sequence I ═ { I ═ at a certain spatial position in the three-dimensional model₁,I₂,I₃,…,I_m,…,I_nN is the sequence length, the electromyographic current sequence is filtered, the filtering window length is l, the value of l in the embodiment is 10, and the filtering result is that Δ I ═ Δ I₁,ΔI₂,ΔI₃,…,ΔI_mM is the length of the filtering result, m<n; filtering result delta I obtained after each movement of filtering window_jIs the difference value of the maximum and minimum myoelectric current values in the filtering window, which reflects the variation amplitude of the myoelectric current at local time, specifically, Delta I_j＝max(I_j,I_j+1,…,I_j+l-1)-min(I_j,I_j+1,…,I_j+l-1) J has a value range of [1, m]Max is a function for taking the maximum value, min is a function for taking the minimum value; the degree of myoelectric current response at that location is

top_k(Delta I) means that after the Delta I is sequenced from big to small, the first k data are selected, mean is an averaging function,

is the mean value of the electromyographic current sequence I.

D (x) reflects the local fluctuation condition of the myoelectric current at the position x in the human face three-dimensional model and the average size of the myoelectric current. The present invention sets the myoelectric current response degree of a position where no myoelectric current is detected to 0.

The myoelectric current control method disclosed by the invention focuses more on the myoelectric current response degree of the muscle with smaller muscle movement degree, and the myoelectric current data has more research significance because the smaller muscle movement degree indicates that the control capability of the patient on the muscle is weaker. Therefore, it is necessary to assign a weight w (x) to the myoelectric current of the muscle at different positions, and the weight w (x) is based on the heat superposition result H_i(x) And the value of i is f:

epsilon and delta are hyperparameters, wherein epsilon is 8.6, delta is-0.4, w (x) and H_f(x) Is in positive correlation and w (x)>0, if the heat value is large, the muscle movement degree is small, the corresponding weight w (x) is large, and if the heat value is small, the muscle movement degree is large, the corresponding weight w (x) is small;

the responsiveness of the facial muscles is then:

c represents the responsiveness of facial muscles, w (x) is a weight coefficient, and the integral interval is the whole three-dimensional real space. And C represents the weighted summation of the muscle current response degree, and the muscle response degree with smaller motion degree is mainly concerned.

According to the degree of movement L of the facial muscles_fAnd the responsiveness C of the facial muscles to calculate the degree of rehabilitation of the patient:

z represents the degree of rehabilitation of the patient; gamma and beta are hyper-parameters, wherein the value of gamma is 1 and the value of beta is 0.1 in the embodiment; s₀Represents the controllable facial muscle area of a normal person; s represents the area of facial muscles that the patient cannot control; wherein

Is that

The result of the simplification of (2) is,

indicating the degree of myoelectric current response per unit area of damaged muscle,

the greater the degree of rehabilitation.

As previously described, H_i(x) The larger the size of the region, the smaller the degree of movement of the muscle, the weaker the movement ability, and since the movement ability of the grid points of these regions is weak, these regions are planar, while the region composed of the grid points having stronger movement ability is stereoscopic, and therefore the area of the planar region is related to s.

On the other hand, considering that the muscles with weak exercise capacity are not necessarily all damaged muscles, and some muscles are still not moved on the normal face, it is also necessary to obtain the area of the muscles that do not exercise on the normal face, and the specific method is as follows: obtaining the 3D mesh of the face of a normal person, calculating the heat superposition result of grid points on the 3D mesh, and specifically obtaining the average value H of the heat superposition results of a plurality of normal persons₀(x)；H₀(x) Is calculated by_f(x) The present invention is not explained much as the calculation method.

The calculation method of s is as follows:

when H is present_f(x) H is more than or equal to 0.85₁(x) Has a value of 1 when H_f(x)<0.85 hour, h₁(x) Is 0; when H is present₀(x) H is more than or equal to 0.98₂(x) Has a value of 0 when H₀(x)<0.98 hour, h₂(x) Has a value of 1; h is₁(x) Is to H_f(x) Performing a thresholding process to indicate areas of facial muscle damage in the patient, h₂(x) Indicating the area of the normal face excluding the non-moving muscles of the face, h₁(x)、h₂(x) Is a three-dimensional mask, s is equivalent to taking h₁(x) And h₂(x) The area of the intersection.

After the rehabilitation degree of the patient is obtained, a rehabilitation scheme can be appointed to the patient according to the change of the rehabilitation degree of the patient.

The above description is intended to provide those skilled in the art with a better understanding of the present invention, and is not intended to limit the present invention to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention.

Claims

1. A facial nerve palsy rehabilitation detection system based on artificial intelligence, characterized in that the system comprises: the facial three-dimensional model building module is used for building a three-dimensional model of the face of the patient according to the collected facial image of the patient;

For Gaussian hot spot at a certain position in space

And (3) performing superposition processing:

2. The system of claim 1, wherein the calibrating the three-dimensional model of the patient's face is specifically:

for any one of the facial features: acquiring a first semantic region and a second semantic region of the facial feature according to the first semantic map and the second semantic map, calculating the intersection ratio of the first semantic region and the second semantic region, and if the intersection ratio is greater than a threshold value, the facial feature region in the three-dimensional model does not need to be calibrated; otherwise, respectively acquiring the central points of the first semantic region and the second semantic region, and dividing the two regionsProjecting the center point to a three-dimensional model to obtain a space coordinate q₁、q₂According to q₁And q is₂Obtaining a vector q, and when the three-dimensional model is calibrated, enabling the facial feature region in the three-dimensional model to move along the direction of the vector q until the cross-over ratio is greater than a threshold value, wherein the moving step length of each time is d; in particular, the amount of the solvent to be used,

b is a hyperparameter, d₀Representing the distance between two pixels with the farthest distance in the second semantic region, A₂Representing the area of the second semantic region, A₃Representing the area of a third semantic region obtained according to the first semantic region and the second semantic region;

3. The system of claim 2, wherein the predicting the three-dimensional model of the next frame is specifically:

4. The system of claim 3, wherein the neural network comprises an encoder, a first fully connected layer and a second fully connected layer, the input of the encoder is a diagram formed by overlapping multiple frames of second semantic diagrams, the output of the encoder is a feature diagram, the input of the first fully connected layer is a feature diagram, the output of the first fully connected layer is a high-dimensional vector, the length of the high-dimensional vector is consistent with the length of a three-dimensional model parameter, the high-dimensional vector is added with a parameter of a three-dimensional model of a current frame and then input into the second fully connected layer, a parameter of a three-dimensional model of a next frame is output after processing, and a three-dimensional model of the next frame can be obtained according to the output parameter.

5. System according to claim 4, characterized in that the derivation of the responsiveness of the facial muscles based on the acquired electromyographic response sequences is in particular:

obtaining electromyographic current sequence I ═ { I ═ at a certain spatial position in the three-dimensional model₁，I₂，I₃，...，I_m，...，I_nN is the sequence length, the electromyographic current sequence is filtered, the length of a filtering window is l, and the filtering result is that delta I is equal to { delta I ═ I₁，ΔI₂，ΔI₃，...，ΔI_mM is the length of the filtering result, Δ I_j＝max(I_j，I_j+1，...，I_j+l-1)-min(I_j，I_j+1，...，I_j+l-1) J has a value range of [1, m]Max is a function for taking the maximum value, min is a function for taking the minimum value; the degree of myoelectric current response at that location is

the mean value of the electromyographic current sequence I is shown;

the responsiveness of the facial muscles is then:

6. The system of claim 5, wherein the patient's degree of rehabilitation is: