CN112562850A - Facial nerve paralysis rehabilitation detection system based on artificial intelligence - Google Patents

Facial nerve paralysis rehabilitation detection system based on artificial intelligence Download PDF

Info

Publication number
CN112562850A
CN112562850A CN202011466952.4A CN202011466952A CN112562850A CN 112562850 A CN112562850 A CN 112562850A CN 202011466952 A CN202011466952 A CN 202011466952A CN 112562850 A CN112562850 A CN 112562850A
Authority
CN
China
Prior art keywords
dimensional model
facial
patient
semantic
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011466952.4A
Other languages
Chinese (zh)
Inventor
黄振海
徐双双
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202011466952.4A priority Critical patent/CN112562850A/en
Publication of CN112562850A publication Critical patent/CN112562850A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Abstract

The invention provides a facial nerve paralysis rehabilitation detection system based on artificial intelligence, which comprises: a facial three-dimensional model construction module for constructing a three-dimensional model of the patient's face from the acquired facial image of the patient; a facial three-dimensional model calibration module for calibrating a three-dimensional model of the patient's face based on the acquired thermal image of the patient's face; a three-dimensional model parameter prediction module for predicting a next frame of the three-dimensional model based on the thermal imaging map of the patient's face and parameters of the three-dimensional model; the facial muscle detection module is used for obtaining the responsiveness of facial muscles based on the obtained myoelectric current response sequence and obtaining the motion degree of the facial muscles based on a three-dimensional model obtained through prediction; and the rehabilitation degree detection module is used for obtaining the rehabilitation degree of the patient based on the motion degree of the facial muscles and the responsiveness of the facial muscles. The system can accurately quantify the rehabilitation degree of the patient and assist the patient to carry out rehabilitation training.

Description

Facial nerve paralysis rehabilitation detection system based on artificial intelligence
Technical Field
The invention relates to the field of medical treatment and artificial intelligence, in particular to a facial paralysis rehabilitation detection system based on artificial intelligence.
Background
At present, most methods for detecting facial nerve paralysis diseases use methods of facial semantic segmentation and key point detection or analysis methods based on facial symmetry, but the methods cannot completely analyze and extract the characteristics of facial muscles of patients, and accuracy and efficiency are lacked. For example, patent application publication No. CN111553250A proposes a method for detecting facial paralysis based on analyzing movement of each region of the face, in which all facial key points in each motion sequence are analyzed, but the key points described in the invention do not describe more detailed movement characteristics of facial muscles.
Disclosure of Invention
In order to solve the above problems, the present invention provides an artificial intelligence-based facial paralysis rehabilitation detection system, which comprises:
the facial three-dimensional model building module is used for building a three-dimensional model of the face of the patient according to the collected facial image of the patient;
the facial three-dimensional model calibration module is used for calibrating the three-dimensional model of the face of the patient based on the acquired thermal imaging image of the face of the patient;
the three-dimensional model parameter prediction module is used for predicting a next frame of three-dimensional model based on the thermal imaging image of the face of the patient and the parameters of the calibrated three-dimensional model;
the facial muscle detection module is used for obtaining the responsiveness of facial muscles based on the obtained myoelectric current response sequence and obtaining the motion degree of the facial muscles based on a three-dimensional model obtained through prediction; the method for acquiring the exercise degree of the facial muscles comprises the following steps:
selecting grid points from the ith frame three-dimensional model to obtain a grid point set piWith piGenerating Gaussian hot spots by taking each grid point as a center to obtain a Gaussian hot spot set
Figure BDA0002834642110000011
For Gaussian hot spot at a certain position in space
Figure BDA0002834642110000012
And (3) performing superposition processing:
Figure BDA0002834642110000013
Hi(x) Shows the result of superposition of the Gaussian hotspots, Hi-1(x) Showing the superposition result of the Gaussian hot spot before the current frame, alpha is a superposition coefficient,
Figure BDA0002834642110000014
the motion degree of the facial muscle corresponding to the ith frame of three-dimensional model is as follows:
Figure BDA0002834642110000015
Mi(x) For the coefficient of interest, Mi(x) Is 0 or 1, x belongs to R3Representing x as belonging to a three-dimensional real space;
and the rehabilitation degree detection module is used for obtaining the rehabilitation degree of the patient based on the motion degree of the facial muscles and the responsiveness of the facial muscles.
The calibration of the three-dimensional model of the face of the patient is specifically as follows:
obtaining a first semantic graph of facial features of a patient according to a three-dimensional model of the face of the patient, and obtaining a second semantic graph of the facial features of the patient through a semantic segmentation network by using a thermal imaging graph of the face of the patient; the facial features include eyebrows, eyes, nose, mouth;
for any one of the facial features: acquiring a first semantic region and a second semantic region of the facial feature according to the first semantic map and the second semantic map, calculating the intersection ratio of the first semantic region and the second semantic region, and if the intersection ratio is greater than a threshold value, the facial feature region in the three-dimensional model does not need to be calibrated; otherwise, respectively acquiring central points of the first semantic region and the second semantic region, projecting the two central points into the three-dimensional model to obtain a space coordinate q1、q2According to q1And q is2Obtaining a vector q, and when the three-dimensional model is calibrated, enabling the facial feature region in the three-dimensional model to move along the direction of the vector q until the cross-over ratio is greater than a threshold value, wherein the moving step length of each time is d; in particular, the amount of the solvent to be used,
Figure BDA0002834642110000021
b is a hyperparameter, d0Representing the distance between two pixels with the farthest distance in the second semantic region, A2Representing the area of the second semantic region, A3Representing according to a first semantic regionThe area of a third semantic region obtained by the domain and the second semantic region;
the third semantic area is obtained by the following steps: and acquiring an overlapping region of the first semantic region and the second semantic region, and removing the overlapping region from the second semantic region to obtain a third semantic region.
The predicting of the next frame three-dimensional model specifically comprises:
and sending the parameters of the multi-frame second semantic graph and the current frame three-dimensional model into a deep neural network, and outputting the parameters of the next frame three-dimensional model after processing.
The neural network comprises an encoder, a first full-connection layer and a second full-connection layer, wherein the input of the encoder is a diagram formed by overlapping a plurality of frames of second semantic diagrams, the output of the encoder is a characteristic diagram, the input of the first full-connection layer is a characteristic diagram, the output of the first full-connection layer is a high-dimensional vector, the length of the high-dimensional vector is consistent with the length of a three-dimensional model parameter, the high-dimensional vector is added with the parameter of a current frame three-dimensional model and then input into the second full-connection layer, the parameter of a next frame of three-dimensional model is output after processing, and the next frame of three-dimensional model can be obtained.
The obtained responsivity of the facial muscles based on the obtained myoelectric current response sequence is specifically as follows:
obtaining electromyographic current sequence I ═ { I ═ at a certain spatial position in the three-dimensional model1,I2,I3,…,Im,…,InN is the sequence length, the electromyographic current sequence is filtered, the length of a filtering window is l, and the filtering result is that delta I is equal to { delta I ═ I1,ΔI2,ΔI3,…,ΔImM is the length of the filtering result, Δ Ij=max(Ij,Ij+1,…,Ij+l-1)-min(Ij,Ij+1,…,Ij+l-1) J has a value range of [1, m]Max is a function for taking the maximum value, min is a function for taking the minimum value; the degree of myoelectric current response at that location is
Figure BDA0002834642110000022
Figure BDA0002834642110000023
topk(Δ I) means that the first k data are selected after sorting Δ I in descending order, mean is the averaging function,
Figure BDA0002834642110000024
the mean value of the electromyographic current sequence I is shown;
the responsiveness of the facial muscles is then:
Figure BDA0002834642110000025
c represents the responsiveness of facial muscles, w (x) is a weight coefficient, w (x) and Hi(x) Is in positive correlation.
The patient's recovery degree is:
Figure BDA0002834642110000026
z represents the degree of rehabilitation of the patient, gamma and beta are hyper-parameters, s0Representing the controllable facial muscle area of a normal person, s representing the uncontrollable facial muscle area of a patient, LfThe degree of movement of the facial muscles obtained when the value of i is f.
The invention has the beneficial effects that:
1. the method obtains the motion degree of the facial muscles of the patient by analyzing the grid points in the three-dimensional model of the face of the patient, wherein the motion degree of the facial muscles is obtained by superposing the spatial heat and is used for representing the motion range, the motion speed, the motion amplitude and the like of the facial muscles; in addition, the present invention combines the varying characteristics of muscle current to obtain a patient's facial muscle responsiveness to muscle current that characterizes a tendency for muscle recovery to be good. The rehabilitation degree of the patient is obtained by fusing the muscle movement degree and the responsiveness of the patient, so that the rehabilitation condition of the patient can be quantized, and the movement characteristic and the muscle current characteristic of the muscle are combined during the quantization result, so that the quantization result is accurate and reliable, and the method has an important auxiliary value for the rehabilitation training of the patient.
2. The method corrects the constructed three-dimensional model of the face of the patient based on the thermal imaging image of the face of the patient, so that the obtained three-dimensional model is more accurate, and predicts the three-dimensional model of the face of the patient of the next frame based on the multi-frame thermal imaging image, thereby reducing the power consumption of the system and avoiding frequently using a three-dimensional face reconstruction network with complex parameters and large calculation amount.
Drawings
FIG. 1 is a block diagram of the system of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, the following detailed description will be given with reference to the accompanying examples.
The diagnosis of patients with facial neuroparalysis diseases mainly depends on detecting the motor function of facial muscles. If the motor function of the facial muscles of the patient gradually becomes stronger, the patient is in slow rehabilitation, otherwise, the patient has no tendency of rehabilitation.
The invention aims to detect the rehabilitation degree of a patient by detecting the facial muscle movement condition of the patient with the facial nerve paralysis disease and combining the change characteristics of facial muscle current; the system structure is shown in fig. 1, and comprises a face three-dimensional model building module for building a three-dimensional model of the face of a patient according to the acquired face image of the patient; a facial three-dimensional model calibration module for calibrating a three-dimensional model of the patient's face based on the acquired thermal image of the patient's face; a three-dimensional model parameter prediction module for predicting a next frame of three-dimensional model based on the thermal imaging image of the patient's face and the parameters of the current frame of three-dimensional model; the facial muscle detection module is used for obtaining the responsiveness of facial muscles based on the obtained myoelectric current response sequence and obtaining the motion degree of the facial muscles based on a three-dimensional model obtained through prediction; and the rehabilitation degree detection module is used for obtaining the rehabilitation degree of the patient based on the motion degree of the facial muscles and the responsiveness of the facial muscles.
Example (b):
the face of a patient is opposite to the camera, RGB image data of the face of the patient are collected, and a 3D model of the face of the patient is obtained by the face three-dimensional model building module, namely 3D reconstruction of the face. There are many methods for 3D reconstruction of human face, such as PRNet, VRNet, 2DASL, etc., and embodiments use the 2DASL method to obtain a 3DMM model of the face, i.e., a 3D mesh of the face.
2DASL is a public human face three-dimensional reconstruction method, the input of which is a human face image and a special single-channel image, the pixel value of the single-channel image at the key point of the human face is 1, and the rest positions are-1; and outputting parameters of the 3DMM model, wherein the parameters act on the 3DMM model to obtain a face 3D mesh corresponding to the face image. The 3D DMM model is a deformable 3D face 3D mesh, and the shape and the expression of the face 3D mesh are adjusted by changing some parameters, so that the 3D mesh can be deformed into different faces, and the purpose of face 3D reconstruction is achieved.
Because the data sets used by the 2DASL three-dimensional face reconstruction method are all normal faces, when the method is used for reconstructing the face 3D mesh of a normal person, the result is relatively accurate, but when the method is used for reconstructing the face 3D mesh of a facial paralysis patient, a large error may exist, because all the data sets for performing three-dimensional face reconstruction are the data sets of the normal faces, and because of reasons such as privacy protection, the face RGB image data of the facial paralysis patient are very rare, and an accurate three-dimensional face model of the facial paralysis patient cannot be obtained; it is therefore desirable to reduce such errors before patient recovery can be achieved.
Although the face image data of the facial nerve palsy patient is protected by privacy and cannot acquire a large number of data sets, the face thermal imaging image of the patient does not reveal the privacy of the patient and can be acquired in a large number to form the data sets; however, the face thermal imaging image cannot describe the details of the face, and the 3D mesh of the face cannot be reconstructed directly according to the face thermal imaging image, so that the invention obtains the positions and the areas of the five sense organs of the face according to the face thermal imaging image of the patient, and then corrects the 3D mesh of the face obtained by the 2DASL according to the positions and the areas of the five sense organs, wherein the five sense organs are specifically the nose, the mouth, the eyes and the eyebrows.
The size of the whole shape of the face of a facial nerve paralysis patient is the same as that of a normal person, only the difference between five sense organs and the expression is large, so that although an error exists in the obtained 3D mesh of the face of the patient, the shape and the size of the mesh are accurate, the obtained 3D mesh of the face can be accurately aligned with the face on an image, and only grid points at the five sense organs or muscles have errors.
The specific steps of calibrating the three-dimensional model of the face of the patient based on the thermal imaging image of the face of the patient in the face three-dimensional model calibration module are as follows:
as is well known, facial muscles of a human face correspond to a plurality of grid points on a 3D mesh, a grid corresponding to five sense organs is obtained on the 3D mesh, and the grid is projected onto an image plane, so as to obtain a region of the five sense organs on the 3D mesh corresponding to an image, that is, a first semantic graph of the five sense organs on the 3D mesh.
And inputting the facial thermal imaging image of the patient into a semantic segmentation network to obtain a second semantic image of the facial features of the patient. Common semantic segmentation networks include DeepLabV3, Mask-RCNN, etc., and embodiments use the DeepLabV3 network to obtain semantic graphs.
And comparing and calculating the first semantic graph of the five sense organs on the 3D mesh with the second semantic graph obtained according to the facial thermal imaging graph, so as to obtain the error between the 3D mesh and the real patient face.
The semantic graph comprises several facial features of a nose, a mouth, eyes and eyebrows, the embodiment takes the mouth as an example, and the calibration process is explained, wherein firstly, the error between the 3D mesh and the real patient face is calculated:
the mouth corresponds to a first semantic region in the first semantic map and corresponds to a second semantic region in the second semantic map, it should be noted that the first semantic region and the second semantic region are binary mask maps, the pixel value of the mouth region is 1, and the pixel values of other regions are 0; calculating an intersection ratio IoU of the first semantic region and the second semantic region, and when IoU is greater than a threshold value of 0.9, indicating that the first semantic region and the second semantic region are basically overlapped, namely the error of the mouth region on the 3D mesh of the face of the patient is small; when IoU is less than or equal to the threshold value of 0.9, it is indicated that the overlapping area of the first semantic area and the second semantic area is small, that is, the error of the mouth area on the 3D mesh of the patient face is large, calibration is required, that is, the mouth movement is controlled by iterating and continuously adjusting the parameters of the face 3DMM model, and when IoU is detected to be greater than 0.9, the iteration is stopped, and at this time, the obtained position or posture of the mouth is accurate; specifically, how to reduce the iteration times needs to be considered during iteration to quickly calibrate the mouth to an accurate position, and the invention provides the following calibration method:
obtaining a first semantic region E1And a second semantic area E2The second semantic region removes the overlapping region to obtain a third semantic region E3Specifically: e3=max(E2-E10), max is taken to be the maximum value, which is indicated at E2In (E)1And E2The pixel value of the overlapping connected component area of (2) is set to 0.
Respectively obtaining central points of the first semantic area and the second semantic area, projecting the two central points into the three-dimensional model to obtain a space coordinate q1、q2According to q1And q is2Obtaining a vector q, and when the three-dimensional model is calibrated, enabling a mouth region in the three-dimensional model to move along the direction of the vector q until the cross-over ratio is greater than a threshold value, wherein the moving step length of each time is d; specifically, the method comprises the following steps:
Figure BDA0002834642110000041
Figure BDA0002834642110000042
denotes the length of the vector q, d0Representing the Euclidean distance between two pixels with the farthest distance in the second semantic region, A2Representing the area of the second semantic region, A3Representing the area of the third semantic region,
Figure BDA0002834642110000043
characterized by E1And E2The greater the ratio, the greater the degree of non-overlap of (A) is, the more E is1And E2The smaller the overlap between them, the larger the step d; b is a hyperparameter determined by the practitioner, in which b is the angle between the left and right corners of the 3D meshThe euclidean distance between them.
The calibration of the mouth position can be completed quickly by using the method, and the calibration method of the nose, eyes and eyebrow positions is similar to the method, so that the invention is not repeated.
And finishing the calibration of the three-dimensional face model.
The invention needs to detect the movement of muscles to predict the rehabilitation condition of the patient, needs the patient to make some expression actions including but not limited to smiling, mouth opening and the like to face the camera, and therefore, each frame of image needs to be subjected to three-dimensional reconstruction of the face, but the DNN network scale of the three-dimensional reconstruction of the face is large, the parameter quantity is large, and the calculation resources are consumed. Considering that the patient does not have too large facial movement when using the invention and the muscle control capability of the patient on the face is weaker, the face of the patient does not have large muscle movement change, so the invention obtains the trend of the face or the muscle movement of the face through the second semantic graph, the 3D mesh of the face of the current frame adjusts the mesh shape according to the movement trend, and further constructs the 3D mesh of the face of the next frame, and the purpose of doing so is to reduce the power consumption.
The specific steps of the three-dimensional model prediction of the next frame part in the three-dimensional model parameter prediction module are as follows:
and sending the parameters of the multi-frame second semantic graph and the three-dimensional model of the face of the patient at the current frame into a deep neural network, and outputting the parameters of the three-dimensional model at the next frame after processing.
The deep neural network comprises an encoder, a first full-connection layer and a second full-connection layer, wherein the input of the encoder is a diagram formed by overlapping multiple frames of second semantic diagrams including a current frame, in the embodiment, 5 frames of second semantic diagrams are overlapped to form a diagram with 5 channels as the input of the encoder, the output is a characteristic diagram, the input of the first full-connection layer is a characteristic diagram, the output is a high-dimensional vector, the length of the high-dimensional vector is consistent with the length of a three-dimensional model parameter, the high-dimensional vector is added with the parameter of the three-dimensional model of the current frame and then input into the second full-connection layer, and the parameter of the three-dimensional model of the next frame is output after processing.
The training data set of the deep neural network can be simply and efficiently acquired by a simulator in a large quantity, which is also a design advantage of the deep neural network, and the acquisition method of the data set is as follows:
in a three-dimensional simulator, for example, maya and 3Dmax, parameters of a 3DMM model are adjusted, so that the 3DMM model simulates facial postures of different facial nerve paralysis disease patients, and the 3DMM model performs different actions and facial expressions, such as face rotation, smiling, mouth opening and the like; in addition, grid areas of five sense organs are marked on the 3DMM model in advance; the method comprises the steps that animation video data of a 3DMM model are collected by a virtual camera, each frame of image in the video only has a semantic area of five sense organs and does not contain other information. The obtained semantic area image data of the multiframe five sense organs is input data of an encoder; and acquiring 3DMM model parameters corresponding to each frame of acquired image data, wherein the parameters are one of input data of the second full-connection layer and label data.
Thus, input data and label data are obtained, a loss value of the network is calculated by adopting a mean square error loss function, and the gradient of the network model is updated by adopting a random gradient descent algorithm.
The deep neural network has the advantages of small calculated amount, uncomplicated model and low power consumption. So far, the prediction of the next frame of three-dimensional model can be completed; in order to reduce errors, the method reconstructs the 3D mesh of the human face by reusing the 2DASL model every 10 frames.
Next, the motion degree of the facial muscles and the responsiveness of the facial muscles are obtained at a facial muscle detection module:
when expression action is carried out, facial muscles can move, grid points of 3D mesh of the human face can move along with the muscles, only the orbicularis oris muscle, the levator labialis, the masseter muscle and the buccinator muscle of the face are concerned, therefore, only the grid points corresponding to the muscles and the grid points corresponding to the positions of several facial features are concerned, and the subsequent grid points refer to the grid points at the positions.
Selecting grid points from the three-dimensional model corresponding to the ith frame facial image to obtain a set p of coordinates of a plurality of grid pointsi' considering that the human face may move when the image is acquired, such as the small-range rotation of the human face or the small-amplitude head-lifting action, which may affect the subsequent calculation, considerThe position of the grid point of the eyebrow center is only related to the whole position of the face and can not change along with the movement of muscles, and the coordinates of the grid point of the eyebrow center are set as
Figure BDA0002834642110000051
Then
Figure BDA0002834642110000052
I.e. p'iThe abscissa of each grid point of (1) minus the abscissa of the eyebrow center grid point, the ordinate of each grid point minus the ordinate of the eyebrow center grid point, piIndependent of the overall movement of the face, only the movement of muscles.
With piGenerating Gaussian hot spots by taking each grid point as a center to obtain a Gaussian hot spot set
Figure BDA0002834642110000053
Finally, each three-dimensional grid point corresponds to a Gaussian hot spot, so each three-dimensional coordinate point in the space corresponds to a heat value, the heat value of the center of the hot spot is the highest and is 1.0, the heat value is gradually attenuated to zero from the center of the hot spot to the periphery, the diameter of the hot spot is d in the same way as the hot spot of a key point on an image1Obtaining Euclidean distances between all adjacent grid points, and averaging all distances, in the embodiment d1Is 0.8 times the mean value.
It is important to point out that
Figure BDA0002834642110000061
Each of the hot spots in (a) is continuously varied in three-dimensional space, and for convenience of description, is used
Figure BDA0002834642110000062
To represent a collection
Figure BDA0002834642110000063
Heat value at different spatial positions on a certain hot spot: x is any point in the three-dimensional space,
Figure BDA0002834642110000064
the heat value of the hot spot at x is shown, and if x is not a hot spot, the heat value is noted as 0.
Superimposing the hot spots at x over multiple frames:
Figure BDA0002834642110000065
Hi(x) Shows the result of superposition of the Gaussian hotspots, Hi-1(x) Showing the superposition result of the Gaussian hot spot before the current frame, wherein alpha is a superposition coefficient, in the embodiment, alpha is 0.05,
Figure BDA0002834642110000066
when the muscle motion amplitude is relatively large, the hot spot superposition result H of the grid points corresponding to the musclesi(x) The heat value of (2) is lower, and the heat value of the superposition result is larger when the muscle movement speed or movement amplitude is smaller. The spatial position of the heat value superposition result close to 1.0 represents that the muscle at the position has no movement or has small movement degree, and the spatial position of the heat value close to 0 represents that the muscle at the position has strong movement capability.
For subsequent analysis and calculation, the invention needs to obtain the motion area of each grid point, and introduces the attention coefficient Mi(x),Mi(x) Representing the motion area of all the grid points in the three-dimensional model of the ith frame is a spatial mask, and if the three-dimensional space position x belongs to the motion area of the grid point, M isi(x) 1 is ═ 1; if the spatial position x is not the motion region of the grid point, Mi(x)=0;Mi(x) The specific calculation method is as follows:
considering that the grid points are one point and the point has no concept of "area" in space, the present invention indicates the area of the grid point by the hot spot area of each grid point and indicates the motion area of the grid point by the motion area of the hot spot.
Because the heat value of the grid point corresponding to the hot spot is attenuated from the center to the periphery, the invention adjusts the attenuation of the heat value through the following mapping relationForce:
Figure BDA0002834642110000067
the purpose is to make the heat value with smaller heat value on the hot spot smaller and close to 0, so that the area of the hot spot is reduced a little;
Figure BDA0002834642110000068
the heat value after mapping at the space position x in the three-dimensional model of the ith frame, namely the heat spot distribution of all the grid points in the space, is shown.
As is well known, the motion region M of all grid points in the three-dimensional model of the current ith framei(x) Is the motion region M of all grid points in the previous frame of three-dimensional modeli-1(x) The position areas of all grid points in the current ith frame three-dimensional model
Figure BDA0002834642110000069
The union of (a) and (b), thus:
Figure BDA00028346421100000610
Figure BDA00028346421100000611
m obtained at this timei(x) Is not binary, so it is necessary to pair Mi(x) Carrying out binarization treatment:
Mi(x) When greater than 0.01, Mi(x) Reassign value to 1, Mi(x) When it is 0.01 or less, Mi(x) The value is reassigned to 0. Mi(x) The spatial region having a value of 1 is a motion region of the grid point.
The motion degree of the facial muscle corresponding to the ith frame of three-dimensional model is as follows:
Figure BDA00028346421100000612
Mi(x) Is taken as0 or 1, x ∈ R3The representation x belongs to a three-dimensional real number space, and x represents the position of the grid point in the three-dimensional space.
Hi(x) Is the superposition result of the heat of the hot spot at the position x in the space, the points are continuous, and the grid points with large motion degree indicate that the muscle motion degree is large, and the grid points with large motion degree are Hi(x) The smaller the heat value of the corresponding point in (1), otherwise, the larger the heat value, the maximum value is 1.0. Mi(x) Corresponding to a spatial mask, only points within the mask can participate in the calculation of the degree of muscle movement.
The integration is performed over the entire three-dimensional real space. The integration here is only to add up the values corresponding to spatially continuous points.
To this end, the degree of movement of the facial muscles, L, is obtainediThe motor capacity of facial muscles is characterized, and the specific characteristics are the movement speed, the movement range and the like of the muscles in a three-dimensional space.
LiThe larger the patient is, the greater the degree of rehabilitation of the patient is indicated; in the embodiment, the movement degree of the facial muscle obtained when the value of i is f is the movement degree L of the facial muscle for subsequent calculation of the rehabilitation degree of the patientf(ii) a The value of f in the examples is 100.
But for some muscles of the face that do not move much or do not move, LiThe degree of movement of these muscles cannot be accurately characterized, and for this case, the recovery degree of the muscles needs to be detected by muscle current in an auxiliary way; generally, when a large change in muscle current is detected, it indicates that the muscle is sensitive to exercise and tends to recover.
Specifically, the patient needs to use a sensor, such as an electromyography, to acquire the muscle current of the facial muscle of the patient, and the specific method is as follows:
the sensor is attached to the skin of the face of a patient, and the position of the sensor on the face can be obtained by using visual detection methods such as semantic segmentation, key point positioning and the like; the sensor is provided with a plurality of electricity sensing units, each electricity sensing unit can detect myoelectric current of muscles to enable a patient to do some expression actions, myoelectric currents of different positions of facial muscles of the patient are fluctuated when the patient does the expression actions, and the position of each electricity sensing unit on a face can be obtained according to the position of the electricity sensing unit and the position of the sensor on the face. Finally, projecting the position to a human face 3D mesh; according to the position of the electricity sensing unit and the output current data, the muscle current of the corresponding position on the human face can be obtained.
In the above, only one way of obtaining the muscle current at different positions of the face of the patient is to obtain the muscle current at the corresponding positions, and the practitioner can obtain the muscle current at the corresponding positions by other embodiments.
Obtaining electromyographic current sequence I ═ { I ═ at a certain spatial position in the three-dimensional model1,I2,I3,…,Im,…,InN is the sequence length, the electromyographic current sequence is filtered, the filtering window length is l, the value of l in the embodiment is 10, and the filtering result is that Δ I ═ Δ I1,ΔI2,ΔI3,…,ΔImM is the length of the filtering result, m<n; filtering result delta I obtained after each movement of filtering windowjIs the difference value of the maximum and minimum myoelectric current values in the filtering window, which reflects the variation amplitude of the myoelectric current at local time, specifically, Delta Ij=max(Ij,Ij+1,…,Ij+l-1)-min(Ij,Ij+1,…,Ij+l-1) J has a value range of [1, m]Max is a function for taking the maximum value, min is a function for taking the minimum value; the degree of myoelectric current response at that location is
Figure BDA0002834642110000071
Figure BDA0002834642110000072
topk(Delta I) means that after the Delta I is sequenced from big to small, the first k data are selected, mean is an averaging function,
Figure BDA0002834642110000073
is the mean value of the electromyographic current sequence I.
D (x) reflects the local fluctuation condition of the myoelectric current at the position x in the human face three-dimensional model and the average size of the myoelectric current. The present invention sets the myoelectric current response degree of a position where no myoelectric current is detected to 0.
The myoelectric current control method disclosed by the invention focuses more on the myoelectric current response degree of the muscle with smaller muscle movement degree, and the myoelectric current data has more research significance because the smaller muscle movement degree indicates that the control capability of the patient on the muscle is weaker. Therefore, it is necessary to assign a weight w (x) to the myoelectric current of the muscle at different positions, and the weight w (x) is based on the heat superposition result Hi(x) And the value of i is f:
Figure BDA0002834642110000074
epsilon and delta are hyperparameters, wherein epsilon is 8.6, delta is-0.4, w (x) and Hf(x) Is in positive correlation and w (x)>0, if the heat value is large, the muscle movement degree is small, the corresponding weight w (x) is large, and if the heat value is small, the muscle movement degree is large, the corresponding weight w (x) is small;
the responsiveness of the facial muscles is then:
Figure BDA0002834642110000081
c represents the responsiveness of facial muscles, w (x) is a weight coefficient, and the integral interval is the whole three-dimensional real space. And C represents the weighted summation of the muscle current response degree, and the muscle response degree with smaller motion degree is mainly concerned.
According to the degree of movement L of the facial musclesfAnd the responsiveness C of the facial muscles to calculate the degree of rehabilitation of the patient:
Figure BDA0002834642110000082
z represents the degree of rehabilitation of the patient; gamma and beta are hyper-parameters, wherein the value of gamma is 1 and the value of beta is 0.1 in the embodiment; s0Represents the controllable facial muscle area of a normal person; s represents the area of facial muscles that the patient cannot control; wherein
Figure BDA0002834642110000083
Is that
Figure BDA0002834642110000084
The result of the simplification of (2) is,
Figure BDA0002834642110000085
indicating the degree of myoelectric current response per unit area of damaged muscle,
Figure BDA0002834642110000086
the greater the degree of rehabilitation.
As previously described, Hi(x) The larger the size of the region, the smaller the degree of movement of the muscle, the weaker the movement ability, and since the movement ability of the grid points of these regions is weak, these regions are planar, while the region composed of the grid points having stronger movement ability is stereoscopic, and therefore the area of the planar region is related to s.
On the other hand, considering that the muscles with weak exercise capacity are not necessarily all damaged muscles, and some muscles are still not moved on the normal face, it is also necessary to obtain the area of the muscles that do not exercise on the normal face, and the specific method is as follows: obtaining the 3D mesh of the face of a normal person, calculating the heat superposition result of grid points on the 3D mesh, and specifically obtaining the average value H of the heat superposition results of a plurality of normal persons0(x);H0(x) Is calculated byf(x) The present invention is not explained much as the calculation method.
The calculation method of s is as follows:
Figure BDA0002834642110000087
when H is presentf(x) H is more than or equal to 0.851(x) Has a value of 1 when Hf(x)<0.85 hour, h1(x) Is 0; when H is present0(x) H is more than or equal to 0.982(x) Has a value of 0 when H0(x)<0.98 hour, h2(x) Has a value of 1; h is1(x) Is to Hf(x) Performing a thresholding process to indicate areas of facial muscle damage in the patient, h2(x) Indicating the area of the normal face excluding the non-moving muscles of the face, h1(x)、h2(x) Is a three-dimensional mask, s is equivalent to taking h1(x) And h2(x) The area of the intersection.
After the rehabilitation degree of the patient is obtained, a rehabilitation scheme can be appointed to the patient according to the change of the rehabilitation degree of the patient.
The above description is intended to provide those skilled in the art with a better understanding of the present invention, and is not intended to limit the present invention to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention.

Claims (6)

1. A facial nerve palsy rehabilitation detection system based on artificial intelligence, characterized in that the system comprises: the facial three-dimensional model building module is used for building a three-dimensional model of the face of the patient according to the collected facial image of the patient;
the facial three-dimensional model calibration module is used for calibrating the three-dimensional model of the face of the patient based on the acquired thermal imaging image of the face of the patient;
the three-dimensional model parameter prediction module is used for predicting a next frame of three-dimensional model based on the thermal imaging image of the face of the patient and the parameters of the calibrated three-dimensional model;
the facial muscle detection module is used for obtaining the responsiveness of facial muscles based on the obtained myoelectric current response sequence and obtaining the motion degree of the facial muscles based on a three-dimensional model obtained through prediction; the method for acquiring the exercise degree of the facial muscles comprises the following steps:
selecting grid points from the ith frame three-dimensional model to obtain a grid point set piWith piGenerating Gaussian hot spots by taking each grid point as a center to obtain a Gaussian hot spot set
Figure FDA0002834642100000011
For Gaussian hot spot at a certain position in space
Figure FDA0002834642100000012
And (3) performing superposition processing:
Figure FDA0002834642100000013
Hi(x) Shows the result of superposition of the Gaussian hotspots, Hi-1(x) Showing the superposition result of the Gaussian hot spot before the current frame, alpha is a superposition coefficient,
Figure FDA0002834642100000014
the motion degree of the facial muscle corresponding to the ith frame of three-dimensional model is as follows:
Figure FDA0002834642100000015
Mi(x) For the coefficient of interest, Mi(x) Is 0 or 1, x belongs to R3Representing x as belonging to a three-dimensional real space;
and the rehabilitation degree detection module is used for obtaining the rehabilitation degree of the patient based on the motion degree of the facial muscles and the responsiveness of the facial muscles.
2. The system of claim 1, wherein the calibrating the three-dimensional model of the patient's face is specifically:
obtaining a first semantic graph of facial features of a patient according to a three-dimensional model of the face of the patient, and obtaining a second semantic graph of the facial features of the patient through a semantic segmentation network by using a thermal imaging graph of the face of the patient; the facial features include eyebrows, eyes, nose, mouth;
for any one of the facial features: acquiring a first semantic region and a second semantic region of the facial feature according to the first semantic map and the second semantic map, calculating the intersection ratio of the first semantic region and the second semantic region, and if the intersection ratio is greater than a threshold value, the facial feature region in the three-dimensional model does not need to be calibrated; otherwise, respectively acquiring the central points of the first semantic region and the second semantic region, and dividing the two regionsProjecting the center point to a three-dimensional model to obtain a space coordinate q1、q2According to q1And q is2Obtaining a vector q, and when the three-dimensional model is calibrated, enabling the facial feature region in the three-dimensional model to move along the direction of the vector q until the cross-over ratio is greater than a threshold value, wherein the moving step length of each time is d; in particular, the amount of the solvent to be used,
Figure FDA0002834642100000016
b is a hyperparameter, d0Representing the distance between two pixels with the farthest distance in the second semantic region, A2Representing the area of the second semantic region, A3Representing the area of a third semantic region obtained according to the first semantic region and the second semantic region;
the third semantic area is obtained by the following steps: and acquiring an overlapping region of the first semantic region and the second semantic region, and removing the overlapping region from the second semantic region to obtain a third semantic region.
3. The system of claim 2, wherein the predicting the three-dimensional model of the next frame is specifically:
and sending the parameters of the multi-frame second semantic graph and the current frame three-dimensional model into a deep neural network, and outputting the parameters of the next frame three-dimensional model after processing.
4. The system of claim 3, wherein the neural network comprises an encoder, a first fully connected layer and a second fully connected layer, the input of the encoder is a diagram formed by overlapping multiple frames of second semantic diagrams, the output of the encoder is a feature diagram, the input of the first fully connected layer is a feature diagram, the output of the first fully connected layer is a high-dimensional vector, the length of the high-dimensional vector is consistent with the length of a three-dimensional model parameter, the high-dimensional vector is added with a parameter of a three-dimensional model of a current frame and then input into the second fully connected layer, a parameter of a three-dimensional model of a next frame is output after processing, and a three-dimensional model of the next frame can be obtained according to the output parameter.
5. System according to claim 4, characterized in that the derivation of the responsiveness of the facial muscles based on the acquired electromyographic response sequences is in particular:
obtaining electromyographic current sequence I ═ { I ═ at a certain spatial position in the three-dimensional model1,I2,I3,...,Im,...,InN is the sequence length, the electromyographic current sequence is filtered, the length of a filtering window is l, and the filtering result is that delta I is equal to { delta I ═ I1,ΔI2,ΔI3,...,ΔImM is the length of the filtering result, Δ Ij=max(Ij,Ij+1,...,Ij+l-1)-min(Ij,Ij+1,...,Ij+l-1) J has a value range of [1, m]Max is a function for taking the maximum value, min is a function for taking the minimum value; the degree of myoelectric current response at that location is
Figure FDA0002834642100000021
Figure FDA0002834642100000022
topk(Δ I) means that the first k data are selected after sorting Δ I in descending order, mean is the averaging function,
Figure FDA0002834642100000023
the mean value of the electromyographic current sequence I is shown;
the responsiveness of the facial muscles is then:
Figure FDA0002834642100000024
c represents the responsiveness of facial muscles, w (x) is a weight coefficient, w (x) and Hi(x) Is in positive correlation.
6. The system of claim 5, wherein the patient's degree of rehabilitation is:
Figure FDA0002834642100000025
z represents the degree of rehabilitation of the patient, gamma and beta are hyper-parameters, s0Representing the controllable facial muscle area of a normal person, s representing the uncontrollable facial muscle area of a patient, LfThe degree of movement of the facial muscles obtained when the value of i is f.
CN202011466952.4A 2020-12-14 2020-12-14 Facial nerve paralysis rehabilitation detection system based on artificial intelligence Withdrawn CN112562850A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011466952.4A CN112562850A (en) 2020-12-14 2020-12-14 Facial nerve paralysis rehabilitation detection system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011466952.4A CN112562850A (en) 2020-12-14 2020-12-14 Facial nerve paralysis rehabilitation detection system based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN112562850A true CN112562850A (en) 2021-03-26

Family

ID=75064391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011466952.4A Withdrawn CN112562850A (en) 2020-12-14 2020-12-14 Facial nerve paralysis rehabilitation detection system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN112562850A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113570590A (en) * 2021-08-03 2021-10-29 江苏仁和医疗器械有限公司 Facial nerve palsy patient rehabilitation detection system based on visual perception
CN117153379A (en) * 2023-10-31 2023-12-01 深圳市前海蛇口自贸区医院 Prediction device for thoracic outlet syndrome

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113570590A (en) * 2021-08-03 2021-10-29 江苏仁和医疗器械有限公司 Facial nerve palsy patient rehabilitation detection system based on visual perception
CN117153379A (en) * 2023-10-31 2023-12-01 深圳市前海蛇口自贸区医院 Prediction device for thoracic outlet syndrome
CN117153379B (en) * 2023-10-31 2024-02-20 深圳市前海蛇口自贸区医院 Prediction device for thoracic outlet syndrome

Similar Documents

Publication Publication Date Title
CN108446020B (en) Motor imagery idea control method fusing visual effect and deep learning and application
Shanmuganathan et al. R-CNN and wavelet feature extraction for hand gesture recognition with EMG signals
US11482048B1 (en) Methods and apparatus for human pose estimation from images using dynamic multi-headed convolutional attention
Bajpai et al. Movenet: A deep neural network for joint profile prediction across variable walking speeds and slopes
CN104851123B (en) A kind of three-dimensional face change modeling method
CN112562850A (en) Facial nerve paralysis rehabilitation detection system based on artificial intelligence
CN111861910A (en) CT image noise reduction system and method
Hossain et al. Deepbbwae-net: A cnn-rnn based deep superlearner for estimating lower extremity sagittal plane joint kinematics using shoe-mounted imu sensors in daily living
CN112465773A (en) Facial nerve paralysis disease detection method based on human face muscle movement characteristics
CN112001122A (en) Non-contact physiological signal measuring method based on end-to-end generation countermeasure network
CN112258423A (en) Deartifact method, device, equipment and storage medium based on deep learning
CN116110597B (en) Digital twinning-based intelligent analysis method and device for patient disease categories
Ma et al. Human motion gesture recognition based on computer vision
Sugimoto et al. A method for detecting transitions of emotional states using a thermal facial image based on a synthesis of facial expressions
CN106846372A (en) Human motion quality visual A+E system and method
Chen et al. Measurement of body joint angles for physical therapy based on mean shift tracking using two low cost Kinect images
CN107967941A (en) A kind of unmanned plane health monitoring method and system based on intelligent vision reconstruct
Xu et al. An inertial sensing-based approach to swimming pose recognition and data analysis
CN112401905B (en) Natural action electroencephalogram recognition method based on source localization and brain network
CN115147768B (en) Fall risk assessment method and system
CN116543455A (en) Method, equipment and medium for establishing parkinsonism gait damage assessment model and using same
Masullo et al. CaloriNet: From silhouettes to calorie estimation in private environments
Liu et al. Tai chi movement recognition method based on deep learning algorithm
CN115813409A (en) Ultra-low-delay moving image electroencephalogram decoding method
Lueken et al. Using synthesized imu data to train a long-short term memory-based neural network for unobtrusive gait analysis with a sparse sensor setup

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210326