CN117883074A

CN117883074A - Parkinson's disease gait quantitative analysis method based on human body posture video

Info

Publication number: CN117883074A
Application number: CN202410056044.XA
Authority: CN
Inventors: 魏滔; 刘瑞清; 王治忠; 王松伟; 牛晓可; 刘晓华; 肖灿; 韩佳欣; 朱俊才; 李佳佳
Original assignee: Henan Provincial Science And Technology Innovation Promotion Center; Zhengzhou University
Current assignee: Henan Provincial Science And Technology Innovation Promotion Center; Zhengzhou University
Priority date: 2024-01-15
Filing date: 2024-01-15
Publication date: 2024-04-16

Abstract

The invention discloses a parkinsonism gait quantitative analysis method based on human body posture video, which comprises the following steps: a: constructing a human body posture data set; b: constructing an HRNet deep neural network model; c: obtaining a key point sequence consisting of 17 key points of a PD patient in a human body posture video by using the trained HRNet depth neural network model; d: c, extracting motion characteristic signals of the 17 human body key point sequences obtained in the step C; e: obtaining a peak sequence and a trough sequence representing periodic motion changes of PD patients in a human body posture video; f: quantifying six gait movement characteristic parameters; g: based on the six gait movement characteristic parameters, the trained classifier is used for scoring according to MDS-UPDRS. The invention can take the human body posture video clips as carriers to realize flexible and interpretable gait quantification so as to assist in early detection, routine monitoring and treatment evaluation of Parkinson's disease.

Description

Parkinson's disease gait quantitative analysis method based on human body posture video

Technical Field

The invention relates to the field of parkinsonism gait quantification, in particular to a parkinsonism gait quantification analysis method based on human body posture video.

Background

Abnormal gait of Parkinson's disease refers to the special posture and gait changes that occur when a patient with Parkinson's disease walks. Parkinsonism gait disorder is one of the common motor symptoms of Parkinsonism (PD) patients and is also a main basis for clinical screening and diagnosis. Currently, the gold standard for assessing the severity of parkinsonism gait is the unified parkinsonism rating scale (MDS-UPDRS). One typical gait program for parkinson's disease is listed in the MDS-UPDRS exercise exam: i.e., leave and walk the parkinson's disease patient to the inspector, during which the parkinson's disease patient should walk at least 10 meters (30 feet) and then turn back around to the inspector. The project is capable of measuring a variety of behaviors: stride, stride frequency, foot lifting height, heel strike while walking, turn around, and arm swing, and then the inspector evaluates the severity of the action between "normal" and "severely impaired" at a 5-point scale. Although the inspector is often trained and the scoring categories are as clear as possible, this assessment is subjective in nature and often occurs when the inspector scores the same parkinson's disease patient with a large variance. Furthermore, due to the size and subtlety of parkinson's disease gait action, the human eye may not be able to capture every detail of a parkinson's patient under a rapid motor test.

Currently, studies on parkinson's disease gait abnormalities rely on various techniques, including gait walkways and wearable sensors, such as accelerometers or rhythms, for objectively assessing the gait of PD. While they enable detailed characterization of gait, they are often burdensome to both PD patients and evaluators, require additional equipment, and add complexity, time and cost to the evaluation, and are often impractical in a home environment. It is also common practice for a clinician to record video during gait exams using commercially available camera devices, with video for gait anomaly analysis, and with the development of a combination of marker-free pose estimation based on deep learning, patient posture key points can be objectively located from the PD gait video in a completely non-contact fashion from the video recordings. These body posture key point sequences contain rich spatiotemporal information and measure the characteristics of the patient's gait in a compact form, and finally score the severity. The above method does not require additional equipment, cost or inconvenience to the inspector or patient. However, due to complex human body posture changes, serious self-occlusion, limited size and flexibility of the public human body posture data set, and the like, human body posture estimation is challenging, and the accuracy of analysis results is poor. Furthermore, most of the current research focuses on marker-based pose estimation, which may lead to errors due to marker placement inconsistencies.

The invention patent of application number 202310812276.9 and application day 2023, 7 and 4 discloses a technical scheme of a system for automatically evaluating the severity and stage of Parkinson's disease based on gait data, which comprises a wearable multi-source gait acquisition module, a quantitative feature analysis module, a motion symptom severity quantification module, a parkinsonism severity quantification module and an automatic Hoehn-Yahr stage diagnosis module; the wearable multi-source gait acquisition module is responsible for acquiring various gait signals; the quantitative feature analysis module is responsible for processing gait signals, extracting quantitative features corresponding to the MDS-UPDRS sub-items and used for quantitative evaluation of the sub-items; the motion symptom severity quantization module is responsible for calculating motion feature weights, performing feature fusion and outputting motion feature severity quantization scores; the parkinsonism severity quantification module is responsible for calculating effective feature weights, carrying out feature fusion and outputting parkinsonism severity quantification scores; the automatic Hoehn-Yahr stage diagnosis module is responsible for bringing the quantitative characteristics into the judging process and outputting a stage result; the invention automatically completes diagnosis, realizes real-time monitoring, and is beneficial to diagnosis mobility and convenience. The technical scheme also uses an additional wearable multi-source gait acquisition module to acquire gait data, and increases the complexity, time and cost of evaluation.

Disclosure of Invention

The invention aims to provide a human body posture video-based parkinsonian gait quantitative analysis method which can realize flexible and interpretable gait quantification by taking human body posture video fragments as carriers so as to assist in early detection, conventional monitoring and treatment evaluation of parkinsonian diseases.

The invention adopts the following technical scheme:

a parkinsonism gait quantitative analysis method based on human body posture video sequentially comprises the following steps:

a: constructing a human body posture data set for PD clinical video human body posture estimation;

b: constructing an HRNet depth neural network model, wherein the HRNet depth neural network model comprises a multi-resolution input module, a multi-resolution feature extraction module, a multi-resolution feature fusion module, a feature pyramid module and a key point prediction module; the multi-resolution input module is used for converting an input original image into input images with different resolutions; the multi-resolution feature extraction module is used for extracting features of input images with different resolutions; the multi-resolution feature fusion module is used for fusing the features with different resolutions extracted by the multi-resolution feature extraction module to obtain a fused feature map; the feature pyramid module is used for carrying out convolution operation on the obtained fused feature graphs through convolution kernels with different sizes and forming a feature pyramid formed by stacking feature graphs with different scales; the key point prediction module is used for predicting key points by outputting key point heat maps according to the feature maps of different scales in the feature pyramid;

C: training the HRNet depth neural network model in the step B by utilizing the training set in the human body posture data set constructed in the step A, and analyzing the human body posture video to be detected through the trained HRNet depth neural network model to obtain a key point sequence consisting of 17 key points of a human body of a PD patient in the human body posture video;

d: c, extracting motion characteristic signals of the 17 human body key point sequences obtained in the step C to obtain seven corresponding motion characteristic signals; the seven motion characteristic signals include a difference in leg ratio, a vertical angle of the body, a horizontal angle of the ankle, a horizontal angle of the wrist, a horizontal distance between the ankles, a left ankle velocity, and a right ankle velocity;

e: c, respectively carrying out Savitzky-Golay smooth filtering and AMPD peak detection processing on the three motion characteristic signals of the leg ratio difference, the vertical angle of the body and the horizontal angle of the ankle joint obtained in the step D to obtain a peak sequence and a trough sequence representing periodic motion change of a PD patient in a human body posture video;

f: quantifying six gait motion characteristic parameters based on the peak sequence and the trough sequence obtained in the step E, wherein the six gait motion characteristic parameters are respectively a step frequency, an arm swing speed, an arm swing peak value, gesture control, a walking roughness minimum value and a walking roughness maximum value; the step frequency refers to the number of steps walked in unit time; the arm swing speed refers to the speed of arm swing in unit time; the arm swing peak value refers to the median amplitude at the arm swing peak value in unit time; posture control refers to variability in the width of a patient's stride; the walking roughness minimum value refers to the minimum value of the roughness of the patient walking within a set time; the walking roughness maximum value refers to the maximum value of the walking roughness of a patient in a set time;

G: and (3) scoring according to MDS-UPDRS by using a trained classifier based on the six gait movement characteristic parameters obtained in the step (F).

The step A comprises the following specific steps:

a1: using a video acquisition device to acquire human body posture videos of the PD patient for executing the actions required by the MDS-UPDRS rating scale;

a2: randomly disturbing all the image frames in the acquired human body posture video, and uniformly sampling N image frames to construct a human body posture data set; the human body posture data set comprises a training set and a testing set; finally, manually marking the human body boundary frame and 17 key points; the human body boundary frame is a rectangular frame tightly surrounding the human body and is formed by the left upper corner coordinates (x ₁ ，y ₁ ) With lower right angular position (x ₂ ，y ₂ ) Determining; the 17 key points of the human body refer to 17 joint points in human body posture estimation.

The HRNet deep neural network model is constructed based on a PyTorch frame, and Basicblock standard convolution and residual error modules Bottleneck in a multi-resolution input module, a multi-resolution feature extraction module, a multi-resolution feature fusion module, a feature pyramid module and a key point prediction module are respectively replaced by a Ghost module and a 3 multiplied by 3 Sandglass module, and attention modules ECA are added in the Ghost module and the Sandglass module.

The multi-resolution feature extraction module adopts a plurality of parallel branches in the HRNet deep neural network model, and each branch performs feature extraction on different input images at different resolutions; each branch adopts a basic residual block of ResNet and is applied to an input image with corresponding resolution; the feature extraction process inside each branch performs feature extraction through a series of convolution, batch normalization and activation functions.

In the step D, 17 human body key point sequences are obtainedIn P _i ^(t) Representing the coordinates of the subject key point i at the t-th frame, expressed as a pair value +.>The main body key points corresponding to i from 1 to 17 are as follows: nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left hip, right hip, left knee, right knee, left ankle, and right ankle; wherein:

difference in leg ratio R _legs The extraction method comprises the following steps: calculating the length ratio of the left leg to the right leg, and the difference value of the length ratio of the right leg to the left leg;

vertical angle of bodyThe extraction method comprises the following steps: calculating the angle between the y-axis of the image in the video and a straight line passing through the midpoint of the connecting line of the left shoulder key point and the right ankle key point;

Horizontal angle of ankle jointThe extraction method comprises the following steps: calculating an angle between an x-axis of an image in the video and a straight line passing through key points of two left and right ankles;

horizontal angle of wristThe extraction method comprises the following steps: calculating an angle between an x-axis of an image in the video and a straight line passing through two wrist key points;

horizontal distance between ankleThe extraction method comprises the following steps: calculating the absolute value of the distance between the x-axis abscissas of the left ankle and right ankle keypoints, and normalizing by dividing by the estimated height H (t) of the PD patient; the estimated height H (t) of the PD patient is the height of the human body boundary box;

left ankle joint velocityThe extraction method comprises the following steps: calculating Euclidean distance between left ankle joint coordinates in two continuous frames of images, and normalizing by dividing the Euclidean distance by estimated height H (t) of a PD patient;

right ankle joint velocityThe extraction method comprises the following steps: the euclidean distance between the right ankle coordinates in two consecutive frames of images is calculated and normalized by dividing by the estimated height H (t) of the PD patient.

The step E comprises the following specific steps:

e1: for the leg ratio difference R obtained in step D _legs Vertical angle of bodyAnd the horizontal angle of the ankle jointThe three motion characteristic signals are respectively subjected to Savitzky-Golay smoothing filtering; the three motion characteristic signals after filtering are collectively expressed as +. >

E2: filtering the three motion characteristic signals obtained in the step E1Taking the peak detection as a quasi-periodic signal, adopting a multiscale-based automatic peak detection algorithm to carry out peak detection to respectively obtain corresponding peak sequences +.>And the trough sequence->Wherein p is ₁ Representing the first peak, v ₁ Representing the first trough.

The step F comprises the following specific steps:

f1: based on the peak sequence obtained in step EAnd the trough sequence->Consider a "peak-valley-peak" period as a cycle of motion; given the sampling frequency f, the leg ratio difference R _legs Vertical angle of body->And horizontal angle of ankle joint->The event occurrence number of the three motion characteristic signals is as follows:

wherein r is _Rl (t)、r _Vb (t) and r _Ha (t) represents the leg ratio difference R in the t-th frame _legs Vertical angle of bodyAnd horizontal angle of ankle joint->The sum of the numbers of wave crests and wave troughs of the three motion characteristic signals at the moment t;

f2: based on the seven motion characteristic signals, the difference R of the leg ratios _legs Vertical angle of bodyAnd horizontal angle of ankle joint->Number of occurrences r of three motion feature signals at t frame _Rl (t)、r _Vb (t) and r _Ha (t) respectively calculating six gait movement characteristic parameters:

(1) Step frequency: by r _Rl (t)、r _Vb (t) and r _Ha (t) calculating a posterior mean value at the kth frame where the last event of the video occurred and as the final step frequency for the patient;

The final step frequency calculation method of the patient is as follows:

firstly, defining lambda-Gamma model expression as:

where λ is a hypothetical variable and obeys the Gamma distribution of parameters α and β. Alpha ₀ And beta ₀ Is a priori parameter, Y _i Is the number r of event occurrences in the ith frame _Rl (t)、r _Vb (t) and r _Ha The sum of (t), N, is the number of elapsed time intervals. A priori is set to lambda Gamma alpha ₀ ＝2，β ₀ =1), the average value thereof isThe prior selection reflects the range of reasonable human motion, and the step frequency of 2Hz is typical normal gait;

next, the posterior update of each frame of image is:

wherein F is the frame rate of the video;

finally, a posterior mean value Eλ at the kth frame where the last event of the video occurs is calculated _k ]And as the final step frequency of the patient;

wherein alpha is _k And beta _k Is the parameter of the k-th frame posterior.

(2) Arm swing speed: by means of the horizontal angle of the wristAnd a horizontal angle time series signal of the wrist, calculating the median of the absolute first-order difference as a median speed, and taking the median speed as an arm swing speed;

the wrist horizontal angle time series signal comprises N data points, which are expressed asThen

Differences＝[|a ₂ -a ₁ |，|a ₃ -a ₂ |，|a ₄ -a ₃ |，...，|a _N -a _N-1 |]；

Median_Velocity＝Median(Differences)；

Wherein Differences represent the absolute value of the difference between adjacent time points, and median_velocity represents the Median Velocity;

(3) Arm swing peak: calculating the horizontal angle of the wristAnd median amplitude at the wrist time series signal peak, and taking the median amplitude as an arm swing peak;

the wrist horizontal angle time series signal comprises N data points, which are expressed asThen using peak detection signal processing technology to determine peak value and valley bottom of wrist horizontal angle signal; for each peak value, finding the lowest valley bottom in two adjacent valley bottoms as a lower limit; calculating the peak value of each peak value, setting the lower limit value of the peak value as L and the peak value as P, and setting the peak value as H= |P-L|; finally, respectively calculating peak height values of all peaks, and taking the median of the obtained peak height values as peak median amplitude;

Median_H＝Median(H)；

wherein median_h represents the Median of the peak amplitudes;

(4) Gesture control value: by using horizontal distance between ankleCalculating variability C of stride width of a patient _v ，/>And will be variable C _v As a gesture control value; wherein sigma represents all->Standard deviation of the data, μ represents all +.>An average of the data;

(5) Walk roughness minimum and walk roughness maximum:

based on left ankle speedAnd right ankle speed +.>Respectively calculating differential values of the left ankle and the right ankle, namely absolute acceleration; the median of the absolute acceleration is then calculated and normalized: dividing the obtained normalized median by the value of the corresponding ankle signal on each frame; taking the minimum characteristic value and the maximum characteristic value obtained after the final recoding as a walking roughness minimum value and a walking roughness maximum value respectively;

First, based on left ankle velocityAnd right ankle speed +.>Respectively calculating differential values of the left ankle and the right ankle, namely absolute acceleration;

wherein deltat is the adjacent time interval,and->The differential values of the left and right ankle, that is, the absolute acceleration, respectively.

Then, the median of the absolute acceleration is calculated and normalized:

wherein MedianAcc _ankle (L) represents the median value of the absolute acceleration of the left ankle, medianAcc _ankle (R) represents the median value of the absolute acceleration of the right ankle;

the median is then divided by the value of the corresponding ankle signal on each frame:

wherein, feature _ankle (L) and Feature _ankle (R) values representing the median of the absolute acceleration of the left and right ankle divided by the left and right ankle signals on each frame, respectively, as normalized characteristic values;

finally recoding the characteristic value, and mapping the characteristic value to a range of 0-1:

wherein, recodef _ankle(L) And Recodef eature _ankle(R) The characteristic values obtained after recoding the left ankle and the right ankle are respectively taken as the minimum value and the maximum value of the walking roughness.

In the step C, before training by using a training set in a human body posture data set, the HRNet deep neural network model is pre-trained by the existing ImageNet image database; in the training process by using the training set in the human body posture data set, a random gradient descent optimizer is adopted, the initial learning rate is 1e-3, the iteration times are 30K, and the times are respectively reduced by 10 times when the 10K and 20K iterations are performed; the weight decay rate, batch_size, and momentum parameters were set to 0.0001, 16, and 0.9, respectively.

In the step E1, the obtained leg ratio difference R _legs When Savitzky-Golay filtering is applied, the window width is 11, and the fitting order is 2; for the resulting vertical angleWhen Savitzky-Golay filtering is applied, the window width is 15, and the fitting order is 2; horizontal angle for the obtained ankle joint +.>When Savitzky-Golay filtering is applied, the window width is 13 and the fitting order is 2.

The classifier comprises an LDA linear discrimination model, a GBDT gradient lifting decision tree, a RFC random forest, an SVM vector machine and/or XGBoost extreme gradient lifting.

The invention has the following beneficial effects:

1) The invention constructs a human body posture data set in a specific field, wherein the data set focuses on the unique characteristics of human body posture in clinical examination videos of MDS-UPDRS related projects executed by a subject, such as the left side and the right side of a human body are easily observed in test gait, and simultaneously, the legs and the legs as well as the arms and the human body are shielded when a patient walks;

2) Constructing an HRNet model to realize end-to-end human body posture estimation, using a Ghost module and a Sandglass module to replace Bottleneck and Basicblock, and adding an attention module method to the two basic modules to obtain stronger feature extraction capability and ensure key point regression accuracy;

3) Six typical gait movement characteristic parameters are quantified by analyzing the movement time sequence of the PD clinical video human body key points, and the parameters integrate the clinical symptoms described in the MDS-UPDRS score, so that more objective and specific explanation on the movement retardation of the PD patient can be provided for a clinician.

Drawings

FIG. 1 is a schematic flow chart of the present invention;

fig. 2 is a schematic diagram of the positions of 17 key points of the human body in the invention.

Detailed Description

The invention is described in detail below with reference to the attached drawings and examples:

as shown in fig. 1, the human body posture video-based parkinsonism gait quantitative analysis method provided by the invention comprises the following steps:

the step A comprises the following specific steps:

a1: using a video acquisition device to acquire human body posture videos of PD patients for executing specified actions; the appointed action refers to the action required by the PD patient to execute the MDS-UPDRS rating scale, namely, the PD patient (parkinsonism patient) leaves and goes to the inspector so as to accurately observe the left and right sides of the PD patient body; meanwhile, the PD patient should walk at least 10 meters (30 feet) and then turn around to the inspector;

Wherein, the picture in the human body posture video only comprises an embodiment that a PD patient walks back and forth (at least more than five meters of effective road sections), and the PD patient is positioned in the central position of the picture as far as possible; each example is a complete walking video sequence completed by a PD patient without any items in the hands and wearing normal clothing; compared with most videos shot in a specific laboratory, the video shooting method can be shot in a hospital corridor, has small requirements on the field, is wider in application scene and has popularization. The above action requirements are required in the existing MDS-UPDRS rating scale, and are not described herein.

In this embodiment, the video acquisition device may use a smart phone, where the shooting resolution is not less than 1920×1080 and the frame rate is 30/fps.

A2: using a random. Shuffle function of Python to randomly shuffle all image frames in the acquired human body posture video, and uniformly sampling N image frames to construct a human body posture data set; the human body posture dataset is then processed according to 7:3 is randomly divided into a training set (PH-train) and a testing set (PH-test), and by using a five-time cross-validation method, the stability and generalization capability of the model are improved, the reliability evaluation of the model on the performance of different data sets is ensured, so that training and testing can be performed more effectively, and the optimal model parameters and configuration are found. And finally, manually marking the human body boundary box and 17 key points by using an on-off source key point marking tool coco-anotator.

The number ratio of men and women is controlled to be about 1 as much as possible in consideration of the influence of sex factors, meanwhile, in order to take care of each classification, correlation and difference detection can be carried out, the number of patients in each sample interval is ensured to reach the specified requirement, the distribution proportion of each classification interval accords with the actual condition, and the accuracy and the robustness of the model can be improved to a certain extent.

In this embodiment, the value of N may be 2537; the human body boundary frame is a rectangular frame tightly surrounding the human body and is formed by the left upper corner coordinates (x ₁ ，y ₁ ) With lower right angular position (x ₂ ，y ₂ ) Determining; the 17 key points of the human body refer to 17 joint points commonly used in human body posture estimation, as shown in fig. 2. Human body boundary frame and human bodyThe 17 key points are all conventional in the art and are not described herein.

B: constructing an HRNet depth neural network model, wherein the HRNet depth neural network model comprises a multi-resolution input module, a multi-resolution feature extraction module, a multi-resolution feature fusion module, a feature pyramid module and a key point prediction module, and the structure of the HRNet depth neural network model is shown as follows;

the multi-resolution input module is used for converting an input original image into input images with different resolutions so as to realize multi-scale image input;

In this embodiment, the input original image is converted into an input image of different resolutions through downsampling, such as an input image of original resolution, an input image of 1/2 resolution (i.e., an input image of 1/2 of the original resolution through downsampling), an input image of 1/4 resolution (i.e., an input image of 1/4 of the original resolution through downsampling), an input image of 1/8 resolution (i.e., an input image of 1/8 of the original resolution through downsampling), and so on. The input images with different resolutions are respectively used for realizing feature extraction with different resolutions in the subsequent steps.

The multi-resolution feature extraction module is used for extracting features of input images with different resolutions so as to realize multi-scale feature extraction;

in the embodiment, the multi-resolution feature extraction module adopts a plurality of parallel branches in the HRNet deep neural network model, and each branch performs feature extraction on different input images at different resolutions; each branch takes the basic residual block of the res net and is applied on the input image of the corresponding resolution. For example, the first branch performs feature extraction on an original resolution input image, the second branch performs feature extraction on a 1/2 resolution input image, the third branch performs feature extraction on a 1/4 resolution input image, and the fourth branch performs feature extraction on a 1/8 resolution input image.

In this embodiment, the feature extraction process inside each branch is similar to a standard ResNet structure, including a series of convolutions, batch normalization, and activation functions for feature extraction. The multi-resolution feature extraction can obtain specific feature representation on each resolution, and the global information and the local information can be reserved at the same time.

The multi-resolution feature fusion module is used for fusing the features with different resolutions extracted by the multi-resolution feature extraction module to obtain a fused feature map so as to form more comprehensive and rich feature representation;

in this embodiment, compared with the existing fusion method, the multi-resolution feature fusion module realizes strong feature representation capability through the characteristics of parallel connection, repeated multi-scale fusion, aggregation representation, spatial accuracy and the like. Specifically, the features of the original resolution and the features of the 1/2 resolution are added, then the obtained result is added with the features of the 1/4 resolution, finally the obtained result is added with the features of the 1/8 resolution, and the obtained fused feature map with uniform spatial resolution contains information from different scales.

The feature pyramid module is used for carrying out convolution operation on the obtained fused feature graphs through convolution kernels with different sizes and forming a feature pyramid formed by stacking feature graphs with different scales; the feature pyramid module can extract information from feature graphs with different scales, and convolves the feature graphs after the fusion by adopting convolution checks with different sizes, so that a series of feature graphs with different scales are finally formed.

In the embodiment, in the feature pyramid module, convolution kernels with different sizes such as 3x3, 5x5 and 7x7 are adopted to carry out convolution operation on the fused feature graphs; the convolution kernels with different sizes can capture detail features and global features respectively, and finally feature graphs with different scales obtained by convolution are stacked together to form a feature pyramid, so that the model can better understand the human body posture on different scales, and the detection accuracy is improved, especially for the situation of key points with different scales. The construction of the feature pyramid is beneficial to the simultaneous focusing of local and global information of the model, and the task performance is improved. For subsequent processing;

the key point prediction module is used for predicting key points by outputting a key point heat map (key point heat map) according to the feature maps of different scales in the feature pyramid, and the value of each point in the key point heat map is used for indicating the existence and position coordinates of a key point.

In this embodiment, the HRNet deep neural network model is constructed based on a pyretch framework, and simultaneously, multi-resolution input, multi-resolution feature extraction, multi-resolution feature fusion, feature pyramid construction and key point prediction are sequentially realized by parallel connection of high-resolution to low-resolution subnets and repeated multi-scale fusion. The HRNet deep neural network model adopted in the invention is characterized in that basic block standard convolution is replaced by a Ghost module, residual module Bottleneck is replaced by a 3×3 Sandglass module, and attention modules ECA are added in the Ghost module and the Sandglass module to obtain stronger feature extraction capability. In an HRNet depth neural network model constructed based on a PyTorch framework, replacing a Basicblock standard convolution and residual error module Bottleneck in a multi-resolution input module, a multi-resolution feature extraction module, a multi-resolution feature fusion module, a feature pyramid module and a key point prediction module with a Ghost module and a 3 multiplied by 3 Sandglass module respectively, wherein an attention module ECA is added in each of the Ghost module and the Sandglass module; the HRNet deep neural network model can greatly reduce the parameter quantity and ensure the improvement of the reasoning speed and the accuracy, thereby being better applied to the outpatient service of hospitals.

C: training the improved HRNet depth neural network model in the step B by utilizing the training set in the human body posture data set constructed in the step A, and analyzing the human body posture video to be detected through the trained HRNet depth neural network model to obtain a key point sequence consisting of 17 key points of a human body of a PD patient in the human body posture video17 human key points are shown in figure 2;

in the invention, before training by using a training set in a human body posture data set, the HRNet deep neural network model is pre-trained by the existing ImageNet image database; in the training process by using the training set in the human body posture data set, a random gradient descent (SGD) optimizer is adopted, the initial learning rate is 1e-3, the iteration number is 30K, and the initial learning rate is reduced by 10 times when the 10K and 20K iterations are performed respectively; the weight decay rate, batch_size, and momentum parameters were set to 0.0001, 16, and 0.9, respectively.

D: c, obtaining 17 human body key point sequences in the step CExtracting motion characteristic signals to obtain seven corresponding motion characteristic signals;

in step D, the method for extracting the motion feature signal is as follows:

at a given key point sequenceUnder, aiming at the inspection requirement of MDS-UPDRS on the gait inspection project of the Parkinson's disease, seven different motion characteristic signal extraction modes are defined, and the extraction modes are respectively as follows:

Difference in leg ratio R _legs Vertical angle of bodyHorizontal angle of ankle joint->Horizontal angle of wrist +.>Horizontal distance between ankle->Left ankle velocity +.>And right ankle velocity->

Wherein P is _i ^(t) Representing coordinates of the subject key point i at the t-th frame, expressed as a pair valueThe i corresponding to the main body key points at different positions are as follows in sequence:

nose, i=1; left eye, i=2; right eye, i=3; left ear, i=4; right ear, i=5; left shoulder, i=6; right shoulder, i=7; left elbow, i=8; right elbow, i=9; left wrist, i=10; right wrist, i=11; left buttocks, i=12; right buttocks, i=13; left knee, i=14; right knee, i=15; left ankle, i=16; right ankle, i=17; l and R represent left and right, respectively, as shown in fig. 2.

1) Difference in leg ratio R _legs The extraction method comprises the following steps: calculating the length ratio of the left leg to the right leg, and the difference value of the length ratio of the right leg to the left leg, namely;

wherein the subscript legs stands for leg,representing the ratio of the lengths of the left leg and the right leg, < >>Representing the ratio of the lengths of the right leg and the left leg, < >>Vector modular length representing the composition of the twelfth and sixteenth human body key points, namely the length of the left leg;vector modular length consisting of thirteenth and seventeenth human body key points, namely right leg length;

2) Vertical angle of bodyThe extraction method comprises the following steps: calculating the y-axis of the image in the video and passing through the left and right shouldersAn angle of a straight line between a connecting line midpoint of the shoulder key points and connecting line midpoints of the left ankle key point and the right ankle key point;

wherein the subscript body represents the body, the superscript vert represents the vertical angle,an abscissa representing the midpoint of the sixth and seventh human keypoints, +.>An abscissa representing the midpoint of the sixteenth and seventeenth human keypoints,vector modulo length, sin, representing the composition of two midpoints ^-1 Representing an arcsine;

3) Horizontal angle of ankle jointThe extraction method comprises the following steps: calculating an angle between an x-axis of an image in the video and a straight line passing through key points of two left and right ankles;

wherein the subscript ankles represents the ankle, the superscript horiz represents the horizontal angle,and->Ordinate representing key points of sixteenth and seventeenth human body, respectively, +.>And->The abscissa, tan, representing the sixteenth and seventeenth human critical points, respectively ^-1 Represents an arctangent;

4) Horizontal angle of wristThe extraction method comprises the following steps: calculating an angle between an x-axis of an image in the video and a straight line passing through two wrist key points;

/>

wherein the subscript writes stands for wrist, the superscript horiz stands for horizontal angle, And->Ordinate representing tenth and eleventh human body key points, respectively, +.>And->The abscissa of the tenth and eleventh human body key points, respectively;

5) Horizontal distance between ankleThe extraction method comprises the following steps: calculating the absolute value of the distance between the x-axis abscissas of the left ankle and right ankle keypoints, and normalizing by dividing by the estimated height H (t) of the PD patient; in the embodiment, the four sides of the human body boundary box are closely attached to the patient appearing in the image, so that the estimated height of the PD patient is replaced by the high level of the boundary box;

wherein the subscript heel represents the ankle, the superscript horiz represents the horizontal angle,and->The abscissa representing the sixteenth and seventeenth human keypoints, respectively, abs (·) representing the absolute function, H (t) representing the estimated height of the PD patient;

6) Left ankle joint velocityThe extraction method comprises the following steps: calculating Euclidean distance between left ankle joint coordinates in two continuous frames of images, and normalizing by dividing the Euclidean distance by the estimated height H (t) of the PD patient;

wherein the subscript ankle (L) represents the left ankle, the superscript Euclidean distance,and->Respectively representing coordinates of a left ankle joint in a t frame image and a t+1 frame image; />Indicating the euclidean distance between the coordinates of the left ankle joint in the t-th frame image and the t+1-th frame image, and H (t) indicating the estimated height of the PD patient.

7) Right ankle joint velocityThe extraction method comprises the following steps: calculating Euclidean distance between right ankle joint coordinates in two continuous frames of images, and normalizing by dividing the Euclidean distance by the estimated height H (t) of the PD patient;

wherein the subscript ankel (R) represents the right ankle, the superscript Euclidean distance,and->Respectively representing coordinates of a right ankle joint in a t-th frame image and a t+1st frame image; />Indicating the euclidean distance between the coordinates of the right ankle joint in the t-th frame image and the t+1-th frame image, and H (t) indicating the estimated height of the PD patient.

E: for the leg ratio difference Rlegs and the vertical angle of the body obtained in step DAnd the horizontal angle of the ankle jointThe three motion characteristic signals are respectively subjected to Savitzky-Golay smoothing filtering and AMPD peak detection processing to obtain a peak sequence and a trough sequence representing periodic motion change of a PD patient in a human body posture video;

in the invention, the step E comprises the following specific steps:

e1: for the leg ratio difference Rlegs and the vertical angle of the body obtained in step DAnd horizontal angle of ankle joint->The three motion characteristic signals are respectively subjected to Savitzky-Golay filtering;

due to unavoidable small prediction errors or discontinuous data labeling, and the like, the extracted motion characteristic signals are easy to generate slight jitter (high-frequency noise such as peaks, saw teeth and the like), and pseudo local extremum can be generated. Therefore, the present invention also provides the difference R in the obtained leg ratios _legs Vertical angle of bodyAnd horizontal angle of ankle joint->The three motion characteristic signals are subjected to Savitzky-Golay filtering, and parameters are set as follows:

for the resulting difference in leg ratio R _legs When Savitzky-Golay filtering is applied, the window width is 11, and the fitting order is 2;

for the resulting vertical angleWhen Savitzky-Golay filtering is applied, the window width is 15, and the fitting order is 2;

for the obtained horizontal angle of the ankle jointWhen Savitzky-Golay filtering is applied, the window width is 13, and the fitting order is 2;

the three motion characteristic signals after filtering are uniformly expressed as

E2: filtering the three motion characteristic signals obtained in the step E1Regarding as a quasi-periodic signal, adopting a multiscale-based automatic peak detection (Automatic Multiscale-based Peak Detection, AMPD) algorithm to carry out peak detection to respectively obtain corresponding peak sequences +.>And the trough sequence->

Wherein p is ₁ Representing the first peak, v ₁ Representing the first trough.

F: based on the peak sequence obtained in step EColumn ofAnd the trough sequence->Six gait motion characteristic parameters are quantized, wherein the six gait motion parameters are respectively Step frequency, arm swing speed Arm swing velocity, arm swing peak Arm swing amplitude, gesture control Postural control, walking Roughness minimum value Roughness min and walking Roughness maximum value Roughness max;

Six gait movement characteristic parameters are shown below;

step frequency: the number of steps walked per unit time;

arm swing speed: the speed of arm swing per unit time;

arm swing peak: median amplitude at arm swing peak per unit time;

gesture control: variability in patient stride width;

minimum value of walking roughness: setting the minimum value of the roughness degree of the walking of the patient in the time;

maximum walking roughness: setting the maximum value of the walking roughness of the patient in the time;

in the invention, the step F comprises the following specific steps:

/>

f2: based on the seven motion characteristic signals, the difference R of the leg ratios _legs Vertical angle of body And horizontal angle of ankle joint->Number of occurrences r of three motion feature signals at t frame _Rl (t)、r _Vb (t) and r _Ha (t) defining six gait motion characteristic parameters as:

step frequency: by r _Rl (t)、r _Vb (t) and r _Ha (t) calculating a posterior mean value at the kth frame where the last event of the video occurred and as the final step frequency for the patient;

the final step frequency calculation method of the patient is as follows:

firstly, defining lambda-Gamma model expression as:

next, the posterior update of each frame of image is:

wherein F is the frame rate of the video;

wherein alpha is _k And beta _k Is the parameter of the k-th frame posterior.

Arm swing speed: by means of the horizontal angle of the wristAnd a horizontal angle time series signal of the wrist, calculating a median of the absolute first step difference as a median speed, and taking the median speed as an arm swing speed.

The median velocity is a characteristic for describing the rate of change of the time-series signal, and the absolute first-order difference refers to the absolute value of the numerical difference between adjacent time points. In the invention, firstly, the numerical value difference between adjacent time points is calculated, then the absolute value of the difference is taken, and then the median of the absolute value is calculated. The median velocity represents the average velocity or intensity of the change in wrist level angle signal over a period of time, which aids in assessing the rate of swing of the patient's wrist during walking or other activities.

The specific calculation of the arm swing speed is as follows:

the wrist horizontal angle time series signal comprises N data points, which are expressed as

Differences＝[|a ₂ -a ₁ |，|a ₃ -a ₂ |，|a ₄ -a ₃ |，...，|a _N -a _N -1|]；

Median_Velocity＝Median(Differences)；

Wherein Differences represent the absolute value of the difference between adjacent time points and median_velocity represents the Median Velocity.

Arm swing peak: calculating the horizontal angle of the wristAnd median amplitude at the wrist time series signal peak, and taking the median amplitude as an arm swing peak;

the median amplitude of the signal peaks characterizes the amplitude of the signal. In the present invention, the signal is a wrist level angle signal, and the peak value is the maximum point in the signal, generally corresponding to the maximum deflection of the swing arm. To calculate the peak median amplitude, it is first necessary to detect the peaks and valleys (minimum points between peaks) of the signal. Then, a lower limit value is defined using these detected valleys, each peak value is compared with the corresponding lower limit value, and a peak height value from the lower limit value to the peak value is calculated as a peak height value. And finally, taking the median of the peak height values of all the peaks to obtain the median amplitude of the peak, and taking the median amplitude as the arm swing peak. The median amplitude of the peak is helpful for measuring the motion characteristics of the patient in the swing arm aspect, and is an important index for evaluating the gait of the patient.

The arm swing peak, i.e., the median amplitude of the peak, is calculated as follows:

first, let the wrist horizontal angle time series signal include N data points, expressed as Then peak value detection and valley bottom detection are carried out, and peak value and valley bottom of the wrist horizontal angle signal are found by using a peak value detection signal processing technology; for each peak value, finding the lowest valley bottom in two adjacent valley bottoms as a lower limit; calculating the peak value of each peak value, setting the lower limit value of the peak value as L and the peak value as P, and setting the peak value as H= |P-L|; finally, respectively calculating peak height values of all peaks, and taking the median of the obtained peak height values as peak median amplitude;

Median_H＝Median(H)；

wherein median_h represents the Median of the peak amplitudes;

gesture control value: by using horizontal distance between ankleCalculating variability C of stride width of a patient _v ，And will be variable C _v As a gesture control value; wherein sigma represents all->Standard deviation of the data, μ represents allAn average of the data;

the larger the stride width variability value, the worse the gesture control ability is represented, whereas the smaller the value, the better the gesture control ability is.

Minimum value of walking roughness: based on left ankle speed And right ankle speed +.>Respectively calculating differential values of the left ankle and the right ankle, namely absolute acceleration; the median of the absolute acceleration is then calculated and normalized: dividing the obtained normalized median by the value of the corresponding ankle signal on each frame, and finally recoding to obtain the minimum characteristic value;

absolute acceleration is obtained by calculationThe first step of the signal, which is the absolute value of the numerical difference between adjacent time points, is taken to be the value obtained by taking the absolute value of the signal, so as to ensure that all values are positive numbers. The absolute acceleration represents the rapid change in ankle signal or the magnitude of acceleration between adjacent time points, the median of the absolute acceleration is taken and normalized, and then divided by +/per frame>The value of the ankle signal, and the minimum eigenvalue obtained after the last recoding.

Maximum walking roughness: based on left ankle speedAnd right ankle speed +.>Respectively calculating differential values of the left ankle and the right ankle, namely absolute acceleration; the median of the absolute acceleration is then calculated and normalized: dividing the obtained normalized median by the value of the corresponding ankle signal on each frame, and finally recoding to obtain the maximum characteristic value; />

The walking roughness minimum and maximum are calculated as follows:

wherein deltat is the adjacent time interval,and->The differential values, i.e., absolute accelerations, of the left and right ankle, respectively.

The median of the absolute acceleration is then calculated and normalized:

wherein, feature _ankle (L) and Feature _ankle (R) respectively substitutesDividing the median value of the absolute acceleration of the left ankle and the right ankle of the table by the value of the signals of the left ankle and the right ankle of each frame to obtain a normalized characteristic value;

In this embodiment, six gait feature parameters are extracted from seven gait feature signals, and compared with the method of extracting features only once in the prior art, some gait performances of a patient can be found more deeply, and features which are difficult to find by naked eyes can be better shown for a doctor;

In the step G, five classifiers LDA, GBDT, RFC, SVM, XGBOOST are trained respectively, and the GBDT gradient lifting decision tree obtained by comparison has the best effect and is more in line with the diagnosis of a clinician. The confusion matrix results based on the GBDT gradient lifting decision tree are shown in the following table, and the obtained accuracy is 0.829, and the accuracy (+ -1) is 0.967. In addition, the results obtained by RFC random forest, LDA linear discrimination, SVM vector machine and XGBoost extreme gradient lifting are shown in the following table;

model	Accuracy	Accuracy(±1)
			LDA	0.685	0.891
GBDT	0.829	0.967
			RFC	0.813	0.959
SVM	0.781	0.949
			XGBOOST	0.796	0.951

Wherein, the accuracy (+ -1) refers to the overall accuracy of the scoring result including the current score plus one or minus one.

Claims

1. A human body posture video-based parkinsonism gait quantitative analysis method is characterized by comprising the following steps of: the method sequentially comprises the following steps of:

2. The quantitative analysis method for parkinsonism gait based on human posture video according to claim 1, wherein the step a comprises the following specific steps:

3. The human body posture video-based parkinsonism gait quantitative analysis method according to claim 1, wherein: the HRNet deep neural network model is constructed based on a PyTorch frame, and Basicblock standard convolution and residual error modules Bottleneck in a multi-resolution input module, a multi-resolution feature extraction module, a multi-resolution feature fusion module, a feature pyramid module and a key point prediction module are respectively replaced by a Ghost module and a 3 multiplied by 3 Sandglass module, and attention modules ECA are added in the Ghost module and the Sandglass module.

4. The human body posture video-based parkinsonism gait quantitative analysis method according to claim 2, wherein: the multi-resolution feature extraction module adopts a plurality of parallel branches in the HRNet deep neural network model, and each branch performs feature extraction on different input images at different resolutions; each branch adopts a basic residual block of ResNet and is applied to an input image with corresponding resolution; the feature extraction process inside each branch performs feature extraction through a series of convolution, batch normalization and activation functions.

5. The human body posture video-based parkinsonism gait quantitative analysis method according to claim 2, wherein: in the step D, 17 human body key point sequences are obtainedIn P _i ^(t) Representing the coordinates of the subject key point i at the t-th frame, expressed as a pair value +.>The main body key points corresponding to i from 1 to 17 are as follows: nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left hip, right hip, left knee, right knee, left ankle, and right ankle; wherein:

6. The quantitative analysis method for parkinsonism gait based on human posture video according to claim 1, wherein the step E comprises the following specific steps:

e1: for the leg ratio difference R obtained in step D _legs Vertical angle of bodyAnd the horizontal angle of the ankle jointThe three motion characteristic signals are subjected to Savitzky-Golay smoothing filtering; the three motion characteristic signals after filtering are collectively expressed as +.>

7. The quantitative analysis method for parkinsonism gait based on human posture video according to claim 6, wherein the step F comprises the following specific steps:

f1: based on the peak sequence obtained in step EAnd the trough sequence->Consider a "peak-valley-peak" period as a cycle of motion; given the sampling frequency f, the leg ratio difference R _legs Vertical angle of bodyAnd horizontal angle of ankle joint->The event occurrence number of the three motion characteristic signals is as follows:

f2: based on the seven motion characteristic signals, the difference R of the leg ratios _legs Vertical angle of bodyAnd horizontal angle of ankle joint->Three movementsNumber of event occurrences r of characteristic signal at t frame _Rl (t)、r _Vb (t) and r _Ha (t) respectively calculating six gait movement characteristic parameters:

the final step frequency calculation method of the patient is as follows:

firstly, defining lambda-Gamma model expression as:

where λ is a hypothetical variable and obeys the Gamma distribution of parameters α and β, α ₀ And beta ₀ Is a priori parameter, Y _i Is the number r of event occurrences in the ith frame _Rl (t)、r _Vb (t) and r _Ha The sum of (t), N being the number of elapsed time intervals;

next, the posterior update of each frame of image is:

wherein F is the frame rate of the video;

Wherein alpha is _k And beta _k Parameters for the k-th frame posterior;

(2) Arm swing speed: by means of the horizontal angle of the wristAnd wrist horizontal angle time series signal, calculating the median of absolute first order difference as the medianThe speed, and the middle speed is taken as the swing speed of the arm;

Differences＝[|a ₂ -a ₁ |,|a ₃ -a ₂ |,|a ₄ -a ₃ |,…,|a _N -a _N-1 |]；

Median_Velocity＝Median(Differences)；

Median_H＝Median(H)；

Wherein median_h represents the Median of the peak amplitudes;

(4) Gesture control value: by using horizontal distance between ankleCalculating variability C of stride width of a patient _v ，And will be variable C _v As a gesture control value; wherein sigma represents all->Standard deviation of the data, μ represents allAn average of the data;

(5) Walk roughness minimum and walk roughness maximum:

Then, the median of the absolute acceleration is calculated and normalized:

wherein MedianAcc _ankle(L) Represent the median of the absolute acceleration of the left ankle, medianAcc _ankle(R) Representing the median value of the absolute acceleration of the right ankle;

wherein, feature _ankle(L) And Feature _ankle(R) Median value representing absolute acceleration of left and right ankle divided by left and right ankle signal on each frameAs normalized eigenvalues;

8. The human body posture video-based parkinsonism gait quantification analysis method according to claim 3, wherein: in the step C, before training by using a training set in a human body posture data set, the HRNet deep neural network model is pre-trained by the existing ImageNet image database; in the training process by using the training set in the human body posture data set, a random gradient descent optimizer is adopted, the initial learning rate is 1e-3, the iteration times are 30K, and the times are respectively reduced by 10 times when the 10K and 20K iterations are performed; the weight decay rate, batch_size, and momentum parameters were set to 0.0001, 16, and 0.9, respectively.

9. The human body posture video-based parkinsonism gait quantitative analysis method according to claim 6, wherein the method comprises the following steps of: in the step E1, the obtained leg ratio difference R _legs When Savitzky-Golay filtering is applied, the window width is 11, and the fitting order is 2; for the resulting vertical angleWhen Savitzky-Golay filtering is applied, the window width is 15, and the fitting order is2; horizontal angle for the obtained ankle joint +.>When Savitzky-Golay filtering is applied, the window width is 13 and the fitting order is 2.

10. The human body posture video-based parkinsonism gait quantitative analysis method according to claim 1, wherein: the classifier comprises an LDA linear discrimination model, a GBDT gradient lifting decision tree, a RFC random forest, an SVM vector machine and/or XGBoost extreme gradient lifting.