CN109192221A

CN109192221A - It is a kind of that phonetic decision Parkinson severity detection method is used based on cluster

Info

Publication number: CN109192221A
Application number: CN201811032625.0A
Authority: CN
Inventors: 宝颜鹏; 金博; 魏小鹏
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2018-03-30
Filing date: 2018-09-05
Publication date: 2019-01-11

Abstract

The invention discloses a kind of to use phonetic decision Parkinson severity detection method based on cluster, includes the following steps: the acquisition of 1, voice signal；2, the pretreatment of voice signal；3, all phonetic features are extracted, including fundamental frequency feature Pitch, fundamental frequency disturb Jitter, amplitude disturbances Shimmer, signal-to-noise ratio feature, nonlinear characteristic；4, model and calculating；5, predicted: each classification for cluster is loaded into classification and regression model；Obtain classification results；Speculated by the severity that mark value carries out patient.Finally, feeding back to front end, showing user the result of prediction by interface.The problems such as present invention is completed using computer software analysis, solves the problems, such as not have in clinic fixed index to determine whether with Parkinson, while it is long also to solve clinical observation Parkinson's period, costly, has in real time, efficient and inexpensive feature.

Description

It is a kind of that phonetic decision Parkinson severity detection method is used based on cluster

Technical field

The present invention relates to machine learning, artificial intelligence, voice diagnosis, data mining, more specifically to a kind of base Phonetic decision Parkinson severity detection method is used in cluster.

Background technique

SVM (SupportVectorMachine) is one of current most popular classifier.SVM is soluble Basic problem is two classification problems.Its essential idea is to find one or one from training data by convex optimized algorithm Group can be separated by two class data hyperplane.In this way, in prediction, so that it may judge to predict number by this group of hyperplane According to which kind of belongs to.Parkinson is only carried out diagnostic classification by current existing Parkinson's diagnostic techniques based on machine learning, Determine whether suspected patient suffers from Parkinson's disease.And Parkinson's disease is a kind of irreversible disease, so in actual life In cannot play essence and solve the problems, such as the effect of patient.Support vector regression (SupportVectorRegression, SVR), It is a kind of regression algorithm being modified to according to the basic thought of SVM, main thought is to find a hyperplane to carry out sample Mapping, the absolute value of the difference unlike other regression algorithms between its mapping and true value are specific if it is less than one Range, be just not counted in loss.But in prior art there is also some defects or problem, such as can only be made whether Classify with Parkinson, extent cannot be provided.Although patent is by UPDRS (unified Parkinson's disease measuring scale) It to Parkinson's patient assessment illness severity, but is to carry out overall calculation and prediction, it cannot be according to the feelings of disturbances in patients with Parkinson disease Condition carries out personalized prediction.Accelerate the efficiency of diagnosis.

Summary of the invention

The present invention provides a kind of use phonetic decision Parkinson severity detection method based on cluster.

In order to achieve the above object, the present invention provides a kind of use phonetic decision Parkinson severity based on cluster Detection method includes the following steps:

The acquisition of S1, voice signal

Vowel is selected, by acquiring equipment, acquire following content: whether patient number name, the age, gender, is made a definite diagnosis For Parkinson, whether there are other to lead to the disease of voice disorder, sick time, UPDRS (movement), UPDRS (entirety), acquisition Date and time, which time acquisition of the same day；

The pretreatment of S2, voice signal

To voice carry out voice signal pretreatment, including format conversion, sample frequency conversion, preemphasis, adding window and point Frame removes unvoiced section, carries out fundamental frequency extraction；

S3, all phonetic features are extracted

Jitter, amplitude disturbances Shimmer, signal-to-noise ratio feature, non-linear spy are disturbed including fundamental frequency feature Pitch, fundamental frequency Sign；

S4, model and calculating

Sorting algorithm based on support vector machines, using linear separability SVM classifier and nonlinear model, SVM is logical It crosses introducing kernel function and establishes model；The information that characteristic obtained by S3 and doctor provide is corresponded, row is at data Collection；Clustered using k-means, for each cluster classification, by data set in the ratio of 3:1 be divided into training set and Test set；For the classification of each cluster stroke, training set is carried out using the sorting algorithm and SVM model of support vector machines Disaggregated model training is carried out regression model training using support vector regression SVR method, is carried out using trellis search method excellent Change model, will be trained, the model parameter of each classification is saved.

S5, it is predicted

Each classification for cluster is loaded into classification and regression model obtained in the above training process；Input needs to sentence Disconnected and prediction data, calculate Distance Judgment generic, carry out classified calculating to data using the corresponding model of the category, Obtain classification results.Classification results are handled again, are the testers for not suffering from Parkinson's disease for prediction, by mark value Installing is 0, for predicting the tester with Parkinson's disease, reuses SVR and carries out regression forecasting, obtain calculated label Value is speculated by the severity that mark value carries out patient, terminates process；

S6, feedback result

By the result of prediction by interface, front end is fed back to, user is showed.

Under preferred embodiment, fundamental frequency feature Pitch carries out the extraction of fundamental frequency using correlation method in step S3；

For deterministic signal, short-time autocorrelation function is defined as:

Then for the auto-correlation function of each frame, need to find maximum peak value after its first zero crossing, it is right The subscript k value answered is exactly the corresponding pitch period of frame voice, and inverted to k is exactly fundamental frequency；

General fundamental frequency is used to indicate, due to having carried out sub-frame processing to voice signal when voice is pretreated, so Each frame can all have one it is corresponding, can be obtained by a fundamental frequency sequence in this way.

And fundamental frequency feature is exactly to calculate some simple statistical parameters in the fundamental frequency sequence basis extracted.

Under preferred embodiment, in step S3, Jitter is used to indicate the disturbance of fundamental frequency, the i.e. journey in pitch period deviation period Degree since audio is using lasting pronunciation process, and sends out long vowel, can exclude the alternate fundamental frequency disturbance of first consonant；

The not instead of fundamental frequency that Jitter is used when calculating, pitch period, pitch period are defined as follows:

S.t.n=1,2 ..., N (2.1)

Wherein, n indicates that frame, N represent the sum of frame, refer to the fundamental frequency of n-th frame, and what is referred to is exactly the fundamental tone week of the n-th frame Phase；

(1) Jitter: i.e. the Relative Perturbation of pitch period is the average absolute value of the difference of adjacent pitch period, with fundamental tone The ratio of the average value in period reflects whole subject to the relation control ability of vocal cord vibration；Its formula are as follows:

(2) Jitter_abs: i.e. the absolute perturbation of pitch period is exactly the average absolute value of the difference of adjacent pitch period, Whole subject is reflected to the absolute control capacity of vocal cord vibration；Its formula are as follows:

(3) Jitter_PPQ5: i.e. the adjacent 5 points of disturbances of pitch period, are the flat of a certain frame pitch period 5 frame adjacent thereto The average absolute value of the difference of equal pitch period, reflects a period of time subject to the control ability of vocal cord vibration to a certain degree； Its formula are as follows:

(4) Jitter_rap: i.e. the adjacent 3 points of disturbances of pitch period, are the flat of a certain frame pitch period 3 frame adjacent thereto The average absolute value of the difference of equal pitch period, reflects a period of time subject to the control ability of vocal cord vibration to a certain degree； Its formula are as follows:

(5) Jitter_ddp: i.e. the difference of the adjacent 3 points of disturbances of pitch period, is the difference between adjacent 3 frame pitch period Difference, then be averaging absolute value, reflect subject in a bit of time to a certain degree to the control ability of vocal cord vibration Difference；Its formula are as follows:

Under preferred embodiment, in step S3, Shimmer is the disturbance of the amplitude of voice, i.e. the journey of amplitude deviation mean amplitude of tide Degree:

The definition of the amplitude of Shimmer measurement is:

A_0,n=max (P_n)-min(P_n) (3.1)

S.t.n=1,2 ..., N

Wherein, sequence indicates the voice signal value sequence of n-th frame, refers to the corresponding amplitude of n-th frame；

(1) Shimmer: i.e. amplitude disturbances (precentagewise to calculate), are the average absolute values of the difference of adjacent amplitude, with vibration The ratio of the average value of width reflects subject to the relation control ability of voice amplitudes；Its formula are as follows:

(2) Shimmer_dB: i.e. amplitude disturbances (being calculated by decibel).It is the average value of the ratio of adjacent amplitude, only Unit is a decibel dB, reflects subject to the relation control ability of voice amplitudes；Its formula are as follows:

(3) Shimmer_APQ5: i.e. the adjacent 5 points of disturbances of amplitude, are the means amplitude of tide of a certain frame amplitude 5 frame adjacent thereto Difference average absolute value, reflect a period of time subject to a certain degree to the control ability of voice amplitudes；Its formula are as follows:

(4) Shimmer_APQ3: i.e. the adjacent 3 points of disturbances of amplitude, are the means amplitude of tide of a certain frame amplitude 3 frame adjacent thereto Difference average absolute value, reflect a period of time subject to a certain degree to the control ability of voice amplitudes；Its formula are as follows:

(5) Shimmer_dda: i.e. the difference of the adjacent 3 points of disturbances of amplitude, is the difference of the difference between adjacent 3 frame amplitude, then It is averaging absolute difference, reflects in a bit of time subject to a certain degree to the difference of the control ability of voice amplitudes；Its Formula are as follows:

(6) Shimmer_APQ11: i.e. the adjacent 11 points of disturbances of amplitude, are the average vibrations of a certain frame amplitude 11 frame adjacent thereto The average absolute value of the difference of width reflects a long time subject to the control ability of voice amplitudes to a certain degree；Its formula Are as follows:

Under preferred embodiment, in step S3, nonlinear characteristic is to have trend fluction analysis DFA, recurrence period density entropy RPDE, fundamental frequency cycles entropy PPE.

The present invention is completed using computer software analysis, and solving in clinic does not have fixed index to determine whether to suffer from Have the problem of Parkinson, while it is long also to solve clinical observation Parkinson's period, it is costly the problems such as, have in real time, efficiently again The feature of low cost.This method detected using phonetic decision Parkinson's severity based on cluster, can not only pass through The mode of voice judges whether testee suffers from Parkinson's disease, can also be tested by the UPDRS value of prediction Examination person suffers from the judgement of Parkinson's severity, and by the method row of cluster at multiple classifications, small classification class is predicted and returned The accuracy rate of prediction can be significantly improved.It can be by prediction as a result, next being controlled how patient carries out auxiliary come decision It treats.Personalized treatment for disturbances in patients with Parkinson disease is a Gospel.

Detailed description of the invention

Fig. 1 is model construction flow chart；

Fig. 2 UPDRS estimation flow figure；

Fig. 3 linear separability SVM classifier；

The linear SVR of Fig. 4 returns device.

Specific embodiment

The acquisition of one voice signal

The selection of 1.1 pronunciations

The pronunciation content of acquisition needs briefly, while can reflect the voice disorder of patient to a certain extent.Consider To need between different people that there are languages different, whether there is or not dialect, whether there is or not accent and need to avoid asophia etc. a variety of because Element determines that the method is commonplace, and effect is good, strong operability using pronunciation process is continued.

And select to have selected hair vowel in pronunciation, this is because the mechanism that different sound is formed is different:

(1) vowel: being the sound during the pronunciation process by air-flow by the unobstructed sending in oral cavity.Air-flow when it pronounces Pass through glottis from lung's exhalation, then impulsive sound band, vibrates vocal cords uniformly, and last air-flow is unobstructed to pass through oral cavity, nose Chamber issues different sound by the adjusting of tongue, lip.So vocal cords necessarily vibrate when hair vowel.

(2) consonant: being during the pronunciation process by air-flow in the sound that oral cavity or pharynx are hindered and are issued.When it pronounces Various obstructions of the air-flow by multiple vocal organs, rely primarily on obstruction to pronounce.So vocal cords are not necessarily intended to when hair consonant Vibration.

1.2 acquisition equipment

Compared with the experiment of the previous sound pick-up outfit using profession, in order to meet the needs for being packaged into app, acquisition When the equipment used be normal mobile phone.Simultaneously because there are operating system different problems for mobile phone, we used two hands Machine is acquired, and one is iphone5s, and another is oppo r9s.The system is being serviced using the Flask frame of python Device builds background process platform and audio transmission interfaces, is acquired by using the app developed, by voice document It is stored in background server, so as to subsequent processing.

1.3 acquisition scheme

A large amount of experimental data in order to obtain needs to take multiple measurements each patient Parkinson (open state, pass State, before medication, after medication, after medication 1 hour, after medication 3 hours), sampling environment need to meet quiet while put sampler The room of pine, and sampler is assisted equally to keep quite.If there is the uncertain noise of burst in Recording Process, need It deletes and rerecords, to guarantee the quality of recording.Whether acquisition content includes: patient number, name, the age, gender, makes a definite diagnosis For Parkinson, whether there are other to lead to the disease of voice disorder, sick time, UPDRS (movement), UPDRS (entirety), acquisition Date and time, which time acquisition of the same day.

The pretreatment of two voice signals

To voice carry out voice signal pretreatment, including format conversion, sample frequency conversion, preemphasis, adding window and point Frame removes unvoiced section, carries out fundamental frequency extraction, uses Matlab R2017a as handling implement.It (this part can be in network On find)

Three extract all phonetic features

Jitter, amplitude disturbances Shimmer, signal-to-noise ratio feature, non-linear spy are disturbed including fundamental frequency feature Pitch, fundamental frequency Sign, steps are as follows for specific calculating phonetic feature:

All voice disorder mark sheets used of table 1

1. fundamental frequency feature Pitch

The extraction of fundamental frequency is carried out using most commonly seen correlation method, its main feature is that interpretation is strong, is suitble to simultaneously In this quick speech processes mobile phone application.Its calculating of correlation method is fairly simple, is exactly to carry out base using auto-correlation function The estimation of frequency.

Autocorrelation calculation in short-term is carried out firstly the need of to each frame voice signal extracted before.For deterministic letter Number, short-time autocorrelation function is defined as:

Then for the auto-correlation function of each frame, need to find maximum peak value after its first zero crossing, it is right The subscript k value answered is exactly the corresponding pitch period of frame voice, and inverted to k is exactly fundamental frequency.

(1) F0_mean: averaging to fundamental frequency sequence, reflects the whole height of subject's vibration frequency of vocal band, There is certain difference between men and women.

(2) F0_max: i.e. to fundamental frequency sequence maximizing, the maximum value of subject's vibration frequency of vocal band is reflected.

(3) F0_min: minimizing to fundamental frequency sequence, reflects the minimum value of subject's vibration frequency of vocal band.

(4) F0_median: seeking intermediate value to fundamental frequency sequence, reflects subject's vibration frequency of vocal band to a certain extent Whole height.

(5) F0_std: seeking standard deviation to fundamental frequency sequence, reflects the dispersion degree of subject's vibration frequency of vocal band.

2. fundamental frequency disturbs Jitter

Jitter sheet is the meaning of shake, referred in Probability an event actually occur with it is ideal occur when Between deviation, be made of certainty content and Gauss (random) content.

Here Jitter is used to indicate the disturbance of fundamental frequency, the i.e. degree in pitch period deviation period, since audio uses Be to continue pronunciation process, and send out long vowel, the alternate fundamental frequency disturbance of first consonant can be excluded, so Jitter is to a certain degree On reflect subject to the control ability of vocal cord vibration.

S.t.n=1,2 ..., N (2.1)

Wherein, n indicates that frame, N represent the sum of frame, refer to the fundamental frequency of n-th frame, and what is referred to is exactly the fundamental tone week of the n-th frame Phase.

(1) Jitter: i.e. the Relative Perturbation of pitch period is the average absolute value of the difference of adjacent pitch period, with fundamental tone The ratio of the average value in period reflects whole subject to the relation control ability of vocal cord vibration.Its formula are as follows:

(2) Jitter_abs: i.e. the absolute perturbation of pitch period is exactly the average absolute value of the difference of adjacent pitch period, Whole subject is reflected to the absolute control capacity of vocal cord vibration.Its formula are as follows:

(3) Jitter_PPQ5: i.e. the adjacent 5 points of disturbances of pitch period, are the flat of a certain frame pitch period 5 frame adjacent thereto The average absolute value of the difference of equal pitch period, reflects a period of time subject to the control ability of vocal cord vibration to a certain degree. Its formula are as follows:

(4) Jitter_rap: i.e. the adjacent 3 points of disturbances of pitch period, are the flat of a certain frame pitch period 3 frame adjacent thereto The average absolute value of the difference of equal pitch period, reflects a period of time subject to the control ability of vocal cord vibration to a certain degree. Its formula are as follows:

(5) Jitter_ddp: i.e. the difference of the adjacent 3 points of disturbances of pitch period, is the difference between adjacent 3 frame pitch period Difference, then be averaging absolute value, reflect subject in a bit of time to a certain degree to the control ability of vocal cord vibration Difference.Its formula are as follows:

3. amplitude disturbances Shimmer

Shimmer sheet is the meaning of flashing, herein same Jitter, but the object of its measurement is different, is the vibration of voice The degree of the disturbance of width, i.e. amplitude deviation mean amplitude of tide.

Since the amplitude that experiment measures voice is not merely related with subject's sounding, also at a distance from subject and mobile phone Etc. other factors it is related, so being difficult with the amplitude of voice directly as feature.As soon as but patient Parkinson has a feature, It is when speaking, sound can be original smaller, so patient Parkinson is not so good as normal person to the control of the amplitude of voice, and Shimmer can be very good to embody this point.

The definition of the amplitude of Shimmer measurement is:

A_0,n=max (P_n)-min(P_n) (3.1)

S.t.n=1,2 ..., N

Wherein, sequence indicates the voice signal value sequence of n-th frame, refers to the corresponding amplitude of n-th frame.

(1) Shimmer: i.e. amplitude disturbances (precentagewise to calculate), are the average absolute values of the difference of adjacent amplitude, with vibration The ratio of the average value of width reflects subject to the relation control ability of voice amplitudes.Its formula are as follows:

(2) Shimmer_dB: i.e. amplitude disturbances (being calculated by decibel).It is the average value of the ratio of adjacent amplitude, only Unit is a decibel dB, reflects subject to the relation control ability of voice amplitudes.Its formula are as follows:

(3) Shimmer_APQ5: i.e. the adjacent 5 points of disturbances of amplitude, are the means amplitude of tide of a certain frame amplitude 5 frame adjacent thereto Difference average absolute value, reflect a period of time subject to a certain degree to the control ability of voice amplitudes.Its formula are as follows:

(4) Shimmer_APQ3: i.e. the adjacent 3 points of disturbances of amplitude, are the means amplitude of tide of a certain frame amplitude 3 frame adjacent thereto Difference average absolute value, reflect a period of time subject to a certain degree to the control ability of voice amplitudes.Its formula are as follows:

(5) Shimmer_dda: i.e. the difference of the adjacent 3 points of disturbances of amplitude, is the difference of the difference between adjacent 3 frame amplitude, then It is averaging absolute difference, reflects in a bit of time subject to a certain degree to the difference of the control ability of voice amplitudes.Its Formula are as follows:

(6) Shimmer_APQ11: i.e. the adjacent 11 points of disturbances of amplitude, are the average vibrations of a certain frame amplitude 11 frame adjacent thereto The average absolute value of the difference of width reflects a long time subject to the control ability of voice amplitudes to a certain degree.Its formula Are as follows:

4. signal-to-noise ratio feature

Harmonic wave signal-to-noise ratio HNR (Harmonic to Noise Ratio) and harmonic wave jamtosignal NHR (Noise to Harmonic Ratio) it is the feature that relatively common a pair is used to assess the ratio in voice comprising noise, in speech disease Diagnostic value in Neo-Confucianism was widely reported.They can be very good measurement voiced speech signal for generating sound Learn the additive noise in waveform.And since the long vowel that this experiment uses also belongs to Voiced signal, so HNR and NHR are fine Voice disorder feature.

If acquisition voice signal it is pure enough, noise therein mostly come from vocal cords vibration when not Be closed completely, patient Parkinson due to the hypotaxia to the muscle near vocal cords and can chatter, it is possible to for commenting The ability that vocal cords are closed when estimating Parkinson's disease human hair sound.

The calculation method of one comparison simple is also based on auto-correlation function.First in fundamental frequency feature trifle Method auto-correlation function (auto-correlation in short-term) is calculated to the voice signal of each frame.Define a parameter, which represent in addition to Other than zero-lag, that maximum corresponding subscript k value of point of autocorrelative value, the corresponding pitch period of the approximation frame.Then root According to formula (1.1), R_xx(l_max) just it is the largest the corresponding auto-correlation function value of that point.

So harmonic component are as follows:

Noise component are as follows:

So HNR's (being calculated by decibel) is defined as:

So NHR's (calculating in proportion) is defined as:

Each frame can obtain a corresponding HNR and NHR, so with reference to fundamental frequency feature, it is also necessary in HNR and NHR Some simple statistical parameters are calculated on the basis of sequence.

(1) HNR_mean, the i.e. average value of harmonic wave signal-to-noise ratio, can be in the measurement of decibel, when assessing subject's sounding The average ability of vocal cords closure.

(2) HNR_std, i.e. harmonic wave signal-to-noise ratio standard deviation, can be in the measurement of decibel, sound when assessing subject's sounding Mean change with closure.

(3) NHR_mean, the i.e. average value of harmonic wave jamtosignal, can be in normal measurement, when assessing subject's sounding The average ability of vocal cords closure.

(4) NHR_std, i.e. harmonic wave signal-to-noise ratio standard deviation, can be in normal measurement, sound when assessing subject's sounding Mean change with closure.

5. nonlinear characteristic

It is discovered by experiment that can more distinguish the sound and health of patient Parkinson using the model that nonlinear characteristic constructs The sound of people, so more and more correlative studys in recent years all efforts be made so that with new nonlinear characteristic.

Nonlinear characteristic calculates more complicated.Nonlinear characteristic used by this is tested is to have trend fluction analysis (DFA), recurrence period density entropy (RPDE), D2, fundamental frequency cycles entropy (PPE).It is below exactly the principle and calculating to these features The introduction of method.

5.1DFA (removes trend fluction analysis)

It goes trend fluction analysis (Detrended Fluctuation Analysi, DFA) to be one the most directly to comment Estimate one of the method for signal point shape autocorrelation, early stage is the variation characteristic for assessing DNA chain structural order.It is main Thought is the inherent trend removed in signal, to obtain the wave characteristic in signal, belongs to nonlinear model, main to use In the time series (statistically refer to the characteristics such as variance, auto-correlation change with the time sequence) to non-stationary Scale index is calculated.

Its calculating is complex, it is necessary first to add up to input signal x (n):

It has been generally acknowledged that x (n) satisfaction is independently distributed, and it is distributed identical, it can think that y (n) statistically meets from phase Pass is independently distributed.Then need to carry out y (n) not Overlapping Fragment, the selection of the number of segment scales default of segmentation is from points 2 Section starts, and arrives closest to log₂The integer of M, wherein M is the sampling sum of input signal.

After segmentation, needs all to carry out linear fit to each section, it is quasi- that single order is carried out usually using least square method It closes (linear regression).A final each section of available corresponding slope a and intercept b.Then need to ask after fitting with it is true Mean square error (and fluctuation) between value:

Wherein, L is each section of length, y_i(n) refer to i-th section of n-th of y value, a_i,b_iExactly i-th section fits Slope and intercept.

If the number of segment scale of segmentation has N number of value, i.e. L has N number of value so just to need N number of to seek pair this respectively The F (L) answered.Due to the relationship between section length L and scale index are as follows: F (L) ∝ L^α, so cannot directly ask, usually need It is mapped in the space log-log, i.e. log (F (L)) ∝ α log (L), carries out linear fit, same fitting uses minimum Square law carries out single order fitting (linear regression), available, i.e. DFA value.

5.2RPDE (recurrence period density entropy)

Recurrence period density entropy (Recurrence Period Density Entropy, RPDE) is one based on mixed The ignorant theoretical algorithm for being used to find the signal period.It can be found that a duplicate degree of time series, basic thought is in phase The multiplicity of measuring signal, i.e. recurrence period belong to nonlinear model to can determine whether the size of random noise under space.

Its calculating is related to chaology.

(1) chaology:

The basis of some chaology is introduced first: being different from common time series research, in chaos time sequence In, it needs for sequence to be mapped in phase space, i.e. phase space reconfiguration.In general, coordinate delay can be used according to embedding theorems Method carries out phase space reconfiguration:

X={ X_i|X_i=[x_n,x_n+τ,x_n+2τ,...,x_n+(m-1)τ]^T, i=1,2 ..., M } (5.2.1)

In formula (5.2.1), x_nIt is n-th of value of the time series x of input.X_nIt is the insertion vector that phase space reconfiguration goes out, It is insertion delay, m is insertion dimension, that is, is embedded in the length of vector, and M is the number for being embedded in vector.Insertion delay and insertion vector Selection be the highly important ring of phase space reconfiguration.

(2) C-C algorithm

At present there are many it is theoretical to and m solve, wherein C-C algorithm is wherein the most classical one of algorithm, can be with Directly the two is all found out, is based on correlation integral:

Wherein, d_ij=| | X_i-X_j||_(∞), it is its Infinite Norm, and N corresponds to the length of input time sequence.

Need to resolve into input time sequence t following subsequences when so calculating:

N is the integral multiple of t, next:

Radius r is adjusted, definition:

ΔS₂(m, r, t)=max { S₂(m,r_j,t)}-min{S₂(m,r_j,t)} (5.2.6)

The value of usual N, m, r can estimate by BDS statistical conclusions, wherein if taking N=3000, m=2,3,4, 5,r_iIt is the standard deviation of input time sequence when=i × 0.5 σ, i=1,2,3,4.Then:

First zero point orThe corresponding t of first minimum be require it is optimal.

S_2cor(t) the corresponding t of minimum value is imbedding inequality L=(m-1) t, then can be obtained by optimal m.

(3)RPDE

Obtain m and after, M insertion vector X can be obtained by according to formula (5.2.1), then using closing on the method for turning back, thing It first defines a radius r and length is M vector P, P is initialized as null vector, and principle is to choose i-th of insertion vector first X_i, then to vector X later_iIt is compared, when finding first X_jIt is started counting at a distance from it less than r, until first A X_j+nIt is greater than r at a distance from it to stop, that is, has n-1 and its vectors of distance less than r, then to (n-1)th value+1 of P, Then to next vector X_i+1It is similarly operated, until institute's directed quantity wheel is complete.

So RPDE are as follows:

Wherein, T_maxIt is the length of the P chosen, default selects P entirely.

(5.3D2 relevant dimension)

Relevant dimension (correlation dimension, D2), is also based on an index of chaology, reflects The correlation degree of signal.

D2 is according to correlation integral C (r), and definition is identical as formula (5.2.2), but d_ijMeaning difference by infinite model Number becomes two norms, i.e. d_ij=| | X_i-X_j||₂.According to formula (5.2.2), r is radius, it is believed that the pass between index of correlation v and r System are as follows: C (r) ∝ r^v, then can also be mapped that in the space log-log similar to DFA, i.e. log (C (r)) ∝ v log (r), linear fit is carried out.It is carried out when fitting using least square method single order fitting (linear regression), available v, That is D2 value.

5.4PPE (fundamental frequency cycles entropy)

Fundamental frequency cycles entropy (Pitch Period Entropy, PPE), the stabilization of fundamental frequency when being for measuring lasting sounding The feature of property, thought are to be mapped to fundamental frequency in log space, and remove various disturbing factors to seek the period entropy of fundamental frequency, Belong to nonlinear model.

Fundamental frequency F0 is converted into logarithm semitone sequence first:

F_0,per=12log₂(F₀/127)²² (5.4.1)

Then it needs F_0,perPower spectrum flattening, this experiment use common linear whitening filter, and principle is first First calculate the power spectrum G of input speech signal_x(ω), then filter are as follows:

Power spectrum flattening filtering can filter out the influence (related to gender and pronunciation content) of average halftone, filtering Obtaining sequence later is r.Then its power spectrum P (r) is sought.The method for finally copying RPDE seeks entropy:

Wherein, L is the length of power spectrum sequence.

We by each section acquire audio all carry out more than feature calculation.

Four model introductions and calculating process

Support vector machines (Support Vector Machine, SVM) is sorting algorithm the most classical.It is mainly thought Want that finding a hyperplane is split sample, this hyperplane needs to allow as far as possible between positive class and negative class sample Interval is maximum, and Fig. 3 illustrates a basic linear separability SVM classifier.

Assumed initially that M sample i ∈ { 1,2 ..., M } | (x_i,y_i), wherein being the vector of n dimension, represent The feature vector of i-th of sample, n value have corresponded to the n feature y of i-th of sample_i∈ {+1, -1 } is i-th of sample Class label, i.e., positive class and negative class.In Fig. 3, H is the hyperplane for classifying, and following linear equation table can be used Show:

w^TX+b=0 (1)

Wherein, w=(w₁；w₂；...；w_n) be H normal vector, dimension is also n dimension, and b is displacement item, represents H and former The distance of point.For i-th of sample (x_i,y_i), if y_i=+1, then w^Tx_i+b>0；If y_i=-1, then w^Tx_i+ b < 0 thus may be used To classify.

And H1 and H2 are two parallel with H and equidistant two hyperplane, wherein H1:w^TX+b=-1 and H2:w^Tx+ B=1.In the thought of SVM, a good classifier need to satisfy two conditions:: (1) without any sample H1 and H2 it Between；(2) the distance between H1 and H2 want maximum.

The range formula between two hyperplane of H1 and H2 can be released according to the distance between straight line formula:

Wherein, D is the distance of two hyperplane H1 and H2.According to condition (1), sample must cannot between h, and h 2, According to condition (2), need to make D maximum, then the minimum value that can be solved, then can convert are as follows:

Formula (3) is the basic model of SVM.It is evident that formula 3 is a convex quadratic programming problem.Convex optimization side can be used Method solves, and according to lagrange's method of multipliers theorem, then solves optimal value using dual problem.

Lagrange multiplier is introduced to above-mentioned function first, obtains Lagrangian:

Wherein, α=(α₁；α₂；...；α_M), seeking this function the local derviation of w and b respectively is 0, as a result are as follows:

So formula (5) and (6) are substituted into (4), so that it may obtain a dual problem:

α_i>=0, i=1,2 ..., M

The problem can be solved usually using SMO algorithm.Utilizing dual problem theory, it is known that α can solve w, In turn, it is known that w can solve α, and the maximum value for solving w is converted into the maximum value for solving α, find out α^*Bringing formula (4) into can Find out w^*And b^*.Finally obtain optimal hyperlane H*.Function is substituted into when a new sample input according to formula (1)It can be obtained by the classification of the sample.

But if data have noise, directly can bring about certain error using above-mentioned model, it at this moment can allow model Allow some data points to deviate hyperplane in a certain range, therefore introduces this concept of slack variable ξ_i, wherein ξ_i≥0。

s.t.y_i(w^Tx_i+b)+ξ_i>=1, i=1,2..., M

ξ_i>=0, i=1,2..., M

Introduce slack variable ξ_iLater, allow the function interval of sample point less than 2, allow some sample points in hyperplane Between or the region of other side in.Section 2 is penalty term in objective function, is punished to outlier, the more mesh of outlier Offer of tender numerical value is more, therefore it is required that reducing target function value as far as possible.C indicates the weight of outlier, the bigger target letter of C Numerical value is bigger.

But for reality problem, usually with a linear model be can not Accurate classification (linearly inseparable), must Nonlinear model must need to be used, SVM solves this problem by introducing kernel function, and principle is directly by sample from original sky Between be mapped to the feature space of more higher-dimension, and linear separability within this space.

Enabling φ (x) is feature vector of the x after mapping, then being brought into formula (8):

When calculating, need to calculate φ (x_i)^Tφ(x_j), thus kernel function introduces:

κ(x_i,x_j)=< φ (x_i),φ(x_j) >=φ (x_i)^Tφ(x_j) (10)

Due to its principle, so the selection of kernel function and the property relationship of classifier are close.Table 5.1 is enumerated several common Kernel function.

The common kernel function of table 5.1

It finally solves available:

There are many methods for realizing SVM, and method most popular at present is Taiwan Univ. professor Lin Zhiren exploitation The tool box LIBSVM, easy to use, speed is quick, full-featured, supports Windows and Unix operating system, the language of support Than wide, the language such as Java, Python, R, Matlab, Ruby can be supported, this experiment uses Matlab to adjust With LIBSVM, version is 3.21 editions.The linear core of the kernel function that LIBSVM itself can be supported, polynomial kernel, RBF core and Sigmoid core, through testing, this experiment finally uses Gaussian kernel (RBF core).

Support vector regression (Support Vector Regression, SVR) is improved according to the basic thought of SVM At a kind of regression algorithm its main thought be to find a hyperplane to map sample, not with other regression algorithms As soon as be its mapping true value between absolute value of the difference if it is less than a specific range, be not counted in loss.Fig. 2 Illustrate that a basic linear SVR returns device.

Assumed initially that M sample i ∈ { 1,2 ..., M } | (x_i,y_i), wherein x_iIt is the vector of n dimension, represents The feature vector of i-th of sample, n value have corresponded to the n feature of i-th of sample, y_iIt is the corresponding of i-th of sample Output is returned, i.e., continuous value.In Fig. 2, H is the hyperplane for Hui-Hui calendar, and following linear equation table can be used Show:

w^TX+b=0 (12)

Wherein, w=(w₁；w₂；...；w_n) be H normal vector, dimension is also n dimension, and b is displacement item, represents H and former The distance of point.For i-th of sample (x_i,y_i), its feature vector is input to after returning in device, the output of model can be obtained f(x_i)=w^Tx_i+ b can be carried out returning in this way.

And H1 and H2 are two parallel with H and equidistant two hyperplane, wherein H1:w^TX+b=-1 and H2:w^Tx+ B=ε.According to the basic thought of SVR, it can tolerate that the error between true output y and model output value f (x) is no more than ε, Only error is more than that ε just calculates error loss.Such as Fig. 4, even if the part prediction that is, among H1 and H2 is correct.It also needs simultaneously Allow H1 and H2 distance recently, so obtaining formula (13) according to formula (2) and above-mentioned condition:

Wherein, C is the coefficient of regular terms, l_εIt is our calculative losses, referred to as insensitive loss:

Same SVM introduces slack variable ξ_iWithThe relaxation degree that two slack variables represent two sides can be different, then Introduce kernel function κ (x_i,x_j), define same SVM.It may finally become:

Its method for solving is completely as SVM, by introducing Laplce's multiplier, then utilizes dual problem

Realize SVR method also there are many, here for convenience the tool box LIBSVM is still called using Matlab, As long as adjusting parameter therein changes algorithm types, in the version that this experiment uses, there are e-SVR and v-SVR two Seed type selects more commonly used e-SVR after testing.

The characteristic that a upper trifle is handled and the UPDRS that doctor provides are corresponded, row is at data Collection.Above data collection is clustered using k-means, for the classification of each cluster, by data set in the ratio of 3:1 It is divided into training set and test set.For the classification of each cluster stroke, training set carries out disaggregated model instruction using SVM model Practice, carries out regression model training using SVR, optimize model using trellis search method, will be trained, each classification Model parameter saved.

Five are predicted

Each classification for cluster is loaded into classification and regression model obtained in the above training process.Input needs to sentence Disconnected and prediction data, calculate Distance Judgment generic, carry out classified calculating to data using the corresponding model of the category, Obtain classification results.Classification results are handled again, are the testers for not suffering from Parkinson's disease for prediction, by UPDRS value It is set to 0, for predicting the tester with Parkinson's disease, SVR is reused and carries out regression forecasting, obtain calculated UPDRS Value is speculated by the severity that the value of UPDRS carries out patient, terminates process.

Six feedback results

The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited to This, anyone skilled in the art within the technical scope of the present disclosure, according to the technique and scheme of the present invention And its inventive concept is subject to equivalent substitution or change, should be covered by the protection scope of the present invention.

Claims

1. a kind of use phonetic decision Parkinson severity detection method based on cluster, which is characterized in that including walking as follows It is rapid:

The acquisition of S1, voice signal

Vowel is selected, by acquiring equipment, acquire following content: whether patient number name, the age, gender, is diagnosed as pa gold It is gloomy, whether have other cause the disease of voice disorder, sick time, UPDRS (movement), UPDRS (entirety), acquisition the date it is timely Between, the same day which time acquisition；

The pretreatment of S2, voice signal

The pretreatment of voice signal, including format conversion, sample frequency conversion, preemphasis, adding window and framing are carried out to voice, are gone Except unvoiced section, fundamental frequency extraction is carried out；

S3, all phonetic features are extracted

Jitter, amplitude disturbances Shimmer, signal-to-noise ratio feature, nonlinear characteristic are disturbed including fundamental frequency feature Pitch, fundamental frequency；

S4, model and calculating

Sorting algorithm based on support vector machines, using linear separability SVM classifier and nonlinear model, SVM passes through introducing Kernel function establishes model；The information that characteristic obtained by S3 and doctor provide is corresponded, row is at data set；It uses K-means is clustered, and for the classification of each cluster, data set is divided into training set and test set in the ratio of 3:1；Needle To the classification of each cluster stroke, training set carries out disaggregated model instruction using the sorting algorithm and SVM model of support vector machines Practice, carries out regression model training using support vector regression SVR method, optimize model using trellis search method, will instruct It perfects, the model parameter of each classification is saved.

S5, it is predicted

Each classification for cluster is loaded into classification and regression model obtained in the above training process；Input need judge and The data of prediction calculate Distance Judgment generic, carry out classified calculating to data using the corresponding model of the category, are divided Class result.Classification results are handled again, are the testers for not suffering from Parkinson's disease for prediction, it is 0 that mark value, which is installed, For predicting the tester with Parkinson's disease, reuses SVR and carry out regression forecasting, obtain calculated mark value, pass through mark The severity that note value carries out patient speculates, terminates process；

S6, feedback result

2. using phonetic decision Parkinson severity detection method based on cluster according to claim 1, feature exists In fundamental frequency feature Pitch carries out the extraction of fundamental frequency using correlation method in step S3；

For deterministic signal, short-time autocorrelation function is defined as:

Then for the auto-correlation function of each frame, need to find maximum peak value after its first zero crossing, it is corresponding Subscript k value is exactly the corresponding pitch period of frame voice, and inverted to k is exactly fundamental frequency；

3. using phonetic decision Parkinson severity detection method based on cluster according to claim 1, feature exists In in step S3, Jitter is used to indicate the disturbance of fundamental frequency, the i.e. degree in pitch period deviation period, is used due to audio It is to continue pronunciation process, and send out long vowel, the alternate fundamental frequency disturbance of first consonant can be excluded；

Wherein, n indicates that frame, N represent the sum of frame, refer to the fundamental frequency of n-th frame, and what is referred to is exactly the pitch period of n-th frame；

(1) Jitter: i.e. the Relative Perturbation of pitch period is the average absolute value of the difference of adjacent pitch period, with pitch period Average value ratio, reflect whole subject to the relation control ability of vocal cord vibration；Its formula are as follows:

(2) Jitter_abs: i.e. the absolute perturbation of pitch period is exactly the average absolute value of the difference of adjacent pitch period, reflection Absolute control capacity of the whole subject to vocal cord vibration；Its formula are as follows:

(3) Jitter_PPQ5: i.e. the adjacent 5 points of disturbances of pitch period, are the average bases of a certain frame pitch period 5 frame adjacent thereto The average absolute value of the difference in sound period reflects a period of time subject to the control ability of vocal cord vibration to a certain degree；It is public Formula are as follows:

(4) Jitter_rap: i.e. the adjacent 3 points of disturbances of pitch period, are the average pitch of a certain frame pitch period 3 frame adjacent thereto The average absolute value of the difference in period reflects a period of time subject to the control ability of vocal cord vibration to a certain degree；Its formula Are as follows:

(5) Jitter_ddp: i.e. the difference of the adjacent 3 points of disturbances of pitch period, is the difference of the difference between adjacent 3 frame pitch period Value, then is averaging absolute value, reflects in a bit of time subject to a certain degree to the difference of the control ability of vocal cord vibration； Its formula are as follows:

4. using phonetic decision Parkinson severity detection method based on cluster according to claim 1, feature exists In in step S3, Shimmer is the disturbance of the amplitude of voice, i.e. the degree of amplitude deviation mean amplitude of tide:

The definition of the amplitude of Shimmer measurement is:

A_0,n=max (P_n)-min(P_n) (3.1)

S.t.n=1,2 ..., N

(1) Shimmer: i.e. amplitude disturbances (precentagewise to calculate), are the average absolute values of the difference of adjacent amplitude, flat with amplitude The ratio of mean value reflects subject to the relation control ability of voice amplitudes；Its formula are as follows:

(2) Shimmer_dB: i.e. amplitude disturbances (being calculated by decibel).It is the average value of the ratio of adjacent amplitude, only unit It is a decibel dB, reflects subject to the relation control ability of voice amplitudes；Its formula are as follows:

(3) Shimmer_APQ5: i.e. the adjacent 5 points of disturbances of amplitude, are the differences of the mean amplitude of tide of a certain frame amplitude 5 frame adjacent thereto Average absolute value reflects a period of time subject to the control ability of voice amplitudes to a certain degree；Its formula are as follows:

(4) Shimmer_APQ3: i.e. the adjacent 3 points of disturbances of amplitude, are the differences of the mean amplitude of tide of a certain frame amplitude 3 frame adjacent thereto Average absolute value reflects a period of time subject to the control ability of voice amplitudes to a certain degree；Its formula are as follows:

(5) Shimmer_dda: i.e. the difference of the adjacent 3 points of disturbances of amplitude is the difference of the difference between adjacent 3 frame amplitude, then be averaging Absolute difference reflects in a bit of time subject to the difference of the control ability of voice amplitudes to a certain degree；Its formula are as follows:

(6) Shimmer_APQ11: i.e. amplitude it is adjacent 11 points disturbance, be a certain frame amplitude 11 frame adjacent thereto mean amplitude of tide it The average absolute value of difference, reflects a long time subject to the control ability of voice amplitudes to a certain degree；Its formula are as follows:

5. using phonetic decision Parkinson severity detection method based on cluster according to claim 1, feature exists In in step S3, nonlinear characteristic is to have trend fluction analysis DFA, recurrence period density entropy RPDE, fundamental frequency cycles entropy PPE.