CN110211566A

CN110211566A - A kind of classification method of compressed sensing based hepatolenticular degeneration disfluency

Info

Publication number: CN110211566A
Application number: CN201910494055.5A
Authority: CN
Inventors: 马春; 汪庆; 杜炜; 阚红星
Original assignee: Anhui University of Traditional Chinese Medicine AHUTCM
Current assignee: Anhui University of Traditional Chinese Medicine AHUTCM
Priority date: 2019-06-08
Filing date: 2019-06-08
Publication date: 2019-09-06

Abstract

The present invention relates to a kind of classification methods of compressed sensing based hepatolenticular degeneration disfluency, comprising the following steps: proposes the preprocess method of reasonable WD patient's voice signal；Establish WD patient's voice, image, video, medical history and CT magnetic resonance data database；The characteristic attribute of the speech characteristic parameter for WD patient's disfluency is obtained, and constructs more specification effective tailored version vocabulary on the basis of existing vocabulary；It proposes feasible Classification of Speech recognizer, realizes the division of state of an illness grade；Research and development can satisfy the categorizing system of the compressed sensing based hepatolenticular degeneration disfluency of clinical practice application requirement.

Description

A kind of classification method of compressed sensing based hepatolenticular degeneration disfluency

Technical field

The present invention relates to a kind of classification methods of compressed sensing based hepatolenticular degeneration disfluency.

Background technique

Hepatolenticular degeneration (HLD) is also known as Wilson sick (Wilson ' s disease, WD), is that a kind of copper dysbolism is led The autosomal recessive hereditary diseases for causing brain Basal ganglia denaturation and hepatic disorder, is apt to occur in teenager, because positioned at 13q14.3's ATP7B gene mutation causes intracellular copper ion transmembrane transport obstacle, so that copper is at positions such as liver, cerebral basal ganglia area and corneas Deposition^[1].Lenticular nucleus is especially invaded, it is earliest and obvious with shell core, it is secondly globus pallidus, caudate nucleus and cerebral cortex, thalamus bottom Core, rubrum, black substance, thalamus and dentate nucleus can also be involved.Clinical manifestation progressive aggravate extrapyramidal symptom, cornea K-F ring, Cirrhosis, mental symptom and kidney function damage etc.^[2].The complicated clinical manifestation multiplicity of its extrapyramidal symptom, is mainly shown as structure Sound barrier hinders (dysarthria), expression exception, salivates, is clumsy in one's movement, lurch, tremble, splinting, dancing, athetoid Deng.Wherein dysarthrosis is one of main performance^[3], which reduce patient's speech articulation and ability to exchange, influence the life of patient Bioplasm amount and social-life ability^[4].Rare in most of American-European countries WD, disease incidence is (15-30)/1,000,000.Though China Report on Epidemiological without large data, it is generally understood that disease incidence is more much higher than western countries.Early stage dysarthrosis is easy It is ignored, it is especially the WD of onset symptoms often by sing misdiagnosis and mistreatment using slight dysarthrosis^[5], so that missing best occasion for the treatment.WD Dysarthrosis performance is varied, predominantly tongue, oral cavity and the structures sound organ such as pharyngeal because hypermyotonia and (or) tremble etc. because Dyslalia caused by element and (or) cannot.Less serious case show as speaking the rhythm and pace of moving things is slow, tone is droning, voice trembling, harmonious sounds disorder, Language is not smooth, but not appreciably affects exchange；It is ambiguous to enunciate compared with severe one, and hardly possible is understood by other people, can be exchanged reluctantly with short sentence, Influence exchange；Serious person then completely loses oral communication ability, cannot exchange with people^[6,7]。

WD dysarthrosis parting is divided into 6 seed types according to nervous system damage position and the speech extent of damage, i.e., spasm type, Relaxation type moves reduction type, is moved through many types of, disorder type and mixed type^[6,8,9].Dysarthrosis type and severity and brain damage Harmful position and severity is related^[10].The dysarthrosis the above-mentioned type performance of WD exists^[6,11,12], visible in iconography to deposit Intracranial lesion, which is read, in the government official of dysarthrosis is concentrated mainly on the portions such as lenticular nucleus, midbrain, pons, head of caudate nucleus, thalamus, cerebellum Position^[13].Clinical visible spasm type is spoken slowly arduously, and tone is low and single, is often accompanied by dysphagia；Relaxation type is spoken with nose Sound, consonant, vowel express mistake, often exist and feed cough of choking；Incoordination type structure sound is inaccurate, in explosive voice, phoneme delays, It pauses, extends in sample of reciting a poem；Be moved through it is many types of speak it is sometimes fast and sometimes slow, vowel extend, modified tone, volume change excessively or sound in Only；It is low flat to move reduction type pronunciation, monotone, or even have trill and stutter, it moves inappropriate with salivation；Mixed type part table Now it is spasm type and relaxation type dysarthrosis, is to write with spasm type ingredient, it is more common for the very few former spasm of ataxia original movement Type^[6,9,14].WD patient is common with mixed type dysarthrosis.

Resume speed is often relatively slow after the treatment of WD dysarthrosis, and the sequelae shape that Chang Liuyou is different degrees of, to patient and Family members bring psychology and spiritual pressure, constrain comprehensive rehabilitation of patient.The complicated clinical manifestation multiplicity of WD dysarthrosis, Its corresponding treatment measures and prognosis also have different from dysarthrosis caused by other reasons, therefore accomplish early diagnosis, morning as far as possible Treatment.All clinics encounter dysarthrosis caused by having unknown cause, especially with cirrhosis or other extrapyramidal symptoms, or Once there is transient liver damage symptom and had similar family history patient, and to consider that WD is possible, it is raw need to further to improve serum copper Change (serum copper, ceruloplasmin, CuO-2 layer), liver function, brain MRI and cornea K-F ring etc. check, reduce to the full extent Mistaken diagnosis is failed to pinpoint a disease in diagnosis.

The treatment modern medicine study of disfluency lays particular emphasis on disfluency evaluation and speech therapy, for there are speech barriers The WD patient hindered will more pay attention in the specification evaluation for the parting and severity for exploring its disfluency to disfluency symptom Treatment, catch the opportune moment of early stage speech rehabilitation, in conjunction with disfluency type and severity, take individuation, Chinese medicine The comprehensive therapeutic plan that treatment is combined with modern medicine Rehabilitation Training in Treating technology, traditional Chinese and western medicine has complementary advantages, and encourage patient Adhere to that reinforce speech rehabilitation improves the quality of living to can preferably restore its speech function for a long time.

How research organically combines the assessment of speech recognition technology and disfluency and rehabilitation, and artificial intelligence application is arrived In clinical medicine, realizes computer assisted automation disfluency assessment and rehabilitation, can not only assist diagnosis, control It treats, it can also be used to which patient voluntarily diagnoses, and for the Finding case state of an illness and the development of the grasp state of an illness in time has great help, therefore The present invention has very high value for clinical application, has very much social value and realistic meaning.

2 domestic and international present Research analyses:

The research of 2.1 WD dysarthrosis；

Early in 1912, Wilson was for the first time described in detail WD, and there are structure sound barriers for majority in 12 patients of report Hinder, and thinks that slight dysarthrosis is earliest one of nervous system signs^[15].The scholars such as later period Fister, Martin have Similar report^[11], domestic Zhu Yun wave etc.^[16]Also have and reported by the WD of onset symptoms of dysarthrosis, but not to dysarthrosis It is elaborated.Most literature describes WD dysarthrosis and exists simultaneously with dyskinesia, Behavioral disturbances, and thinks structure sound barrier Hinder incidence quite high, the report such as Liu Wei is up to 77.4%^[17], with external Machado^[18]It reports similar.Related WD dysarthrosis is also Further quantity table method, instrumental method is needed to carry out analysis and assessment, and need to be groped a set of more effective speech therapy method.

Dysarthrosis is since maincenter or Peripheral nervous system diseases become cause speech related muscles and benumb or move uncoordinated draw It rises, the lesion from brain pathways to muscle in itself can lead to, and be clinically common aphasis^[14], most of structure sound barrier Hinder and belongs to dyskinesia scope, therefore also known as dysarthria.WD dysarthrosis is mostly dyskinetic structure sound barrier Hinder^[19].Language is a more complicated process, is generally made of the Surrounding muscles and non-musculature of lung and thorax abdomen Respiratory movement system, cartilagines larynigs and muscle (including vocal cords) composition sonification system and articulatory system three coordinated At^[6].IX, X, XII 3 dominates above-mentioned vocal organs to cranial nerve jointly, and suitably changing by air-flow makes oral cavity issue fluency Speech, such complex process need precise motion planning and regulation^[10,14].WD dysarthrosis occurs and basal ganglion function It lacks of proper care related, basal ganglion usually makes indirect pathway and direct path function keeps balance, Reasonable Regulation And Control movement, when its function Then there is dyskinesia in imbalance, causes muscle rigidity not flexible, articulation obstacle, and volume, word speed, the rhythm and pace of moving things change, and causes speech It is unclear, voice mistake；Some patientss are then that can not send out sound since throat muscle or vocal cords myokinesis are uncoordinated；Lesions on cerebellum can Cause ataxiophemia, in sample language etc. of reciting a poem；Active speech is reduced after thalamic lesion, and wave volume is substantially reduced with clear Degree decline etc.^[6].To sum up, the WD so breath such as the dysarthrosis of multiplicity and its diseased region basal ganglion, cerebellum, thalamus manner of breathing Close, the damage of above structure directly or indirectly affects the conduction and metabolism of neurotransmitter, cause secondary sound muscle and The myodystony of congener, leads to dyslalia^[6,14]。

The evaluation of dysarthrosis, the current country there is no unified assessment method^[20], there are no special assessment marks for WD dysarthrosis Standard, majority use Frenchay dysarthrosis evaluation assessment^[21]Or improved method and China Rehabilitation Research Center's dysarthrosis check table, It is checked by clinician or rehabilitation department doctor, scoring, record, evaluation dysarthrosis degree, type^[6,21,22]。

China's speech structure sound, voice disorder patient populations are more, and relevant obstacle appraisal procedure is mainly with the subjective sense of hearing Based on perception, lack certain objectivity and stability.In recent years, speech recognition technology has obtained widely answering in multiple fields With the application study in terms of verbal language education also achieves certain achievement.But in disfluency assessment and rehabilitation research Field, the related research result based on speech recognition is actually rare, and fails to cause enough attention.This research is according to the country The study on assessing method status and development trend of outer speech structure sound, voice disorder, integrated voice identification technology are taught in verbal language The research achievement of middle application is educated, emphasis is directed to the pilot study that the WD patient with disfluency has carried out automatic assessment, this It studies and subsequent may extend to commenting for disfluency caused by nervous system other diseases (such as Parkinson, motor neuron disease etc.) It surveys and diagnoses, during research, we also sample and analyze disfluency patient caused by above several the nervous system diseases Voice, for the later period parting and automatic judgment lay the foundation.

2.2, the study on assessing method of speech structure sound:

In recent years, with computer technology, signal processing technology, pattern-recognition and the fast development of acoustic technique, language Sound identification is widely applied in many aspects such as industry, military affairs, traffic, civilian.In verbal language education sector, voice is known Other application study also achieves certain achievement, by taking the country as an example, be concentrated mainly on foreign language learning, normal child's language education, Mandarin Grade Exam System etc..But for the disfluency assessment and rehabilitation using voice as main study subject Educational research field, the related research result based on speech recognition is considerably less, and does not cause enough attention.

External speech science man and phonetician to the research of structure sound organ movement's situation mainly use subjective observation and Medical image means.The structure sound method of estimating motion of subjective observation can only be provided due to lacking the quantitative monitoring to structure system for electrical teaching The structures sound organ such as lower song, lip, tongue is qualitatively described, and the professional knowledge of appraiser is required relatively high.Based on medicine The correlative study of iconography is more early in foreign countries' starting, and the Europe before and after year successively establishes several speech sciences and phonetics Research center, wherein Germany scientist Menzerath is based on x-ray imaging and makes video recording, and realizes the individual structure of careful observation earlier The motion conditions of sound organ and the variation of relative position^[23], speech science man of the same period also utilizes medical image means to analyze The structures sound feature [24] such as clear unique, vowel tongue position position of consonant out.Nineteen sixty -1980 years, electronic computer technology Rapid development, speech science man and phoneticians are no longer satisfied with simple electro-physiological signals class to the research that structure sound moves Than the channel model of various Computer Simulations comes into being, wherein the most well-known is the P.Mermelstain (1973) in the U.S. The structure sound model [25] that the speech of proposition generates, the model are that a kind of two-dimentional sound channel is depicted with straight line, curve and angle is imitative True mode, the assessment which moves structure sound have a very strong directive significance, but due to current nmr imaging technique at This is high, in actual clinical diagnosis and is of little use, and correlative study is at poly- also fewer.The material of external structure sound phonetic function assessment For material generally using monosyllable, these are monosyllabic generally with the form appearance of phoneme pair.Whitehill and Ciocca is to 17 kinds Phoneme compares influence of the condition to speech articulation and studies, the results showed that gutturophony and non-gutturophony, tone comparison, supply gas with not It supplies gas, vowel followed by a nasal consonant and non-vowel followed by a nasal consonant, long vowel and short vowel this 6 kinds of phonemes comparison conditions have important shadow to speech articulation It rings^[26]。

The method that the researcher in China studies structure sound locomotion evaluation is also concentrated mainly on subjective evaluation and medical image hand Above, whether the content of the subjective evaluation mainly morphosis of the structure sound organ of observation patient and motion state be normal for section；It is yellow Clear ring, Lu Hongyun etc., which are proposed, moves subjective evaluation scheme than more complete structure sound^[27]。

In terms of domestic structure sound phonetic function evaluation studies are concentrated mainly on subjective evaluation, and only a small number of researchers The concept of structure sound and voice is distinguished.Huang Zhaoming etc. proposes " Chinese Articulation ability tests vocabulary " [28], the vocabulary packet Containing 50 words, speech rehabilitation teacher by the structure sound voices of 50 words of evaluation subject, can thoroughly evaluating subject to 21 initial consonants With the Articulation ability of 4 kinds of tones, meanwhile, compared by the comparison of 18 phonemes, 37 minimum voices to come the phoneme of assessing subject Ability.Chen Sanding et al. has carried out the evaluation of standard Chinese initial consonant, simple or compound vowel of a Chinese syllable and tone to 50 deaf youngsters, and it is general to disclose Chinese The rule of development of deaf youngster's structure sound voice of call, it is further proposed that early, it is sequence, fault-tolerant and consolidate " speech rehabilitation religion Educate principle^[29].Doctor Zhang Jing of East China Normal University has studied the main error trend for listening barrier children in a consonant structure sound, point The origin cause of formation is analysed, and proposes listen barrier children's consonant phoneme treatment frame accordingly^[30]。

2.3, the application study of speech recognition:

The concept of " speech recognition " is born in the fifties in last century, AT&T Labs, the U.S. take the lead in completing it is monosyllabic and The identification of isolated word is tested^[31], core concept is that model is made in the characteristic parameter of extraction voice signal, uses model in identification Matched method is identified.The continuous development of sound signal processing technique and abundant, has pushed speech recognition technically significantly Innovation, the appearance of the feature extraction and matchings algorithm such as linear predictive coding (LPC) and dynamic time warping (DTW), push The initial stage upsurge of the Research of Speech Recognition.LPC technique efficiently solves the problems, such as the extraction of phonetic feature, and DTW efficiently solves isolated The problem of speaker's word speed unevenness in word identification, to the speech recognition excellent of speaker dependent.In recent years, speech recognition existed External many fields are all taken seriously, and be used widely, and especially core algorithm reaches its maturity, and many companies all push away The SDK used for technical staff's self-developing, the more well-known ViaVoice for having IBM Corporation are gone out^[32]With Microsoft Speech SDK^[33]Multilingual in the world is supported Deng, these research achievements, has pushed speech recognition technology in global family Extensive use in front yard, consumer electronics, service industry, education and medical industry.

The country is in terms of to the Research of Speech Recognition of standard Chinese, oneself is achieved achievement abundant, especially technical level Always immediately following external development.The research project of oneself the multinomial large vocabulary speech recognition of approval of national " 863 " plan, it is therein " the improvement hidden Markov model of speech recognition " proposes the non-homogeneous hidden Markov model based on duration distribution, is To an important improvement of the HMM model used in speech recognition.The speech recognition system THEESP energy developed based on this The discrimination of enough realizations up to 98.7%^[34], this is also the domestic optimum level in the current field.Up to the present, voice is known Other technology is applied to the speech function assessment of disfluency crowd and trains the research of aspect quite deficient at home, and another party Face, the speech rehabilitation teacher quality and quantity in China is all much unable to satisfy the rehabilitation demands of China disfluency patient at present, speech Language obstacle crowd is more urgent for the Speech assessment of automation and the demand of rehabilitation equipment.In summary speech recognition skill The research and development and status of art and structure sound, voice disorder assessment with training be not difficult to find out, the respective theory of two systems, method and Technological means has been mature on the whole, and is not combined organically but, is reached for the purpose of more crowd's services in need, this A problem needs emerging cross discipline to go to solve, and how research has the assessment of speech recognition technology and disfluency with rehabilitation Machine combines, and realizes that computer assisted automation disfluency assessment has social value and realistic meaning with rehabilitation very much.

Leading reference:

[1]Lutsenko S,Barnes NL,Bartee MY,et al.Function and regulation of human Copper-Transposing ATPases[J].PhysiolRev,2007,87(3):1011-1046.

[2]Ala A,Walker AP,Ashkan K,et al.Wilsons disease[J].Lancet,2007,369 (9559):397-408.

[3]Wilson SAK.Progressive lenticular degeneration:a famklial nervous disease associated with cirrhosis of the liver[J].Brain,1912,34:295-509.

[4]Yunusova Y,Weismer G,Kent RD,et al.Breath-group intelligibility in dysarthria:characteristics and underlying correlates[J].J Speech Lang Hear Res,2005,48(6):12.

[5] Hu Jiyuan, Lv Daping, Wang Gongqiang wait clinical misdiagnosis research [J] Chinese medical of hepatolenticular degeneration miscellaneous Will, 2001,81 (11): 642-644.

[6] Wang Xiao Yang, research [J] tcm clinical magazine of Bao Yuancheng hepatolenticular degeneration patient's dysarthrosis, 2012,24(3):202-204.

[7]Brabo NC,Cera ML,Barreto SS,et al.Dysarthria in Wilson’s disease: analysis of two cases in different stages[J].Rev CEFAC[online].2010,12(3): 509-515.

[8]Wang YT,Kent RD,Duffy JR,et al.Dysarthria associated with traumatic brain injury:speaking rate and emphatic stress[J].J CommunDisord, 2005,38(3):231-260.

[9] evaluation Yu rehabilitation [J] Chinese Scientific Journal of Hearing And Speech Rehabilitation of Li Shengli dysarthrosis, 2009, 32(1):8-12.

[10]Kent RD,Vorperian HK,Kent JF,et al.Voice dysfunction in dysarthria:application of the Multi-Dimensiona Voice Program[J].J CommunDisord,2003,36(4):280-306.

[11]Berry WR,Darley FL,Aronson AE.Dysarthria in Wilson’s disease[J].J Speech Hearing Res,1974,17(2):169-183.

[12] in He Weijia, progress [J] that Li Shengli's dysarthria Speech acoustic level objectively evaluates State's rehabilitation theory and practice, 2010,16 (2): 118-120.

[13] Yu Xuen, Yang Renmin 132 cases brain MR imaging of hepatolenticular degeneration [J] apoplexy and neurological disease Magazine, 2007,24 (1): 30-33.

[14] Beijing Li Shengli speech therapy [M]: Huaxia Press, 2003:77-85.

[15]Wilson SAK.Progressive lenticular degeneration:a familial nervous disease associated with cirrhosis of the liver[J].Brain,1912,34:295-509.

[16] Zhu Yunbo, Song Lei, Tian Zhu wait using dysarthrosis as hepatolenticular degeneration 2 analysis [J] of onset symptoms Chinese Chinese and western medical science magazine, 2010,8 (3): 28-29.

[17] Liu Wei, Shi Youkun, Li Jing .Wilson 35 dysfunctions of disease assess [J] modern rehabilitation, 2000,4 (9): 1334-1335.

[18]Machado A,Chien HF,Deguti MM,et al.Nenurological manifestations in Wilson’s disease:Report of 119cases[J].MovDisord,2006,21:2192-2196.

[19] Bi Caiqin, Sun Junqi, Zhao Hong wait rehabilitation nursing [J] China of hepatolenticular degeneration patient's dysarthrosis Modern Nursing magazine, 2011,17 (22): 2609-2613.

[20] Li Huan dysarthrosis evaluation studies commentary [J] China special education, 2010,6:59-62.

[21]EnderbyPM.Frenchay Dysarthria Assessment[M].San Diego:Colifornia College-hill Press,1983:34-53.

[22] Clinics and Practices [J] the massage and medical science of recovery therapy of Mao Jianming dysarthria, 2010,36:54-55

[23]Altmann G..Prolegomena to Menzerath’s Law[J].Glottometrika,1980, 2:1-10.

[24] Beijing Wu Zongji, Lin Maocan experimental phonetics summary [M]: Higher Education Publishing House, 1989:153-190.

[25]Mermelstein.P..Articulatory model for the study of speech production[J].Journal of Acoustic Society of America,1973,61(2):581-587.

[26]Whitehill.T.L,,Ciocca V..Speech errors in Cantonese speaking adults with cerebral palsy[J].Clinical Linguistics&Phonetics,2000,14:111-130.

[27] the province spy school Lu Hongyun, Huang Zhaoming, Zhou Hong speech rehabilitation specialized instrument and equipment configuration standard [J] the modern times are special Education, 2010, (6): 31-34.

[28] Shanghai Huang Zhaoming, Du Xiaoxin speech therapy [M]: publishing house, East China Normal University, 2014.

[29] Chen Sanding, Xu Changhong, the rule of the deaf virgin Phonetic Speech Development of Li Yuming Han nationality and Rehabilitation from Illness with Counterplan research [J] China Rehabilitation, 1996,11 (2): 53-54.

[30] Zhang Lei listens analysis and the therapeutic strategy [master thesis] of barrier children's initial consonant structure sound exception, Shanghai: East China Normal university, 2009.

[31]K.H.Davis,R.Biddulph,S.Balashek.Automatic Recognition of Spoken Digits[J].Journal of the Acoustic Society of America,1952,24(6):637-642.

[32]International Bussiness Machine.IBM Desktop ViaVoice.2003,http:// www-01.ibm.com/software/pervasive/viavoice.html

[33]Microsoft.Microsoft Speech Platform-Software DevelopmentKit.2010,

Http:// www.microsoft.com/en-us/download/details.aspx? id=14373

[34] improvement [J] electronic letters, vol of Zhan Puming, Wang Zuoying, Lu great Jin speech recognition Hidden Markov Model, 1994,22(1):9-15。

Summary of the invention

The present invention devises a kind of classification method of compressed sensing based hepatolenticular degeneration disfluency, solves Technical problem is how to organically combine the assessment of speech recognition technology and disfluency and rehabilitation, by artificial intelligence application to facing In bed medicine, realizes computer assisted automation disfluency assessment and rehabilitation, can not only assist diagnosis, treatment, It can also be used for patient voluntarily to diagnose, for the Finding case state of an illness and the development of the grasp state of an illness in time has great help.

In order to solve above-mentioned technical problem, present invention employs following scheme:

A kind of classification method of compressed sensing based hepatolenticular degeneration disfluency, comprising the following steps:

It is proposed the preprocess method of reasonable WD patient's voice signal；

Establish WD patient's voice, image, video, medical history and CT magnetic resonance data database；

Obtain the characteristic attribute of the speech characteristic parameter for WD patient's disfluency, and structure on the basis of existing vocabulary Build the effective tailored version vocabulary of more specification；

It proposes feasible Classification of Speech recognizer, realizes the division of state of an illness grade；

Research and development can satisfy point of the compressed sensing based hepatolenticular degeneration disfluency of clinical practice application requirement Class system.

Further, the preprocess method of WD patient's voice signal, comprising the following steps: step 1: the input of sound with Filtering；Step 2, the compressed sensing of voice signal；Step 3, end-point detection；Step 4, preemphasis；Step 5, adding window and framing.

Further, filtering adds bandpass filter to voice signal in the step 1, added bandpass filter Upper and lower cutoff frequency is respectively f_H=20kHz, f_L=60Hz.

Further, in the step 3 by the autocorrelation maximum of voice signal and cross that threshold rate combines in the way of it is real The end-point detection of existing efficient voice: assuming that the time-domain expression after voice signal adding window is x (m), wherein n-th frame signal representation Formula is x_n(m), frame length N, the then short-time autocorrelation function of the voice signal are as follows:

Wherein, 0≤k≤K, K are the largest delay points.

Further, step 4 purpose is to increase the energy of high fdrequency component, signal-to-noise ratio is effectively improved, so as to letter It number carries out that unified signal-to-noise ratio can be used when spectrum analysis and channel parameters calculate, reduces difficulty in computation.Using single order number Filter composition can improve the preemphasis digital filter of signal high fdrequency component with 6dB/ octave:

H (z)=1- μ z^-1 (3)

System function H (z) is z^-1Multinomial, its pole on the origin of Z plane, wherein μ be pre emphasis factor.

Further, window function is to be denoted as ω for the tool of the extraction " speech frame " from continuous speech in the step 5 (n)；The characteristic of window function is exactly the voice segments whole zero setting that will be needed except processing region, i.e., to voice signal " framing ".Point There are two ways to frame: continuously framing and overlapping framing, overlapping framing are the framing methods generallyd use；The voice signal of adding window Expression formula are as follows:

s_ω(n)=s (n) g ω (n) (4)

ω (n) is window function, and s (n) is primary speech signal.

Further, the speech characteristic parameter of WD patient's disfluency be short-time energy, Mel cepstrum coefficient (MFCC) and WD patient's voice nonlinear characteristic parameters.

A kind of self-evaluating system established based on classification method, including the automatic evaluation module of dysarthrosis, voice disorder are certainly Dynamic evaluation module, state of an illness classification automatic identification module and the automatic evaluation module of state of an illness grade；The dysarthrosis assesses function automatically Energy module includes the judgement read word, read word and tone, and the entry randomly selected in vocabulary is pronounced by subject, needs the property chosen herein Not, illness type is horizontal, specific the barrer types will be provided after evaluation and test, and generate test report；The voice disorder is assessed automatically Functional module includes " voice repetition " and " voice switching " evaluation module, randomly selects entry in vocabulary and sentence by subject Pronunciation, this module result can be recorded in table in conjunction with subjective judgement, and record result in detail is also had after evaluation and test；The disease Mutual affection class automatic identification module is given by the overall merit of word, phrase and sentence various aspects in selection initial consonant, simple or compound vowel of a Chinese syllable, vocabulary Test report out, and may recognize that the cause of disease of dysarthria；The automatic evaluation module of state of an illness grade, mainly for liver Lenticular degeneration carries out grade distinction, evaluates state of an illness etc. to the pronunciation situation of word in system vocabulary by investigating subject Grade.

Disfluency patient voice caused by different syndromes is acquired, sub-category speech database is established；

Using word as the formulation of the experiment vocabulary of object；

Divided in terms of content, tone and the barrer types three by structure sound function of the equipment such as computer to patient Class；

On the basis of Speech SDK identifies engine, in conjunction with the identification engine voluntarily constructed, the dysarthrosis state of an illness is developed Classification feasibility analysis system；

It is put down using the speech recognition algorithm of Hidden Markov Model from whole based on the small feature of sample amount early period Equal angle realizes optimal identification process；

By the extraction to different sexes, all ages and classes stage voice spectrum signal characteristic parameter, pitch period, frequency are discussed The difference of rate, short-term spectrum, short-time energy, the short-time average magnitude supplemental characteristic in different syndromes, then by the fundamental frequency of sound sample, The indexs such as frequency perturbation quotient, Shimmer quotient analyze the difference discussed between various the nervous system disease disfluencys；

Disfluency patient is tested into information and retains storage, after the treatment of a period of time, is detected again in contrast, Analysing patient's condition development.

The classification method of the compressed sensing based hepatolenticular degeneration disfluency has the advantages that

(1) present invention proposes the compressed sensing sampling of improved voice signal, has provided to remote patient data sampling The hardware supported and data compression technique of power are of great immediate significance for the remote collection of realization data with test, are The acquisition of each area sample data needed for the building of later data library lays the foundation, and also makes for automatic identification system to all patients With providing powerful technical support.

(2) present invention studies traditional vocabulary of disfluency, by mathematical tool to the speech for influencing WD patient The voice signal of obstacle carries out step analysis, extracts characteristic attribute, establishes the tailored version vocabulary for being suitable for WD patient.

(3) present invention building WD database proposes that the various aspects sufferer feature of patient, which is carried out classification with data, begs for By, be applied to automatic identification system in, have important practical significance and accurate data basis.

Detailed description of the invention

Fig. 1: the block diagram of the extraction process of Mel cepstrum coefficient in the present invention.

Fig. 2: speech processes schematic diagram in the present invention.

Fig. 3: experimental comparison figure of the present invention.

Specific embodiment

Below with reference to Fig. 1 to Fig. 3, the present invention will be further described:

Evaluation for current dysarthrosis does not still determine quantifier elimination, and what the research of WD dysarthrosis was more a lack of shows Shape is proposed first to sample voice signal using compressed sensing technology, recycles a variety of points of random forest, SVM classifier etc. Class knows method for distinguishing and carries out Classification and Identification to the state of an illness of WD patient, and studies the key technology that it is realized.

The invention mainly comprises four big functional modules:

The automatic evaluation module of dysarthrosis, disfluency state of an illness categorization module, the automatic categorization module of the state of an illness, state of an illness ranking score Generic module.

The automatic evaluation function module of dysarthrosis includes the judgement read word, read word, tone, and system randomly selects in vocabulary Entry is pronounced by subject, needs to choose gender, the levels such as illness type herein.Specific the barrer types will be provided after system evaluation, And generate test report.

Disfluency state of an illness categorization module includes " voice repetition ", " voice switching " evaluation module, and system randomly selects word Entry and sentence in table are pronounced by subject, this module result can be recorded in table, system evaluation in conjunction with subjective judgement Record result in detail is also had afterwards.

The automatic categorization module of the state of an illness passes through the comprehensive of the various aspects such as selection initial consonant, simple or compound vowel of a Chinese syllable, word, phrase and sentence in vocabulary Evaluation is closed, provides test report, and may recognize that the cause of disease of dysarthria.

State of an illness grade separation module carries out grade distinction mainly for hepatolenticular degeneration, by investigating subject to being The pronunciation situation of word evaluates state of an illness grade in system vocabulary.

1.2, target of the invention:

Great demand of the present invention towards WD patient clinical diagnoses and illness analysis, by being quantified to patient's disfluency Analysing and evaluating is that provide accurate underlying parameter be final goal for WD cerebral injury in patients degree and state of an illness rank, is felt around compression Know that the key scientific problems in the sampling of voice signal and the classification and identification algorithm of Small Sample Database conduct a research, it is specific to study Target is as follows:

(1) reasonable Speech Signal Compression cognitive method is proposed.

(2) database of the data such as WD patient's voice, image, video, medical history and CT magnetic resonance is established.

(3) the speech parameter characteristic attribute for being directed to WD patient's disfluency is obtained, and is constructed on the basis of existing vocabulary The effective tailored version vocabulary of more specification.

(4) it proposes feasible Classification of Speech recognizer, realizes the division of state of an illness grade.

(5) research and development can satisfy the WD disfluency state of an illness classification method of clinical practice application requirement.

1.3.1, the pretreatment of voice signal:

Voice pre-treatment step of the present invention includes: filtering, compression sampling, conversion, preemphasis, adding window, framing And etc..

(1) input and filtering of sound；

By using the voice of the one-way type microphone samples patient of high-quality, as input signal of the invention.Filtering It is to inhibit frequency values in input signal is more than the frequency component of sample frequency, to prevent aliasing from interfering；Meanwhile also to press down The interference of 50Hz alternating current working frequency processed.Since the speech sample frequency of high-quality is usually 44100Hz, needed in voice signal Inhibiting frequency is more than the frequency component of 22050Hz, therefore, the step for filtering is actually to voice signal addition band logical filter Wave device, in the present invention, the upper and lower cutoff frequency of added bandpass filter is respectively f_H=20kHz, f_L=60Hz.

(2) compressed sensing of voice signal；

Due to the time-varying characteristics of voice signal, need to carry out Short Compression and perception to signal.

The analysis of voice signal is usually exactly to carry out linear prediction analysis to the sinusoidal model of voice signal or to signal x The formant and frequency information of decomposition carry out cepstral analysis.To the rarefaction of voice signal, there are many kinds of research methods: base at present This redundant dictionary, based on approximate KLT in method, discrete cosine transform (DCT), and the voice signal based on wavelet basis is sparse Representation method etc..Rarefaction representation shaped like x=ψ α in compressive sensing theory, voice signal can be become with the DCT shaped like formula (1) Bring substitution:

X=c^-1·θ₁ (1)

C herein is a real transformation matrix, θ₁It is DCT coefficient.

In the application of CS compressed sensing, widely using random matrix as sampling base Φ；This is because CS compression sense Know the incoherence for needing height, and random matrix and any base all have very big incoherence.Bai Gaosi uniformly makes an uproar Sound all has good sampling base, therefore the random matrix of a usually used independent same distribution (IID) to CS compressed sensing, Including random gaussian matrix, Teoplitz and circular matrix etc..

The reconstruct of signal has critical role in CS compressive sensing theory, and signal can be from one group of a small amount of measured value It is reconstructed or restores.By using optimization algorithm, can in the case where receiving end does not need loss information reconstruction signal.CS pressure The method of contracting sensing reconstructing Resolving probiems is generally divided into two classes, and one is linear program is used, to restore data, (such as base is tracked ), BP another kind is using second order greedy algorithm (such as orthogonal matching pursuit OMP).

(3) end-point detection；

The suitable end-point detecting method of selection can shorten signal processing time, moreover it is possible to it is mixed to exclude invalid section in signal Noise enhances the recognition performance of system.It generally requires and is combined judgement, this hair using two or more characteristic parameters It is bright improved to realize efficient voice in such a way that the autocorrelation maximum of voice signal and mistake threshold rate combine using a kind of End-point detection.Assuming that the time-domain expression after voice signal adding window is x (m), wherein n-th frame signal expression is x_n(m), Frame length is N, then the short-time autocorrelation function of the voice signal are as follows:

Wherein, 0≤k≤K, K are the largest delay points.

(4) preemphasis；

The purpose of preemphasis is the energy of increase high fdrequency component, effectively improves signal-to-noise ratio, to carry out frequency to signal Unified signal-to-noise ratio can be used when calculating in spectrum analysis and channel parameters, reduces difficulty in computation.Using order digital filter group At the preemphasis digital filter that can improve signal high fdrequency component with 6dB/ octave:

H (z)=1- μ z^-1 (3)

(5) adding window and framing

Window function is the tool for extracting " speech frame " from continuous speech, is denoted as ω (n).The characteristic of window function is exactly The voice segments whole zero setting except processing region will be needed, i.e., to voice signal " framing ".There are two ways to framing: continuous point Frame and overlapping framing, overlapping framing are the framing methods generallyd use.The voice signal expression formula of adding window are as follows:

s_ω(n)=s (n) g ω (n) (4)

Overlapping sub-frame processing is carried out to voice signal using " Hamming window " in the present invention, adding window direction is needed from voice signal The part of analysis starts, and successively slides along time shaft.

The expression-form for the Hamming window function that one frame length is N are as follows:

1.3.2 the selection of characteristic parameter:

Common identification parameter type includes time domain, frequency domain and transform domain, includes multiple parameters in each type.Wherein Common feature parameter for identification is as shown in table 1:

1 common feature parameter of table

(1) short-time energy:

Short-time energy is to judge one of most important feature of sound-type, and the present invention chooses logarithmic energy conduct in short-term first One of characteristic parameter of speech recognition:

(2) Mel cepstrum coefficient (MFCC):

The present invention chooses the another kind of characteristic parameter of Mel cepstrum coefficient (MFCC) as language identification of the present invention.From process The basic procedure that Mel cepstrum coefficient is extracted in pretreated voice signal is as shown in Figure 1.Firstly, to the language of each windowed function Sound frame does Fast Fourier Transform (FFT) and obtains power spectrum；Power spectrum is obtained into Mel frequency spectrum, Mel by Mel filter group again Filter group is actually one group of normalized triangle bandpass filter, i.e., seeks log spectrum to power spectrum；Then, to Mel Frequency spectrum solves cepstrum, that is, does discrete cosine transform (DCT) and obtain Mel cepstrum, Mel cepstrum coefficient is one group of feature vector, i.e.,

Wherein, N is the number of Mel filter group intermediate cam wave, takes N=20 in the present invention.K=2,3 is taken ..., Ck when 13 12 results as MFCC coefficient.This 12 dimension MFCC coefficient of group is exactly required this frame language for being used for speech recognition of the present invention The characteristic parameter of sound, as shown in Figure 1.

(3) extraction of WD patient's voice nonlinear characteristic parameters:

The characteristics of for WD patient and voice collecting environment, when selection phonetic feature carrys out characterization information, it is contemplated that in short-term The phonetic features such as average energy (or amplitude), short-time zero-crossing rate are vulnerable to the distance of patient and microphone, angle and ambient noise Interference, and durations for speech (word speed) and pure and impure sound changing pattern can be by individual patients language differences (dialect) and subjective pronunciation Attitude (emotion) is influenced.Therefore, the phonetic feature for being not easy to be affected by other factors should be chosen, at the same cast out other vulnerable to The feature of interference accordingly reduces its weight.Influence according to fatigue to voice signal, feelings of the same person in hair same voice Under condition, nonlinear characteristic: maximum Lyapunov exponent (MLE), approximate entropy (ApEn) and fractal dimension and traditional characteristic: base The variation of voice frequency, formant, sound channel time-varying system parameter can more objectively reflect the voice messaging of patient.

1.4 critical issues intended to solve:

(1) the disfluency test vocabulary for being suitable for WD patient is established.

By the analysis to collected WD patient's speech samples, in Huang Mingzhao-Han Zhijuan vocabulary that is existing and using On the basis of, the dedicated sample being applicable in suitable for WD patient is designed, in addition, also can for the patient of different geographical, different dialects Design specific vocabulary.

(2) the speech parameter feature extraction of WD disfluency and database building.

Collected WD patient's speech samples are carried out with the signature analysis of voice signal, extracts the spy that typical case has experimental value Sign analyzes the difference of the speech characteristic parameter of different patients or different state of an illness grades, clearly influences the feature of state of an illness grade classification Parameter rationally quantifies the correlativity between each characteristic parameter and state of an illness grade by emulation experiment, and then extracts special Attribute is levied, while constructing patient's speech database.

(3) research of classification of speech signals recognizer.

Classification and Identification is carried out for WD patient's voice signal data of acquisition, first by the method for machine learning from sample Wherein valuable rule is found in data, from database analysis treatment as a result, the grade of the state of an illness, from training examples The algorithm of middle study objective function.By training sample, data are classified using the classifier of algorithms of different, to reach The state of an illness is classified.

The 2.1 quasi- research methods taken:

(1) method for taking theoretical research and experimental study to combine.

(2) it is based on compressive sensing theory, using sparse dictionary and convex optimized algorithm is solved, in conjunction with to WD patient's voice signal Research and analyse, compression sampling is carried out to the voice signal of WD patient.

(3) random forest, neural network, supporting vector are studied by theory analysis and machine learning by experimental study A variety of classification and identification algorithms such as machine SVM, and the kernel function for optimization of analyzing and researching.

(4) by theory analysis and numerical value analysis means, how research is classified the state of an illness of WD patient, and to vocabulary It is updated and sets, the overall data library of WD patient is expanded and safeguarded.

2.2, technology path, as shown in Figure 2: including preprocessing module, identification module and output module.Wherein pre-process Pass through A/D conversion, end-point detection, preemphasis, adding window and parameter extraction after voice input in module, identification module includes standard Acoustics template and identification are identified that identification is based on dictionary and language after parameter extraction again after the processing of standard acoustic template Method rule.Recognition result is exported by the output interface of output module.

2.3 experimental programs:

(1) design of speech recognition main performance index

Vocabulary

Selected entry can reflect the mandarin Articulation ability of speaker comprehensively.

Sound combination including 21 initial consonants and 18 simple or compound vowel of a Chinese syllable phonemes.

Preliminary " Chinese Articulation ability assesses vocabulary " (2006) for formulating selection Huang Mingzhao-Han Zhijuan establishment are used as reference.

In conjunction with the different expression form of the nervous system disease disfluency, vocabulary is grouped, it is final to determine vocabulary tool The composition of 50 words of body.

Acquisition mode: subject per says the entry in a vocabulary, belongs to isolated word recognition.

The identification based on Microsoft's SAPI continuous speech recognition is combined to draw using the speech recognition engine based on isolated word recognition Hold up common completion.

Identify object --- unspecified person

Noise requirements: the noiseproof feature of system needs to adapt to environmental change, it is also necessary in noiseless or less noise peace and quiet It is used in environment, needs higher robustness.

(2) design of experimentation:

The more apparent patient of disfluency symptom is as subject in screening the nervous system disease such as disturbances in patients with Parkinson disease. Before the essential information (name, height, weight, age etc.), sick time, testing time, symptom and the experiment that record patient Medicining condition.The specific requirement that experiment is lectured from main examiner person to subject, makes it fully understand the concrete form of experiment.Experiment When beginning, the distance between microphone and subject lip about 10cm are kept.

Experimental facilities is made of the professional microphone of high-quality one-way fashion, voice collecting card, computer, monitoring grade earphone.

When doctor checks, to room and facility requirements: answering peace and quiet in room, may not disperse the object of patient's attention Product.Two should be placed without armchair and a training station.The height of chair is subject to examiner and patient is in same level.For Patient's dispersion attention is avoided, the relatives or nursing staff of patient not accompany indoors.

The room that when voice collecting and system are evaluated and tested in real time, the selection of sample plot point is quiet, does not interfere with, ambient noise are less than 45dB。

Specific steps:

Disfluency patient voice caused by different syndromes is acquired, sub-category speech database is established.

Using word as the formulation of the experiment vocabulary of object.According to being instructed for voice disorder for the propositions such as Liu Qiaoyun, Huang Zhao ring Experienced " CRDS " strategy, formulates the vocabulary for being suitable for this experiment.

The structure sound function of patient is carried out in terms of content, tone and the barrer types three by equipment such as computers automatic Evaluation.

On the basis of Speech SDK identifies engine, in conjunction with the identification engine voluntarily constructed, WD dysarthrosis disease is studied Feelings classification method.

It is put down using the speech recognition algorithm of Hidden Markov Model from whole based on the small feature of sample amount early period Equal angle realizes optimal identification process.

By the extraction to different sexes, all ages and classes stage voice spectrum signal characteristic parameter, pitch period, frequency are discussed The difference of rate, short-term spectrum, short-time energy, the short-time average magnitude supplemental characteristic in different syndromes, then by the fundamental frequency of sound sample, The indexs such as frequency perturbation quotient, Shimmer quotient analyze the difference discussed between various the nervous system disease disfluencys.

Pass through rhythm structure (word length, morphology and tone), phoneme and phonetic feature (syllable, tone, consonant, vowel, lexeme Deng) etc. subregions the similarities and differences of different type patient disfluency are discussed, establish the sound bank of region dialect, understand dialect with it is general The difference of call.

Subject is tested into information and retains storage, after the treatment of a period of time, is detected again in contrast, analysis disease Feelings development.

As shown in figure 3, being fallen to predictive coefficient, cepstrum, LPC respectively firstly, carry out Experimental comparison to the pronunciation of Chinese character " west " The parameters such as the Power estimation that spectrum, short-time spectrum, LPC Power estimation, Line Spectral Pair coefficients and line spectrum pair are asked are tested.

Above in conjunction with attached drawing, an exemplary description of the invention, it is clear that realization of the invention is not by aforesaid way Limitation, as long as use the inventive concept and technical scheme of the present invention carry out various improvement, or it is not improved will be of the invention Conception and technical scheme directly apply to other occasions, be within the scope of the invention.

Claims

1. a kind of classification method of compressed sensing based hepatolenticular degeneration disfluency, comprising the following steps:

It is proposed the preprocess method of reasonable WD patient's voice signal；

Obtain the characteristic attribute of the speech characteristic parameter for WD patient's disfluency；

Research and development can satisfy the classification system of the compressed sensing based hepatolenticular degeneration disfluency of clinical practice application requirement System.

2. the classification method of compressed sensing based hepatolenticular degeneration disfluency according to claim 1, feature It is: the preprocess method of WD patient's voice signal, comprising the following steps: step 1: the input and filtering of sound；Step 2, the compressed sensing of voice signal；Step 3, end-point detection；Step 4, preemphasis；Step 5, adding window and framing.

3. the classification method of compressed sensing based hepatolenticular degeneration disfluency according to claim 2, feature Be: filtering adds bandpass filter, the upper and lower cutoff frequency of added bandpass filter to voice signal in the step 1 Rate is respectively f_H=20kHz, f_L=60Hz.

4. the classification method of compressed sensing based hepatolenticular degeneration disfluency according to claim 2, feature Be: in the step 3 by the autocorrelation maximum of voice signal and cross threshold rate combine in the way of realize efficient voice End-point detection: assuming that time-domain expression after voice signal adding window is x (m), wherein n-th frame signal expression is x_n(m), Frame length is N, then the short-time autocorrelation function of the voice signal are as follows:

Wherein, 0≤k≤K, K are the largest delay points.

5. the classification method of compressed sensing based hepatolenticular degeneration disfluency according to claim 2, feature Be: step 4 purpose is the energy of increase high fdrequency component, effectively improves signal-to-noise ratio, to carry out frequency spectrum to signal Unified signal-to-noise ratio can be used when calculating in analysis and channel parameters, reduces difficulty in computation.It is formed using order digital filter The preemphasis digital filter of signal high fdrequency component can be improved with 6dB/ octave:

H (z)=1- μ z^-1 (3)

Wherein, system function H (z) is z^-1Multinomial, its pole on the origin of Z plane, wherein μ be pre emphasis factor.

6. the classification method of compressed sensing based hepatolenticular degeneration disfluency according to claim 2, feature Be: window function is the tool for extracting " speech frame " from continuous speech in the step 5, is denoted as ω (n)；Window function Characteristic is exactly the voice segments whole zero setting that will be needed except processing region, i.e., to voice signal " framing ".The method of framing has two Kind: continuously framing and overlapping framing, overlapping framing are the framing methods generallyd use；The voice signal expression formula of adding window are as follows:

s_ω(n)=s (n) g ω (n) (4)

ω (n) is window function, and s (n) is primary speech signal.

7. the classification of compressed sensing based hepatolenticular degeneration disfluency described in any one of -6 according to claim 1 Method, it is characterised in that: the speech characteristic parameter of WD patient's disfluency is short-time energy, Mel cepstrum coefficient (MFCC) With WD patient's voice nonlinear characteristic parameters.

8. a kind of classification method of compressed sensing based hepatolenticular degeneration disfluency, comprising the following steps:

Using word as the formulation of the experiment vocabulary of object；

It is commented automatically in terms of content, tone and the barrer types three by structure sound function of the equipment such as computer to patient Valence；

On the basis of Speech SDK identifies engine, in conjunction with the identification engine voluntarily constructed, develops dysarthrosis and assess automatically Feasibility analysis system；

Based on the small feature of sample amount early period, using the speech recognition algorithm of Hidden Markov Model, from ensemble average Angle realizes optimal identification process；

By the extraction to different sexes, all ages and classes stage voice spectrum signal characteristic parameter, discuss pitch period, frequency, The difference of short-term spectrum, short-time energy, the short-time average magnitude supplemental characteristic in different syndromes, then pass through fundamental frequency, the frequency of sound sample The indexs such as rate perturbation quotient, Shimmer quotient analyze the difference discussed between various the nervous system disease disfluencys；

Pass through rhythm structure (word length, morphology and tone), phoneme and phonetic feature (syllable, tone, consonant, vowel, lexeme etc.) Equal subregions discuss the similarities and differences of different type patient disfluency, establish the sound bank of region dialect, understand dialect and common The difference of words；

Disfluency patient is tested into information and retains storage, after the treatment of a period of time, is detected again in contrast, analysis Progression of the disease situation.