CN108074585A - A kind of voice method for detecting abnormality based on sound source characteristics - Google Patents

A kind of voice method for detecting abnormality based on sound source characteristics Download PDF

Info

Publication number
CN108074585A
CN108074585A CN201810126670.6A CN201810126670A CN108074585A CN 108074585 A CN108074585 A CN 108074585A CN 201810126670 A CN201810126670 A CN 201810126670A CN 108074585 A CN108074585 A CN 108074585A
Authority
CN
China
Prior art keywords
mrow
voice
glottis
sound source
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810126670.6A
Other languages
Chinese (zh)
Inventor
姚潇
白文松
李伟亮
徐宁
李旭
蒋爱民
刘小峰
张学武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Campus of Hohai University
Original Assignee
Changzhou Campus of Hohai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Campus of Hohai University filed Critical Changzhou Campus of Hohai University
Priority to CN201810126670.6A priority Critical patent/CN108074585A/en
Publication of CN108074585A publication Critical patent/CN108074585A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window

Abstract

The invention discloses a kind of voice method for detecting abnormality based on sound source characteristics, include the following steps:Pass through sensor real-time collecting voice data;Obtained speech segments are pre-processed;For the voice data of voice segments, glottis ripple signal is obtained using iteration self-adapting liftering;Characteristic parameter is extracted from glottis ripple signal:Normalized amplitude business compares data with glottis closing time;The characteristic extracted input ideal SVM models are classified;Tag along sort is obtained, for judging speaker's situation, speaker's situation label is exported, execution module is transferred to be fed back.The present invention characteristic be, for the stressed speech under stress, it has broken away from and model is generated based on traditional linear speech, extraction lacks the recognition methods of the acoustical characteristic parameters of physical significance, establish sound source estimation model, using the liftering technology of speech production, analysis and extraction carry out the detection of abnormal speech based on the characteristic parameter of human vocal cord vibration.

Description

A kind of voice method for detecting abnormality based on sound source characteristics
Technical field
The present invention relates to a kind of voice method for detecting abnormality based on sound source characteristics, belong to intelligent sound technical field.
Background technology
Pressure is the natural reaction that body is stimulated for physics, psychology or emotion, when we are subject to these stimulations, greatly Brain can release alcohol and peptide matters to body, so as to cause nervous reaction.This lasting Anxiety for work, it will It is reflected on phonatory organ, so as to cause the change of the series of parameters such as audible frequency, the rate of articulation.These change believes in voice The various fields of number processing suffer from very important meaning, such as stressed speech identification, emotion recognition etc..
The important embodiment mode of pressure one is voice when speaker speaks, and becoming, which influences voice, generates very important one A influence factor.Mostly it is absorbed in Mr. Yu when ambient enviroment or words person's self-condition are abnormal variation or due to the use of person Work, speech recognition is to aid in the underworks of other work, in this process, at this moment depositing due to operating pressure It is subject to stress in, speaker, interlocutor's pronunciation will have large effect, and so as to generate abnormality, generate language The change of tune is different, and abnormality is often embodied among the voice of speaker, forms the voice signal under pressure anomaly state.
But the stressed speech under the stressed speech under stress, particularly multitask brain load pressure, from acoustically Discrimination it is relatively low, general acoustic feature cannot by its it is correct classify, deficient in stability and robustness.Further, since Stressed speech is in generating process, significant difference compared with sound source characteristics have with normally voice.Therefore, in detection process In, we improve the reliability of Stressful speech classification by sound source characteristics.It is language by improving the mark efficiency of stressed speech The strong robustness of sound identifying system lays the foundation.
The content of the invention
Problem to be solved by this invention is that pressure state is detected from the angle of the sound source of speech production, proposes one Pressure detection method of the kind based on speech production modeling.The characteristic of the present invention is to have broken away to generate mould based on traditional linear speech Type and lack physical significance acoustical characteristic parameters recognition methods, establish sound source estimation model, utilize the inverse of speech production Filtering technique, analysis and extraction carry out the detection of abnormal speech based on the characteristic parameter of human vocal cord vibration.
Technical scheme is as follows:
A kind of voice method for detecting abnormality based on sound source characteristics, includes the following steps:
(1), sensor real-time collecting voice data is passed through;
(2), the voice segments and noise segment of voice data are judged by end-point detection, to decide whether to carry out next step voice Signal processing works;
(3), the voice data framing adding window to obtained voice segments, and high frequency preemphasis processing is carried out to each frame;
(4), for the voice data of voice segments, glottis ripple signal is obtained using iteration self-adapting liftering;
(5), extract the characteristic parameter normalized amplitude business of glottis ripple and glottis closing time compares data;
(6), by the data extracted input, trained SVM models are classified;
(7), obtain tag along sort, for judging speaker's situation, export speaker's situation label, transfer to execution module into Row feedback.
Adding window uses Hamming window to a frame voice adding window in above-mentioned steps (3).
The processing of above-mentioned steps (3) medium-high frequency preemphasis is by the limited exciter response high-pass filter of a single order to its high frequency Part is promoted.
The step of glottis ripple signal is obtained in above-mentioned steps (4) is as follows:
(a), channel model is established using iteration self-adapting liftering;
Iteration self-adapting liftering eliminates the influence that glottal excitation is brought in primary speech signal frequency spectrum;
(b) and then by the method for liftering the influence of formant is eliminated;
Acoustic model is established by linear predictive coding and discrete complete or collected works' model exactly, is finally obtained using liftering Glottis ripple signal.
The extracting method of the characteristic parameter normalized amplitude business of glottis ripple is as follows in above-mentioned steps (5):
NAQ is normalized amplitude business in formula;T is pitch period;AQ is amplitude business, is glottis ripple peak swing and its correspondence The ratio between the maximum negative peak of first derivative;
F in formulaacFor the maximum crest value of glottal;Dpeak is the maximum negative peak that glottal corresponds to first derivative.
Glottis closing time is as follows than the extracting method of data in above-mentioned steps (5):
CPR compares data for glottis closing time in formula;CP is the Closure states of glottis stage;O is glottis total opening time.
The advantageous effect that the present invention is reached:
The present invention by sounding physiological system under pressure influence on the Research foundation of variation characteristic, studying physiological feature Inner link between sound source parameter verifies an important factor for can reflecting pressure state in glottis wave characteristic so that required The glottis wave parameter obtained not only possesses theoretical direction, and with specific physical significance;Finding out can describe in sonification system The glottis wave parameter of pressure correlation sound source characteristic establishes the inner link of sound source characteristics and physiological characteristic, is identified with this feature It with the correlation of pressure variance factor, indicates the mode of vibration of vocal cords and has physical significance, finally to voice exception shape The detection of state improves the precision and reliability of speech recognition system.
Present invention could apply to environment inside cars, judge its pressure shape by detecting the voice data of driver and passenger State, by transmission equipment by status information feedback to execution module, and then adopted an effective measure automatically by execution module as:It reminds Driver takes care, notifies that nearby vehicle pays attention to avoidance etc. using car networking, so as to reach protection safety of life and property Purpose.
Description of the drawings
Fig. 1 is the basic flow chart of the present invention;
Fig. 2 is the basic flow chart for obtaining svm classifier model.
Fig. 3 is iteration self-adapting liftering (IAIF) technical architecture plan that the present invention establishes;
Fig. 4 is the ROC curve of five kinds of parameters in embodiment 1;
Fig. 5 is each parameter value of ROC curve of five kinds of features in embodiment 1, and wherein AUC is area under the curve, and SE is standard Difference;CL is confidence interval;
Fig. 6 is to be tested in embodiment 1 by 50 wheels, the average recognition rate of the grader drawn.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.Following embodiment is only used for clearly illustrating the present invention Technical solution, and be not intended to limit the protection scope of the present invention and limit the scope of the invention.
As shown in Figure 1, a kind of voice method for detecting abnormality based on sound source characteristics, includes the following steps:
(1), sensor real-time collecting voice data is passed through;
(2), the voice segments and noise segment of voice data are judged by end-point detection, to decide whether to carry out next step voice Signal processing works;
Present invention uses the sound end detecting method based on energy and short-time zero-crossing rate, to efficiently differentiate voice Section.The above method is the detection method of existing maturation, is not elaborated herein.
(3), the voice data framing adding window to obtained voice segments, and high frequency preemphasis processing is carried out to each frame;
Preemphasis:The average power spectra of voice signal is influenced by glottal excitation and mouth and nose radiation, and front end about exists More than 800Hz decays by 6dB/oct (octave), and the more high corresponding ingredient of frequency is smaller, to be carried out therefore to voice signal Its high frequency section is promoted by the limited exciter response high-pass filter of a single order before analysis.
Framing:Since voice signal has short-term stationarity, we can carry out sub-frame processing to signal.From macroscopically It sees, it short enough must ensure that signal is stable in frame, i.e., the length of a frame should be less than the length of a phoneme.Normally Under word speed, the duration of phoneme is about 50-200 milliseconds, so frame length is generally less than 50 milliseconds.From microcosmic, it It must include enough vibration periods again, because Fourier transformation will analyze frequency, only repeatedly enough times ability Analyze frequency.The fundamental frequency of voice, male voice is in 100 hertzs, and for female voice in 200 hertzs, it is exactly 10 milliseconds to be converted into the cycle With 5 milliseconds.Since a frame will include multiple cycles, so generally taking at least 20 milliseconds.
Adding window:Using Hamming window to a frame voice adding window, it not only has preferable frequency resolution, can also reduce frequency spectrum Leakage, so as to reduce the influence of Gibbs' effect.
(4), as shown in figure 3, voice data for voice segments, obtains glottis ripple using iteration self-adapting liftering and believe Number;
(a), channel model is established using iteration self-adapting liftering (IAIF);
Iteration self-adapting liftering eliminates the influence that glottal excitation is brought in primary speech signal frequency spectrum;
(b) and then by the method for liftering (IF) influence of formant is eliminated;
Acoustic model is established by linear predictive coding (LPC) and discrete complete or collected works' model (DAP) exactly, finally using inverse Filter (IF) to obtain glottis ripple signal, as shown in Figure 2.
(5), extract the characteristic parameter normalized amplitude business of glottis ripple and glottis closing time compares data;
Under working pressure conditions, since vocal cords contraction of muscle causes the randomization of vocal cord vibration.So as to cause The variation of glottis interior air-flow fluidised form so that voice signal is made a variation.The variation of these vocal cords features will be reacted in glottis In the feature of ripple, hence glottis ripple can reflect operating pressure to a certain extent.We use normalized amplitude business (NAQ) and glottis closing time ratio (CPR) characterizes the intrinsic propesties of glottis ripple, and the feature of proposition has specific physical significance, Reflect the different vibration mode of vocal cords during speech production.
The characteristic parameter one of glottis ripple:Normalized amplitude business, the main closed manners for reflecting vocal cords, extracting method is such as Under:
NAQ is normalized amplitude business in formula;T is pitch period;AQ is amplitude business, is glottis ripple peak swing and its correspondence The ratio between the maximum negative peak of first derivative;
F in formulaacFor the maximum crest value of glottal;Dpeak is the maximum negative peak that glottal corresponds to first derivative. Since the instantaneous moment of glottis closure or openness need not be measured so that AQ becomes more readily available, but the value of AQ depends on The measurement of signal fundamental frequency (F0), therefore in formula (1), normalize to obtain NAQ by pitch period, eliminate and fundamental frequency is measured Dependence.
The characteristic parameter two of glottis ripple:Glottis closing time ratio (CPR).CPR parameters reflect the Closure states of glottis stage and account for sound The ratio of door total opening time is mainly shown as the crooked degree of glottis signal in glottis ripple..
Glottis closing time is as follows than the extracting method of data:
CPR compares data for glottis closing time in formula;CP is the Closure states of glottis stage;O is glottis total opening time.
(6), by the data extracted input, trained SVM models are classified;
Support vector machines (SVM) plays an important role [8] always in area of pattern recognition, and so-called supporting vector refers to that A bit in the training sample point of interval area edge.SVM is classified using linear processes hyperplane.SVM is built upon statistics The VC dimensions of the theories of learning are theoretical and Structural risk minization basis on, according to limited sample information model complexity Seek between (i.e. to the study precision of specific training sample) and learning ability (ability for identifying arbitrary sample without error) Optimal compromise, in the hope of obtaining best Generalization Ability.Support vector machines is substantially the Nonlinear Classifier of two classes, very It is suitble to unique recognition point of stressed speech:(1) due to being not at every moment in pressure state in speaker's voiced process In, pressure shows as of short duration instantaneity in continuous speech, so only a small amount of samples can be defined as the variation under pressure Voice, therefore stressed speech identification is usually small sample problem.(2) a kind of typical two class of the stressed speech identification of causalgia is known Other problem.We establish the Classification and Identification model based on SVM, relevant in speaker, since each subject is spoken The sample size of people is relatively fewer, is typical small sample problem, so in this case, SVM models achieve relatively good Recognition effect.
Classification is identified to voice under stressed speech and normal condition by svm classifier model in the present invention, realizes institute Sound source parameter is for becoming the evaluation of the susceptibility of metachromatic state in proposition method, so as to test the validity of proposed method Card.
(7), obtain tag along sort, for judging speaker's situation, export speaker's situation label, transfer to execution module into Row feedback.
Embodiment 1
We used the database that Fuji Tsu collects, wherein the speech samples comprising 11 speakers (4 Male and 7 women).For simulation psychological pressure generate specific situation, for speaker be provided with three kinds of different tasks, with Operator carries out telephone talk when progress, to simulate the situation of pressure in the phone.
Three task (A) high concentrations being related to are (it is required that speaker completes to include solving logic puzzle and finds two figures The task of difference between piece);(B) time pressure (it is required that speaker answers a question under time pressure);(C) risk taking behavior (is adopted Risk task is taken, to assess serious hope of the speaker to pecuniary gain).For each speaker, there are four types of the dialogues of different task. In talking with twice, spokesman is required to complete task within the limited time, and in other dialogues, without any task, It can easily chat.
The part intercepted from speech is vowel/a/ ,/i/ ,/u/ ,/e/ ,/o/.These experiments are for every spokesman It carries out, all results are all determined by spokesman.11 experimental subjects for testing to choose are in speaker system It carries out, the number of sample depends on speaker, and total speech samples number is 700.
In the present invention, used verification data are all from telephone communication data, wherein 100 subjects (male 50 people, 50 people of female) participate in experiment.In experiment, operator is chatted by phone and each subject, everyone average four groups of dialogues, every group Chatting time is 10 minutes, and records most real voice communication data.In four groups of dialogues, two groups are stopping under light state It chats, in addition in two groups of dialogues, subject is applied in different types of pressure respectively, and the pressure of application includes:(1) multiplexing is appointed Business;(2) it is pressed for time;(3) venture, detail such as table 1.The real speech data that subject people speaks under pressure state It is logged for the verification of pressure detection method validity.
Table 1
In order to verify the validity of proposed method, the present invention is using Receiver operating curve (ROC), to evaluate not The recognition performance of same parameter, as shown in Figure 4, Figure 5, ROC curve are that (cut off value is determined according to a series of two different mode classifications Determine threshold), with true positive rate (sensitivity) for ordinate, false positive rate (1- specificities) is the curve that abscissa is drawn.ROC curve Closer to the upper left corner, area under the curve (AUC) is bigger, and method for expressing recognition performance is better, and accuracy is higher.
True positives (TPR):
False positive (FPR):
TP:True positives;TN:True negative;FP:False positive;FN:False negative
The Source Model parameter of proposition compared with traditional parameter, is passed through the average knowledge in pressure detecting by the present invention It does not compare in rate, illustrates that the method based on speech production modeling has apparent advantage in pressure detection method, so as to reach Distinguish the purpose of normal condition and abnormality.Three traditional speech parameters include, and fundamental frequency, mel-frequency cepstrum coefficient are thrown Object line frequency spectrum parameter (F0, MFCC, PSP) is used as experimental comparison group.
In sorting phase, NAQ and CPR are established SVM models as bivector, choose 125 groups of samples as training sample This, 125 groups of samples are used as test group, test and have selected the speech samples of 7 different speakers (4 3 female of man) to be from database, It is intended to eliminate the variation of the speech parameter caused by individual specificity, while F0, MFCC, PSP with the shape of one-dimensional sample Formula is trained, as experimental comparison group.As shown in fig. 6, after 50 wheel experiments, parameter is calculated in SV disaggregated models Average recognition rate.As can be seen that NAQ with CPR sound source characteristics compared with traditional parameters, embodied under abnormality good different The recognition performance of Chang Yuyin.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the technical principles of the invention, several improvement and deformation can also be made, these are improved and deformation Also it should be regarded as protection scope of the present invention.

Claims (6)

1. a kind of voice method for detecting abnormality based on sound source characteristics, it is characterised in that include the following steps:
(1), sensor real-time collecting voice data is passed through;
(2), the voice segments and noise segment of voice data are judged by end-point detection, to decide whether to carry out next step voice signal Handle work;
(3), the voice data framing adding window to obtained voice segments, and high frequency preemphasis processing is carried out to each frame;
(4), for the voice data of voice segments, glottis ripple signal is obtained using iteration self-adapting liftering;
(5), extract the characteristic parameter normalized amplitude business of glottis ripple and glottis closing time compares data;
(6), by the data extracted input, trained SVM models are classified;
(7), tag along sort is obtained, for judging speaker's situation, exports speaker's situation label, execution module is transferred to carry out anti- Feedback.
2. a kind of voice method for detecting abnormality based on sound source characteristics according to claim 1, it is characterised in that:The step Suddenly adding window uses Hamming window to a frame voice adding window in (3).
3. a kind of voice method for detecting abnormality based on sound source characteristics according to claim 1, it is characterised in that:The step Suddenly (3) medium-high frequency preemphasis processing promotes its high frequency section by the limited exciter response high-pass filter of a single order.
4. a kind of voice method for detecting abnormality based on sound source characteristics according to claim 1, it is characterised in that:The step Suddenly the step of glottis ripple signal is obtained in (4) is as follows:
(a), channel model is established using iteration self-adapting liftering;
Iteration self-adapting liftering eliminates the influence that glottal excitation is brought in primary speech signal frequency spectrum;
(b) and then by the method for liftering the influence of formant is eliminated;
Acoustic model is established by linear predictive coding and discrete complete or collected works' model exactly, finally obtains glottis using liftering Ripple signal.
5. a kind of voice method for detecting abnormality based on sound source characteristics according to claim 1, it is characterised in that:The step Suddenly the extracting method of the characteristic parameter normalized amplitude business of glottis ripple is as follows in (5):
<mrow> <mi>N</mi> <mi>A</mi> <mi>Q</mi> <mo>=</mo> <mfrac> <mrow> <mi>A</mi> <mi>Q</mi> </mrow> <mi>T</mi> </mfrac> <mo>=</mo> <mfrac> <msub> <mi>f</mi> <mrow> <mi>a</mi> <mi>c</mi> </mrow> </msub> <mrow> <mi>d</mi> <mi>p</mi> <mi>e</mi> <mi>a</mi> <mi>k</mi> <mo>&amp;CenterDot;</mo> <mi>T</mi> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
NAQ is normalized amplitude business in formula;T is pitch period;AQ is amplitude business, is glottis ripple peak swing corresponding with its one The ratio between maximum negative peak of order derivative;
<mrow> <mi>A</mi> <mi>Q</mi> <mo>=</mo> <mfrac> <msub> <mi>f</mi> <mrow> <mi>a</mi> <mi>c</mi> </mrow> </msub> <mrow> <mi>d</mi> <mi>p</mi> <mi>e</mi> <mi>a</mi> <mi>k</mi> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>
F in formulaacFor the maximum crest value of glottal;Dpeak is the maximum negative peak that glottal corresponds to first derivative.
6. a kind of voice method for detecting abnormality based on sound source characteristics according to claim 1, it is characterised in that:The step Suddenly glottis closing time is as follows than the extracting method of data in (5):
<mrow> <mi>C</mi> <mi>P</mi> <mi>R</mi> <mo>=</mo> <mfrac> <mrow> <mi>C</mi> <mi>P</mi> </mrow> <mi>O</mi> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>
CPR compares data for glottis closing time in formula;CP is the Closure states of glottis stage;O is glottis total opening time.
CN201810126670.6A 2018-02-08 2018-02-08 A kind of voice method for detecting abnormality based on sound source characteristics Pending CN108074585A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810126670.6A CN108074585A (en) 2018-02-08 2018-02-08 A kind of voice method for detecting abnormality based on sound source characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810126670.6A CN108074585A (en) 2018-02-08 2018-02-08 A kind of voice method for detecting abnormality based on sound source characteristics

Publications (1)

Publication Number Publication Date
CN108074585A true CN108074585A (en) 2018-05-25

Family

ID=62155229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810126670.6A Pending CN108074585A (en) 2018-02-08 2018-02-08 A kind of voice method for detecting abnormality based on sound source characteristics

Country Status (1)

Country Link
CN (1) CN108074585A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112735386A (en) * 2021-01-18 2021-04-30 苏州大学 Voice recognition method based on glottal wave information
CN113824843A (en) * 2020-06-19 2021-12-21 大众问问(北京)信息科技有限公司 Voice call quality detection method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011149558A2 (en) * 2010-05-28 2011-12-01 Abelow Daniel H Reality alternate
CN102324229A (en) * 2011-09-08 2012-01-18 中国科学院自动化研究所 Method and system for detecting abnormal use of voice input equipment
CN103730130A (en) * 2013-12-20 2014-04-16 中国科学院深圳先进技术研究院 Detection method and system for pathological voice
US9338547B2 (en) * 2012-06-26 2016-05-10 Parrot Method for denoising an acoustic signal for a multi-microphone audio device operating in a noisy environment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011149558A2 (en) * 2010-05-28 2011-12-01 Abelow Daniel H Reality alternate
CN102324229A (en) * 2011-09-08 2012-01-18 中国科学院自动化研究所 Method and system for detecting abnormal use of voice input equipment
US9338547B2 (en) * 2012-06-26 2016-05-10 Parrot Method for denoising an acoustic signal for a multi-microphone audio device operating in a noisy environment
CN103730130A (en) * 2013-12-20 2014-04-16 中国科学院深圳先进技术研究院 Detection method and system for pathological voice

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李宁: "基于声学参数和支持向量机的病理噪音分类研究", 《华东师范大学》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113824843A (en) * 2020-06-19 2021-12-21 大众问问(北京)信息科技有限公司 Voice call quality detection method, device, equipment and storage medium
CN113824843B (en) * 2020-06-19 2023-11-21 大众问问(北京)信息科技有限公司 Voice call quality detection method, device, equipment and storage medium
CN112735386A (en) * 2021-01-18 2021-04-30 苏州大学 Voice recognition method based on glottal wave information
CN112735386B (en) * 2021-01-18 2023-03-24 苏州大学 Voice recognition method based on glottal wave information

Similar Documents

Publication Publication Date Title
Hansen et al. Speaker recognition by machines and humans: A tutorial review
US8428945B2 (en) Acoustic signal classification system
CN109044396B (en) Intelligent heart sound identification method based on bidirectional long-time and short-time memory neural network
CN106941005A (en) A kind of vocal cords method for detecting abnormality based on speech acoustics feature
CN101923855A (en) Test-irrelevant voice print identifying system
Sahoo et al. Silence removal and endpoint detection of speech signal for text independent speaker identification
Vikram et al. Estimation of Hypernasality Scores from Cleft Lip and Palate Speech.
CN110265063A (en) A kind of lie detecting method based on fixed duration speech emotion recognition sequence analysis
Kim et al. Hierarchical approach for abnormal acoustic event classification in an elevator
Subhashree et al. Speech Emotion Recognition: Performance Analysis based on fused algorithms and GMM modelling
CN108074585A (en) A kind of voice method for detecting abnormality based on sound source characteristics
Whitehill et al. Whosecough: In-the-wild cougher verification using multitask learning
CN110415707B (en) Speaker recognition method based on voice feature fusion and GMM
Kalimoldayev et al. Voice verification and identification using i-vector representation
Thomas et al. Data-driven voice soruce waveform modelling
Islam et al. Neural-Response-Based Text-Dependent speaker identification under noisy conditions
Shofiyah et al. Voice recognition system for home security keys with mel-frequency cepstral coefficient method and backpropagation artificial neural network
Komlen et al. Text independent speaker recognition using LBG vector quantization
Godino-Llorente et al. Automatic detection of voice impairments due to vocal misuse by means of gaussian mixture models
Estrebou et al. Voice recognition based on probabilistic SOM
Arsikere et al. Speaker recognition via fusion of subglottal features and MFCCs
Pandiaraj et al. A confidence measure based—Score fusion technique to integrate MFCC and pitch for speaker verification
Warule et al. Empirical Mode Decomposition Based Detection of Common Cold Using Speech Signal
Iwok et al. Evaluation of Machine Learning Algorithms using Combined Feature Extraction Techniques for Speaker Identification
CN109243486A (en) A kind of winged acoustic detection method of cracking down upon evil forces based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180525