CN103928023B - A kind of speech assessment method and system - Google Patents

A kind of speech assessment method and system Download PDF

Info

Publication number
CN103928023B
CN103928023B CN201410178813.XA CN201410178813A CN103928023B CN 103928023 B CN103928023 B CN 103928023B CN 201410178813 A CN201410178813 A CN 201410178813A CN 103928023 B CN103928023 B CN 103928023B
Authority
CN
China
Prior art keywords
voice
examination paper
scoring
scored
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410178813.XA
Other languages
Chinese (zh)
Other versions
CN103928023A (en
Inventor
李心广
李苏梅
何智明
陈泽群
李婷婷
陈广豪
马晓纯
王晓杰
陈嘉华
徐集优
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Foreign Studies
Original Assignee
Guangdong University of Foreign Studies
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Foreign Studies filed Critical Guangdong University of Foreign Studies
Priority to CN201410178813.XA priority Critical patent/CN103928023B/en
Publication of CN103928023A publication Critical patent/CN103928023A/en
Application granted granted Critical
Publication of CN103928023B publication Critical patent/CN103928023B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a kind of speech assessment method, including step:S1, the examination paper voice for recording examinee;S2, the examination paper voice to the examinee carry out pretreatment, obtain examination paper voice language material;S3, the characteristic parameter for extracting the examination paper voice language material;The characteristic parameter and received pronunciation template of the examination paper voice language material are carried out characteristic matching by the audio recognition method of S4, employing based on HMM and ANN mixed models, are identified the content of the examination paper voice, and are given raw score;If S5, raw score are less than threshold value, raw score is final scoring;The scoring of point index such as accuracy, fluency, word speed, rhythm, stress and intonation is carried out otherwise;S6, comprehensive various score calculations obtain the final scoring of examination paper voice.The invention also discloses a kind of speech assessment system., using the audio recognition method based on mixed model, it is more accurate to recognize for the present invention, additionally it is possible to realize that the voice paper that examinee is deposited with document form after recording carries out objective scoring by evaluation criterion classification.

Description

A kind of speech assessment method and system
Technical field
The present invention relates to speech recognition and assessment technique, more particularly to a kind of speech assessment method and system.
Background technology
Speech recognition technology is generally divided into two classes from using angle:One class is particular person speech recognition, and a class is non-spy Determine people's speech recognition.Particular person speech recognition technology is the technology of identification for a specific people, is exactly briefly only identification The sound of one people, is not suitable for wider colony;And unspecified person technology of identification is on the contrary, different people can be met Speech recognition requires, is adapted to extensive crowd and applies.
The IBM voice studies group for maintaining the leading position in terms of big vocabulary speech recognition at present.Bel's research of AT&T A series of also begun to experiments about signer-independent sign language recognition, its achievement are established and how to be made for nonspecific human speech The method of the standard form of sound identification.
Major progress acquired by this period has:
(1) maturation and constantly improve of hidden markov model (Hidden Markov Models, HMM) technology becomes The main stream approach of speech recognition;
(2) when continuous speech recognition is carried out, in addition to recognizing acoustic information, more known using various language Know, word-building, syntax, semanteme, dialogue background in terms of etc. knowledge come help further to voice make identification and understand;Together When in the Research of Speech Recognition field, also create the language model based on statistical probability;
(3) rise of applied research of the artificial neural network in speech recognition.In these researchs, major part adopts base In the multilayer perception network of back-propagation algorithm (BP algorithm);Additionally, also network structure is simple, be easily achieved, do not have feedback The feedforward network of signal;The stability and function of associate memory of system have the feedback network for having feedback between substantial connection, neuron. Artificial neural network has the ability of the classification boundaries for distinguishing complicated, it is clear that it efficiently contributes to mode division.
In addition, the continuous speech dictation machine technology towards personal use is also gradually improved.This respect, it is most representational to be The Dragon Dictate systems of the ViaVoice and Dragon companies of IBM.These systems have speaker adaptation ability, newly User need not be trained to whole vocabulary, just can improve constantly discrimination in use.
The development of the speech recognition technology of China:There are acoustics institute of the Chinese Academy of Sciences, Institute of Automation, Tsing-Hua University, the north in Beijing The scientific research institutions such as university of communications and institution of higher learning.In addition, also Harbin Institute of Technology, Chinese University of Science and Technology, Sichuan University etc. Also take action one after another.Now, the country has many speech recognition systems to succeed in developing.The performance of these systems differs from one another: In terms of isolated word large vocabulary speech recognition, most representational is department of electronic engineering, tsinghua university and China Electronics's device public affairs Take charge of cooperation research and development successful THED-919 particular persons speech recognition and understand real-time system;In terms of continuous speech recognition, Sichuan University computer center realizes the continuous English of particular person of a Topic-constrained on microcomputer --- Chinese speech translation demonstration System;In terms of signer-independent sign language recognition, the voice control telephone directory enquiry system for having Computer Science and Technology Department of Tsing-Hua University to develop Unite and put into and be actually used.
In addition, University of Science and Technology's news fly the intelligent sound technology provider as Largest In China, issue global first in 2010 Mobile Internet intelligent sound interaction platform " news fly speech cloud ", the declaration mobile Internet voice dictation epoch arrive.
Fly have long-term research accumulation in intelligent sound technical field University of Science and Technology news, and know in Chinese speech synthesis, voice Not, speech evaluating etc. is multinomial technically possesses achievement leading in the world:Phonetic synthesis and speech recognition technology are to realize man-machine language Sound communicates, and setting up one has two key technologies necessary to the voice system for listening and saying ability;Automatic speech recognition technology (Auto Speech Recognize, ASR) problem to be solved is the voice for allowing computer " can understand " mankind, by language The Word message " extraction " included in sound is out;Speech evaluating technology is a study frontier of intelligent sound process field, and Claim computer-assisted language learning (Computer Assisted Language Learning) technology, be that one kind passes through machine Automatically pronunciation carried out scoring, error detection provide the technology of remedial teaching;Sound groove recognition technology in e, also known as speaker Recognition Technology (Speaker Recognition), is one and extracts the correlated characteristic for representing speaker's identity (such as reflection sound by voice signal Spectrum signature of fundamental frequency feature, reflection oral cavity size shape and sound channel length of door folding frequency etc.), and then identify speaker Technology in terms of the work such as identity;Natural language be for thousands of years people life, work, study in requisite element, and Computer is one of 20th century greatest invention, how the natural language that the mankind grasp is carried out processing using computer, very To understanding, computer is made to possess the listening, speaking, reading and writing ability of the mankind, always research institution pays special attention to and actively develops both at home and abroad Research work.
The content of the invention
The technical problem to be solved is, there is provided a kind of speech assessment method and system, can be fast accurate Carry out scoring of going over examination papers, with objective standards of grading give examinee scoring.The present invention has merged existing voice quality objective evaluation mould The advantage of type, obtains the more preferable speech recognition modeling of performance and voice training model and the spoken scoring of more accurate voice Scheme;And can realize that the voice paper to depositing with document form carries out objective scoring by multiple assessment indicator system. The present invention has the advantages that more stable, in hgher efficiency, is that the practical of achievement in research lays the foundation, is advantageously implemented on a large scale The target that Oral English Practice test is automatically goed over examination papers.
To solve above-mentioned technical problem, the invention provides a kind of speech assessment method, including step:
S1, the examination paper voice for recording examinee;
S2, the examination paper voice to the examinee carry out pretreatment, obtain examination paper voice language material;
S3, the characteristic parameter for extracting the examination paper voice language material;
S4, using the audio recognition method based on HMM and ANN mixed models by the characteristic parameter of the examination paper voice language material Characteristic matching is carried out with received pronunciation template, the content of the examination paper voice is identified, and is given raw score;
If S5, raw score are less than preset threshold value, the raw score is the final scoring of the examination paper voice, and The labelling examination paper voice is rolled up for problem;If raw score be higher than preset threshold value, the examination paper voice is carried out accuracy, Point index scoring of fluency, word speed, rhythm, stress and intonation;
S6, the final scoring for obtaining the examination paper voice is weighted to the scoring of described point of index.
Further, also include step S0 before step S1, step S0 specifically includes step:
S01, the received pronunciation for recording expert;
S02, pretreatment is carried out to the received pronunciation, obtain received pronunciation language material;
S03, the characteristic parameter for extracting the received pronunciation language material;
S04, the characteristic parameter to the received pronunciation language material carry out model training, obtain the received pronunciation template.
Further, the audio recognition method in step S4 based on HMM and ANN mixed models is concretely comprised the following steps:
S41, set up the examination paper voice language material characteristic parameter HMM model, obtain all state cumulatives in HMM model Probability;
S42, all state cumulative probability are processed as the input feature vector of ANN classification device, so as to export knowledge Other result;
S43, the recognition result is carried out into characteristic matching with the received pronunciation template, so as to identify the examination paper language The content of sound.
Further, the pretreatment in step S2 specifically include preemphasis, framing, adding window, noise reduction, end-point detection and Cutting word, wherein, the blank voice segments concretely comprised the following steps using voice of the noise reduction are entered to subsequent voice as the base value of noise Row denoising.
Further, the cutting word specifically includes step:
S21, the MFCC parameters for extracting each phoneme in voice, and set up the HMM model of correspondence phoneme;
S22, thick cutting is carried out to voice, obtain effective voice segments;
S23, the word that institute's speech segment is identified according to the HMM model of the phoneme, so as to by speech recognition be word Set.
Further, the extracting parameter feature in step S3 specially extracts MFCC characteristic parameters, concretely comprises the following steps The language material obtained after pretreatment is carried out fast Fourier transform, triangle window filtering, asks logarithm, discrete cosine transform to obtain MFCC Characteristic parameter.
Further, the accuracy scoring in step S5 is concretely comprised the following steps:
Using pulling and pushing the method for value by speech sentences to be scored regular to the degree close with received pronunciation sentence;Using short Shi Nengliang extracts the intensity curve of the speech sentences to be scored and received pronunciation sentence as feature;Wait to score by comparing Speech sentences are scored with the fitting degree of the intensity curve of received pronunciation sentence.
Further, the fluency scoring in step S5 is concretely comprised the following steps:
Voice to be scored is cut into into before and after's two parts, and to first part and latter part cutting word so as to obtaining efficient voice Section;The length of two-part efficient voice section in front and back made with the length of always voice to be scored division operation respectively, and will be obtained Value is compared with corresponding threshold value, if both greater than corresponding threshold value, it is fluent to be judged to;Otherwise, it is determined that being unfluent.
Word speed scoring is concretely comprised the following steps:Calculate the ratio that pronunciation part in voice to be scored accounts for entirely voice duration to be scored Example, carries out word speed scoring according to the ratio.
Rhythm scoring is concretely comprised the following steps:The rhythm of voice to be scored is calculated using improved dPVI parameter calculation formulas.
Stress scoring is concretely comprised the following steps:On the basis of intensity curve after regular, by arranging stress threshold value and non-stress Threshold value divides stress unit as the double threshold and stressed vowel duration of feature, and using DTW algorithms to the language to be scored Sound sentence and received pronunciation sentence carry out pattern match, realize commenting for stress.
Intonation scoring is concretely comprised the following steps:The formant of voice to be scored and received pronunciation is extracted, and waits to score according to described The variation tendency of speech resonant peak is scored to intonation with the fitting degree of the variation tendency of received pronunciation formant.
Present invention also offers a kind of speech assessment system, including:
Voice recording module, for recording the examination paper voice of examinee;
Pretreatment module, for carrying out pretreatment to the examination paper voice of the examinee, obtains examination paper voice language material;
Parameter attribute extraction module, for extracting the characteristic parameter of the examination paper voice language material;
Sound identification module, for adopting the audio recognition method based on HMM and ANN mixed models to the examination paper voice The characteristic parameter and received pronunciation template of language material carries out characteristic matching, recognizes the content of the voice that sets a paper, and gives raw score;
Speech assessment module, carries out accuracy scoring, stream for the examination paper voice for raw score higher than given threshold Sharp degree scoring, word speed scoring, rhythm scoring, stress scoring and intonation scoring.
Comprehensive grading module, the score calculation for overall accuracy, fluency, word speed, rhythm, stress and intonation are obtained Final scoring of the raw score higher than the examination paper voice of given threshold.
Implement the present invention, have the advantages that:
1st, the present invention adds the noise reduction and cutting word method of practicality in pretreatment module, obtains the voice language of better quality Material;
2nd, using the audio recognition method based on HMM and ANN mixed models, more preferably, it is more accurate to recognize for performance;
3rd, it is by the multi-target analysis to word speed, rhythm, stress and intonation, more polynary than original Score index for reading aloud topic Change, as a result more objectivity;
4th, by the double analysis to accuracy and fluency, on the basis of original can only realization to reading aloud topic scoring, Realize to non-objective scorings for reading aloud topic such as translation topic, question-and-answer problem and repetition topics, establish a rationally perfect voice and comment Divide method and system, fast can carry out scoring of going over examination papers exactly, score to examinee with objective standards of grading;
5th, the present invention has the advantages that more stable, in hgher efficiency and practical, applied range, can apply to Process is corrected in SET, is significantly effectively shortened and is corrected the time, improves the high efficiency that system is processed, also improve and correct Objectivity.
Description of the drawings
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing Accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is the schematic flow sheet of speech assessment method provided in an embodiment of the present invention;
Fig. 2 is the schematic flow sheet of the concrete steps of step S0;
Fig. 3 is the schematic flow sheet of the concrete steps of pretreatment in Fig. 1;
Fig. 4 is the schematic flow sheet of the concrete steps of cutting word in Fig. 3;
Fig. 5 is the schematic flow sheet of the concrete steps of MFCC characteristic parameter extractions;
Fig. 6 is the schematic flow sheet of the concrete steps of the audio recognition method based on HMM and ANN mixed models;
Fig. 7 is the structural representation of speech assessment system provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.
A kind of speech assessment method is embodiments provided, as shown in figure 1, including step:
S1, the examination paper voice for recording examinee;
S2, the examination paper voice to the examinee carry out pretreatment, obtain examination paper voice language material;
S3, the characteristic parameter for extracting the examination paper voice language material;
S4, employing are based on hidden Markov model (Hidden Markov Models, HMM) and artificial neural network The audio recognition method of (Artificial Neural Networks, ANN) mixed model is by the spy of the examination paper voice language material Levying parameter and received pronunciation template carries out characteristic matching, identifies the content of the examination paper voice, and gives raw score;
If S5, raw score are less than preset threshold value, the raw score is the final scoring of the examination paper voice, and The labelling examination paper voice is rolled up for problem;If raw score be higher than preset threshold value, the examination paper voice is carried out accuracy, Point index scoring of fluency, word speed, rhythm, stress and intonation;
S6, the scoring to described point of index are weighted the final scoring for obtaining the examination paper voice.
Further, also include step S0 before step S1, as shown in Fig. 2 step S0 specifically includes step:
S01, the received pronunciation for recording expert;
Wherein received pronunciation is all recorded under particular circumstances by most professional persons, voice content and Oral English Practice Examination content is corresponding;
S02, pretreatment is carried out to the received pronunciation, obtain received pronunciation language material;
S03, the characteristic parameter for extracting the received pronunciation language material;
S04, the characteristic parameter to the received pronunciation language material carry out model training, obtain the received pronunciation template.
Wherein, the model training of received pronunciation refers to that according to certain criterion sign is obtained from a large amount of known modes should The model parameter of pattern substitutive characteristics, i.e. received pronunciation template.The process of the model training is specifically referred in order that voice is known Other system reaches certain optimum state, by initial construction data constantly iteration adjustment system template parameter (include shape The variance of the probability and gauss hybrid models of state transfer matrix, average, weight etc.), make the performance of system constantly to it is this most The process that good state is approached.As the received pronunciation of professional person and the voice of examinee have difference to a certain extent, and The scoring of the present invention, is expanded to so the present invention will make great efforts to extend corpus by specific professional person to liking natural person Ordinary people, specific environment expand to conventional environment, and comprising different sexes, the age, the speaker of accent sound.
Next each step will be specifically introduced.
1st, pretreatment
As shown in figure 3, the pretreatment in step S2 specifically includes noise reduction, preemphasis, framing, adding window, end-point detection And cutting word, the purpose of pretreatment is eliminated because of people's phonatory organ itself and as the equipment of voice signal is to quality of speech signal The impact of generation, provides the parameter of high-quality for speech feature extraction, so as to improve the quality of speech processes.
Wherein, the blank voice segments concretely comprised the following steps using voice of the noise reduction as the base value of noise to subsequent voice Denoising is carried out, because being found according to research, when examinee is before recording is recorded, generally in a bit of time for starting is Without sounding, and this bit of recording is not blank, but the recording section with noise.Therefore, by extracting this The audio frequency of recording section can be carried out the process of a place to go noise to recording afterwards as the base value of noise, while Eliminate the noise jamming of unvoiced segments.
Wherein, the cutting word refer to a word be cut into word one by one or phrase so that computer can pass through The presentation content of word one by one or phrase and " understanding " examinee is recognized, is that rear stepped reckoner carries out corresponding bonus point or deduction of points The analysis of factor and last automatic scoring are prepared.As shown in figure 4, the cutting word specifically includes step:
S21, Mel frequency cepstral coefficients (the Mel Frequency Cepstrum for extracting each phoneme in voice Coefficient, MFCC) parameter, and set up the HMM model of correspondence phoneme;
S22, thick cutting is carried out to voice, obtain effective voice segments;
The purpose of thick cutting has at 2 points:One is to reduce operand, reduces the time of cutting word whereby;Two standards for being to increase cutting word Exactness.With regard to rough segmentation, what is utilized is double threshold method, intercepting where substantially blank is fallen, but the threshold value for using is relatively low, mesh Be to obtain effective voice segments;
S23, the word that institute's speech segment is identified according to the HMM model of the phoneme, so as to by speech recognition be word Set.
The cutting word method has discrimination, accuracy rate high, the little advantage of error:1) number of recognition template be it is fixed, For HMM model, accuracy rate is very high;And need not go again to arrange the threshold value of output probability, this will largely Improve discrimination.2) after cutting word, that is, the pronunciation of word is obtained, pronunciation can aid in the matching for carrying out key word, so as to subtract The error brought by matching word is lacked.
2nd, extracting parameter feature
Extraction characteristic parameter in step S3 specially extracts MFCC characteristic parameters, as shown in figure 5, concretely comprising the following steps The language material obtained after pretreatment is carried out fast Fourier transform, triangle window filtering, asks logarithm, discrete cosine transform to obtain MFCC Characteristic parameter.Wherein, it is because that takes into account the auditory properties of human ear using MFCC characteristic parameters, frequency spectrum is converted into and is based on The non-linear spectrum of Mel frequencies, then switches on cepstrum domain.And no any hypotheses, with the method for mathematics come mould The auditory properties of anthropomorphic ear, using a string triangle mode filters in the arrangement of low frequency region juxtaposition, capture the frequency spectrum of voice Information;In addition, the anti-noise ability of MFCC characteristic parameters and anti-distortion spectrum ability are strong, the identity of system can be preferably improved Energy.
3rd, voice content identification
The audio recognition method based on HMM and ANN mixed models is employed in step S4, wherein HMM methods have needs Priori statistical knowledge, the categorised decision ability of wanting voice signal are weak, complex structure, need substantial amounts of training sample and needs to carry out A large amount of shortcomings for calculating;Although ANN has certain advantage in decision-making capability, its description energy to dynamic time signal Power is still unsatisfactory, and the speech recognition algorithm based on neutral net has training, the too long of shortcoming of recognition time.In order to gram Respective shortcoming is taken, it is of the invention by the two methods of the HMM with stronger time modeling ability and ANN with stronger classification capacity Organically combine, further increase the robustness and accuracy rate of speech recognition.This method not only overcomes HMM itself Overlapped problem between insoluble pattern class, improves the identification ability to easy confusable word, while also overcoming ANN It is only capable of processing the limitation of the long input pattern of fixation, eliminates the consolidation computing of complexity.Specifically, as shown in fig. 6, step S4 In concretely comprised the following steps based on the audio recognition method of HMM and ANN mixed models:
S41, set up the examination paper voice language material characteristic parameter HMM model, obtain all state cumulatives in HMM model Probability;
It is S42, all state cumulative probability are special as the input of ANN (specially self organizing neural network) grader Levy and processed, so as to export recognition result;
S43, the recognition result is carried out into characteristic matching with the received pronunciation template, so as to identify the examination paper language The content of sound.
4th, Speech Assessment
Due in daily life, there are some examinees can not carry out spoken test in the time of regulation well, obtain Examination paper voice will appear from a large amount of blank or None- identifieds, these examination paper record labels are problem volume by we.Problem volume bag Blank recording and the sound recording of various None- identifieds, such as excessive recording of the recording of non-English languages, noise etc. are included, and is walked The purpose of rapid S4 is more than identifying the content read by examinee, is also exactly test problems volume, and is given according to actual situation Go out relatively low fraction, for problems volume voice just there is no need accuracy, fluency, word speed, rhythm, stress to be carried out to which Scored with intonation.Only further Speech Assessment is just carried out when initial score is higher than preset threshold value.
(1) the accuracy scoring in step S5 is concretely comprised the following steps:Will voice language be scored using the method for pulling and pushing value Sentence is regular to the degree close with received pronunciation sentence;Using short-time energy as feature extracting the speech sentences to be scored With the intensity curve of received pronunciation sentence;By the fitting for comparing speech sentences to be scored and the intensity curve of received pronunciation sentence Degree is scored.
The intensity of sentence can reflect voice signal change over time.The loud spy of stressed syllable in sentence Levy will reflection to show as speech energy intensity to the energy intensity in time domain, i.e. syllable big.But during due to different people difference Between, intensity of phonation unequal to the pronunciation duration of same a word it is also different, if will speech sentences be scored and received pronunciation language The intensity curve of sentence directly carries out template matching, the objectivity that as a result will affect to evaluate.Therefore the present invention is in the base of original technology A kind of intensity curve extracting method based on received pronunciation sentence is changed out on plinth:When speech sentences time length ratio standard to be scored is used When speech sentences are short, the supplement of duration is carried out using interpolation method to which;When speech sentences time length ratio standard speech to be scored When sound sentence is long, the adjustment of duration is carried out to which using value method is taken out;Finally, using the intensity curve of received pronunciation sentence Point of maximum intensity, treating the intensity curve of scoring speech sentences, to carry out intensity regular.
(2) fluency scoring is concretely comprised the following steps:Voice to be scored is cut into into before and after's two parts, and to first part and later half Part cutting word is so as to obtaining efficient voice section;By the length of two-part efficient voice section in front and back respectively with always voice to be scored Length makees division operation, and the value for obtaining is compared with corresponding threshold value, if both greater than corresponding threshold value, it is fluent to be judged to; Otherwise, it is determined that being unfluent;
For the fluency of Sentence-level, it is intended to by the clear and coherent degree for calculating sentence expression, and utilize received pronunciation meter The rhythm score of pronunciation is calculated, both fusions obtain the fluency diagnostic cast of sentence.This sentence fluency methods of marking also may be used To be applied to the scoring of chapter fluency.The method considers smoothness of the enunciator during statement sentence, compares traditional method There is higher degree of association.Therefore may apply in speech assessment system.
(3) word speed scoring is concretely comprised the following steps:Calculate pronunciation part in voice to be scored and account for whole voice duration to be scored Ratio, scores to word speed according to the ratio.
(4) rhythm scoring is concretely comprised the following steps:Using improved diversity paired index of variability (the Distinct Pairwise Variability Index, dPVI) parameter calculation formula calculates the rhythm of voice to be scored.DPVI is according to voice The feature of unit duration diversity, received pronunciation sentence is carried out respectively with the syllable unit clip durations with scoring speech sentences Comparing calculation, and the parameter changed out is used for into objective evaluation and feedback guidance foundation.
Wherein d be sentence divide voice unit clip durations (such as:dkFor k-th voice unit clip durations), m= (received pronunciation statement element number, speech sentences unit number to be scored), Len during minStdFor received pronunciation sentence duration.Due to By speech sentences duration to be scored regular to suitable with received pronunciation sentence duration before carrying out PVI computings, can during calculating Len is used onlyStdAs computing unit.
(5) stress scoring is concretely comprised the following steps:On the basis of intensity curve after regular, by arranging stress threshold value and non-heavy Sound threshold value divides stress unit as the double threshold and stressed vowel duration of feature, and adopts dynamic time warping (Dynamic Time Warping, DTW) algorithm carries out pattern match to the speech sentences to be scored and received pronunciation sentence, realizes stress Scoring.
Stress refers to the sound read again in word, phrase, sentence.The ultimate principle of DTW algorithms is dynamic time warping, test Between template and reference template, original unmatched time span is matched.Its similarity is calculated with traditional Euclidean distance, If reference template and test template are R and T, higher apart from the more little then similarities of D [T, R].The shortcoming of traditional DTW algorithms is to enter During row template matching, the weight of all frames is consistent, it is necessary to match all of template, amount of calculation than larger, particularly when template number When increasing very fast, operand increases especially fast.So the present invention using the DTW algorithms that improve carry out speech sentences score with The pattern match of received pronunciation sentence, the perfect shortcoming of traditional DTW algorithms, the weight of each frame are given priority to, are substantially reduced Amount of calculation so that result is more accurate.
(6) intonation scoring is concretely comprised the following steps:The formant of voice to be scored and received pronunciation is extracted, and according to described to be evaluated The variation tendency of speech resonant peak is divided to score to intonation with the fitting degree of the variation tendency of received pronunciation formant.
Intonation is an important sign of representation language ability to express in people's English communication, is speech people language fortune With the reflection of state holophrase gesture, it is the order of importance and emergency and the intonation of modulation in tone of voice in sense of hearing.
In the research of digital speech signal processing, the formant of voice signal is a highly important performance parameter. Formant described herein refers to some regions of the energy Relatively centralized in the frequency spectrum of sound, formant not still tonequality certainly Determine factor, and reflect the physical features of sound channel (resonant cavity).Sound is made by the filtering of cavity when through resonant cavity With so that in frequency domain, the energy of different frequency is redistributed, and a part is strengthened because of the resonant interaction of resonant cavity, another portion Divide then attenuated, those frequencies for being strengthened show as dense blackstreak on the sonagram of time frequency analysis.Due to energy Amount skewness, strong part is just as mountain peak, so referred to as formant.Formant is reflection vocal tract resonance characteristic Key character, it represents the most direct sources of pronunciation information, and people make use of formant information in speech perception, so Formant is very important characteristic parameter in Speech processing.Formant is that pulse paracycle excitation is produced into during sound channel One group of resonant frequency.Formant parameter includes formant frequency and bandwidth, and it is the important parameter of the different simple or compound vowel of a Chinese syllable of difference. And among formant information is included in frequency envelope, therefore the key that formant parameter is extracted is to estimate natural-sounding frequency spectrum bag Network, it is considered that the maximum in spectrum envelope is exactly formant.
Present invention also offers a kind of speech assessment system, as shown in fig. 7, comprises:
Voice recording module 101, for recording the examination paper voice of examinee;
Pretreatment module 102, for carrying out pretreatment to the examination paper voice of the examinee, obtains examination paper voice language material;
Parameter attribute extraction module 103, for extracting the characteristic parameter of the examination paper voice language material;
Sound identification module 104, for adopting the audio recognition method based on HMM and ANN mixed models to the examination paper The characteristic parameter and received pronunciation template of voice language material carries out characteristic matching, recognizes the content of the voice that sets a paper, and gives preliminary Scoring;
Speech assessment module 105, for for raw score higher than given threshold examination paper voice carry out accuracy scoring, Fluency scoring, word speed scoring, rhythm scoring, stress scoring and intonation scoring.
Comprehensive grading module 106, for the score calculation of overall accuracy, fluency, word speed, rhythm, stress and intonation Obtain final scoring of the raw score higher than the examination paper voice of given threshold.
Wherein, described speech assessment system and speech assessment method are mutually corresponded to, therefore the concrete process step of each module Suddenly the step of referring to speech assessment method, is not repeating again.
Implement the present invention, have the advantages that:
(1) present invention adds the noise reduction and cutting word method of practicality in pretreatment module, obtains the voice of better quality Language material;
(2) using the audio recognition method based on HMM and ANN mixed models, more preferably, it is more accurate to recognize for performance;
(3) it is by the multi-target analysis to word speed, rhythm, stress and intonation, more more than original Score index for reading aloud topic Unitization, as a result more objectivity;
(4) by the double analysis to accuracy and fluency, on the basis of original can only realization to reading aloud topic scoring, Realize to non-objective scorings for reading aloud topic such as translation topic, question-and-answer problem and repetition topics, establish a rationally perfect voice and comment Divide method and system, fast can carry out scoring of going over examination papers exactly, score to examinee with objective standards of grading;
(5) present invention has the advantages that more stable, in hgher efficiency and practical, applied range, can apply to Process is corrected in SET, is significantly effectively shortened and is corrected the time, improves the high efficiency that system is processed, also improve and correct Objectivity.
Above disclosed is only a kind of preferred embodiment of the invention, and the power of the present invention can not be limited certainly with this Sharp scope, therefore the equivalent variations made according to the claims in the present invention, still belong to the scope covered by the present invention.

Claims (6)

1. a kind of speech assessment method, it is characterised in that including step:
S1, the examination paper voice for recording examinee;
S2, the examination paper voice to the examinee carry out pretreatment, obtain examination paper voice language material;
S3, the characteristic parameter for extracting the examination paper voice language material;
S4, using the audio recognition method based on HMM and ANN mixed models by the characteristic parameter and mark of the examination paper voice language material Quasi- sound template carries out characteristic matching, identifies the content of the examination paper voice, and gives raw score;
If S5, raw score are less than preset threshold value, the raw score is the final scoring of the examination paper voice, and labelling The examination paper voice is rolled up for problem;If raw score is higher than preset threshold value, accuracy, fluent is carried out to the examination paper voice Point index scoring of degree, word speed, rhythm, stress and intonation;
S6, the final scoring for obtaining the examination paper voice is weighted to the scoring of described point of index;
Wherein, the accuracy scoring in step S5 is concretely comprised the following steps:
Using pulling and pushing the method for value by speech sentences to be scored regular to the degree close with received pronunciation sentence;Using in short-term Amount extracts the intensity curve of the speech sentences to be scored and received pronunciation sentence as feature;By comparing voice to be scored Sentence is scored with the fitting degree of the intensity curve of received pronunciation sentence;
Fluency scoring in step S5 is concretely comprised the following steps:
Voice to be scored is cut into into before and after's two parts, and to first part and latter part cutting word so as to obtaining efficient voice section; The length of two-part efficient voice section in front and back made division operation respectively with the length of always voice to be scored, and by the value for obtaining with Corresponding threshold value compares, if being more than corresponding threshold value, it is fluent to be judged to;Otherwise, it is determined that being unfluent;
The examination paper voice is test taker answers translation topic, question-and-answer problem or the voice for repeating questions record;
Pretreatment in step S2 specifically includes noise reduction, preemphasis, framing, adding window, end-point detection and cutting word, wherein, institute Stating concretely comprising the following steps for noise reduction carries out denoising to subsequent voice as the base value of noise using the blank voice segments of voice;
The cutting word specifically includes step:
S21, the MFCC parameters for extracting each phoneme in voice, and set up the HMM model of correspondence phoneme;
S22, thick cutting is carried out to voice, obtain effective voice segments;
S23, the word that institute's speech segment is identified according to the HMM model of the phoneme, so as to by speech recognition be set of letters.
2. speech assessment method as claimed in claim 1, it is characterised in that also include step S0, institute before step S1 State step S0 and specifically include step:
S01, the received pronunciation for recording expert;
S02, pretreatment is carried out to the received pronunciation, obtain received pronunciation language material;
S03, the characteristic parameter for extracting the received pronunciation language material;
S04, the characteristic parameter to the received pronunciation language material carry out model training, obtain the received pronunciation template.
3. speech assessment method as claimed in claim 1, it is characterised in that HMM and ANN hybrid guided modes are based in step S4 The audio recognition method of type is concretely comprised the following steps:
S41, set up the examination paper voice language material characteristic parameter HMM model, in obtaining HMM model, all state cumulatives are general Rate;
S42, all state cumulative probability are processed as the input feature vector of ANN classification device, so as to export identification knot Really;
S43, the recognition result is carried out into characteristic matching with the received pronunciation template, so as to identify the examination paper voice Content.
4. speech assessment method as claimed in claim 1, it is characterised in that the extraction characteristic parameter in step S3 is concrete To extract MFCC characteristic parameters, concretely comprising the following steps carries out fast Fourier transform, quarter window filter by the language material obtained after pretreatment Ripple, logarithm, discrete cosine transform is asked to obtain MFCC characteristic parameters.
5. speech assessment method as claimed in claim 1, it is characterised in that the word speed scoring concrete steps in step S5 For:The ratio that pronunciation part in voice to be scored accounts for entirely voice duration to be scored is calculated, word speed is carried out according to the ratio Scoring;
Rhythm scoring is concretely comprised the following steps:The rhythm of voice to be scored is calculated using improved dPVI parameter calculation formulas;
Stress scoring is concretely comprised the following steps:On the basis of intensity curve after regular, by arranging stress threshold value and non-stress threshold value Double threshold and stressed vowel duration as feature divides stress unit, and using DTW algorithms to the voice language to be scored Sentence and received pronunciation sentence carry out pattern match, realize the scoring of stress;
Intonation scoring is concretely comprised the following steps:The formant of voice to be scored and received pronunciation is extracted, and according to the voice to be scored The variation tendency of formant is scored to intonation with the fitting degree of the variation tendency of received pronunciation formant.
6. a kind of speech assessment system, it is characterised in that include:
Voice recording module, for recording the examination paper voice of examinee;
Pretreatment module, for carrying out pretreatment to the examination paper voice of the examinee, obtains examination paper voice language material;
Characteristic parameter extraction module, for extracting the characteristic parameter of the examination paper voice language material;
Sound identification module, for adopting the audio recognition method based on HMM and ANN mixed models to the examination paper voice language material Characteristic parameter and received pronunciation template carry out characteristic matching, identification sets a paper the content of voice, and give raw score and Mark whether to roll up for problem;
Speech assessment module, for commenting for raw score carries out accuracy higher than the non-problems examination paper voice of preset threshold value Divide, fluency scores, word speed scores, rhythm scores, stress scoring and intonation are scored;
Comprehensive grading module, the score calculation for overall accuracy, fluency, word speed, rhythm, stress and intonation are obtained tentatively Final scoring of the scoring higher than the examination paper voice of given threshold;
Wherein, the accuracy scoring is concretely comprised the following steps:
Using pulling and pushing the method for value by speech sentences to be scored regular to the degree close with received pronunciation sentence;Using in short-term Amount extracts the intensity curve of the speech sentences to be scored and received pronunciation sentence as feature;By comparing voice to be scored Sentence is scored with the fitting degree of the intensity curve of received pronunciation sentence;
The fluency scoring is concretely comprised the following steps:
Voice to be scored is cut into into before and after's two parts, and to first part and latter part cutting word so as to obtaining efficient voice section; The length of two-part efficient voice section in front and back made division operation respectively with the length of always voice to be scored, and by the value for obtaining with Corresponding threshold value compares, if being more than corresponding threshold value, it is fluent to be judged to;Otherwise, it is determined that being unfluent;
The examination paper voice is test taker answers translation topic, question-and-answer problem or the voice for repeating questions record;
It is described that noise reduction, preemphasis, framing, adding window, end-point detection and cutting word are specifically included to the pretreatment that examination paper voice is carried out, its In, concretely comprising the following steps for the noise reduction is carried out at denoising to subsequent voice as the base value of noise using the blank voice segments of voice Reason;
The cutting word specifically includes step:
S21, the MFCC parameters for extracting each phoneme in voice, and set up the HMM model of correspondence phoneme;
S22, thick cutting is carried out to voice, obtain effective voice segments;
S23, the word that institute's speech segment is identified according to the HMM model of the phoneme, so as to by speech recognition be set of letters.
CN201410178813.XA 2014-04-29 2014-04-29 A kind of speech assessment method and system Expired - Fee Related CN103928023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410178813.XA CN103928023B (en) 2014-04-29 2014-04-29 A kind of speech assessment method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410178813.XA CN103928023B (en) 2014-04-29 2014-04-29 A kind of speech assessment method and system

Publications (2)

Publication Number Publication Date
CN103928023A CN103928023A (en) 2014-07-16
CN103928023B true CN103928023B (en) 2017-04-05

Family

ID=51146222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410178813.XA Expired - Fee Related CN103928023B (en) 2014-04-29 2014-04-29 A kind of speech assessment method and system

Country Status (1)

Country Link
CN (1) CN103928023B (en)

Families Citing this family (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361896B (en) * 2014-12-04 2018-04-13 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104361895B (en) * 2014-12-04 2018-12-18 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104505103B (en) * 2014-12-04 2018-07-03 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104464423A (en) * 2014-12-19 2015-03-25 科大讯飞股份有限公司 Calibration optimization method and system for speaking test evaluation
CN104485105B (en) * 2014-12-31 2018-04-13 中国科学院深圳先进技术研究院 A kind of electronic health record generation method and electronic medical record system
CN104732977B (en) * 2015-03-09 2018-05-11 广东外语外贸大学 A kind of online spoken language pronunciation quality evaluating method and system
CN104732352A (en) * 2015-04-02 2015-06-24 张可 Method for question bank quality evaluation
CN104810017B (en) * 2015-04-08 2018-07-17 广东外语外贸大学 Oral evaluation method and system based on semantic analysis
CN105989839B (en) * 2015-06-03 2019-12-13 乐融致新电子科技(天津)有限公司 Speech recognition method and device
CN105681920B (en) * 2015-12-30 2017-03-15 深圳市鹰硕音频科技有限公司 A kind of Network teaching method and system with speech identifying function
CN106971711A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of adaptive method for recognizing sound-groove and system
CN105608960A (en) * 2016-01-27 2016-05-25 广东外语外贸大学 Spoken language formative teaching method and system based on multi-parameter analysis
CN105632488A (en) * 2016-02-23 2016-06-01 深圳市海云天教育测评有限公司 Voice evaluation method and device
CN105654785A (en) * 2016-03-18 2016-06-08 上海语知义信息技术有限公司 Personalized spoken foreign language learning system and method
CN105825852A (en) * 2016-05-23 2016-08-03 渤海大学 Oral English reading test scoring method
CN106548673A (en) * 2016-10-25 2017-03-29 合肥东上多媒体科技有限公司 A kind of Teaching Management Method based on intelligent Matching
CN106531182A (en) * 2016-12-16 2017-03-22 上海斐讯数据通信技术有限公司 Language learning system
CN106710348A (en) * 2016-12-20 2017-05-24 江苏前景信息科技有限公司 Civil air defense interactive experience method and system
CN106652622B (en) * 2017-02-07 2019-09-17 广东小天才科技有限公司 Text training method and device
CN107221318B (en) * 2017-05-12 2020-03-31 广东外语外贸大学 English spoken language pronunciation scoring method and system
CN107293286B (en) * 2017-05-27 2020-11-24 华南理工大学 Voice sample collection method based on network dubbing game
CN107292496A (en) * 2017-05-31 2017-10-24 中南大学 A kind of work values cognitive system and method
CN107239897A (en) * 2017-05-31 2017-10-10 中南大学 A kind of personality occupation type method of testing and system
CN107230171A (en) * 2017-05-31 2017-10-03 中南大学 A kind of student, which chooses a job, is orientated evaluation method and system
CN107274738A (en) * 2017-06-23 2017-10-20 广东外语外贸大学 Chinese-English translation teaching points-scoring system based on mobile Internet
CN109214616B (en) 2017-06-29 2023-04-07 上海寒武纪信息科技有限公司 Information processing device, system and method
CN109426553A (en) 2017-08-21 2019-03-05 上海寒武纪信息科技有限公司 Task cutting device and method, Task Processing Unit and method, multi-core processor
WO2019001418A1 (en) 2017-06-26 2019-01-03 上海寒武纪信息科技有限公司 Data sharing system and data sharing method therefor
CN110413551B (en) 2018-04-28 2021-12-10 上海寒武纪信息科技有限公司 Information processing apparatus, method and device
CN107578778A (en) * 2017-08-16 2018-01-12 南京高讯信息科技有限公司 A kind of method of spoken scoring
CN107785011B (en) * 2017-09-15 2020-07-03 北京理工大学 Training method, device, equipment and medium of speech rate estimation model and speech rate estimation method, device and equipment
CN109697988B (en) * 2017-10-20 2021-05-14 深圳市鹰硕教育服务有限公司 Voice evaluation method and device
CN109727608B (en) * 2017-10-25 2020-07-24 香港中文大学深圳研究院 Chinese speech-based ill voice evaluation system
CN107818797B (en) * 2017-12-07 2021-07-06 苏州科达科技股份有限公司 Voice quality evaluation method, device and system
CN108428382A (en) * 2018-02-14 2018-08-21 广东外语外贸大学 It is a kind of spoken to repeat methods of marking and system
CN108429932A (en) * 2018-04-25 2018-08-21 北京比特智学科技有限公司 Method for processing video frequency and device
CN108831503B (en) * 2018-06-07 2021-11-19 邓北平 Spoken language evaluation method and device
CN109036429A (en) * 2018-07-25 2018-12-18 浪潮电子信息产业股份有限公司 A kind of voice match scoring querying method and system based on cloud service
CN108986786B (en) * 2018-07-27 2020-12-08 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Voice interaction equipment rating method, system, computer equipment and storage medium
CN109147823A (en) * 2018-10-31 2019-01-04 河南职业技术学院 Oral English Practice assessment method and Oral English Practice assessment device
CN109493658A (en) * 2019-01-08 2019-03-19 上海健坤教育科技有限公司 Situated human-computer dialogue formula spoken language interactive learning method
CN111640452B (en) * 2019-03-01 2024-05-07 北京搜狗科技发展有限公司 Data processing method and device for data processing
CN109979484B (en) * 2019-04-03 2021-06-08 北京儒博科技有限公司 Pronunciation error detection method and device, electronic equipment and storage medium
CN110135492B (en) * 2019-05-13 2020-12-22 山东大学 Equipment fault diagnosis and abnormality detection method and system based on multiple Gaussian models
CN110211607A (en) * 2019-07-04 2019-09-06 山东中医药高等专科学校 A kind of English learning system based on sensing network
CN110600052B (en) * 2019-08-19 2022-06-07 天闻数媒科技(北京)有限公司 Voice evaluation method and device
CN111358428A (en) * 2020-01-20 2020-07-03 书丸子(北京)科技有限公司 Observation capability test evaluation method and device
CN111294468A (en) * 2020-02-07 2020-06-16 普强时代(珠海横琴)信息技术有限公司 Tone quality detection and analysis system for customer service center calling
CN111554324A (en) * 2020-04-01 2020-08-18 深圳壹账通智能科技有限公司 Intelligent language fluency identification method and device, electronic equipment and storage medium
CN111696524B (en) * 2020-04-21 2023-02-14 厦门快商通科技股份有限公司 Character-overlapping voice recognition method and system
CN111583961A (en) * 2020-05-07 2020-08-25 北京一起教育信息咨询有限责任公司 Stress evaluation method and device and electronic equipment
CN111612324B (en) * 2020-05-15 2021-02-19 深圳看齐信息有限公司 Multi-dimensional assessment method based on oral English examination
CN111599234A (en) * 2020-05-19 2020-08-28 黑龙江工业学院 Automatic English spoken language scoring system based on voice recognition
CN111612352B (en) * 2020-05-22 2024-06-11 北京易华录信息技术股份有限公司 Student expression capability assessment method and device
CN111816169B (en) * 2020-07-23 2022-05-13 思必驰科技股份有限公司 Method and device for training Chinese and English hybrid speech recognition model
CN112349300A (en) * 2020-11-06 2021-02-09 北京乐学帮网络技术有限公司 Voice evaluation method and device
CN112634692A (en) * 2020-12-15 2021-04-09 成都职业技术学院 Emergency evacuation deduction training system for crew cabins
CN112750465B (en) * 2020-12-29 2024-04-30 昆山杜克大学 Cloud language ability evaluation system and wearable recording terminal
CN113035238B (en) * 2021-05-20 2021-08-27 北京世纪好未来教育科技有限公司 Audio evaluation method, device, electronic equipment and medium
CN113571043B (en) * 2021-07-27 2024-06-04 广州欢城文化传媒有限公司 Dialect simulation force evaluation method and device, electronic equipment and storage medium
CN113807813A (en) * 2021-09-14 2021-12-17 广东德诚科教有限公司 Grading system and method based on man-machine conversation examination

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800314A (en) * 2012-07-17 2012-11-28 广东外语外贸大学 English sentence recognizing and evaluating system with feedback guidance and method of system
CN103617799A (en) * 2013-11-28 2014-03-05 广东外语外贸大学 Method for detecting English statement pronunciation quality suitable for mobile device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102354495B (en) * 2011-08-31 2012-11-14 中国科学院自动化研究所 Testing method and system of semi-opened spoken language examination questions
CN103559894B (en) * 2013-11-08 2016-04-20 科大讯飞股份有限公司 Oral evaluation method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800314A (en) * 2012-07-17 2012-11-28 广东外语外贸大学 English sentence recognizing and evaluating system with feedback guidance and method of system
CN103617799A (en) * 2013-11-28 2014-03-05 广东外语外贸大学 Method for detecting English statement pronunciation quality suitable for mobile device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
发音自动评估系统的设计与实现;孟平;《中国优秀硕士学位论文全文数据库 信息科技辑》;20101215(第12期);第7-26、33-40页 *
考察重音与韵律的英语句子客观评价系统研究;李心广 等;《计算机工程与应用》;20130415;第49卷(第8期);105-109 *
语音识别中HMM与自组织神经网络结合的混合模型;李晶皎 等;《东北大学学报》;19990430;第20卷(第2期);144-147 *

Also Published As

Publication number Publication date
CN103928023A (en) 2014-07-16

Similar Documents

Publication Publication Date Title
CN103928023B (en) A kind of speech assessment method and system
US11322155B2 (en) Method and apparatus for establishing voiceprint model, computer device, and storage medium
CN106228977B (en) Multi-mode fusion song emotion recognition method based on deep learning
CN101064104B (en) Emotion voice creating method based on voice conversion
CN102800314B (en) English sentence recognizing and evaluating system with feedback guidance and method
Deshwal et al. Feature extraction methods in language identification: a survey
CN109119072A (en) Civil aviaton's land sky call acoustic model construction method based on DNN-HMM
CN104050965A (en) English phonetic pronunciation quality evaluation system with emotion recognition function and method thereof
US20010010039A1 (en) Method and apparatus for mandarin chinese speech recognition by using initial/final phoneme similarity vector
CN106548775A (en) A kind of audio recognition method and system
Razak et al. Quranic verse recitation recognition module for support in j-QAF learning: A review
CN109300339A (en) A kind of exercising method and system of Oral English Practice
Sinha et al. Acoustic-phonetic feature based dialect identification in Hindi Speech
Kandali et al. Vocal emotion recognition in five native languages of Assam using new wavelet features
Kanabur et al. An extensive review of feature extraction techniques, challenges and trends in automatic speech recognition
Liu et al. AI recognition method of pronunciation errors in oral English speech with the help of big data for personalized learning
CN102880906B (en) Chinese vowel pronunciation method based on DIVA nerve network model
Goyal et al. A comparison of Laryngeal effect in the dialects of Punjabi language
CN112133292A (en) End-to-end automatic voice recognition method for civil aviation land-air communication field
Sharma et al. Soft-Computational Techniques and Spectro-Temporal Features for Telephonic Speech Recognition: an overview and review of current state of the art
Bansal et al. Emotional Hindi speech: Feature extraction and classification
Hacioglu et al. Parsing speech into articulatory events
Rao et al. Robust features for automatic text-independent speaker recognition using Gaussian mixture model
Dalva Automatic speech recognition system for Turkish spoken language
Sinha et al. Spectral and prosodic features-based speech pattern classification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170405

Termination date: 20200429

CF01 Termination of patent right due to non-payment of annual fee