CN100514446C - Pronunciation evaluating method based on voice identification and voice analysis - Google Patents

Pronunciation evaluating method based on voice identification and voice analysis Download PDF

Info

Publication number
CN100514446C
CN100514446C CNB2004100744450A CN200410074445A CN100514446C CN 100514446 C CN100514446 C CN 100514446C CN B2004100744450 A CNB2004100744450 A CN B2004100744450A CN 200410074445 A CN200410074445 A CN 200410074445A CN 100514446 C CN100514446 C CN 100514446C
Authority
CN
China
Prior art keywords
energy
word
fundamental frequency
syllable
duration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2004100744450A
Other languages
Chinese (zh)
Other versions
CN1750121A (en
Inventor
刘建
赵庆卫
颜永红
邵健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Beijing Kexin Technology Co Ltd
Original Assignee
Institute of Acoustics CAS
Beijing Kexin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS, Beijing Kexin Technology Co Ltd filed Critical Institute of Acoustics CAS
Priority to CNB2004100744450A priority Critical patent/CN100514446C/en
Publication of CN1750121A publication Critical patent/CN1750121A/en
Application granted granted Critical
Publication of CN100514446C publication Critical patent/CN100514446C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The pronunciation evaluating method based on voice identification and voice analysis includes the following steps: selecting input original phonetic signal, acquiring and converting into digital signal and frame dividing treatment; extracting characteristic parameter of the phonetic frame; identifying the input voice with voice identifying engine to obtain the sectional information of each word and/or syllable and calculating confidence of each word; and evaluating pronunciation truth of each word and/or syllable in input voice based on the confidence of each word and/or syllable. Further, the time length, energy and frequency information of each phonetic frequency may be calculated simultaneously and compared with those in standard pronunciation library, so as to calculate the similarity of each word and/or syllable, and the similarity may be weighted and added to the confidence to obtain the pronunciation truth. The present invention has greatly raised pronunciation evaluating precision.

Description

A kind of pronunciation evaluating method based on speech recognition and speech analysis
Technical field
The present invention relates to a kind of pronunciation evaluating method, more particularly, the present invention relates to pronunciation evaluating method based on speech recognition and speech analysis.
Background technology
" say " it is a important step in the language learning, existing a lot of language teaching softwares on the market, such as, English study software, the form of teaching that these software adopted substantially all is " recording contrast ", that is to say that the repeat playing that they can only provide the student to pronounce and teacher's example is pronounced allows student oneself listen contrast different orchids wherein and corrects one's pronunciation.In fact, the teaching efficiency that such mode can play is very limited, because the common cacoepy of people, just because of the difference of itself can not listening between " accurately " and " inaccurate " pronunciation.
At present, the English study software of pronunciation evaluation function has also appearred having in market, but they have mostly just adopted speech recognition technology to judge what the user said, can only provide a general judge, can only show the user this in short read OK, and cacoepy on certain or certain several words just often in the said a word of user, this each word (word) thought in can only can not accurately showing in short to the whether accurate method of assessing in short OK, allow the user correct targetedly, thereby be difficult to satisfy actual demand.
Summary of the invention
The technical problem to be solved in the present invention provides a kind of pronunciation evaluating method based on speech recognition and speech analysis, to the analysis of input voice and pass judgment on can word or individual character be unit, thereby improve the accuracy of assessment.In order to solve the problems of the technologies described above, the invention provides pronunciation evaluating method based on speech recognition and speech analysis, may further comprise the steps:
(a) choose the primary speech signal of input, gather and be converted to digital signal, carry out the branch frame then and handle;
(b) extract the characteristic parameter of speech frame, calculate the energy and the fundamental frequency of each speech frame, perhaps wherein a kind of;
(c) utilize speech recognition engine that these input voice are discerned, obtain each word (as, to English) or individual character (as, to Chinese) segment information;
(d) then with a kind of or combination in any in the speech frame energy, fundamental frequency and the segment information that obtain, compare with corresponding Received Pronunciation information, calculate a kind of or combination in any in the difference value of duration, energy and fundamental frequency of each word or individual character.
In the technique scheme, described step (d) is further divided into following steps:
(d1) obtain the duration information of syllable according to the speech frame quantity that each syllable comprised, the energy of all speech frames that each syllable comprises and fundamental frequency are obtained separately the fundamental frequency and the energy information of syllable after the addition divided by the number of speech frame respectively;
(d2) duration, energy and the fundamental frequency information of all syllables that respectively each word or individual character comprised are compared with corresponding Received Pronunciation information, calculate the difference value of duration, energy and the fundamental frequency of each word or individual character.
In the technique scheme, it is characterized in that, duration, fundamental frequency and the energy accumulation that also comprises all syllables according to each word or individual character in the described step (d1) obtains duration, fundamental frequency or the energy of each word or individual character, and compares with corresponding Received Pronunciation information and to obtain another group difference value of each word or individual character duration, energy and fundamental frequency.
In the technique scheme, when calculating the difference value of duration, energy and fundamental frequency of each word or individual character in the described step (d2), be respectively it to be comprised to get average after the squared difference of duration, energy and the fundamental frequency of all syllables and Received Pronunciation information and obtain.
The technical problem to be solved in the present invention provides a kind of pronunciation evaluating method based on speech recognition and speech analysis, can syllable be unit to analysis and the judge of importing voice, thereby improve the accuracy of assessment.In order to solve the problems of the technologies described above, the invention provides a kind of pronunciation evaluating method based on speech recognition and speech analysis, may further comprise the steps:
(o) choose the primary speech signal of input, gather and be converted to digital signal, carry out the branch frame then and handle;
(p) speech frame is carried out the extraction of characteristic parameter, calculate the energy and the fundamental frequency of each speech frame, perhaps wherein a kind of;
(q) utilize speech recognition engine that these input voice are discerned, obtain being accurate to the segment information of each syllable;
(r) with a kind of or combination in any in the speech frame energy, fundamental frequency and the segment information that obtain, compare with corresponding Received Pronunciation information, calculate a kind of or combination in any in the difference value of duration, energy and fundamental frequency of each syllable.
In the technique scheme, described step (r) is with duration, energy and the fundamental frequency of speech frame that each syllable comprises, the duration of this syllable that perhaps further calculates, energy and fundamental frequency are compared with corresponding Received Pronunciation information, obtain the difference value of each syllable duration, energy and fundamental frequency.
The present invention does not need the user to compare sound voluntarily, and is not only to judge with simple speech recognition technology what the user said, but utilizes speech recognition technology that accurate cutting is carried out in user's pronunciation earlier, is accurate to each word in a word, each syllable; Then, the pronunciation that comes the Accurate Analysis user at the strength information, frequency information, the prosodic information that utilize pronunciation has any difference with the pronunciation of standard, tell finally user (learner) problem goes out at which word, which syllable, and how to improve, can improve the degree of accuracy and the effect of pronunciation evaluation greatly.
Description of drawings
Fig. 1 is the system chart of the pronunciation evaluating method of the embodiment of the invention;
Fig. 2 is the exemplary plot as a result after segmentation discerned in certain example sentence;
Fig. 3 is the process flow diagram of first embodiment of the invention method;
Fig. 4 is the process flow diagram of second embodiment of the invention method.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is done and to describe in further detail.
First embodiment
Fig. 1 is the system chart of present embodiment pronunciation evaluating method, and as shown in the figure, the pronunciation evaluation system of present embodiment comprises pronunciation extracting module, speech recognition and align automatically regular module, information convergence analysis module.After the raw tone input, at first enter pronunciation extracting module and carry out feature extraction, this characteristic extraction procedure is that primary speech signal is carried out the data that features such as the intensity of phonation of each frame, pronunciation duration, fundamental curve were handled and obtained to the branch frame.Then primary speech signal is carried out the MFCC conversion, and enter sound identification module, to carrying out speech recognition through the voice signal after the MFCC conversion and alignment is regular automatically, match the segment information of an individual character that can be as accurate as each English word in the sentence or Chinese according to current learning content, received pronunciation model; Calculate the degree of confidence of each word or syllable more respectively according to these segment informations.Enter the information convergence analysis module at last, in this module, data such as the segment information that obtains in data such as the intensity of phonation that extracts in the comprehensive characteristics extraction module, pronunciation duration, fundamental curve and the sound identification module, degree of confidence, and convergence analysis is carried out in contrast learning content sample storehouse, draw the final judge to this pronunciation, this judge is accurate to each word or the individual character of this pronunciation.
Fig. 3 is the process flow diagram of present embodiment, as shown in the figure, may further comprise the steps:
Step 100, the user reads a word in the learning content;
Step 101 is carried out the primary speech signal collection, and the analog-signal transitions of each user pronunciation is become digital sampled signal;
Step 102 is carried out the branch frame with the digital signal of voice and is handled, and is the length of an analysis frame with 25ms usually, and after each frame analysis was intact, with the mobile backward 10ms of analysis frame, re-treatment again finished until all signal Processing;
Step 103 calculates the speech energy of this frame, MFCC parameter, and fundamental frequency respectively to each frame voice signal;
Step 104 according to current learning content, utilizes speech recognition engine that the input voice are discerned, and obtains at this segment information in short, and present embodiment is to obtain each word or individual character, and the segment information of each word or syllable that word comprises; Fig. 2 discerns exemplary plot as a result after the segmentation to certain example sentence.
Step 105, obtain segment information after, continue to calculate the degree of confidence of each word in the input voice or individual character;
In the above-mentioned steps 104 and 105, patented claim is all adopted in the calculating of segment information and degree of confidence: " speech recognition confidence evaluation method and system and use this method dictation device (application number: 02148686.7) " in disclosed algorithm, but can adopt other algorithm, the present invention is not limited this yet.
Step 106, according to speech energy, the fundamental frequency of the segment information that obtains and each frame, duration, energy and the fundamental frequency of each speech or word in the calculating input voice;
Word or individual character may be made up of one or more continuous syllables, present embodiment is duration, energy and the fundamental frequency that calculates each syllable earlier, obtains duration, energy and the fundamental frequency of each speech or word again according to the above-mentioned information of the syllable that is comprised in each speech or the word.
The duration of syllable is exactly the pronunciation length of each syllable, can directly obtain according to segment information, and frame voice are equivalent to 1oms, and the number of all speech frames that each syllable comprises just can be represented the duration of this syllable; The speech energy of syllable and fundamental frequency are then by obtaining averaging after the speech energy of all speech frames of this syllable and the fundamental frequency accumulative total.Energy on each frame of user's raw tone is designated as eng (i), fundamental frequency are designated as pitch (i), duration is designated as dur (i), wherein i represents frame number; Energy on certain syllable is designated as eng (k), fundamental frequency are designated as pitch (k), duration is designated as dur (k), wherein k represents the sequence number of syllable.Then dur (k) equals the number of the frame that comprises among the syllable k; Energy eng (k) is each frame energy eng (i) addition that syllable k is comprised, and the number dur (k) divided by the frame that this syllable comprised obtains again; Fundamental frequency pitch (k) is that the frame number dur (k) divided by syllable k obtains again with fundamental frequency pitch (i) addition of each frame in the syllable k.
In the present embodiment, the duration of each word or individual character, energy and fundamental frequency obtain by duration, energy and the fundamental frequency stack of the syllable that it comprised.The energy of each word or word in the sentence is designated as eng_w (j), fundamental frequency are designated as pitch_w (j), duration is designated as dur_w (j); Wherein j represents the sequence number of word or individual character, then duration dur_w (j) equal all syllables that word or word j comprise duration dur's (k) and; Energy eng_w (j) equal all syllables that word or word j comprise energy eng's (k) and; Fundamental frequency pitch_w (j) equal all syllables that word or word j comprise fundamental frequency pitch's (k) and.
Step 107 is compared Received Pronunciation information corresponding in degree of confidence, duration, energy and the fundamental frequency information of each speech that obtains or word and the learning content sample storehouse, calculates the pronunciation validity of each speech or word;
The specific algorithm of present embodiment is as follows:
1) finds earlier in the learning content sample storehouse and the corresponding pronunciation example sentence of this input voice content, obtain one group of duration, fundamental frequency, energy corresponding " standard " duration, fundamental frequency and energy information with each syllable of calculating and each speech (word), be designated as respectively: dur0 (k), eng0 (k), pitch0 (k) and dur_w0 (j), eng_w0 (j), pitch_w0 (j).This standard duration, fundamental frequency and energy information can be preserved in advance, also can calculate in real time according to this example sentence in received pronunciation storehouse.
2) calculate duration, the energy of each word or word j, the difference value of fundamental frequency then:
Δdur ( j ) = Σ k ( dur ( k ) - dur 0 ( k ) ) 2 / N for all syllable ( k ) ∈ word ( j )
Δeng ( j ) = Σ k ( eng ( k ) - eng 0 ( k ) ) 2 / N for all syllable ( k ) ∈ word ( j )
Δpitch ( j ) = Σ k ( pitch ( k ) - pitch 0 ( k ) ) 2 / N for all syllable ( k ) ∈ word ( j )
Wherein: N is the number of the syllable (syllable) that belongs to Word (j), Δ dur (j), and Δ eng (j), Δ pitch (j) is respectively the duration difference value of single speech or word j, capacity volume variance value, fundamental frequency difference value.
3) again the information of duration, energy, fundamental frequency difference value is merged, calculates the similarity of these information:
Parameter a = w 1 · Δdur ( j ) + w 2 · Δeng ( j ) + w 3 · Δpitch ( j ) - - - ( 1 )
Parameter b = 0.5 * w 1 · x + w 2 · y + w 3 · z - - - ( 2 )
Wherein: x=|dur_w0 (j)-dur_w (j) | 2Y=|eng_w0 (j)-eng_w (j) | 2Z=|pitch_w0 (j)-pitch_w (j) | 2
Scoring functions is: sigmo (x)=arctan (x-3.5)/π * 100+50 (3)
The substitution formula as a result (3) of formula (1), (2) formula is obtained similarity:
score 1=sigmo(-log(a+b)/2)
Wherein: w1, w2, w3 represent the weight to different information, can be provided with respectively, as be 0.5,0.3,0.2.
4), calculate the pronunciation validity scoring of each word or word at last according to the degree of confidence of each word or word and the similarity of calculating above.
The degree of confidence of the word (j) that obtains is designated as score 2, so, the pronunciation validity of final word (j) is exactly score 1And score 2Weighted mean:
score=w 1·score 1+w 2·score 2
Both weights can be made as identical, that is: w 1=w 2=0.5.Can certainly be made as difference.
Step 108 is calculated the validity of whole word, equals the mean value of the validity score of all words in the word or word.
Step 109 is made judge according to above result of calculation to this pronunciation, and the validity of validity, each word or the individual character of whole word is fed back to the user.
Second embodiment
The key distinction of the present embodiment and first embodiment is that present embodiment has also carried out the pronunciation validity assessment of syllable, and to point out the difference of user pronunciation and Received Pronunciation more accurately, in addition, present embodiment has adopted different scoring methods.
Present embodiment the pronunciation evaluation system identical with first embodiment, its method flow may further comprise the steps:
Step 200, the user reads a word in the learning content;
Step 201 is carried out the primary speech signal collection, and the analog-signal transitions of each user pronunciation is become digital sampled signal;
Step 202 is carried out the branch frame with the digital signal of voice and is handled, and is the length of an analysis frame with 25ms usually, and after each frame analysis was intact, with the mobile backward 10ms of analysis frame, re-treatment again finished until all signal Processing;
Step 203 calculates the speech energy of this frame, MFCC parameter (also can be other characteristic parameter), and fundamental frequency respectively to each frame voice signal;
Step 204 according to current learning content, utilizes speech recognition engine that the input voice are discerned, obtains at this each word or individual character in short, and the segment information of each syllable;
Step 205, obtain segment information after, calculate each word or word, with and the degree of confidence of each syllable of being comprised; Still adopting application number is 02148686.7 the disclosed algorithm of patented claim;
Step 206, duration, energy and fundamental frequency information dur (i) with each frame of obtaining, eng (i), the corresponding Received Pronunciation information of this frame: dur0 (i) in pitch (i) and the learning content sample storehouse, eng0 (i), pitch0 (i) compares, and calculates the similarity of duration, energy and the fundamental frequency of each syllable in the input voice:
Calculate duration difference value, capacity volume variance value and the fundamental frequency difference value of syllable k: Δ dur (k) earlier, Δ eng (k), Δ pitch (k):
Δdur ( k ) = Σ i ( dur ( i ) - dur 0 ( i ) ) 2 / N for all frame ( i ) ∈ syllable ( k )
Δeng ( k ) = Σ i ( eng ( i ) - eng 0 ( i ) ) 2 / N for all frame ( i ) ∈ syllable ( k )
Δpitch ( k ) = Σ i ( pitch ( i ) - pitch 0 ( i ) ) 2 / N for al lframe ( i ) ∈ syllable ( k )
Wherein: N is the number of the frame (frame) that belongs to syllable (k).Calculate the similarity of duration, energy and fundamental frequency then respectively:
score a = sigmo ( - log ( Δdur ( k ) + | dur 0 ( k ) - dur ( k ) | / 2 ) )
score b = sigmo ( - log ( Δeng ( k ) + | eng 0 ( k ) - eng ( k ) | / 2 ) )
score c = sigmo ( - log ( Δpitch ( k ) + | pitch 0 ( k ) - pitch ( k ) | / 2 ) )
Sigmo wherein (x) function is identical with first embodiment.
Step 207 is calculated the pronunciation validity of each syllable that is comprised in each speech or the word; Be that similarity according to the degree of confidence of each syllable and duration, energy, fundamental frequency is weighted on average:
Score=w 1Score 1+ w 2Score 2+ w 3Score 3+ w 4Score 4Wherein: w1, w2, w3, w4 represent the weight to different information, can be made as 0.25,0.15,0.10,0.5 respectively;
Step 208 calculates the validity of single speech or word, and the validity of word or individual character is determined by the average of its all syllable validities that comprise, and also can adopt the method for first embodiment to calculate;
Step 209 is calculated the validity of whole word, equals the mean value of the validity score of all words in the words or individual character;
Step 210 is made judge according to above result of calculation to this pronunciation, and the validity of whole word and each syllable is fed back to the user.
The present invention can have a lot of conversion on the basis of the foregoing description, for example the validity for each speech or word is to have considered degree of confidence, duration, fundamental frequency and four information of energy, wherein duration, fundamental frequency and energy can not considered yet, perhaps consider wherein a kind of or its combination in any.
And for example, for word or individual character, and two kinds have been enumerated above the concrete computing method of the validity of syllable, but much conversion of this algorithm, for example when first embodiment calculates duration, fundamental frequency and the energy of word or individual character and difference value thereof, do not calculate the difference value of its duration that comprises syllable, fundamental frequency and energy, and the frame that directly adopts this word or individual character and comprised calculates; The computing method of similarity and last validity can have a variety of, or the like.

Claims (4)

1, a kind of pronunciation evaluating method based on speech recognition and speech analysis may further comprise the steps:
(a) choose the primary speech signal of input, gather and be converted to digital signal, carry out the branch frame then and handle;
(b) extract the characteristic parameter of speech frame, calculate the energy and the fundamental frequency of each speech frame, perhaps wherein a kind of;
(c) utilize speech recognition engine that these input voice are discerned, obtain the segment information of each word or individual character;
(d) then with a kind of or combination in any in the speech frame energy, fundamental frequency and the segment information that obtain, compare with corresponding Received Pronunciation information, calculate a kind of or combination in any in the difference value of duration, energy and fundamental frequency of each word or individual character; Described step (d) is further divided into following steps:
(d1) obtain the duration information of syllable according to the speech frame quantity that each syllable comprised, the energy of all speech frames that each syllable comprises and fundamental frequency are obtained separately the fundamental frequency and the energy information of syllable after the addition divided by the number of speech frame respectively;
(d2) duration, energy and the fundamental frequency information of all syllables that respectively each word or individual character comprised are compared with corresponding Received Pronunciation information, calculate the difference value of duration, energy and the fundamental frequency of each word or individual character.
2, pronunciation evaluating method as claimed in claim 1, it is characterized in that, duration, fundamental frequency and the energy accumulation that also comprises all syllables according to each word or individual character in the described step (d1) obtains duration, fundamental frequency or the energy of each word or individual character, and compares with corresponding Received Pronunciation information and to obtain another group difference value of each word or individual character duration, energy and fundamental frequency.
3, pronunciation evaluating method as claimed in claim 1, it is characterized in that, when calculating the difference value of duration, energy and fundamental frequency of each word or individual character in the described step (d2), be respectively it to be comprised to get average after the squared difference of duration, energy and the fundamental frequency of all syllables and Received Pronunciation information and obtain.
4, a kind of pronunciation evaluating method based on speech recognition and speech analysis may further comprise the steps:
(o) choose the primary speech signal of input, gather and be converted to digital signal, carry out the branch frame then and handle;
(p) speech frame is carried out the extraction of characteristic parameter, calculate the energy and the fundamental frequency of each speech frame, perhaps wherein a kind of;
(q) utilize speech recognition engine that these input voice are discerned, obtain being accurate to the segment information of each syllable;
(r) with duration, energy and the fundamental frequency of speech frame that each syllable comprises, the duration of this syllable that perhaps further calculates, energy and fundamental frequency are compared with corresponding Received Pronunciation information, obtain the difference value of each syllable duration, energy and fundamental frequency.
CNB2004100744450A 2004-09-16 2004-09-16 Pronunciation evaluating method based on voice identification and voice analysis Expired - Fee Related CN100514446C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2004100744450A CN100514446C (en) 2004-09-16 2004-09-16 Pronunciation evaluating method based on voice identification and voice analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2004100744450A CN100514446C (en) 2004-09-16 2004-09-16 Pronunciation evaluating method based on voice identification and voice analysis

Publications (2)

Publication Number Publication Date
CN1750121A CN1750121A (en) 2006-03-22
CN100514446C true CN100514446C (en) 2009-07-15

Family

ID=36605530

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100744450A Expired - Fee Related CN100514446C (en) 2004-09-16 2004-09-16 Pronunciation evaluating method based on voice identification and voice analysis

Country Status (1)

Country Link
CN (1) CN100514446C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092330A (en) * 2011-10-27 2013-05-08 宏碁股份有限公司 Electronic device and voice recognition method thereof

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101118745B (en) * 2006-08-04 2011-01-19 中国科学院声学研究所 Confidence degree quick acquiring method in speech identification system
CN101551947A (en) * 2008-06-11 2009-10-07 俞凯 Computer system for assisting spoken language learning
CN101645271B (en) * 2008-12-23 2011-12-07 中国科学院声学研究所 Rapid confidence-calculation method in pronunciation quality evaluation system
CN101950560A (en) * 2010-09-10 2011-01-19 中国科学院声学研究所 Continuous voice tone identification method
CN102999496A (en) * 2011-09-09 2013-03-27 北京百度网讯科技有限公司 Method for building requirement analysis formwork and method and device for searching requirement recognition
CN103390409A (en) * 2012-05-11 2013-11-13 鸿富锦精密工业(深圳)有限公司 Electronic device and method for sensing pornographic voice bands
CN104078050A (en) 2013-03-26 2014-10-01 杜比实验室特许公司 Device and method for audio classification and audio processing
CN104299612B (en) * 2014-11-10 2017-11-07 科大讯飞股份有限公司 The detection method and device of imitative sound similarity
CN105609114B (en) * 2014-11-25 2019-11-15 科大讯飞股份有限公司 A kind of pronunciation detection method and device
CN104485115B (en) * 2014-12-04 2019-05-03 上海流利说信息技术有限公司 Pronounce valuator device, method and system
CN104575490B (en) * 2014-12-30 2017-11-07 苏州驰声信息科技有限公司 Spoken language pronunciation evaluating method based on deep neural network posterior probability algorithm
CN105989836B (en) * 2015-03-06 2020-12-01 腾讯科技(深圳)有限公司 Voice acquisition method and device and terminal equipment
CN106157974A (en) * 2015-04-07 2016-11-23 富士通株式会社 Text recites quality assessment device and method
CN104952444B (en) * 2015-04-27 2018-07-17 桂林电子科技大学 A kind of Chinese's Oral English Practice method for evaluating quality that text is unrelated
CN106340295B (en) * 2015-07-06 2019-10-22 无锡天脉聚源传媒科技有限公司 A kind of receiving method and device of speech recognition result
CN107767863B (en) * 2016-08-22 2021-05-04 科大讯飞股份有限公司 Voice awakening method and system and intelligent terminal
CN106328168B (en) * 2016-08-30 2019-10-18 成都普创通信技术股份有限公司 A kind of voice signal similarity detection method
CN106847260B (en) * 2016-12-20 2020-02-21 山东山大鸥玛软件股份有限公司 Automatic English spoken language scoring method based on feature fusion
CN109697988B (en) * 2017-10-20 2021-05-14 深圳市鹰硕教育服务有限公司 Voice evaluation method and device
CN109697975B (en) * 2017-10-20 2021-05-14 深圳市鹰硕教育服务有限公司 Voice evaluation method and device
CN107767862B (en) * 2017-11-06 2024-05-21 深圳市领芯者科技有限公司 Voice data processing method, system and storage medium
CN108133706B (en) * 2017-12-21 2020-10-27 深圳市沃特沃德股份有限公司 Semantic recognition method and device
CN109147419A (en) * 2018-07-11 2019-01-04 北京美高森教育科技有限公司 Language learner system based on incorrect pronunciations detection
CN108961856A (en) * 2018-07-19 2018-12-07 深圳乐几科技有限公司 Verbal learning method and apparatus
CN109872726A (en) * 2019-03-26 2019-06-11 北京儒博科技有限公司 Pronunciation evaluating method, device, electronic equipment and medium
CN109859745A (en) * 2019-03-27 2019-06-07 北京爱数智慧科技有限公司 A kind of audio-frequency processing method, equipment and computer-readable medium
CN110085257A (en) * 2019-03-29 2019-08-02 语文出版社有限公司 A kind of rhythm automated decision system based on the study of national literature classics
CN111916108B (en) * 2020-07-24 2021-04-02 北京声智科技有限公司 Voice evaluation method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293428A (en) * 2000-11-10 2001-05-02 清华大学 Information check method based on speed recognition

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293428A (en) * 2000-11-10 2001-05-02 清华大学 Information check method based on speed recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于音节的汉语连续语音声调识别方法研究. 钟金宏.. 2001 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092330A (en) * 2011-10-27 2013-05-08 宏碁股份有限公司 Electronic device and voice recognition method thereof
CN103092330B (en) * 2011-10-27 2015-11-25 宏碁股份有限公司 Electronic installation and speech identifying method thereof

Also Published As

Publication number Publication date
CN1750121A (en) 2006-03-22

Similar Documents

Publication Publication Date Title
CN100514446C (en) Pronunciation evaluating method based on voice identification and voice analysis
CN101710490B (en) Method and device for compensating noise for voice assessment
CN101751919B (en) Spoken Chinese stress automatic detection method
CN101740024B (en) Method for automatic evaluation of spoken language fluency based on generalized fluency
CN103065626B (en) Automatic grading method and automatic grading equipment for read questions in test of spoken English
TWI275072B (en) Pronunciation assessment method and system based on distinctive feature analysis
CN105741831B (en) A kind of oral evaluation method and system based on syntactic analysis
CN101246685B (en) Pronunciation quality evaluation method of computer auxiliary language learning system
CN102930866B (en) Evaluation method for student reading assignment for oral practice
CN103177733B (en) Standard Chinese suffixation of a nonsyllabic "r" sound voice quality evaluating method and system
CN109256152A (en) Speech assessment method and device, electronic equipment, storage medium
Das et al. Bengali speech corpus for continuous auutomatic speech recognition system
US6618702B1 (en) Method of and device for phone-based speaker recognition
CN103559892B (en) Oral evaluation method and system
CN106782603B (en) Intelligent voice evaluation method and system
CN102376182B (en) Language learning system, language learning method and program product thereof
JP2002544570A (en) Automated linguistic assessment using speech recognition modeling
CN105825852A (en) Oral English reading test scoring method
CN109377981B (en) Phoneme alignment method and device
CN101650886A (en) Method for automatically detecting reading errors of language learners
CN107240394A (en) A kind of dynamic self-adapting speech analysis techniques for man-machine SET method and system
CN107886968A (en) Speech evaluating method and system
CN109300339A (en) A kind of exercising method and system of Oral English Practice
CN111915940A (en) Method, system, terminal and storage medium for evaluating and teaching spoken language pronunciation
JPH0250198A (en) Voice recognizing system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090715

CF01 Termination of patent right due to non-payment of annual fee