CN101739870B - Interactive language learning system and method - Google Patents

Interactive language learning system and method Download PDF

Info

Publication number
CN101739870B
CN101739870B CN2009101887026A CN200910188702A CN101739870B CN 101739870 B CN101739870 B CN 101739870B CN 2009101887026 A CN2009101887026 A CN 2009101887026A CN 200910188702 A CN200910188702 A CN 200910188702A CN 101739870 B CN101739870 B CN 101739870B
Authority
CN
China
Prior art keywords
module
pronunciation
rhythm
mistake
learner
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2009101887026A
Other languages
Chinese (zh)
Other versions
CN101739870A (en
Inventor
王岚
李崇国
陈金玉
蒙美玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN2009101887026A priority Critical patent/CN101739870B/en
Publication of CN101739870A publication Critical patent/CN101739870A/en
Application granted granted Critical
Publication of CN101739870B publication Critical patent/CN101739870B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to an interactive language learning system and an interactive language learning method. A core module of the interactive language learning system comprises a feature extraction module, a voice recognition module, a pronunciation evaluation module, a rhythm detection module and a rhythm evaluation module, all of which form a pronunciation and rhythm detection module. The interactive language learning system can judge and feed back the voice input of a learner in real time, so that the learner can accurately know specific pronunciation errors, and provide the memory content by combining the feedback result and the dynamic memory curve. Therefore, the learner can improve the language level gradually and form an interactive learning style.

Description

Interacting language learning system and interacting language learning method
[technical field]
The present invention relates to a kind of interacting language learning system and interacting language learning method.
[background technology]
Language learning is one of very important part during people gain knowledge.Also there is increasing people to rely on the language learning aid to improve the speed and the efficient of language learning.Abundant learning content, interactive interactive mode, personalized course, the aspect such as to be convenient to operate be the inexorable trend of the development of langue leaning system.
Dictionary essence is a kind of assisted learning system, and it only is media with the literal, though helpful on reading and writing, can not play direct help for listening and speaking.Along with the continuous development of technology such as computing machine, multimedia, voice, can the assistant learning system that aspects such as listening, speaking, reading and writing have to a certain degree or part is supported be continued to bring out.From the appearance of e-dictionary, reading following machine finally, point reader, and some learning softwares are enriched the form of assisted learning system and function such as the appearance of hearing, writing software etc. gradually.
But the weak point of these systems is the part supports that only realized listening, speaking, reading and writing, does not organically combine each link of language learning, lacks real-time false judgment and feedback, and the learner is just accepting passively.Wherein have system, but it finally gives learner's just mark or rank to voice quality evaluation and test, and the difficult tool accuracy of this mark and authority.The more important thing is, the learner is concerned about be own pronunciation concrete mistake, which place is wrong, still this evaluating pronunciation system is difficult to provide the result that the learner wants, and does not tell how the learner corrects a mistake.
Therefore, there is defective in prior art, needs to improve.
[summary of the invention]
In view of this, be necessary, the real-time feedback learning error situation of a kind of ability be provided and have interactive exercise and the interacting language learning system and the interacting language learning method of interactive memory to the problems referred to above.
A kind of interacting language learning system comprises: voice acquisition module is used to gather learner's speech data; Pronunciation and prosody detection module are used for extracting the characteristic parameter that is used to pronounce with rhythm error-detecting from speech data, and to the degree that mistake is further judged and controlled wrong demonstration, obtain final phoneme mistake and rhythm mistake; Data storage and statistical module are used to write down said phoneme mistake and rhythm mistake, and combine these mistakes to give overall assessment to learner's pronunciation situation, and evaluation result is fed back to interactive module; Interactive module comprises display interface, and said display interface is used to show phoneme mistake and rhythm mistake, learner the pronounce overall assessment and the help options of situation, and the pronunciation prompting is provided; Said pronunciation and prosody detection module comprise: characteristic extracting module is used for extracting the characteristic parameter that is used to pronounce with rhythm error-detecting from said speech data; Sound identification module combines language model or speech network based on acoustic model, and said characteristic parameter is discerned, and obtains word sequence, aligned phoneme sequence, corresponding time border, likelihood probability value respectively; The pronunciation evaluation module, the aligned phoneme sequence that is used for identification is obtained and the reference phoneme of system compare alignment, obtain phoneme mistake and help options; Prosody detection module is used to combine characteristic parameter, aligned phoneme sequence, time boundary information, likelihood probability value, adopts statistical model to obtain word and reads pattern, whole sentence intonation and time rhythm again; Rhythm evaluation module is used for reading word again pattern, whole sentence intonation and time rhythm with comparing with reference to pronunciation, obtains rhythm mistake and help options.
Preferably, the speech data of said voice acquisition module collection comprises that the pronunciation prompting that system is provided follow the speech data of reading and obtaining in a minute according to the sight that pronounces.
Preferably; Said pronunciation evaluation module at first uses the method for statistics to combine said word sequence, aligned phoneme sequence, time border and likelihood probability value to carry out the differentiation of word level content; If content is inconsistent; The system log (SYSLOG) content false, and the whole sentence of prompting content is undesirable in said interactive module, the request learner re-enters voice; Otherwise phoneme is detected, obtain the phoneme mistake, comprise insertion, deletion, the replacement mistake of phoneme in the word.
Preferably, pattern read again in said word is that unit is judged with the syllable, comprises the position of main stressed syllable in the word and the position of time stressed syllable; Said whole sentence intonation is the sentence stress of whole word, i.e. the position of stressed syllable in this sentence, and it reflects whole fundamental frequency variation tendency based on syllable and intonation; Said time rhythm is the judgement to speed of speaking and duration.
Preferably, pronunciation text, the target learning content that this pronunciation text is the learner are adopted in the pronunciation of said interactive module prompting; Or adopt with reference to pronunciation, this is the received pronunciation that the people sent out of target language country with reference to pronunciation; Again or adopt the pronunciation sight, this pronunciation sight is the sight that system provides, and requires the learner according to this pronunciation sight in a minute.
Preferably, said interactive module also comprises inputting interface, and said inputting interface is used to select memory pattern, learning content or logs off; Said display interface also is used for the display system feedack, comprises audio frequency and spelling information and said data storage and statistical module feedack; Said interactive module is selected the language learning material; Through audio frequency or text mode the learner is pointed out; Audio prompt is the pronunciation that system provides needs memory, requires the learner to spell and follow and read, and the spelling prompting is the text prompt that system provides the spelling content that needs to remember; Require the learner to spell, obtain spelling content; Said interacting language learning system also comprises text collection module and text spelling detection module, and said text collection module is used to gather said spelling content, obtains input text; Said text spelling detection module is used to check input text, through calculating the similarity editing distance of input text and model answer text, obtains misspelling; Said data storage and statistical module also are used to write down said misspelling; Said data storage and statistical module also wrap and expand a database, and concrete error statistics situation will be write this database in time, and this database is not only stored learning records, but also has stored learning content; System is according to the learning content of storing in the memory pattern of current error logging, selection and the database; Select and produce new learning content and audio frequency and spelling prompting; Feed back to said interactive module; Thereby get into the interactive learning of next round, perhaps reselect learning content, or log off according to current study schedule.
Preferably, said misspelling comprises alternative, insertion and deletion error.
Preferably; Said interactive module also is used to show the session operational scenarios of a group task form, after selected certain session operational scenarios of this interactive module, the subtask will occur; The information that the learner will provide according to this interactive module is carried out interactive operation and is pronounced and spell and finish the work; Said interacting language learning system also comprises user interface, operation discrimination module; Said user interface is used to gather said interactive operation; Said operation discrimination module is used to judge whether said interactive operation meets mission requirements, obtains operating mistake; Said data storage and statistical module also are used to write down said operating mistake, and said database has also been stored the information relevant with dialogue; Said interacting language learning system also comprises the session operational scenarios module, and error statistics and the information relevant with dialogue according to said data storage and statistical module output dynamically generate new session operational scenarios, and shows through said interactive module; The learner can select to get into new round study through said interactive module, perhaps withdraws from study.
Preferably, the implementation of said interacting language learning system be client/server approach, browser/server mode, a kind of based in the single cpu mode of embedded system.
A kind of interacting language learning method comprises: gather the speech data that the requirement of learner's follow procedure pronounces to obtain; From speech data, extract the characteristic parameter that is used to pronounce with rhythm error-detecting; Based on acoustic model, in conjunction with language model or speech network, characteristic parameter is discerned, obtain word sequence, aligned phoneme sequence, corresponding time border, likelihood probability value respectively; The reference phoneme of aligned phoneme sequence and system is compared alignment, obtain phoneme mistake and help options; In conjunction with characteristic parameter, aligned phoneme sequence, time boundary information, adopt statistical model to obtain word and read pattern, whole sentence intonation and time rhythm again; Read word again pattern, whole sentence intonation and time rhythm with comparing, obtain rhythm mistake and help options with reference to pronunciation; Show phoneme, rhythm mistake, the overall assessment and the help options of pronunciation situation, and the pronunciation prompting is provided.
Preferably, further comprising the steps of: before gathering speech data, the memory material of output audio or text mode requires the learner to pronounce and spells; Collection needs the spelling content of memory, obtains input text; The inspection input text obtains misspelling; Carry out error statistics according to the phoneme that obtains, the rhythm and misspelling, write down concrete phoneme mistake, rhythm mistake and misspelling situation, and provide evaluation score and feedback information; Show and estimate score and feedback information; Receive the instruction of selecting memory pattern, learning content or quitting a program.
Preferably, further comprising the steps of: show session operational scenarios, the learner by the session operational scenarios requirement pronounce, spelling and interactive operation; Gather interactive operation; Judge that whether interactive operation meets mission requirements, obtains operating mistake; Carry out error statistics according to the phoneme that obtains, the rhythm, spelling and operating mistake, write down concrete phoneme pronunciation, the rhythm, spelling and operating mistake situation, and provide evaluation score and feedback; Dynamically generate new session operational scenarios, and show.
The real-time phonetic entry with the learner of above-mentioned interacting language learning system ability is judged and is fed back; Learner's input audio frequency is carried out the rhythm detection of the utterance detection and the word level of phone-level; Make the learner can accurately hold oneself the pronunciation concrete wrong part; And combine feedback result and memory curve that the memory content dynamically is provided, and make the raising language proficiency that the learner can be incremental, form a kind of interactively mode of learning.
[description of drawings]
Fig. 1 is the synoptic diagram of interacting language learning system first embodiment.
Fig. 2 is the synoptic diagram of pronunciation and prosody detection module.
Fig. 3 is the synoptic diagram of interacting language learning system second embodiment.
Fig. 4 is the synoptic diagram of interacting language learning system the 3rd embodiment.
[embodiment]
Below in conjunction with accompanying drawing,, will make technical scheme of the present invention and other beneficial effects obvious through the detailed description of specific embodiments of the invention.
Fig. 1 is the synoptic diagram of interacting language learning system first embodiment.Interacting language learning system comprises the two large divisions, promptly user oriented user side 11 and the data processing end 12 that carries out background process.User side 11 provides equipment and the display interface of gathering learner's behavior, comprises voice acquisition module 112, interactive module 111; Data processing end 12 is responsible for display message is handled and generated to the data that user side 11 is gathered, and comprises pronunciation and prosody detection module 121, data storage and statistical module 122.
Voice acquisition module 112 is used to gather learner's speech data.At first need carry out silence detection for the voice that collect; It is through calculating audio frequency characteristics; Whether for example energy (Energy), zero-crossing rate (ZeroCrossing Rate) etc. judge whether phonetic entry or input is quiet etc.; Do not have phonetic entry or quiet if differentiate, will require to gather again voice.
Pronunciation and prosody detection module 121 are used for extracting the characteristic parameter that is used to pronounce with rhythm error-detecting from speech data, and to the degree that mistake is further judged and controlled wrong demonstration, obtain final phoneme mistake and rhythm mistake.
Data storage and statistical module 122 recorded content mistakes, phoneme mistake and rhythm mistake, and combine these mistakes to give overall assessment to learner's pronunciation situation, evaluation result is fed back to interactive module 111.
Interactive module 111 is used for this content, phoneme, rhythm mistake, and the overall assessment and the help options of pronunciation situation are shown to the learner, and the pronunciation prompting that comprises pronunciation text, reference pronunciation or pronunciation sight is provided.The target learning content that this pronunciation text is the learner is like word, phrase or sentence; This is the received pronunciation that the people sent out of target language country with reference to pronunciation; This pronunciation sight is the sight that system provides, and for example on the way runs into friend and greets to it, requires the learner to speak according to this sight.
Fig. 2 is the synoptic diagram of pronunciation and prosody detection module.Pronunciation and prosody detection module 121 comprise characteristic extracting module 202, sound identification module 203, pronunciation evaluation module 204, prosody detection module 205, rhythm evaluation module 206.
202 pairs of speech datas of characteristic extracting module extract the characteristic parameter that is used to pronounce with rhythm error-detecting; Perception linear forecasting parameter PLP (Perceptual Linear Prediction coefficients) for example; Mel cepstrum coefficient MFCC (Mel-frequency cepstral coefficients); Frame average energy (Energy), be the energy of all frames of crossing over of vowel; The average fundamental frequency of frame (Pitch), be the fundamental frequency of all frames of crossing over of vowel and to be crossed over frame number by it average; And before and after their to differential parameter, comprise that the forward frame average energy is poor, back is poor to the frame average energy, forward direction consonant frame average energy is poor, the average fundamental frequency of forward frame is poor, the back is poor to the average fundamental frequency of frame, the forward direction duration is poor, afterwards to the duration difference etc.
Sound identification module 203 is based on acoustic model, and combination language model or speech network, and characteristic parameter is discerned, and obtains sequence, time corresponding border and the corresponding likelihood probability value (likelihood) of word level and phone-level respectively.Can use acoustic model and a pronunciation dictionary based on hidden Markov model (HMM, HiddenMarkov Model).Its acoustic model is to use the people (Native Speakers) that collected target language country to cover the voice of all phonemes and trains and obtain; Pronunciation dictionary has not only comprised correct pronunciation, has also comprised possible incorrect pronunciations simultaneously.Its language model or speech network are the statistical models at the word level probability of happening.With the speech data of reading to import, sound identification module 203 can use the pressure alignment schemes for the learner, and the combining with pronunciation text is discerned, and obtains word sequence and aligned phoneme sequence, and time border and likelihood probability value; For the learner according to require the to speak speech data of input of sight, sound identification module can the bluebeard compound network or language model decode, obtain word sequence and aligned phoneme sequence, and the time border.
Pronunciation evaluation module 204 at first uses the method for statistics to combine the input of sound identification module 203 to carry out the differentiation of word level content.If judge by the pronunciation prompting different with the word sequence of reference pronunciation with the speech data of reading to obtain; Perhaps different with the model answer content by the speech data that obtains in a minute of pronunciation sight; With can not carrying out the judgement of phone-level, and directly get into data storage and statistical module 122, the recorded content mistake; And the whole sentence of prompting content is undesirable in interactive module 111, and the request user re-enters voice; Otherwise use character string alignment algorithm; Dynamic programming algorithm (Dynamic Programming Algorithm) for example; Compare alignment and pronounce to estimate through the reference phoneme that aligned phoneme sequence and system are provided, obtain the phoneme mistake, comprise the insertion (Insertion) of phoneme in the word according to the feedback error precision of setting; Deletion (Deletion) and replacement (Substitution) three types of mistakes, and help options.
Prosody detection module 205 comprises that the word accent pattern (Lexical stress) of word level detects, the rhythm (Prosody) detects; Its combine sound identification module 203 the result, be aligned phoneme sequence, time corresponding boundary information, likelihood probability value; With the average fundamental frequency information of frame average energy, frame that characteristic extracting module 202 obtains, pattern, whole sentence intonation and time rhythm situation read again in the word that the statistical model that provides according to system obtains in the speech data sentence.This statistical model can be the supporting vector machine model (SVM, Support Vector Machine) that obtains through training, perhaps neural network (Neural Network), perhaps hidden Markov model (HMM, Hidden Markov Model) etc.; Pattern read again in this word is that unit is judged with the syllable, comprises the position of main stressed syllable in the word and the position of time stressed syllable; This whole sentence intonation is the sentence stress of whole word, i.e. the position of stressed syllable in this sentence is based on whole fundamental frequency variation tendency of syllable and intonation; This time rhythm is the speed in a minute and the judgement of duration aspect.
Rhythm evaluation module 206 is read this word again pattern, whole sentence intonation and time rhythm with comparing with reference to pronunciation; And obtain error situation that word reads pattern again and corrects helps, and rhythm error situation and help options such as whole stressed syllable, whole tone and rhythm according to the feedback error accuracy requirement of setting.
Fig. 3 is the synoptic diagram of interacting language learning system second embodiment.The difference of itself and first embodiment has been to increase text collection module 113 that belongs to user side 11 and the text spelling detection module 123 that belongs to data processing end 12, and will do corresponding expansion with the function of data storage and statistical module 122 with the direct-connected interactive module 111 of these two modules.
Interactive module 111 comprises a display interface and an inputting interface.Display interface is used for the information that display system feeds back to the learner, comprises audio frequency and spelling information, data storage and statistical module 122 feedacks etc.Inputting interface is used to select memory pattern, learning content or logs off etc.Interactive module 111 is according to the language learning material that the learner selects or system is selected automatically, and such as word, phrase or a text chunk, the purpose to the language memory offers the learner through text or audio frequency mode.Audio prompt is the pronunciation that system provides needs memory, and requires the learner to spell and follow to read; Spelling prompting is the spelling content that system provides needs memory, such as the subalphbet of a word, and the perhaps part word of a sentence.The learner spells, reads simultaneously the content of needs memory according to prompting, thereby in pronunciation and spelling, remembers simultaneously.
Text collection module 113 is used to gather the content of the needs memory of learner's spelling, obtains input text.
Text spelling detection module 123 is used to check input text; Through calculating the similarity editing distance (Levenshtein distance) of input text and model answer text, obtain concrete alternative (Substitution), insert (Insertion), delete misspellings such as (Deletion).
Data storage and statistical module 122 carry out error statistics according to the voice mistake and the misspelling that obtain; Recording learning person's concrete phoneme pronunciation mistake, rhythm mistake and misspelling situation; And provide evaluation score and feedback, show through interactive module 111.Data storage and statistical module 122 comprise a database, and concrete error statistics situation will be write this database in time; This database has not only been stored learner's learning records, but also has stored learning content, comprises corresponding multimedia messages and model answer etc.; System selects and produces new learning content and audio frequency and spelling prompting, thereby get into the interactive mode memory of next round according to the learning content of storing in the memory pattern of active user's mistake, selection and the database.The learner also can reselect learning content according to current study schedule, perhaps withdraws from this subsystem.
Fig. 4 is the synoptic diagram of interacting language learning system the 3rd embodiment.The key distinction of itself and second embodiment has been to increase the user interface 114 that belongs to user side 11 and the operation discrimination module 124, the session operational scenarios module 125 that belong to data processing end 12, and will do corresponding expansion with the function of data storage and statistical module 122 with the direct-connected interactive module 111 of these three modules.The 3rd embodiment of interacting language learning system combines language memory and dialogue (Dialogue), abundant listening, speaking, reading and writing four key elements in the practice language study, and combines with specific scene, the utilization of learning language in the specific occasion through the mode of talking with.
Interactive module 111 is interface equipments towards the learner, is used for showing the session operational scenarios of a group task form to the learner, such as ask the way, buy vegetables, the scene of various use language such as tourism accomplishes the task of system's appointment; When the learner through this module after selected certain session operational scenarios, will successively occur dialogue, spelling, with read, subtask such as selection, the information that the learner provides according to session operational scenarios is carried out interactive operation, input voice and text message and is finished the work.
User interface 114 is used to gather the interactive operation of learner and system, for example controls direction, perhaps uses mouse to select with keyboard, obtains concrete learner and internally holds the perhaps selection of answer.
Operation discrimination module 124 is used to judge whether learner's interactive operation meets mission requirements, obtains operating mistake.
Data storage and statistical module 122 carry out error statistics according to the voice mistake, misspelling and the operating mistake that obtain; Recording learning person's concrete phoneme pronunciation mistake, rhythm mistake, misspelling and operating mistake situation; And provide the evaluation score, show through interactive module 111.Data storage and statistical module 122 comprise a database, and concrete error statistics situation will be write this database in time; This database has not only been stored learner's learning records, has also stored learning content, comprises corresponding multimedia messages and model answer etc., and has stored the information relevant with dialogue, such as session operational scenarios information, mission bit stream etc.
Session operational scenarios module 125 is according to the error statistics situation of data storage and statistical module 122 outputs and session operational scenarios, mission bit stream; Dynamically generate new session operational scenarios; And be shown to the learner through interactive module 111; The learner can select to get into the study of new round session operational scenarios through interactive module 111, perhaps selects to withdraw from study.
Above-mentioned interacting language learning system has multiple implementation, for example based on network client/server (Client/Server) mode, based on network browser/server (Browser/Server) mode, based on single cpu mode of embedded system or the like.
Based on network client, server mode: its client is learner's access terminal; Phonetic entry, text input, voice playing and mouse-keyboard operation are provided; And the input audio frequency is accomplished functions such as silence detection, feature extraction and Network Transmission, session operational scenarios generation, and its server end is accomplished functions such as the incorrect pronunciations detection of input voice, word accent mode detection, rhythm detection, spell check, error feedback, help options feedback, the generation of session operational scenarios content, database manipulation, learning information statistics, Network Transmission.
Based on network browser, server mode: its browser is learner's access terminal; Phonetic entry, text input, voice playing, mouse-keyboard operation, Network Transmission, session operational scenarios are provided; And pass through plug-in unit (Plug-in) and accomplish operations such as input audio frequency completion silence detection and feature extractions; Its server comprises data processing server and Web server; Wherein the data server end is accomplished functions such as incorrect pronunciations detection to the input voice, word accent mode detection, rhythm detection, spell check, error feedback, help options feedback, conversation content generation, database manipulation, learning information statistics, Network Transmission; Wherein Web server is the access server of browser, carries out direct data transmission between browser and the data processing server.
Unit mode based on embedded system: mode detection, rhythm detection, spell check, error feedback, conversation content generation, database manipulation, learning information statistics etc. read again in incorrect pronunciations detection and the word of in a program frame, accomplishing phonetic entry, text input, voice playing, audio mute detection, audio feature extraction, input voice.
Above-mentioned interacting language learning system has made up a kind of interacting language learning platform; Make listening, speaking, reading and writing four key elements in the abundant practice language study of learner; Organically combine each link of language learning; Provide the high scene dialogue study form of degree of freedom to improve learner's interest, the enthusiasm of transferring the learner initiatively participates in the middle of the study it, and provides real-time false judgment and feedback.
Above-mentioned interacting language learning system detects and the rhythm (Prosody) detection the incorrect pronunciations (Mispronunciation) that learner's input audio frequency carries out real-time phone-level (Phone-level), and the rhythm detects the word accent pattern (Lexical stress) that comprises word level and detects and correct help, the rhythm (Prosody) detection and imitate help; Wherein the incorrect pronunciations of phone-level detects the input voice is carried out the speech recognition of phone-level, and points out the concrete phoneme that it makes a mistake; Wherein the word accent mode detection of word level and the aligned phoneme sequence that correct to help the detection according to phone-level to obtain are carried out the identification of word level, identify the word stress pattern of word and provide the error type that compares with correct word stress pattern; Wherein rhythm detection and imitation help to comprise the sentence stressed (Sentence Stress) to the pronunciation statement; Rhythm (Rhythm); The contrast of the rhythm of the check and analysis of intonation aspects such as (Intonation) and the statement of RP is differentiated, and be given on the rhythm evaluation and with the help options of imitation RP statement.Make the learner can accurately hold oneself the pronunciation concrete wrong part.And combine feedback result and memory curve that the memory content dynamically is provided, make the raising language proficiency that the learner can be incremental.
The above embodiment has only been expressed several kinds of embodiments of the present invention, and it describes comparatively concrete and detailed, but can not therefore be interpreted as the restriction to claim of the present invention.Should be pointed out that for the person of ordinary skill of the art under the prerequisite that does not break away from the present invention's design, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with accompanying claims.

Claims (12)

1. an interacting language learning system is characterized in that, comprising:
Voice acquisition module is used to gather learner's speech data;
Pronunciation and prosody detection module are used for extracting the characteristic parameter that is used to pronounce with rhythm error-detecting from speech data, and to the degree that mistake is further judged and controlled wrong demonstration, obtain final phoneme mistake and rhythm mistake;
Data storage and statistical module are used to write down said phoneme mistake and rhythm mistake, and combine these mistakes to give overall assessment to learner's pronunciation situation, and evaluation result is fed back to interactive module;
Interactive module comprises display interface, and said display interface is used to show phoneme mistake and rhythm mistake, learner the pronounce overall assessment and the help options of situation, and the pronunciation prompting is provided;
Said pronunciation and prosody detection module comprise:
Characteristic extracting module is used for extracting the characteristic parameter that is used to pronounce with rhythm error-detecting from said speech data;
Sound identification module combines language model or speech network based on acoustic model, and said characteristic parameter is discerned, and obtains word sequence, aligned phoneme sequence, corresponding time border, likelihood probability value respectively;
The pronunciation evaluation module, the aligned phoneme sequence that is used for identification is obtained and the reference phoneme of system compare alignment, obtain phoneme mistake and help options;
Prosody detection module is used to combine characteristic parameter, aligned phoneme sequence, time boundary information, likelihood probability value, adopts statistical model to obtain word and reads pattern, whole sentence intonation and time rhythm again;
Rhythm evaluation module is used for reading word again pattern, whole sentence intonation and time rhythm with comparing with reference to pronunciation, obtains rhythm mistake and help options.
2. interacting language learning system according to claim 1 is characterized in that: the speech data of said voice acquisition module collection comprises that the pronunciation prompting that system is provided follow the speech data of reading and obtaining in a minute according to the sight that pronounces.
3. interacting language learning system according to claim 1; It is characterized in that: said pronunciation evaluation module at first uses the method for statistics to combine said word sequence, aligned phoneme sequence, time border and likelihood probability value to carry out the differentiation of word level content; If content is inconsistent; The system log (SYSLOG) content false, and the whole sentence of prompting content is undesirable in said interactive module, the request learner re-enters voice; Otherwise phoneme is detected, obtain the phoneme mistake, comprise insertion, deletion, the replacement mistake of phoneme in the word.
4. interacting language learning system according to claim 1 is characterized in that: pattern read again in said word is that unit is judged with the syllable, comprises the position of main stressed syllable in the word and the position of time stressed syllable; Said whole sentence intonation is the sentence stress of whole word, i.e. the position of stressed syllable in this sentence, and it reflects whole fundamental frequency variation tendency based on syllable and intonation; Said time rhythm is the judgement to speed of speaking and duration.
5. interacting language learning system according to claim 1 is characterized in that: pronunciation text, the target learning content that this pronunciation text is the learner are adopted in the pronunciation prompting of said interactive module; Or adopt with reference to pronunciation, this is the received pronunciation that the people sent out of target language country with reference to pronunciation; Again or adopt the pronunciation sight, this pronunciation sight is the sight that system provides, and requires the learner according to this pronunciation sight in a minute.
6. interacting language learning system according to claim 1 is characterized in that: said interactive module also comprises inputting interface, and said inputting interface is used to select memory pattern, learning content or logs off; Said display interface also is used for the display system feedack, comprises audio frequency and spelling information and said data storage and statistical module feedack; Said interactive module is selected the language learning material; Through audio frequency or text mode the learner is pointed out; Audio prompt is the pronunciation that system provides needs memory, requires the learner to spell and follow and read, and the spelling prompting is the text prompt that system provides the spelling content that needs to remember; Require the learner to spell, obtain spelling content;
Said interacting language learning system also comprises text collection module and text spelling detection module, and said text collection module is used to gather said spelling content, obtains input text; Said text spelling detection module is used to check input text, through calculating the similarity editing distance of input text and model answer text, obtains misspelling;
Said data storage and statistical module also are used to write down said misspelling; Said data storage and statistical module also comprise a database, and concrete error statistics situation will be write this database in time, and this database is not only stored learning records, but also has stored learning content; System is according to the learning content of storing in the memory pattern of current error logging, selection and the database; Select and produce new learning content and audio frequency and spelling prompting; Feed back to said interactive module; Thereby get into the interactive learning of next round, perhaps reselect learning content, or log off according to current study schedule.
7. interacting language learning system according to claim 6 is characterized in that: said misspelling comprises alternative, insertion and deletion error.
8. interacting language learning system according to claim 7; It is characterized in that: said interactive module also is used to show the session operational scenarios of a group task form; After selected certain session operational scenarios of this interactive module; The subtask will occur, the information that the learner will provide according to this interactive module is carried out interactive operation and is pronounced and spell and finish the work;
Said interacting language learning system also comprises user interface, operation discrimination module; Said user interface is used to gather said interactive operation; Said operation discrimination module is used to judge whether said interactive operation meets mission requirements, obtains operating mistake;
Said data storage and statistical module also are used to write down said operating mistake, and said database has also been stored the information relevant with dialogue;
Said interacting language learning system also comprises the session operational scenarios module, and error statistics and the information relevant with dialogue according to said data storage and statistical module output dynamically generate new session operational scenarios, and shows through said interactive module; The learner can select to get into new round study through said interactive module, perhaps withdraws from study.
9. interacting language learning system according to claim 1 is characterized in that: the implementation of said interacting language learning system is client/server approach, browser/server mode, a kind of based in the single cpu mode of embedded system.
10. interacting language learning method comprises:
Gather the speech data that the requirement of learner's follow procedure pronounces to obtain;
From speech data, extract the characteristic parameter that is used to pronounce with rhythm error-detecting;
Based on acoustic model, in conjunction with language model or speech network, characteristic parameter is discerned, obtain word sequence, aligned phoneme sequence, corresponding time border, likelihood probability value respectively;
The reference phoneme of aligned phoneme sequence and system is compared alignment, obtain phoneme mistake and help options;
In conjunction with characteristic parameter, aligned phoneme sequence, time boundary information, adopt statistical model to obtain word and read pattern, whole sentence intonation and time rhythm again;
Read word again pattern, whole sentence intonation and time rhythm with comparing, obtain rhythm mistake and help options with reference to pronunciation;
Show phoneme, rhythm mistake, the overall assessment and the help options of pronunciation situation, and the pronunciation prompting is provided.
11. interacting language learning method according to claim 10 is characterized in that, and is further comprising the steps of:
Before gathering speech data, the memory material of output audio or text mode requires the learner to pronounce and spells;
Collection needs the spelling content of memory, obtains input text;
The inspection input text obtains misspelling;
Carry out error statistics according to the phoneme that obtains, the rhythm and misspelling, write down concrete phoneme mistake, rhythm mistake and misspelling situation, and provide evaluation score and feedback information;
Show and estimate score and feedback information;
Receive the instruction of selecting memory pattern, learning content or quitting a program.
12. interacting language learning method according to claim 11 is characterized in that, and is further comprising the steps of:
Show session operational scenarios, the learner by the session operational scenarios requirement pronounce, spelling and interactive operation;
Gather interactive operation;
Judge that whether interactive operation meets mission requirements, obtains operating mistake;
Carry out error statistics according to the phoneme that obtains, the rhythm, spelling and operating mistake, write down concrete phoneme pronunciation, the rhythm, spelling and operating mistake situation, and provide evaluation score and feedback;
Dynamically generate new session operational scenarios, and show.
CN2009101887026A 2009-12-03 2009-12-03 Interactive language learning system and method Active CN101739870B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009101887026A CN101739870B (en) 2009-12-03 2009-12-03 Interactive language learning system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009101887026A CN101739870B (en) 2009-12-03 2009-12-03 Interactive language learning system and method

Publications (2)

Publication Number Publication Date
CN101739870A CN101739870A (en) 2010-06-16
CN101739870B true CN101739870B (en) 2012-07-04

Family

ID=42463295

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009101887026A Active CN101739870B (en) 2009-12-03 2009-12-03 Interactive language learning system and method

Country Status (1)

Country Link
CN (1) CN101739870B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800314A (en) * 2012-07-17 2012-11-28 广东外语外贸大学 English sentence recognizing and evaluating system with feedback guidance and method of system

Families Citing this family (67)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102682768A (en) * 2012-04-23 2012-09-19 天津大学 Chinese language learning system based on speech recognition technology
CN103021226B (en) * 2012-11-20 2015-02-11 北京语言大学 Voice evaluating method and device based on pronunciation rhythms
CN103310666A (en) * 2013-05-24 2013-09-18 深圳市九洲电器有限公司 Language learning device
CN103413550B (en) * 2013-08-30 2017-08-29 苏州跨界软件科技有限公司 A kind of man-machine interactive langue leaning system and method
CN103505313B (en) * 2013-09-24 2016-01-06 上海泰亿格康复医疗科技股份有限公司 Speech disorder based on real-time rhythm Conceptual Modeling reads interfering system and method thereof again
CN103514768A (en) * 2013-10-24 2014-01-15 苏州市思玛特电力科技有限公司 Auxiliary teaching system
CN104299612B (en) * 2014-11-10 2017-11-07 科大讯飞股份有限公司 The detection method and device of imitative sound similarity
CN104361896B (en) * 2014-12-04 2018-04-13 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104361895B (en) * 2014-12-04 2018-12-18 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104505103B (en) * 2014-12-04 2018-07-03 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104835366B (en) * 2015-05-07 2018-11-13 肖思俐 A kind of Chinese studying systems
CN104809676B (en) * 2015-05-11 2019-12-17 林辉 Method and device for analyzing error type of answer
CN105139311A (en) * 2015-07-31 2015-12-09 谭瑞玲 Intelligent terminal based English teaching system
CN105162892A (en) * 2015-10-15 2015-12-16 戚克明 Language technique exercise treatment method, apparatus and system, and language technique exercise supervision method
CN105609098A (en) * 2015-12-18 2016-05-25 江苏易乐网络科技有限公司 Internet-based online learning system
CN105679135A (en) * 2016-04-06 2016-06-15 天津飞跃无限科技有限公司 Mobile on-line music level test system and method
CN106448288A (en) * 2016-11-01 2017-02-22 安阳师范学院 Interactive English learning system and method
CN106531182A (en) * 2016-12-16 2017-03-22 上海斐讯数据通信技术有限公司 Language learning system
CN108806720B (en) * 2017-05-05 2019-12-06 京东方科技集团股份有限公司 Microphone, data processor, monitoring system and monitoring method
CN107316638A (en) * 2017-06-28 2017-11-03 北京粉笔未来科技有限公司 A kind of poem recites evaluating method and system, a kind of terminal and storage medium
CN108806367A (en) * 2017-07-21 2018-11-13 河海大学 A kind of Oral English Practice voice correcting system
JP7031101B2 (en) * 2017-08-03 2022-03-08 リンゴチャンプ インフォメーション テクノロジー (シャンハイ) カンパニー, リミテッド Methods, systems and tangible computer readable devices
CN108469947A (en) * 2017-09-25 2018-08-31 杨秀梅 A kind of language breath training device of exceptional child
CN109686383B (en) * 2017-10-18 2021-03-23 腾讯科技(深圳)有限公司 Voice analysis method, device and storage medium
CN109697988B (en) * 2017-10-20 2021-05-14 深圳市鹰硕教育服务有限公司 Voice evaluation method and device
CN109697975B (en) * 2017-10-20 2021-05-14 深圳市鹰硕教育服务有限公司 Voice evaluation method and device
CN107958673B (en) * 2017-11-28 2021-05-11 北京先声教育科技有限公司 Spoken language scoring method and device
CN110010123A (en) * 2018-01-16 2019-07-12 上海异构网络科技有限公司 English phonetic word pronunciation learning evaluation system and method
CN110047341A (en) * 2018-01-17 2019-07-23 希格纳姆国际股份有限公司 Scenario language facility for study, system and method
CN108597538B (en) * 2018-03-05 2020-02-11 标贝(北京)科技有限公司 Evaluation method and system of speech synthesis system
CN108492641A (en) * 2018-03-26 2018-09-04 贵州西西沃教育科技股份有限公司 A kind of English phonetic learning system
CN108520650A (en) * 2018-03-27 2018-09-11 深圳市神经科学研究院 A kind of intelligent language training system and method
CN108492652A (en) * 2018-03-29 2018-09-04 吴浩东 A kind of English word intelligence spelling memory method and its intelligent realization system
CN108735220A (en) * 2018-04-11 2018-11-02 四川斐讯信息技术有限公司 A kind of language learning intelligent earphone, intelligent interactive system and man-machine interaction method
CN108630046A (en) * 2018-04-20 2018-10-09 安徽展航信息科技发展有限公司 A kind of welder's electronic multimedia the network teaching platform
CN108648527B (en) * 2018-05-15 2020-07-24 黄淮学院 English pronunciation matching correction method
CN108536875A (en) * 2018-06-12 2018-09-14 重庆靖帛天域科技发展有限公司 Language autonomous learning systems based on big data interactive voice
CN108806719A (en) * 2018-06-19 2018-11-13 合肥凌极西雅电子科技有限公司 Interacting language learning system and its method
WO2019246174A1 (en) * 2018-06-23 2019-12-26 Square Panda Inc. Symbol manipulation educational system and method
CN109255988A (en) * 2018-07-11 2019-01-22 北京美高森教育科技有限公司 Interactive learning methods based on incorrect pronunciations detection
JP7063779B2 (en) * 2018-08-31 2022-05-09 国立大学法人京都大学 Speech dialogue system, speech dialogue method, program, learning model generator and learning model generation method
CN109119064A (en) * 2018-09-05 2019-01-01 东南大学 A kind of implementation method suitable for overturning the Oral English Teaching system in classroom
CN109410911A (en) * 2018-09-13 2019-03-01 何艳玲 Artificial intelligence learning method based on speech recognition
CN109272793A (en) * 2018-11-21 2019-01-25 合肥虹慧达科技有限公司 Child interactive reading learning system
CN109858014A (en) * 2018-12-10 2019-06-07 西南石油大学 Language message active critique system and its active proofreading method
CN109493658A (en) * 2019-01-08 2019-03-19 上海健坤教育科技有限公司 Situated human-computer dialogue formula spoken language interactive learning method
CN109712449A (en) * 2019-03-06 2019-05-03 武汉几古几古科技有限公司 A kind of intellectual education learning system improving child's learning initiative
CN111724635A (en) * 2019-03-18 2020-09-29 云南尚途科技有限公司 Interactive online learning system for multidisciplinary teaching
CN110085257A (en) * 2019-03-29 2019-08-02 语文出版社有限公司 A kind of rhythm automated decision system based on the study of national literature classics
CN110136748A (en) * 2019-05-16 2019-08-16 上海流利说信息技术有限公司 A kind of rhythm identification bearing calibration, device, equipment and storage medium
CN112053020A (en) * 2019-06-06 2020-12-08 兰贵曜威公司 Method, system and equipment for providing man-machine interactive auxiliary language learning
CN110223673B (en) * 2019-06-21 2020-01-17 龙马智芯(珠海横琴)科技有限公司 Voice processing method and device, storage medium and electronic equipment
CN110246484A (en) * 2019-07-19 2019-09-17 山东劳动职业技术学院 A kind of intelligence pronunciation of English self-aid learning system
CN112309371A (en) * 2019-07-30 2021-02-02 上海流利说信息技术有限公司 Intonation detection method, apparatus, device and computer readable storage medium
CN110598208A (en) * 2019-08-14 2019-12-20 清华大学深圳研究生院 AI/ML enhanced pronunciation course design and personalized exercise planning method
CN110600052B (en) * 2019-08-19 2022-06-07 天闻数媒科技(北京)有限公司 Voice evaluation method and device
CN112307859A (en) * 2019-08-30 2021-02-02 北京字节跳动网络技术有限公司 User language level determination method and device, electronic equipment and medium
JP7131518B2 (en) * 2019-09-20 2022-09-06 カシオ計算機株式会社 Electronic device, pronunciation learning method, server device, pronunciation learning processing system and program
CN110992986B (en) * 2019-12-04 2022-06-07 南京大学 Word syllable stress reading error detection method, device, electronic equipment and storage medium
CN111599234A (en) * 2020-05-19 2020-08-28 黑龙江工业学院 Automatic English spoken language scoring system based on voice recognition
CN111739527B (en) * 2020-06-01 2023-06-27 广东小天才科技有限公司 Speech recognition method, electronic device, and computer-readable storage medium
CN111768667A (en) * 2020-07-15 2020-10-13 唐山劳动技师学院 Interactive cycle demonstration method and system for English teaching
CN112086094B (en) * 2020-08-21 2023-03-14 广东小天才科技有限公司 Method for correcting pronunciation, terminal equipment and computer readable storage medium
CN111968426A (en) * 2020-09-11 2020-11-20 河南农业职业学院 Foreign language teaching equipment based on internet
CN112349300A (en) * 2020-11-06 2021-02-09 北京乐学帮网络技术有限公司 Voice evaluation method and device
CN113470447A (en) * 2021-06-29 2021-10-01 读书郎教育科技有限公司 System and method for assisting in memorizing words
CN116805495B (en) * 2023-08-17 2023-11-21 北京语言大学 Pronunciation deviation detection and action feedback method and system based on large language model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006133521A (en) * 2004-11-05 2006-05-25 Kotoba No Kabe Wo Koete:Kk Language training machine
CN101105939A (en) * 2007-09-04 2008-01-16 安徽科大讯飞信息科技股份有限公司 Sonification guiding method
JP2008164701A (en) * 2006-12-27 2008-07-17 Victor Co Of Japan Ltd Language learning system, and program for language learning
CN101292281A (en) * 2005-09-29 2008-10-22 独立行政法人产业技术综合研究所 Pronunciation diagnosis device, pronunciation diagnosis method, recording medium, and pronunciation diagnosis program
CN101551952A (en) * 2009-05-21 2009-10-07 无敌科技(西安)有限公司 Device and method for evaluating pronunciation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006133521A (en) * 2004-11-05 2006-05-25 Kotoba No Kabe Wo Koete:Kk Language training machine
CN101292281A (en) * 2005-09-29 2008-10-22 独立行政法人产业技术综合研究所 Pronunciation diagnosis device, pronunciation diagnosis method, recording medium, and pronunciation diagnosis program
JP2008164701A (en) * 2006-12-27 2008-07-17 Victor Co Of Japan Ltd Language learning system, and program for language learning
CN101105939A (en) * 2007-09-04 2008-01-16 安徽科大讯飞信息科技股份有限公司 Sonification guiding method
CN101551952A (en) * 2009-05-21 2009-10-07 无敌科技(西安)有限公司 Device and method for evaluating pronunciation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800314A (en) * 2012-07-17 2012-11-28 广东外语外贸大学 English sentence recognizing and evaluating system with feedback guidance and method of system
CN102800314B (en) * 2012-07-17 2014-03-19 广东外语外贸大学 English sentence recognizing and evaluating system with feedback guidance and method

Also Published As

Publication number Publication date
CN101739870A (en) 2010-06-16

Similar Documents

Publication Publication Date Title
CN101739870B (en) Interactive language learning system and method
Yamagishi et al. Thousands of voices for HMM-based speech synthesis–Analysis and application of TTS systems built on various ASR corpora
CN101105939B (en) Sonification guiding method
US7412387B2 (en) Automatic improvement of spoken language
CN101751919B (en) Spoken Chinese stress automatic detection method
KR20170011636A (en) Speech recognition apparatus and method, Model generation apparatus and method for Speech recognition apparatus
WO1998002862A1 (en) Apparatus for interactive language training
CN101551947A (en) Computer system for assisting spoken language learning
Qian et al. A two-pass framework of mispronunciation detection and diagnosis for computer-aided pronunciation training
Sefara et al. HMM-based speech synthesis system incorporated with language identification for low-resourced languages
Audhkhasi et al. Reliability-weighted acoustic model adaptation using crowd sourced transcriptions
Ibrahim et al. Improve design for automated Tajweed checking rules engine of Quranic verse recitation: a review
Van Bael et al. Automatic phonetic transcription of large speech corpora
Stuttle et al. A framework for dialogue data collection with a simulated ASR channel.
Kantor et al. Reading companion: The technical and social design of an automated reading tutor
Yu et al. Overview of SHRC-Ginkgo speech synthesis system for Blizzard Challenge 2013
CN115440193A (en) Pronunciation evaluation scoring method based on deep learning
Iriondo et al. Objective and subjective evaluation of an expressive speech corpus
Zheng An analysis and research on Chinese college students’ psychological barriers in oral English output from a cross-cultural perspective
Janyoi et al. An Isarn dialect HMM-based text-to-speech system
Cincarek et al. Development of preschool children subsystem for ASR and Q&A in a real-environment speech-oriented guidance task
Pellegrini et al. Extension of the lectra corpus: classroom lecture transcriptions in european portuguese
Schlippe et al. Rapid bootstrapping of a ukrainian large vocabulary continuous speech recognition system
Kaszczuk et al. The IVO software Blizzard Challenge 2009 entry: Improving IVONA text-to-speech
Ipšić et al. Croatian Speech Recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant