CN1851779A - Multi-language available deaf-mute language learning computer-aid method - Google Patents

Multi-language available deaf-mute language learning computer-aid method Download PDF

Info

Publication number
CN1851779A
CN1851779A CNA2006100607787A CN200610060778A CN1851779A CN 1851779 A CN1851779 A CN 1851779A CN A2006100607787 A CNA2006100607787 A CN A2006100607787A CN 200610060778 A CN200610060778 A CN 200610060778A CN 1851779 A CN1851779 A CN 1851779A
Authority
CN
China
Prior art keywords
pronunciation
user
deaf
mute
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2006100607787A
Other languages
Chinese (zh)
Other versions
CN1851779B (en
Inventor
黄中伟
杨磊
孙宏元
蒙山
徐�明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN2006100607787A priority Critical patent/CN1851779B/en
Priority to JP2009510256A priority patent/JP5335668B2/en
Priority to PCT/CN2006/001917 priority patent/WO2007134494A1/en
Publication of CN1851779A publication Critical patent/CN1851779A/en
Application granted granted Critical
Publication of CN1851779B publication Critical patent/CN1851779B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B21/00Teaching, or communicating with, the blind, deaf or mute
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/06Foreign languages
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B21/00Teaching, or communicating with, the blind, deaf or mute
    • G09B21/009Teaching or communicating with deaf persons
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • G09B7/02Electrically-operated teaching apparatus or devices working with questions and answers of the type wherein the student is expected to construct an answer to the question which is presented or wherein the machine gives an answer to the question presented by a student

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The present invented method includes customer selecting language kind and speech element to be learned, display displaying learned speech-sounds and attention link, starting up speech-sounds receiver, receiving speech signal, analogue/digit converting, central processor extracting phonetic feature to make accuracy distinguishing to pronounce, display displaying pronouncing accuracy. Compared with current technology, the present invention utilizes multimedia PC hard ware Equipment, combining computer graphics picture technology and multimedia PC speech-sounds technology, according to the use object having different mother tongue property, using different literal teaching language in auxiliary teaching process, effectively aiding different state deaf mute learning native language pronounces.

Description

Deaf-mute's phonetic study computer-aid method that multilingual is suitable for
Technical field
The present invention relates to a kind of method of utilizing computing machine to carry out phonetic study, particularly a kind of computer-aided learning (CAL) method that is applicable to deaf-mute's sound pronunciation with different Mother Languages.
Background technology
The deaf-mute is a specific group among the disabled person, exteriorly, they and common abled person are without any difference, only owing to lost the ability of normal articulation, caused they and the exchanging of social environment the obstacle that is difficult to go beyond to occur, finally cause the overwhelming majority among them to live in the social bottom, live in poverty, throughout one's life.To fundamentally change the present life present situation of deaf-mute, allow the understanding of their passive dependence society and the support of concern and national governments still be not enough merely, requisite be need they initiatively go to grasp the ability that exchanges with surrounding population.At present, a lot of deaf-mutes have learned the use sign language, but except the deaf-mute, it is few can being familiar with and using the people of sign language among the common abled person, so the use of sign language still is, and interchange scope with the deaf-mute is confined in the interchange with the deaf-mute.This shows, improve deaf-mute's weather, allow them have same life and ability to work with common abled person, teach they can speak normally the pronunciation be indispensable.
In the huge deaf and dumb crowd of number, the overwhelming majority all is to have " deaf person " who perfects phonatory organ, and they possess the physiological condition of sounding fully.But owing to congenital or posteriori reason has been lost hearing, the sound that the phonatory organ that can't utilize auditory system to go to correct oneself sends, phonatory organ is degenerated gradually, finally loses the ability of speaking, and becomes " deaf-mute ".Having experience to show, for such " deaf person ", if can feed back the correctness of its pronunciation by certain effective means at any time, by persevering training, is to help them to recover the ability of speaking fully.Report that the student of China certain " deaf youngster training centre " by studying hard of several years, not only can speak as the normal person, even learn to say cross-talk and tongue twister under teacher's concentrated guidance.Feedback under the traditional approach and correction " deaf person " pronunciation is by manually finishing, and this just means that the head of a family or teacher need persevere for several years as if it were one day, day and night accompany those deaf-mutes carry out man-to-man teaching exercise at one's side, just can receive achievement preferably.
Summary of the invention
The purpose of this invention is to provide deaf-mute's phonetic study computer-aid method that one kind of multiple language are suitable for, the technical matters that solves is to help the head of a family and teacher from heavy and free the language teaching activity that repeats, allow the people of study pronunciation can easier acquisition correct the chance of self pronunciation, association speaks as the normal person faster and more easily.
The present invention is by the following technical solutions: deaf-mute's phonetic study computer-aid method that one kind of multiple language are suitable for may further comprise the steps: one, the user selects the category of language of required study according to self needs; Two, the user selects and definite required learning pronunciation unit; Three, the display of computing machine shows the link that needs learning pronunciation and noted to the user; Four, start the pronunciation receiver of computing machine, wait for the input of user voice signal; Five, pronunciation receiver receives the voice signal that the user sends, and carries out analog/digital conversion; Six, the central processing unit of computing machine extracts required phonetic feature from analog-digital commutator; Seven, central processing unit is made the accuracy differentiation to user pronunciation; Eight, the accuracy of explicit user pronunciation on display.
When central processing unit of the present invention extracts required phonetic feature from analog-digital commutator, obtain the relevant digital voice data of user pronunciation by end-point detection.
Method of the present invention is when the monosyllabic pronunciation of user learning, central processing unit adopts the MFCC calculation of parameter of continuous Short Time Speech frame being carried out restraining based on μ cepstral analysis, the single order, the second order difference component that in each parameter vector, comprise signal short-time energy and parameter, based on the DHMM model parameter, by the Viterbi algorithm, the pronunciation order of accuarcy is differentiated; When user learning multisyllable, word, sentence pronunciation, central processing unit realizes that by the conventional HMM method continuous phonation to the user order of accuarcy that pronounces differentiates.
When pronunciation receiver of the present invention starts, start video acquisition device simultaneously, the detail characteristic of the shape of the mouth as one speaks when catching user pronunciation.
When display of the present invention shows the link that needs learning pronunciation and noted to the user, use literal and animated image to user prompt pronunciation characteristic and shape of the mouth as one speaks feature.
Display of the present invention uses animated image when user prompt pronunciation characteristic and shape of the mouth as one speaks feature, changes and the phonation animated image of front animation, side animation and the phonatory organ anatomical structure of the feature of supplying gas with the shape of the mouth as one speaks, demonstrates the routing motion of phonatory organ.
Adopt the method for end-point detection when the central processing unit of computing machine of the present invention extracts required voice unit, realize importing the preliminary ruling of voice signal initial sum final position by signal calculated energy and zero-crossing rate.
When pronunciation receiver of the present invention receives the voice signal that the user sends,, neighbourhood noise is suppressed and the high frequency component signal pre-emphasis by software pre-service digital filter.
During the present invention's accuracy that explicit user pronounces on display, use percentage to show pronouncing accuracy, the video information during replaying user pronunciation simultaneously compares with the Received Pronunciation video cartoon.
User of the present invention selects and determines that required learning pronunciation unit is phonetic symbol, word or sentence.
The present invention compared with prior art, utilized each hardware equipped of multimedia computer, computer graphic image technology, multimedia computer voice technology are effectively combined, at using object to have the characteristics of different country origins, different mother tongues, in the aided education process, use the language of instruction of different literals, the deaf-mute of a plurality of countries such as effectively auxiliary Britain, the U.S., Japan, France, Germany, Russia, Spain learns the mother tongue pronunciation, satisfies the needs of the deaf-mute's word pronunciation learning with different mother tongues.
Description of drawings
Fig. 1 is learning system master's surface chart of the embodiment of the invention.
Fig. 2 is that the pronunciation of English learning functionality of the embodiment of the invention 1 is selected surface chart.
Fig. 3 is that surface chart selected in the International Phonetic Symbols of the embodiment of the invention 1.
Fig. 4 is the International Phonetic Symbols [a :] the pronunciation teaching surface chart of the embodiment of the invention 1.
Fig. 5 is the International Phonetic Symbols [a :] test result and the feedback diagram of the embodiment of the invention 1.
Fig. 6 is English word " car " the inputting interface figure of the embodiment of the invention 2.
Fig. 7 is English word " car " the word pronunciation learning surface chart of the embodiment of the invention 2.
Fig. 8 is English sentence " This is my car " the pronunciation teaching surface chart of the embodiment of the invention 3.
Fig. 9 is the Japanese word pronunciation learning function selecting surface chart of the embodiment of the invention 4.
Figure 10 is that the Hiragana [I] of the embodiment of the invention 4 is selected the interface.
Figure 11 is Hiragana [I] the pronunciation teaching surface chart of the embodiment of the invention 4.
Figure 12 is Hiragana [I] test result and the feedback diagram of the embodiment of the invention 4.
Figure 13 is the French word pronunciation learning function selecting surface chart of the embodiment of the invention 5.
Figure 14 is French word " v é lo " the inputting interface figure of the embodiment of the invention 5.
Figure 15 is French word " v é lo " the word pronunciation learning surface chart of the embodiment of the invention 5.
Figure 16 is French word " v é lo " test result and the feedback diagram of the embodiment of the invention 5.
Figure 17 is method sentence " Cest mon v é lo " the word pronunciation learning surface chart of the embodiment of the invention 6.
Embodiment
The present invention is described in further detail below in conjunction with drawings and Examples.Deaf-mute's phonetic study computer-aid method that multilingual of the present invention is suitable for, may further comprise the steps: deaf-mute's phonetic study computer-aid method that one kind of multiple language are suitable for may further comprise the steps: one, the user selects the category of language of required study according to self needs; Two, the user selects and determines required learning pronunciation unit, and voice unit is phonetic symbol, word or sentence; Three, the display of computing machine shows the link that needs learning pronunciation and noted to the user, use literal and animated image to user prompt pronunciation characteristic and shape of the mouth as one speaks feature, change and the phonation animated image of front animation, side animation and the phonatory organ anatomical structure of the feature of supplying gas the routing motion of demonstration phonatory organ with the shape of the mouth as one speaks; Four, start the pronunciation receiver of computing machine, wait for the input of user voice signal, start video acquisition device simultaneously, the detail characteristic of the shape of the mouth as one speaks when catching user pronunciation; Five, pronunciation receiver receives the voice signal that the user sends, and by software pre-service digital filter, neighbourhood noise is suppressed and the high frequency component signal pre-emphasis, carries out analog/digital conversion; Six, the central processing unit of computing machine extracts required phonetic feature from analog-digital commutator, obtain the relevant digital voice data of user pronunciation by end-point detection, adopt the method for end-point detection, realize importing the preliminary ruling of voice signal initial sum final position by signal calculated energy and zero-crossing rate; Seven, central processing unit is made the accuracy differentiation to user pronunciation, when the monosyllabic pronunciation of user learning, central processing unit adopts the MFCC calculation of parameter of continuous Short Time Speech frame being carried out restraining based on μ cepstral analysis, the single order, the second order difference component that in each parameter vector, comprise signal short-time energy and parameter, based on the DHMM model parameter, by the Viterbi algorithm, the pronunciation order of accuarcy is differentiated; When user learning multisyllable, word, sentence pronunciation, central processing unit realizes that by the conventional HMM method continuous phonation to the user order of accuarcy that pronounces differentiates; Eight, the accuracy of explicit user pronunciation on display uses percentage to show pronouncing accuracy, and the video information during replaying user pronunciation simultaneously compares with the Received Pronunciation video cartoon.
Embodiment 1, the pronunciation that the deaf-mute of English-speaking country learns lnternational Phonetic Alphabet [a :], the present embodiment CPU that uses a computer is AMD2500+, in save as 1GB, 160GB SATA Seagate hard disk, Benq FP71G+ display, sound card carries integrated sound card for the AC97 plate, multimedia loudspeaker box uses Xfree XE233, pronunciation receiver adopts Voiceao VA-800MV, video acquisition device adopts good farmland Camera-168 type stylus, operating system is used Microsoft WindowXP Professional, version 2 002, Service Pack 2, deaf-mute's talking computer assisted learning software uses the software of the method according to this invention programming: " Audio-Video BimodalPronunciation Learning System for Deaf-Mute ", Version 1.0.
At first open the multimedia computer of the software that the method according to this invention programming successfully has been installed, start this assisted learning software systems.As shown in Figure 1, after learning system detects each necessary hardware facility and all obtains correct the installation, enter this assistant learning system master interface.The employed language in main interface is English.Main interface provides choice of language, and the user can select institute to learn employed language of instruction in the category of language of pronouncing and the teaching process by this option.Present embodiment is the deaf-mute of English-speaking country at the user, and the language of required study is an English, and employed language of instruction is similarly English, so the user selects " English ".After this system will enter the pronunciation of English teaching, and the language of instruction of using is English.
As shown in Figure 2, after selecting " English ", the user can see " Phonetic Symbol ", " Common Word ", " Common Sentence " three English options, in this specific embodiment, user learning be the pronunciation of the International Phonetic Symbols [a :], so the user clicks " Phonetic Symbol " option with left mouse button on screen.As shown in Figure 3, enter this option after, the user can see and demonstrates whole 48 International Phonetic Symbols on the computer screen.
As shown in Figure 4; The user uses left mouse button to click the International Phonetic Symbols [a :], then enters the study interface of phonetic symbol [a :]. computer screen demonstrates the pronunciation prompting of the International Phonetic Symbols [a :] and the International Phonetic Symbols [a :]: " Open your mouth naturally and let your tongue off topronounce.Bear in mind to lay your tongue as low as possible and keepthe tip of your tongue away from teeth.Remember to low your Chin andrelax your tongue then you can pronounce smoothly. ". The phonation animation of front animation, side animation and the phonatory organ anatomical structure of the shape of the mouth as one speaks variation and the feature of supplying gas when simultaneously, screen display Received Pronunciation demonstration people correctly sends [a :] sound.
The user clicks screen lower right button " Prepare for test ", the then screen display video acquisition device real-time pictures of being caught, and prompting " Please adjust the position of your headcorrectly ".The user adjusts the position and the angle of own head, makes video acquisition device can stablize the shape of the mouth as one speaks and facial characteristics when capturing user pronunciation clearly.After adjusting end, the user clicks this button once more.Click this button from the user, voice acquisition device and video acquisition device begin to note the Voice ﹠ Video signal that length was 10 seconds respectively, and show countdown in ten seconds on computer screen.Within ten seconds, the user aims at voice acquisition device and sends [a :] sound.After ten seconds, voice acquisition device and video acquisition device stop record audio and vision signal.
In the word pronunciation learning process of English intemational phonetic symbols,, generally can elongate the pronunciation duration of syllable intentionally for accurately grasping the characteristics of pronunciation.Therefore native system adopts DHMM model (Duration HMM) method that has comprised pronunciation duration statistical information learner's pronunciation is differentiated and to be assessed.The DHMM model has not only comprised pronunciation state single order, second-order statistics feature description, has also introduced the statistical nature of pronunciation state duration is described.Therefore, DHMM can assess the order of accuarcy of current phonetic symbol pronunciation effectively.
System to pronunciation differentiation process in, CPU obtains the zero-time position and the concluding time position of user pronunciation by the end-point detecting method based on the energy judgement.Current continuous number audio data stream is lacked division frame, at the continuous certain overlapping of maintenance between the digital voice frame in short-term.Digital audio-frequency data after native system uses Hamming window to handle, again by software pre-service digital filter, realizes that neighbourhood noise suppresses and the high frequency component signal pre-emphasis through windowing.Feature for the current pronunciation of effective expression, continuous Short Time Speech frame is carried out MFCC (Mel-Frequency Cepstral Coefficients) calculation of parameter based on μ rule cepstral analysis, in each parameter vector, comprise single order, the second order difference component of signal short-time energy and parameter.The present invention has set up the DHMM model bank of English phonetic word pronunciation learning in assisted learning system, central processing unit carries out pronunciation state transfer path search by the Viterbi algorithm to active user's pronunciation based on wherein model parameter.In iterative search procedures, add up the activation duration of each state, by state duration statistical information in the DHMM model, bonding state shifts and the state output probability, accuracy is made in the current study pronunciation of user differentiated.Among the present invention with centesimal system to provided the evaluation and test achievement by the trainer.
As shown in Figure 5, after speech evaluating finishes, demonstrate the score of this pronunciation with percentage above computer screen, this percentage shows the similarity degree of user pronunciation and Received Pronunciation, is used for passing judgment on quantitatively the accuracy of pronunciation.Simultaneously, the video record that changes of the user that noted of the playback video harvester shape of the mouth as one speaks and facial characteristics in sending out [a :] sound process repeatedly in the screen.The user can also click right-hand " Standard Video " button of screen.After clicking this button, playing standard pronunciation demonstration people sends out the Received Pronunciation shape of the mouth as one speaks video recording of [a :] sound on the display.The user can compare the pronunciation mouth shape and the standard shape of the mouth as one speaks of oneself, corrects the pronunciation of oneself repeatedly according to these feedback informations.
This moment, screen had three options in the below: " Try again ", " Choose new section ", " Quit ".If the user selects " Try again ", then system carries out the pronunciation supplemental training of the International Phonetic Symbols [a :] again.If the user selects " Choose new section ", then system returns to the function selecting interface, waits for that the user selects new voice training content.If the user selects " Quit ", then program withdraws from, the return system interface.
Embodiment 2, the study English pronunciation of word " car " of the deaf-mute of English-speaking country, and employed software and hardware, and computer-aided learning (CAL) software is identical with embodiment 1.The Starting mode of learning system is identical with embodiment 1.As shown in Figure 1, at first enter system master interface, in main interface, select " English ".As shown in Figure 2, the user clicks " CommonWord " option with left mouse button in " Phonetic Symbol ", " CommonWord ", " Common Sentence " three options.As shown in Figure 6, enter this option after, computer screen shows a dialog box, there is English reminding the dialog box below: " Please input the word ".The user keys in " car " by keyboard in this dialog box, and clicks enter key, enters the word pronunciation learning interface of English word " car ".
As shown in Figure 7, enter the word pronunciation learning interface of word " car " after, the user can see the screen the top be word " car " with and the phonetic notation [ka :] of the International Phonetic Symbols, play the front and the side ports catalog picture of true man's pronunciation in the middle of the screen with low speed.
User learning finishes after word " car " pronunciation characteristic, click screen lower right button " Preparefor test ", the then screen display video acquisition device real-time pictures of being caught, and prompting " Pleaseadjust the position of your head correctly ".The user adjusts the position and the angle of own head, makes video acquisition device can stablize the shape of the mouth as one speaks and facial characteristics when capturing user pronunciation clearly.After adjusting end, the user clicks this button once more.Click this button from the user, voice acquisition device and video acquisition device begin to note the Voice ﹠ Video signal that length was 10 seconds respectively, and show countdown in ten seconds on computer screen.Within ten seconds, the user aims at voice acquisition device and sends [ka :] sound.After ten seconds, voice acquisition device and video acquisition device stop record audio and vision signal.
Because the pronunciation of English word is made of a plurality of phonemes, method therefore of the present invention adopts and based on the continuous HMM method that does not comprise the state duration statistical information pronunciation of words is provided evaluation of the accuracy.Voice signal is realized analog/digital conversion by sound card, and CPU at first lacks division frame to the digital audio data stream that collects, because voice signal has smooth performance in short-term, at the continuous certain overlapping of maintenance between the digital voice frame in short-term.Digital audio-frequency data after the present invention uses Hamming window to handle, again by software pre-service digital filter, realizes that neighbourhood noise suppresses and the high frequency component signal pre-emphasis through windowing.In continuous Short Time Speech frame, realize importing the preliminary judgement of voice signal initial sum final position by parameters such as signal calculated energy and zero-crossing rates.In through the preliminary digital voice stream of confirming, continuous Short Time Speech frame is carried out MFCC calculation of parameter based on μ rule cepstral analysis, in each parameter vector, comprise single order, the second order difference component of signal short-time energy and parameter.The present invention has stored in assisted learning system and has covered all unspecified person hidden Markov model HMM set of training scene.CPU uses the Viterbi optimization algorithm based on this set to the characteristic vector sequence that obtains after the feature extraction.Because the content of known speech finds by the optimum state transfer path of this pronunciation of trainer by the Viterbi algorithm.Based on the continuous HMM model output probability of this optimal path correspondence, with centesimal system to provided the evaluation and test achievement by the trainer.After this operating process is identical with embodiment 1.
Embodiment 3, and the deaf-mute of English-speaking country learns the pronunciation of common sentences " This is my car ", employed software of present embodiment and hardware, and computer-aided learning (CAL) software is identical with embodiment 1.The Starting mode of learning system is identical with embodiment 1.As shown in Figure 1, at first enter system master interface, in main interface, select " English ".The user clicks " Common Sentence " option with left mouse button in " Phonetic Symbol ", " Common Word ", " Common Sentence " three options.As shown in Figure 9, enter this option after, computer screen will show a dialog box, there is English reminding dialog box below: " Please input the sentence ".The user keys in " This is my car " by keyboard in this dialog box, and clicks enter key, then enters the word pronunciation learning interface of English common sentences " This is my car ".
As shown in Figure 8, enter the word pronunciation learning interface of common sentences " This is my car " after, the user can see the screen the top be sentence pattern " This is my car " with and the phonetic notation [ is iz mai ka :] of the International Phonetic Symbols.Play the shape of the mouth as one speaks video recording of true man's pronunciation in the middle of the screen.After this operating process is identical with embodiment 2.
Embodiment 4, the pronunciation that the deaf-mute of Japanese country learns Japanese hiragana " I ", and employed software of present embodiment and hardware, and computer-aided learning (CAL) software is identical with embodiment 1.The Starting mode of learning system is identical with embodiment 1.
As shown in Figure 1, at first enter system master interface, select " Japanese ".As shown in Figure 9, the user is at " flat Provisional name ひ ら Ga な ", " sheet Provisional name か か な ", “ Language Department ", “ Even Language ", click " flat Provisional name ひ ら Ga な " option with left mouse button in " セ Application テ Application ス " five options.As shown in figure 10, enter this option after, the user can see the 50 sound picture that demonstrates Hiragana on the computer screen, this figure provides whole 51 hiraganas in the Japanese.
As shown in figure 11, the user uses left mouse button to click hiragana " I ", then enters the study interface of hiragana " I ".Computer screen demonstrates the pronunciation prompting of hiragana " I " and " I ": " mouthful The Shao Open け て, mouthful first The Hou ろ To Wide げ, tongue elder generation The Xia Qian Dentistry The is given as security な Ga ら, the breath The blows I and goes out The ".The phonation animation of front animation, side animation and the phonatory organ anatomical structure of the shape of the mouth as one speaks variation and the feature of supplying gas when simultaneously, screen display Received Pronunciation demonstration people correctly sends " I " sound.
The user clicks screen lower right button " good ㄑ Quasi Prepare て お い て ㄑ だ さ い ", then the screen display video acquisition device real-time pictures of being caught.The user adjusts the position and the angle of own head, makes video acquisition device can stablize the shape of the mouth as one speaks and facial characteristics when capturing user pronunciation clearly.After adjusting end, the user clicks this button once more.Click this button from the user, voice acquisition device and video acquisition device begin to note the Voice ﹠ Video signal that length was 10 seconds respectively, and show countdown in ten seconds on computer screen.Within ten seconds, the user aims at voice acquisition device and sends " I " sound.After ten seconds, voice acquisition device and video acquisition device stop record audio and vision signal.
For the evaluation and test of hiragana in the Japanese word pronunciation learning and katakana pronouncing accuracy, the present invention is same to adopt DHMM (Duration HMM) method that has comprised pronunciation duration statistical information learner's pronunciation is differentiated and to be assessed.It is concrete that to differentiate evaluation process identical with process among the embodiment 1.As shown in figure 12, after speech evaluating finishes, demonstrate the score of this pronunciation with percentage above computer screen, this percentage shows the similarity degree of user pronunciation and Received Pronunciation, is used for passing judgment on quantitatively the accuracy of pronunciation.Simultaneously, the video record that changes of the user that noted of the playback video harvester shape of the mouth as one speaks and facial characteristics in sending out " I " sound process repeatedly in the screen.The user can also click right-hand " the Standard Quasi Move work " button of screen.After clicking this button, playing standard pronunciation demonstration people sends out the Received Pronunciation shape of the mouth as one speaks video recording of " I " sound on the display.The user can compare the pronunciation mouth shape and the standard shape of the mouth as one speaks of oneself, corrects the pronunciation of oneself repeatedly according to these feedback informations.
After this end of test (EOT), screen below has three options: " preceding time テ ス ト を Kuri り returns The ", " new い content ", “ Final ".If the user selects " preceding time テ ス ト を Kuri り returns The ", then system carries out the pronunciation supplemental training of hiragana " I " again.If the user selects " new い content ", then system returns to main interface, waits for that the user selects new voice training.If the user has selected “ Final ", then program withdraws from, the return system interface.
Embodiment 5, the pronunciation that the deaf-mute of French country learns French word " v é lo ", and employed software of present embodiment and hardware, and computer-aided learning (CAL) software is identical with embodiment 1.The Starting mode of learning system is identical with embodiment 1.
As shown in Figure 1, at first enter system master interface, select " French ".As shown in figure 13, the user clicks " Mot d ' usage Courant " option with left mouse button in " Transcription Phon é tique ", " Mot d ' usage Courant ", " Phrased ' usage Courant " three options.As shown in figure 14, enter this option after, computer screen shows a dialog box, there is the French prompting dialog box below: " Importer le Mot ".The user keys in " v é lo " by keyboard in this dialog box, and clicks enter key, enters the word pronunciation learning interface of French word " v é lo ".
As shown in figure 15, enter the word pronunciation learning interface of word " v é lo " after, the user can see the screen the top be word " v é lo " with and phonetic notation [velp].Play the front and the side ports catalog picture of true man's pronunciation in the middle of the screen with low speed.
User learning finishes after word " the v é lo " pronunciation characteristic, click screen lower right button " Pr é parer pour test ", the then screen display video acquisition device real-time pictures of being caught, and prompting " Corrigez la position de votre t ê te ".The user adjusts the position and the angle of own head, makes video acquisition device can stablize the shape of the mouth as one speaks and facial characteristics when capturing user pronunciation clearly.After adjusting end, the user clicks this button once more.Click this button from the user, voice acquisition device and video acquisition device begin to note the Voice ﹠ Video signal that length was 10 seconds respectively, and show countdown in ten seconds on computer screen.Within ten seconds, the user aims at voice acquisition device and sends [velp] sound.After ten seconds, voice acquisition device and video acquisition device stop record audio and vision signal.
Because the pronunciation of French word generally also is made of a plurality of phonemes, so same employing of this assistant learning system provides evaluation of the accuracy based on the continuous HMM method that does not comprise the state duration statistical information to pronunciation of words.It is concrete that to differentiate evaluation process identical with process among the embodiment 2.
After speech evaluating finishes, demonstrate the score of this pronunciation with percentage above computer screen, this percentage shows the similarity degree of user pronunciation and Received Pronunciation, is used for passing judgment on quantitatively the accuracy of pronunciation.Simultaneously, the video record that changes of the user that noted of the playback video harvester shape of the mouth as one speaks and facial characteristics in the phonation of " v é lo[velp] " repeatedly in the screen.The user can also click right-hand " maniere d ' the agir de prononciation " button of screen.After clicking this button, playing standard pronunciation demonstration people sends out the Received Pronunciation shape of the mouth as one speaks video recording of " v é lo[velp] " sound on the display.The user can compare the pronunciation mouth shape and the standard shape of the mouth as one speaks of oneself, corrects the pronunciation of oneself repeatedly according to these feedback informations.
As shown in figure 16, after this end of test (EOT), screen below has three options: " r é p é terle contenu d ' é tudes ", " chiosir nouveau contenu d ' é tudes ", " quitter ".If the user selects " r é p é ter le contenu d ' é tudes ", then system carries out the pronunciation supplemental training of word " v é lo " again.If the user selects " chiosir nouveau contenu d ' é tudes ", then system returns to main interface, waits for that the user selects new voice training.If the user selects " quitter ", then program withdraws from, the return system interface.
Embodiment 6, the pronunciation of French country's deaf-mute's learning method sentence " Cest mon v é lo ", and employed software of present embodiment and hardware, and computer-aided learning (CAL) software is identical with embodiment 1.The Starting mode of learning system is identical with embodiment 1.As shown in Figure 1, at first enter system master interface, select " French ".As shown in figure 13, the user clicks " Phrase d ' usage Courant " option with left mouse button in " TranscriptionPhon é tique ", " Mot d ' usage Courant ", " Phrase d ' usage Courant " three options.After entering this option, computer screen will show a dialog box, and there is the French prompting dialog box below: " Importer laPhrase ".The user keys in " Cest mon v é lo " by keyboard in this dialog box, and clicks enter key, enters the word pronunciation learning interface of method sentence " Cest mon v é lo ".
As shown in figure 17, enter the word pronunciation learning interface of common sentences " Cest mon v é lo " after, the user can see the screen the top be sentence pattern " Cest mon v é lo " with and phonetic notation [s ε m η velp].Play the shape of the mouth as one speaks video recording of true man's pronunciation in the middle of the screen.After this operating process is identical with embodiment 5.
The present invention makes full use of the pronunciation receiver of multimedia computer, a plurality of hardware equippeds such as video acquisition device and display, with the computer graphic image technology, comprise still image technology and cartoon technique, the multimedia computer voice technology, comprise that speech recognition and speech evaluating organically combine, at using object to have different country origins, the characteristics of different mother tongues, in the aided education process, use the language of instruction of different literals, effectively auxiliary Britain, the U.S., Japan, France, Germany, Russia, the deaf-mute of a plurality of countries such as Spain learns the mother tongue pronunciation, satisfies the needs of the deaf-mute's word pronunciation learning with different mother tongues.Deaf-mute's voice computer assisted learning method of the present invention is easy to be easy-to-use, and the user can use this assistant learning system on any one multimedia computer that has pronunciation receiver, a video acquisition device.Carrying a tune property to learning object is carried out quantitative evaluation accurately with percentage, and utilizes means such as scoring, video fully to feed back the difference of its sounding and normative reference to the learner.

Claims (10)

1. deaf-mute's phonetic study computer-aid method of being suitable for of one kind of multiple language may further comprise the steps: one, the user selects the category of language of required study according to self needs; Two, the user selects and definite required learning pronunciation unit; Three, the display of computing machine shows the link that needs learning pronunciation and noted to the user; Four, start the pronunciation receiver of computing machine, wait for the input of user voice signal; Five, pronunciation receiver receives the voice signal that the user sends, and carries out analog/digital conversion; Six, the central processing unit of computing machine extracts required phonetic feature from analog-digital commutator; Seven, central processing unit is made the accuracy differentiation to user pronunciation; Eight, the accuracy of explicit user pronunciation on display.
2. deaf-mute's phonetic study computer-aid method that multilingual according to claim 1 is suitable for, it is characterized in that: when described central processing unit extracts required phonetic feature from analog-digital commutator, obtain the relevant digital voice data of user pronunciation by end-point detection.
3. deaf-mute's phonetic study computer-aid method that multilingual according to claim 2 is suitable for, it is characterized in that: described when the monosyllabic pronunciation of user learning, central processing unit adopts the MFCC calculation of parameter of continuous Short Time Speech frame being carried out restraining based on μ cepstral analysis, the single order, the second order difference component that in each parameter vector, comprise signal short-time energy and parameter, based on the DHMM model parameter, by the Viterbi algorithm, the pronunciation order of accuarcy is differentiated; When user learning multisyllable, word, sentence pronunciation, central processing unit realizes that by the conventional HMM method continuous phonation to the user order of accuarcy that pronounces differentiates.
4. deaf-mute's phonetic study computer-aid method that multilingual according to claim 3 is suitable for is characterized in that: when described pronunciation receiver starts, start video acquisition device simultaneously, the detail characteristic of the shape of the mouth as one speaks when catching user pronunciation.
5. deaf-mute's phonetic study computer-aid method that multilingual according to claim 4 is suitable for, it is characterized in that: when described display shows the link that needs learning pronunciation and noted to the user, use literal and animated image to user prompt pronunciation characteristic and shape of the mouth as one speaks feature.
6. deaf-mute's phonetic study computer-aid method that multilingual according to claim 5 is suitable for, it is characterized in that: described display uses animated image when user prompt pronunciation characteristic and shape of the mouth as one speaks feature, change and the phonation animated image of front animation, side animation and the phonatory organ anatomical structure of the feature of supplying gas the routing motion of demonstration phonatory organ with the shape of the mouth as one speaks.
7. deaf-mute's phonetic study computer-aid method that multilingual according to claim 6 is suitable for, it is characterized in that: adopt the method for end-point detection when the central processing unit of described computing machine extracts required voice unit, realize importing the preliminary ruling of voice signal initial sum final position by signal calculated energy and zero-crossing rate.
8. deaf-mute's phonetic study computer-aid method that multilingual according to claim 7 is suitable for, it is characterized in that: when described pronunciation receiver receives the voice signal that the user sends, by software pre-service digital filter, neighbourhood noise is suppressed and the high frequency component signal pre-emphasis.
9. deaf-mute's phonetic study computer-aid method that multilingual according to claim 8 is suitable for, it is characterized in that: during the accuracy of described explicit user pronunciation on display, use percentage to show pronouncing accuracy, video information during replaying user pronunciation simultaneously compares with the Received Pronunciation video cartoon.
10. deaf-mute's phonetic study computer-aid method that multilingual according to claim 9 is suitable for is characterized in that: described user selects and determines that required learning pronunciation unit is phonetic symbol, word or sentence.
CN2006100607787A 2006-05-16 2006-05-16 Multi-language available deaf-mute language learning computer-aid method Expired - Fee Related CN1851779B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN2006100607787A CN1851779B (en) 2006-05-16 2006-05-16 Multi-language available deaf-mute language learning computer-aid method
JP2009510256A JP5335668B2 (en) 2006-05-16 2006-07-31 Computer-aided pronunciation learning support method using computers applicable to various languages
PCT/CN2006/001917 WO2007134494A1 (en) 2006-05-16 2006-07-31 A computer auxiliary method suitable for multi-languages pronunciation learning system for deaf-mute

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2006100607787A CN1851779B (en) 2006-05-16 2006-05-16 Multi-language available deaf-mute language learning computer-aid method

Publications (2)

Publication Number Publication Date
CN1851779A true CN1851779A (en) 2006-10-25
CN1851779B CN1851779B (en) 2010-04-14

Family

ID=37133257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2006100607787A Expired - Fee Related CN1851779B (en) 2006-05-16 2006-05-16 Multi-language available deaf-mute language learning computer-aid method

Country Status (3)

Country Link
JP (1) JP5335668B2 (en)
CN (1) CN1851779B (en)
WO (1) WO2007134494A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102063903A (en) * 2010-09-25 2011-05-18 中国科学院深圳先进技术研究院 Speech interactive training system and speech interactive training method
CN101290720B (en) * 2008-06-17 2011-08-31 北京志诚卓盛科技发展有限公司 Visualized pronunciation teaching method and apparatus
CN102920432A (en) * 2012-10-16 2013-02-13 上海市闸北区民办小小虎幼稚园 Speech audition rehabilitation system and rehabilitation method based on phoneme matrix comparison technology
CN106354767A (en) * 2016-08-19 2017-01-25 语当先有限公司 Practicing system and method
CN109064532A (en) * 2018-06-11 2018-12-21 上海咔咖文化传播有限公司 The automatic shape of the mouth as one speaks generation method of cartoon role and device
CN110010123A (en) * 2018-01-16 2019-07-12 上海异构网络科技有限公司 English phonetic word pronunciation learning evaluation system and method

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104157181B (en) * 2014-07-22 2017-07-28 雷青云 A kind of language teaching method and system
CN105261246B (en) * 2015-12-02 2018-06-05 武汉慧人信息科技有限公司 A kind of Oral English Practice error correction system based on big data digging technology
IT201800009607A1 (en) * 2018-10-19 2020-04-19 Andrea Previato System and method of help for users with communication disabilities
US11361677B1 (en) 2021-11-10 2022-06-14 King Abdulaziz University System for articulation training for hearing impaired persons

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS60195584A (en) * 1984-03-16 1985-10-04 富士通株式会社 Enunciation training apparatus
DE69739545D1 (en) * 1996-10-02 2009-10-01 Stanford Res Inst Int METHOD AND SYSTEM FOR THE AUTOMATIC TEXT-INDEPENDENT EVALUATION OF THE LANGUAGE DIRECTORY
JP3448170B2 (en) * 1996-12-02 2003-09-16 山武コントロールプロダクト株式会社 Terminal device and host device used in vocal training machine and vocal training system
CN1095580C (en) * 1998-04-18 2002-12-04 茹家佑 Method for deaf-dumb voice learning dialogue and pronunciation synchronous feedback device
JP2000250402A (en) * 1999-03-01 2000-09-14 Kono Biru Kk Device for learning pronunciation of foreign language and recording medium where data for learning foreign language pronunciation are recorded
CN1282069A (en) * 1999-07-27 2001-01-31 中国科学院自动化研究所 On-palm computer speech identification core software package
JP2002072860A (en) * 2001-09-12 2002-03-12 Yasuhiko Nagasaka Multiple language learning supporting server device, terminal device and multiple language learning support system using these devices and multiple language learning supporting program
JP2003202800A (en) * 2001-12-29 2003-07-18 Keihin Tokushu Insatsu:Kk Implement for learning foreign language
JP2003228279A (en) * 2002-01-31 2003-08-15 Heigen In Language learning apparatus using voice recognition, language learning method and storage medium for the same
CN1530892A (en) * 2003-03-14 2004-09-22 毅 仇 Hearing sense recovering method and system for deaf children
JP2005024815A (en) * 2003-07-01 2005-01-27 Ryuichiro Yamazaki System, device, method, and program for language learning, and recording medium for recording the program
JP2005128242A (en) * 2003-10-23 2005-05-19 Ntt Docomo Inc Speech recognition device
CN1556496A (en) * 2003-12-31 2004-12-22 天津大学 Lip shape identifying sound generator
JP2005321443A (en) * 2004-05-06 2005-11-17 Ace:Kk Pronunciation learning support method, learner terminal, processing program, and recording medium with the program recorded thereon
CN100397438C (en) * 2005-11-04 2008-06-25 黄中伟 Method for computer assisting learning of deaf-dumb Chinese language pronunciation
CN1804934A (en) * 2006-01-13 2006-07-19 黄中伟 Computer-aided Chinese language phonation learning method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290720B (en) * 2008-06-17 2011-08-31 北京志诚卓盛科技发展有限公司 Visualized pronunciation teaching method and apparatus
CN102063903A (en) * 2010-09-25 2011-05-18 中国科学院深圳先进技术研究院 Speech interactive training system and speech interactive training method
CN102063903B (en) * 2010-09-25 2012-07-04 中国科学院深圳先进技术研究院 Speech interactive training system and speech interactive training method
CN102920432A (en) * 2012-10-16 2013-02-13 上海市闸北区民办小小虎幼稚园 Speech audition rehabilitation system and rehabilitation method based on phoneme matrix comparison technology
CN102920432B (en) * 2012-10-16 2015-01-07 上海泰亿格康复服务有限公司 Speech audition rehabilitation system and rehabilitation method based on phoneme matrix comparison technology
CN106354767A (en) * 2016-08-19 2017-01-25 语当先有限公司 Practicing system and method
CN110010123A (en) * 2018-01-16 2019-07-12 上海异构网络科技有限公司 English phonetic word pronunciation learning evaluation system and method
CN109064532A (en) * 2018-06-11 2018-12-21 上海咔咖文化传播有限公司 The automatic shape of the mouth as one speaks generation method of cartoon role and device
CN109064532B (en) * 2018-06-11 2024-01-12 深圳市卡扑动漫设计有限公司 Automatic mouth shape generating method and device for cartoon character

Also Published As

Publication number Publication date
JP2009537850A (en) 2009-10-29
CN1851779B (en) 2010-04-14
JP5335668B2 (en) 2013-11-06
WO2007134494A1 (en) 2007-11-29

Similar Documents

Publication Publication Date Title
CN1851779B (en) Multi-language available deaf-mute language learning computer-aid method
CN100397438C (en) Method for computer assisting learning of deaf-dumb Chinese language pronunciation
US7280964B2 (en) Method of recognizing spoken language with recognition of language color
Kumar et al. Improving literacy in developing countries using speech recognition-supported games on mobile devices
CN1128435C (en) Speech recognition registration without textbook and without display device
Narayanan et al. Creating conversational interfaces for children
US6963841B2 (en) Speech training method with alternative proper pronunciation database
CN1804934A (en) Computer-aided Chinese language phonation learning method
US20070003913A1 (en) Educational verbo-visualizer interface system
CN116206496B (en) Oral english practice analysis compares system based on artificial intelligence
WO2022105472A1 (en) Speech recognition method, apparatus, and electronic device
Wang et al. A probe into spoken English recognition in English education based on computer-aided comprehensive analysis
CN101241656A (en) Computer assisted training method for mouth shape recognition capability
Dhanjal et al. An optimized machine translation technique for multi-lingual speech to sign language notation
TWI240875B (en) Method for interactive computer assistant language learning and system thereof
Hanson Computing technologies for deaf and hard of hearing users
Zheng et al. Improving the efficiency of dysarthria voice conversion system based on data augmentation
Harada et al. VoiceLabel: using speech to label mobile sensor data
CN114267325A (en) Method, system, electronic device and storage medium for training speech synthesis model
CN107203539B (en) Speech evaluating device of complex word learning machine and evaluating and continuous speech imaging method thereof
Abdo et al. Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech.
WO2022159983A1 (en) Systems and methods for mobile speech therapy
Cincarek et al. Development of preschool children subsystem for ASR and Q&A in a real-environment speech-oriented guidance task
TW201411577A (en) Voice processing method of point-to-read device
Cai et al. Transcribing southern min speech corpora with a web-based language learning system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100414

Termination date: 20120516