CN1236928A - Computer aided Chinese intelligent education system and its implementation method - Google Patents

Computer aided Chinese intelligent education system and its implementation method Download PDF

Info

Publication number
CN1236928A
CN1236928A CN98101974A CN98101974A CN1236928A CN 1236928 A CN1236928 A CN 1236928A CN 98101974 A CN98101974 A CN 98101974A CN 98101974 A CN98101974 A CN 98101974A CN 1236928 A CN1236928 A CN 1236928A
Authority
CN
China
Prior art keywords
chinese
phoneme
feature
teaching
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN98101974A
Other languages
Chinese (zh)
Inventor
郭巧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN98101974A priority Critical patent/CN1236928A/en
Publication of CN1236928A publication Critical patent/CN1236928A/en
Pending legal-status Critical Current

Links

Images

Abstract

A computer aided intelligent Chinese language education system is composed of multimedia computer with microphone and CD-ROM, computerized multimedia standard Chinese common speech education program, speech cells template library with kernel of standard Chinese common speech characteiistics, standard Chinese common speech sentence characteristics templat library for teacher, Chinese speech signal processing and recognizing and recognizing subsystem, Chinese-character writing signal template library and writing evaluation subsystem. It can automatically perform the quantitative evaluation to the learning effect on speaking standard Chinese common speech and writing Chinese characters in real time mode.

Description

Computer aided Chinese intelligent education system and its implementation
The present invention relates to a kind of computer aided Chinese intelligent education system and its implementation, affiliated field is computer-aided instruction.
Present existing standard Chinese teaching method mainly contains following several, i.e. classroom instruction, distance teaching, television teaching, recording teaching, video teaching and computer multimedia teaching.There is following problem in these teaching methods: a. classroom instruction is subjected to the restriction of teaching conditions such as time, place and teacher's level usually; B. distance teaching, television teaching, recording teaching, video teaching and existing in the market computer media Chinese teaching program etc. are underaction all, can't real-time online and the scientific quantification ground problem that exists in learning process of analytic learning person (, writing stroke whether correct as pronunciation whether correctly and whether write the result correct etc.) automatically.Therefore, also just can not feed back instruction in time, targetedly.The pertinent literature of finding through retrieval has Chinese patent " CN1101446A Computerized system for teching speech ", " CN2202949Y Chinese studying device ", " CN1146820A talking phonics interactive learning device ", " CN2257944Y Chinese phonetic alphabet Received Pronunciation Teaching instrument ".All there is bigger difference in the disclosed content of above-mentioned pertinent literature with the present invention with regard to instructional function and instructional mode.
The purpose of this invention is to provide that a kind of have can online in real time and scientific quantification ground computer aided Chinese intelligent education system and its implementation of the problem that exists of analytic learning person automatically in the Chinese studying process.Adopt this cover intelligent tutoring system, can not be subjected to the time, the standard Chinese teaching of standard is carried out in the restriction of conditions such as place and teachers ' teaching level, and can not have under (live body) the teacher situation on the scene, the results of learning (accuracy that comprises pronunciation to learner's learning Chinese standard mandarin, the standardization of writing process and the aspects such as correctness of writing the result) carry out real-time online and scientific quantification ground automatically and estimate, and can in time provide recommendation on improvement effectively at the different particular problems that the learner exists
Purpose of the present invention is realized by following technical scheme: the multimedia computer that has microphone and CD-ROM, computer media Chinese standard mandarin educational programme, phoneme of speech sound template base and its implementation with Chinese standard mandarin feature kernel, standard mandarin teacher statement feature templates storehouse and its implementation, Chinese speech signal processing and recognition subsystem and its implementation, phonetic study evaluation and recommendations subsystem and its implementation, write signal Processing and pattern-recognition subsystem and its implementation, write signal templates storehouse and its implementation, write evaluation and recommendations subsystem and its implementation.Follow its educational programme to learn by computing machine, the result of study voice or Chinese-character writing can be imported in the computing machine by microphone or mouse.If phonetic study signal, then enter voice evaluation and suggestion subsystem by voice signal processing and recognition subsystem thereof, through with the matching ratio in the phoneme of speech sound template base with standard mandarin feature kernel and teacher's statement feature templates storehouse, obtain learner's phonetic study Evaluation on effect result, promptly be the standard Chinese of standard, still have the standard Chinese of accent or wrong sound, lack the mistake on the phonetic studies such as sound, multitone, and, provide recommendation on improvement automatically in phonetic study at the different problems that the learner exists; If write learning signal, then enter writing process after the signal mode recognition subsystem is handled and write evaluation of result and suggestion subsystem through writing, compare through pattern match with the writing analog board storehouse, obtain learner's writing process and the evaluation result of writing the result, and, provide recommendation on improvement automatically in writing learning process at the different problems that the learner exists.So far, finished the intelligent tutoring task of computer aided Chinese standard mandarin.
The present invention compares with existing Chinese teaching method has the restriction that following advantage: a. is not subjected to teaching conditions such as time, place and teacher's level, can not have (live body)
Under teacher's situation on the scene, to the results of learning (bag of foreign student's learning Chinese standard mandarin
Draw together the accuracy of pronunciation, the standardization of writing process and the aspects such as correctness of writing the result)
Automatically carry out real-time online and scientific quantification ground and estimate, and can be at the difference of learner's existence
Particular problem in time provides recommendation on improvement effectively.B. in the invention, set up the unspecified person large vocabulary and be suitable for continuous speech recognition and tool
The phoneme of speech sound feature templates storehouse of standard mandarin feature kernel is arranged, thereby solved the Chinese effectively
The required template base problem of the accurate mandarin teaching efficiency of logograph evaluation system.C. in the invention, set up standard mandarin teacher statement feature templates storehouse and (" spoken here
The sentence " notion be broad sense, it had both comprised Chinese phonetic alphabet, what comprise Chinese word again has
The speech of tuning joint and Chinese standard mandarin, sentence or text).This template base is by three ones
Be grouped into, i.e. Chinese standard mandarin feature kernel (phoneme) template base, teaching statement phoneme
Label string template base and tone prosodic features template base thereof.D. in the invention, realize that by following plant process the automatic on-line of phonetic study effect is commented
Valency, promptly according to ready area of computer aided multimedia teaching programme content, the student is by calculating
The selected text content that will learn of the mouse of machine (as Chinese syllable/character/word/statement etc.), heat
The percussive sound icon sends this statement standards for teachers mandarin teaching voice.Afterwards, the student begins
The study pronunciation.Through computing machine the voice signal that the student learns to pronounce is handled it in real time
After, with unspecified person large vocabulary Chinese speech phoneme mould with standard mandarin feature kernel
The plate storehouse compares, and obtains a choosing, two choosings of this phoneme in kernel domain or hybrid domain template base
With three select label to (, a variable result phase mark being arranged promptly at the phoneme label of each choosing
Number, 0 ecbatic state belongs to kernel domain, and 1 ecbatic state belongs to hybrid domain, for example: p
(1) the expression recognition result is the phoneme p that belongs to hybrid domain).So far, finished the learner
The identification mission of phoneme of speech sound.Repeat said process, obtain one of the statement of learning select, two select and
Three select phoneme feature label string; Calculate its corresponding fundamental frequency track again.At this moment, with the teacher
Corresponding statement label string template base and tone prosodic features template base in the statement feature templates storehouse
Carry out pattern match relatively, promptly obtain matching degree the quantitatively evaluating result (the one, judge phoneme
Whether mate; The 2nd, judge whether it belongs to kernel domain; Three will judge whether tone is correct),
Thereby the situation that has not only realized computing machine automatic Evaluation learner learning Chinese (comprises whether having
Whether multitone, few sound, wrong sound and tone be correct) and can also estimate learner's learning Chinese
The situation of standard mandarin (promptly with the degree of closeness of standard mandarin, be pronunciation accurately, substantially
Carry a tune, still incorrect).Computing machine is exported the evaluation result of study situation automatically and is had
Recommendation on improvement targetedly.E. in educational programme in the M shape square frame of 3 centimetres of given length of sides, the student is according to writing religion
Teacher's exemplary word and demonstration writing process use mouse to write (bag on computer screen
Draw together stroke and font) exercise.By with write teacher's template base and carry out (dynamically) pattern relatively
Automatically differentiate student's writing process and the correct degree of writing the result, give output then automatically
That changes writes evaluation result and recommendation on improvement targetedly.
Concrete structure of the present invention is provided by accompanying drawing 1.
Accompanying drawing 1 is the computer aided Chinese intelligent education system structural representation.
Concrete structure among the figure is: the multimedia computer 1, computer media Chinese standard mandarin educational programme 2, the phoneme of speech sound template base 3 with Chinese standard mandarin feature kernel, standard mandarin teacher statement feature templates storehouse 4, Chinese speech signal that has microphone and a CD-ROM handles and recognition subsystem 5, phonetic study evaluation with suggestion subsystem 6, write signal Processing and pattern-recognition subsystem 7 thereof, write signal templates storehouse 8, write and estimate and suggestion subsystem 9.
Below in conjunction with accompanying drawing the present invention is described in detail: (1) sets up a unspecified person (preferably number is more than 100 people) large vocabulary (5000 Chinese
More than the word) (it covers as far as possible to be fit to standard Chinese corpus that continuous speech recognition uses
Cover all voice sound connection phenomenons); (2) make (comprising what two teacher's teachings of men and women were pronounced an at least) standard mandarin teaching
The section directory image-tape; (3) convert above-mentioned standard mandarin teaching to the voice teaching with video-tape respectively by computing machine
Database and write (dynamic image) training data storehouse; (4) under MS-Windows 95 environment, adopt Visual C++ (4.0 above version) programming language to carry out modularization programming, set up the phoneme of speech sound template base that the unspecified person large vocabulary with standard mandarin feature kernel is suitable for Chinese continuous speech identification usefulness.Its implementation is: at first, (with the Chinese phonetic alphabet/character/word/sentence is unit by teaching (dynamically) image and voice signal with teacher's standard mandarin videotape instruction program, have men and women's sound two people's sounding samples at least)) carry out digitized processing respectively, and store by standard format.At multimedia (dynamically) image area, set up standard mandarin teacher voiced process teaching (active multi-media image) database; At the voice signal part, set up Chinese standard mandarin teacher teaching language material database.Foundation has the unspecified person large vocabulary Chinese speech phoneme template base of standard mandarin feature kernel below, for this reason, and the extraction of advanced lang sound characteristic parameter.Speech data signal in the standard mandarin teacher teaching language material database is carried out windowing (adopting hamming code window here).The sampling rate of original language material database is 16khz, and quantitative rate is 16, monophony, and every frame sampling point data are 256, and it is 128 that frame moves sampling point, and window is about 16ms.Intraframe data is carried out the FFT conversion, afterwards, ask for the feature parameter vector of each frame.By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound phoneme feature parameter vector storehouse; Continue said process, occur, will be weighted on average, obtain feature parameter vector corresponding to this phoneme corresponding to all frame feature parameter vectors of a voiceless sound phoneme or a voiced sound phoneme up to pure and impure cent circle frame.Repeat said process, at a phoneme template, we can obtain many (men and women mixes) standard mandarin feature parameter vector, have constituted a feature parameter vector group thus.In order to make this feature parameter vector group can reflect the feature of this phoneme well, adopt the Kohonen self organizing neural network, it is carried out the self-organizing clustering computing, thereby obtained a feature parameter vector kernel spacing territory that should phoneme.Repeat said process, can obtain whole standard mandarin phoneme of speech sound feature parameter vector kernel spacings territory template, and preserve separately.In addition, adopt the unspecified person large vocabulary to be used for the Chinese data database of continuous speech recognition, according to above-mentioned similar process, we can obtain unspecified person large vocabulary Chinese speech phoneme template base.At first, the speech data to primary voice data file (being the Chinese data database) lining carries out windowing (adopting hamming code window here).The sampling rate of original language material database is 16khz, and quantitative rate is 16, monophony, and every frame sampling point data are 256, and it is 128 that frame moves sampling point, and window is about 16ms.Intraframe data is carried out the FFT conversion; Afterwards, ask for the feature parameter vector of each frame.By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound group phoneme feature parameter vector; Continue said process, occur, will vow corresponding to all frame characteristic parameters of a voiceless sound phoneme or a voiced sound phoneme (or quiet) up to pure and impure sound (or quiet) boundary frame
Amount is weighted average, obtains the feature parameter vector corresponding to this phoneme.Because herein
Adopt the unspecified person large vocabulary to be fit to Chinese data database (it of continuous speech recognition
Covered all voice sound connection phenomenons as far as possible), repeat said process, at one
The phoneme template, we can obtain many (men and women mixes) feature parameter vector.Will
The kernel feature parameter vector group of being tried to achieve by the standard mandarin corpus is the new feature parameter therewith
Set of vectors merges, and constitutes the feature parameter vector group of a mixing.In order to make this mixing
The feature parameter vector group can reflect the feature of this phoneme well, adopts once more
The Kohonen self organizing neural network carries out self-organizing clustering to composite character parameter vector group
Computing, thus obtained a composite character parameter vector spatial domain that should phoneme.
It can contain standard mandarin feature kernel template apparently, promptly we can say mark
Accurate mandarin feature kernel template can be used as a feature ginseng of mixing voice phoneme template
Number vector subclass.(this also is the reason that we are called " kernel " template).For searching
For the purpose of rope is convenient, voiceless sound and voiced sound can be divided into two groups, setting up two, to have standard general
The phoneme of speech sound subtemplate storehouse of conversation feature kernel.These two sub-template base lump together work
The unspecified person large vocabulary required for this intelligent tutoring system has the standard mandarin feature
The Chinese speech phoneme template base of kernel.(5) set up standard mandarin teacher statement feature templates storehouse, its implementation is: by with (4)
Similar process, we can extract from teacher's standard mandarin voice flow signal has
The feature parameter vector string of tuning joint/character/word/sentence/text, and it is common to set up the Chinese standard
Words teaching statement feature templates storehouse.It is made up of three parts, i.e. Chinese standard mandarin spy
Levy kernel phoneme template base, teaching statement phoneme label string template base and teaching statement sound
Transfer the prosodic features template base.First's task is finished, the realization of second portion
Journey is: a syllable/character/word/sentence/text at the teacher can obtain corresponding
Phoneme (comprising the tone feature) label string template.And deposit it in teaching statement feature
In the template base.Third part is that teaching statement tone prosodic features template base is by the teaching statement
Fundamental frequency track template set is formed.The changing pattern of fundamental frequency is the tone of Chinese characters in the Chinese
Basic feature.The tone of Chinese characters shows as the variation of the four tones of standard Chinese pronunciation and the rhythm in Chinese sentence.Thus not
Be difficult to see, characterize the four tones of standard Chinese pronunciation and the rhythm and change and to obtain part by the estimation pitch period
Realize.The method of asking for the fundamental tone parameter generally has based on short-time autocorrelation function, based on weak point
The time AMDF, handle and based on several algorithms such as linear predictive codings based on homomorphic signal.
Here, select for use based on the short-time autocorrelation function algorithm, main because its operand is little, and
The effect that obtains is fine again.In the process of calculating fundamental frequency, for solving first maximum
The problem that the position of peak point can not coincide with pitch period sometimes, we are from following two
Start with in the aspect, and the one, it is few greater than two pitch periods that window is grown to, and imitates to obtain preferably
Really; The 2nd, be that the bandpass filter of 60-900Hz is carried out voice signal with a bandwidth
Filtering, and utilize the autocorrelation function of filtering signal to carry out the fundamental tone estimation.So far, also
There is a problem to need to solve, because the pitch period that no matter adopts any algorithm to try to achieve
Track can not fit like a glove with real pitch period track, and actual conditions are most of
Paragraph coincide, and at some local paragraphs or one or several regional pitch period
Valuation has departed from normal track, and this situation is called and some wild points occurred.For removing
These wild points need to adopt various smoothing algorithms, wherein the most frequently used is median smoothing algorithm and
Linear smoothing algorithm.Adopted the median smoothing technology in this problem, about smoothed point
Respectively get L sampling point, together with one group of signal sampling value of the common formation of smoothed point, (2L altogether
+ 1) individual sample value, then with this (2L+1) individual sample value by size order line up a team,
Get of the output of this formation intermediate as smoother.The L value generally is taken as 1 or 2, promptly in
The level and smooth window of value generally entangles 3 to 5 sample values.The advantage of median smoothing is to go
Remove a small amount of wild point, can not destroy in the pitch period track step between two smooth section again
The property variation.Thus, can obtain the phoneme spy of whole Chinese standard mandarin teaching statements
Levy vector kernel spacing territory template base, teaching statement phoneme label string template base and tone thereof
The prosodic features template base.These three template base lump together, and it is common promptly to constitute standard Chinese
Words (teacher) teaching statement feature templates storehouse.Building of whole teaching statement feature templates storehouse
Upright process is equivalent to the voice teacher's " preparing lessons " process.(6) under MS-Windows 95 environment, adopt Visual C++ (4.0 above version)
Programming language is write Chinese speech signal processing and identification software module and is used for results of learning
The software module of estimating.Its implementation is: according to the many matchmakers of ready area of computer aided
Body educational programme content is by computer learning standard Chinese mandarin.At first, student
By the study schedule requirement that program is given, the selected Chinese sentence that will learn (or syllable/
Character/word etc.), thermal shock sounding icon sends this statement standards for teachers mandarin sound.It
After, the student begins the study pronunciation, and (setting the used sampling rate of study voice here, is
8khz, quantitative rate is 8, monophony), will learn the voice flow signal and deposit in accordingly
Data file.Extract its characteristic parameter by (4) described process.To voice data file
In speech data carry out windowing (adopting hamming code window here), according to set sampling
Parameter such as frequency, quantitative rate, every frame sampling point data should be 128, and it is 64 that frame moves sampling point
Individual, window is about 16ms.Intraframe data is carried out the FFT conversion; Afterwards, ask for each frame
Feature parameter vector.By the feature parameter vector of every frame, will by pure and impure sound decision rule
It incorporates into and is voiceless sound/voiced sound phoneme feature parameter vector; Continue said process, up to pure and impure
Sound (or quiet) boundary frame occurs, will be corresponding to a voiceless sound phoneme or a voiced sound sound
All frame feature parameter vectors of plain (or quiet) are weighted on average, obtain corresponding to
The feature parameter vector of this phoneme.Adopt Kohonen neural network and supervised learning to calculate
Method, at first, with the Chinese of setting up by (4) with standard mandarin feature kernel
Language phoneme of speech sound feature templates storehouse compares, and obtains this phoneme in kernel domain or hybrid domain
In the template base one choosing, two choosings and three select label to (promptly at the phoneme label of each choosing,
A variable result phase label is arranged, and 0 ecbatic state belongs to kernel domain, 1 expression
Result phase belongs to hybrid domain, and for example: b (1) expression recognition result is for belonging to hybrid domain
Phoneme b).So far, finished learner's phoneme of speech sound identification mission.Repeat above-mentioned
Process, phoneme feature label string is selected in the choosing, two choosings and three that obtain the statement of learning; Count again
Calculate its corresponding fundamental frequency track.By with teacher's teaching statement feature templates storehouse in
Statement label string template base and tone prosodic features template base are carried out pattern match relatively,
Can obtain matching degree the quantitatively evaluating result (the one, judge whether phoneme mates; Two
Be to judge whether it belongs to kernel domain; The 3rd, judge whether tone is correct), promptly realized
Estimate learner's learning Chinese roughly situation (comprise whether multitone is arranged, few sound, wrong sound,
Whether correct with tone) and to estimate the situation of study standard mandarin (promptly general with standard
The degree of closeness of conversation is that pronunciation is accurate, carries a tune substantially, and is still incorrect),
At last export the quantitatively evaluating result of study situation and targetedly automatically by computing machine
Recommendation on improvement.(7) under MS-Windows 95 environment, adopt Delphy language compilation area of computer aided right
Outer Chinese multimedia teaching program software, it comprises: teaching of Chinese pin yin, standard Chinese
The explanation of mandarin word, speech, statement and text and voice and write contents such as teaching.
In the multimedia teaching software module, call soft by Visual C++ programming as required
The part module.This educational programme has the function and the feature of following several aspects: a. to carry out the teaching of teaching of Chinese pin yin and Chinese character/word/statement/text etc. by the text order; B. each teaching unit comprise listen, say, four partial contents of reading and writing; And learn the learner
After " saying " and learning " writing " and finish, automatically real-time learning outcome is estimated, right
After provide the suggestion that improves study targetedly automatically; C. in the process that " listening " teacher teaches, can on screen, see teacher's the shape of the mouth as one speaks simultaneously
Action; D. in writing the teaching process, can on screen, see teacher's writing process and write knot
Really; E. for cooperating teaching, arranged several Chinese studying recreation to select for use for the learner; F. part text content cooperates animation to demonstrate explanation.(8) set up Chinese-character writing teacher teaching (dynamically) template base.Its implementation is: will impart knowledge to students
Is 3 centimetres of square square frames that have the M shape substrate with Chinese character (regular script body) by the length of side
Graphic model preserved and be used as template.Each length of side is 3 centimetres the M shape that has
The gray scale of the grid frame of substrate represents that with-1 screen resolution is 640 * 480,
Font weight is 0 or 1 (expression is white or black respectively).The Chinese of setting up by this setting
The teacher that word (regular script body) template base can be used as in this tutoring system writes the demonstration template
The storehouse.In order to make this tutoring system have the teaching of writing process (such as stroke, the order of strokes observed in calligraphy)
Ability needs in advance whole writing process to be recorded, and through after the digitized processing, presses
Require to store respectively, set up (dynamically) writing analog board storehouse.Then, according to
The teaching requirement, the speed (dynamically) with per second 24 frame figures plays back again.(9) writing process with the implementation of writing the evaluation of result process is: given in educational programme
The M shape square frame of 3 centimetres of length of sides in, the student according to the exemplary word of writing the teacher and
The teaching writing process uses mouse to write on computer screen and (comprises that stroke reaches
Font) exercise.Carry out (dynamically) pattern by writing the teaching template base with (dynamically)
Relatively, can know student's writing process and write the result whether all correct, then from
Moving provide quantification write evaluation result and recommendation on improvement targetedly thereof.
The principle of work and the process thereof of whole intelligent tutoring system are as follows: follow its educational programme 2 to learn by computing machine 1, the result of study voice (or Chinese-character writing) can be imported in the computing machine by microphone (or mouse).If phonetic study signal, then enter voice evaluation and suggestion subsystem 6 by voice signal processing and recognition subsystem 5 thereof, through with the matching ratio in the phoneme of speech sound template base 3 with standard mandarin feature kernel and teacher's statement feature templates storehouse 4, obtain learner's phonetic study Evaluation on effect result (promptly be the standard Chinese of standard, still have the standard Chinese of accent or wrong sound, lack the mistake on the phonetic studies such as sound, multitone), and, provide recommendation on improvement automatically in phonetic study at the different problems that the learner exists; If write learning signal, then enter writing process after signal mode recognition subsystem 7 is handled and write evaluation of result and suggestion subsystem 9 through writing, compare through pattern match with writing analog board storehouse 8, obtain learner's writing process and the evaluation result of writing the result, and, provide recommendation on improvement automatically in writing learning process at the different problems that the learner exists.So far, finished the computer aided Chinese intelligent education task.

Claims (1)

1. a computer aided Chinese intelligent education system and its implementation, mainly comprise a multimedia computer that has microphone and CD-ROM, phoneme of speech sound template base with Chinese standard mandarin feature kernel, Chinese standard mandarin teacher statement feature templates storehouse, Chinese speech signal is handled and recognition subsystem, Chinese speech study evaluation and recommendations subsystem, Chinese writing signal Processing and pattern-recognition subsystem thereof, Chinese writing signal templates storehouse, Chinese writing is estimated and the suggestion subsystem, and corresponding calculated machine multimedia Chinese standard mandarin educational programme, it is characterized in that:
A. phoneme of speech sound template base and its implementation of having Chinese standard mandarin feature kernel: at first, teacher's standard mandarin videotape instruction program is carried out digitized processing respectively by teaching image and voice signal, voice signal is a unit with the Chinese phonetic alphabet/character/word/sentence, have men and women's sound two people's sounding samples at least, and store by standard format; At the Multimedia Dynamic image area, set up standard mandarin teacher voiced process teaching active multi-media image database; At the voice signal part, set up Chinese standard mandarin teacher teaching language material database; Then, carry out the extraction of speech characteristic parameter; Speech data signal in the standard mandarin teacher teaching language material database is carried out windowing; The sampling rate of original language material database is 16khz, and quantitative rate is 16, monophony, and every frame sampling point data are 256, and it is 128 that frame moves sampling point, and window is about 16ms; Intraframe data is carried out the FFT conversion, afterwards, ask for the feature parameter vector of each frame; By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound phoneme feature parameter vector storehouse; Continue said process, occur, will be weighted on average, obtain feature parameter vector corresponding to this phoneme corresponding to all frame feature parameter vectors of a voiceless sound phoneme or a voiced sound phoneme (or quiet) up to pure and impure cent circle frame; Repeat said process, at a phoneme template, we can obtain many standard mandarin feature parameter vectors, have constituted a feature parameter vector group thus; In order to make this feature parameter vector group can reflect the feature of this phoneme well, adopt the Kohonen self organizing neural network, it is carried out the self-organizing clustering computing, thereby obtained a feature parameter vector kernel spacing territory that should phoneme; Repeat said process, can obtain whole standard mandarin phoneme of speech sound feature parameter vector kernel spacings territory template, and preserve separately; According to above-mentioned similar process, adopt the unspecified person large vocabulary to be used for the Chinese data database of continuous speech recognition, we can obtain unspecified person large vocabulary Chinese speech phoneme template base; At first, the speech data in the primary voice data file is carried out windowing; The sampling rate of original language material database is 16khz, and quantitative rate is 16, monophony, and every frame sampling point data are 256, and it is 128 that frame moves sampling point, and window is about 16ms; Intraframe data is carried out the FFT conversion; Afterwards, ask for the feature parameter vector of each frame.By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound group phoneme feature parameter vector; Continue said process, occur, will be weighted on average, obtain feature parameter vector corresponding to this phoneme corresponding to all frame feature parameter vectors of a voiceless sound phoneme or a voiced sound phoneme up to pure and impure cent circle frame; Owing to adopt the unspecified person large vocabulary to be fit to the Chinese data database of continuous speech recognition herein, repeat said process, at a phoneme template, we can obtain many feature parameter vectors; The kernel feature parameter vector group that to be tried to achieve by standard mandarin corpus new feature parameter vector combination therewith also constitutes the feature parameter vector group of a mixing; In order to make this composite character parameter vector group can reflect the feature of this phoneme well, adopt the Kohonen self organizing neural network that composite character parameter vector group is carried out the self-organizing clustering computing once more, thereby obtained a composite character parameter vector spatial domain that should phoneme; It can contain standard mandarin feature kernel template apparently, promptly we can say, standard mandarin feature kernel template can be used as a feature parameter vector subclass of mixing voice phoneme template; For the purpose of searching for conveniently, voiceless sound and voiced sound can be divided into two groups, set up two phoneme of speech sound subtemplate storehouses with standard mandarin feature kernel; These two sub-template base lump together the Chinese speech phoneme template base that has standard mandarin feature kernel as the required unspecified person large vocabulary of this intelligent tutoring system;
B. Chinese standard mandarin teacher statement feature templates storehouse and its implementation: at first extracting from teacher's standard mandarin voice flow signal has the feature parameter vector string of tuning joint/character/word/sentence/text, and sets up Chinese standard mandarin teaching statement feature templates storehouse; It is made up of three parts, i.e. Chinese standard mandarin feature kernel phoneme template base, teaching statement phoneme label string template base and teaching statement tone prosodic features template base; First's task is finished, and the implementation procedure of second portion is at teacher's a syllable/character/word/sentence/text, can obtain corresponding phoneme label string template, and it is deposited in the teaching statement feature templates storehouse; Third part is that teaching statement tone prosodic features template base is made up of teaching statement fundamental frequency track template set; Characterizing that the four tones of standard Chinese pronunciation and the rhythm change can be by estimating that pitch period obtain part and realize, the method for asking for the fundamental tone parameter generally has based on short-time autocorrelation function, based on AMDF in short-term, handles and based on several algorithms such as linear predictive codings based on homomorphic signal; Here, select for use based on the short-time autocorrelation function algorithm, main because its operand is little, the effect that obtains is fine again; In calculating the process of fundamental frequency, for the problem that the position that solves first maximal peak point can not coincide with pitch period sometimes, start with from following two aspects, the one, it is few greater than two pitch periods, to obtain effect preferably that window is grown to; The 2nd, be that the bandpass filter of 60-900Hz is carried out filtering to voice signal with a bandwidth, and utilize the autocorrelation function of filtering signal to carry out fundamental tone and estimate; So far, also have a problem to need to solve, because the pitch period track that no matter adopts any algorithm to try to achieve can not fit like a glove with real pitch period track, actual conditions are that most of paragraph is to coincide, and having departed from normal track at one or several pitch period estimation in some local paragraphs or zone, this situation is called and some wild points occurred; Need to adopt various smoothing algorithms for removing these wild points, adopt the median smoothing technology at this, promptly about smoothed point, respectively get L sampling point, together with one group of signal sampling value of the common formation of smoothed point, be total to 2L+1 sample value, then with this 2L+1 sample value by size order line up a team, get of the output of this formation intermediate as smoother; The L value generally is taken as 1 or 2, and promptly the window of median smoothing generally entangles 3 to 5 sample values; The advantage of median smoothing is to remove a small amount of wild point, and the step evolution that can not destroy again in the pitch period track between two smooth section changes; Thus, phoneme eigenvector kernel spacing territory template base, teaching statement phoneme label string template base and the tone prosodic features template base thereof of whole Chinese standard mandarin teaching statements have been obtained; These three template base lump together, and promptly constitute standard Chinese mandarin teacher teaching statement feature templates storehouse;
C. Chinese speech is learnt evaluation and recommendations subsystem and its implementation: according to ready area of computer aided multimedia teaching programme content, the study schedule requirement of giving by program, by the selected Chinese sentence that will learn of computing machine, thermal shock sounding icon sends this statement standards for teachers mandarin sound; Afterwards, begin the study pronunciation; Here, setting the used sampling rate of study voice is 8khz, and quantitative rate is 8, and monophony will be learnt the voice flow signal and deposit corresponding data file in; Extract its characteristic parameter by the described process of a.; According to set parameters such as sample frequency, quantitative rate, every frame sampling point data should be 128, it is 64 that frame moves sampling point, window is about 16ms; Intraframe data is carried out the FFT conversion; Afterwards, ask for the feature parameter vector of each frame; By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound phoneme feature parameter vector; Continue said process, occur, will be weighted on average, obtain feature parameter vector corresponding to this phoneme corresponding to all frame feature parameter vectors of a voiceless sound phoneme or a voiced sound phoneme up to pure and impure cent circle frame; Adopt Kohonen neural network and supervised learning algorithm, at first, compare with the Chinese speech phoneme feature templates storehouse of setting up by a. with standard mandarin feature kernel, obtaining a choosing, two choosings and three of this phoneme in kernel domain or hybrid domain template base selects label right, promptly at the phoneme label of each choosing, a variable result phase label is arranged, and 0 ecbatic state belongs to kernel domain, and 1 ecbatic state belongs to hybrid domain; So far, finished learner's phoneme of speech sound identification mission; Repeat said process, phoneme feature label string is selected in the choosing, two choosings and three that obtain the statement of learning; Calculate its corresponding fundamental frequency track again; By with teacher's teaching statement feature templates storehouse in statement label string template base and tone prosodic features template base carry out pattern match relatively, can obtain the quantitatively evaluating result of matching degree, the one, judge whether phoneme mates; The 2nd, judge whether it belongs to kernel domain; The 3rd, judge whether tone is correct, promptly realized estimating the roughly situation of learner's learning Chinese, comprise whether multitone is arranged, whether few sound, wrong sound and tone correct, and the situation of estimating the study standard mandarin, promptly with the degree of closeness of standard mandarin, be pronunciation accurately, carry a tune substantially, still incorrect, export the quantitatively evaluating result of study situation and recommendation on improvement targetedly thereof automatically by computing machine at last;
D. Chinese writing signal templates storehouse and its implementation: will impart knowledge to students with regular script body Chinese character is that 3 centimetres of graphic models that have a square square frame of M shape substrate have been preserved and are used as template by the length of side; Each length of side is the gray scale of 3 centimetres the grid frame that has the M shape substrate and represents that with-1 screen resolution is 640 * 480, and font weight is that 0 or 1 expression is white or black respectively; The teacher that the regular script body Chinese character template base of setting up by this setting can be used as in this tutoring system writes the demonstration template base; In order to make this tutoring system have writing process, as the teaching ability of stroke, the order of strokes observed in calligraphy, need in advance whole writing process to be recorded, through after the digitized processing, store respectively on request, set up one and dynamically write template base; Then, according to the teaching requirement, come out with the speed dynamic play of per second 24 frame figures again;
E. Chinese writing evaluation and suggestion subsystem and its implementation: in educational programme in the M shape square frame of 3 centimetres of given length of sides, the student uses mouse to carry out writing practising on computer screen according to exemplary word of writing the teacher and teaching writing process; By with dynamically write the teaching template base and carry out dynamic mode relatively, whether all correctly can know student's writing process and write the result, what provide quantification then automatically writes evaluation result and recommendation on improvement targetedly thereof;
F. computer media Chinese standard mandarin educational programme comprises the function and the feature of following several aspects:
(a). carry out the religion of teaching of Chinese pin yin and Chinese character/word/statement/text etc. by the text order
Learn;
(b). each teaching unit comprise listen, say, four partial contents of reading and writing; And in study
The person carries out learning outcome after learning " saying " and learn " writing " and finishing automatically real-time
Estimate, provide the suggestion that improves study then targetedly automatically;
(c). in the process that " listening " teacher teaches, can on screen, see the teacher's simultaneously
Shape of the mouth as one speaks action;
(d). in writing the teaching process, can on screen, see teacher's writing process and book
Write the result;
(e). for cooperating teaching, arranged several Chinese studying recreation to select for use for the learner;
(f). part text content cooperates animation to demonstrate explanation; The learner follows its educational programme to learn by computing machine, the result of study voice or Chinese-character writing can be imported in the computing machine by microphone or mouse; If phonetic study signal, then enter voice evaluation and suggestion subsystem by voice signal processing and recognition subsystem thereof, through carrying out matching ratio with the phoneme of speech sound template base with standard mandarin feature kernel and teacher's statement feature templates storehouse, obtain learner's phonetic study Evaluation on effect result, promptly be the standard Chinese of standard, still have the standard Chinese of accent or wrong sound, lack the mistake on the phonetic studies such as sound, multitone, and, provide recommendation on improvement automatically in phonetic study at the different problems that the learner exists; If write learning signal, then enter writing process after the signal mode recognition subsystem is handled and write evaluation of result and suggestion subsystem through writing, compare through pattern match with the writing analog board storehouse, obtain learner's writing process and the evaluation result of writing the result, and, provide recommendation on improvement automatically in writing learning process at the different problems that the learner exists; So far, finished the intelligent tutoring task of computer aided Chinese standard mandarin.
CN98101974A 1998-05-25 1998-05-25 Computer aided Chinese intelligent education system and its implementation method Pending CN1236928A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN98101974A CN1236928A (en) 1998-05-25 1998-05-25 Computer aided Chinese intelligent education system and its implementation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN98101974A CN1236928A (en) 1998-05-25 1998-05-25 Computer aided Chinese intelligent education system and its implementation method

Publications (1)

Publication Number Publication Date
CN1236928A true CN1236928A (en) 1999-12-01

Family

ID=5217052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN98101974A Pending CN1236928A (en) 1998-05-25 1998-05-25 Computer aided Chinese intelligent education system and its implementation method

Country Status (1)

Country Link
CN (1) CN1236928A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1331113C (en) * 2004-02-27 2007-08-08 雅马哈株式会社 Speech synthesizer,method and recording medium for speech recording synthetic program
CN100411011C (en) * 2005-11-18 2008-08-13 清华大学 Pronunciation quality evaluating method for language learning machine
CN107390872A (en) * 2017-07-24 2017-11-24 沙洲职业工学院 A kind of sound controlled computer
CN111105798A (en) * 2018-10-29 2020-05-05 宁波方太厨具有限公司 Equipment control method based on voice recognition
CN111192573A (en) * 2018-10-29 2020-05-22 宁波方太厨具有限公司 Equipment intelligent control method based on voice recognition

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1331113C (en) * 2004-02-27 2007-08-08 雅马哈株式会社 Speech synthesizer,method and recording medium for speech recording synthetic program
CN100411011C (en) * 2005-11-18 2008-08-13 清华大学 Pronunciation quality evaluating method for language learning machine
CN107390872A (en) * 2017-07-24 2017-11-24 沙洲职业工学院 A kind of sound controlled computer
CN111105798A (en) * 2018-10-29 2020-05-05 宁波方太厨具有限公司 Equipment control method based on voice recognition
CN111192573A (en) * 2018-10-29 2020-05-22 宁波方太厨具有限公司 Equipment intelligent control method based on voice recognition
CN111105798B (en) * 2018-10-29 2023-08-18 宁波方太厨具有限公司 Equipment control method based on voice recognition
CN111192573B (en) * 2018-10-29 2023-08-18 宁波方太厨具有限公司 Intelligent control method for equipment based on voice recognition

Similar Documents

Publication Publication Date Title
CN112863483B (en) Voice synthesizer supporting multi-speaker style and language switching and controllable rhythm
CN102779508B (en) Sound bank generates Apparatus for () and method therefor, speech synthesis system and method thereof
CN1187734C (en) Robot control apparatus
CN1763843A (en) Pronunciation quality evaluating method for language learning machine
EP3859735A3 (en) Voice conversion method, voice conversion apparatus, electronic device, and storage medium
CN1815522A (en) Method for testing mandarin level and guiding learning using computer
CN1681002A (en) Speech synthesis system, speech synthesis method, and program product
CN1101446A (en) Computerized system for teching speech
CN1379392A (en) Feeling speech sound and speech sound translation system and method
CN1790481A (en) Pronunciation assessment method and system based on distinctive feature analysis
CN106128450A (en) The bilingual method across language voice conversion and system thereof hidden in a kind of Chinese
CN102253976B (en) Metadata processing method and system for spoken language learning
CN1510590A (en) Language learning system and method with visual prompting to pronunciaton
CN1787035A (en) Method for computer assisting learning of deaf-dumb Chinese language pronunciation
CN108806719A (en) Interacting language learning system and its method
CN1320497C (en) Statistics and rule combination based phonetic driving human face carton method
CN116206496B (en) Oral english practice analysis compares system based on artificial intelligence
CN110598208A (en) AI/ML enhanced pronunciation course design and personalized exercise planning method
WO2006034569A1 (en) A speech training system and method for comparing utterances to baseline speech
CN108877769A (en) The method and apparatus for identifying dialect type
Bissiri et al. Lexical stress training of German compounds for Italian speakers by means of resynthesis and emphasis
CN1267805C (en) User's interface, system and method for automatically marking phonetic symbol to correct pronunciation
CN1956057A (en) Voice time premeauring device and method based on decision tree
CN1236928A (en) Computer aided Chinese intelligent education system and its implementation method
CN1253851C (en) Speaker's inspection and speaker's identification system and method based on prior knowledge

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication