CN1236928A

CN1236928A - Computer aided Chinese intelligent education system and its implementation method

Info

Publication number: CN1236928A
Application number: CN98101974A
Authority: CN
Inventors: 郭巧
Original assignee: Individual
Current assignee: Individual
Priority date: 1998-05-25
Filing date: 1998-05-25
Publication date: 1999-12-01

Abstract

A computer aided intelligent Chinese language education system is composed of multimedia computer with microphone and CD-ROM, computerized multimedia standard Chinese common speech education program, speech cells template library with kernel of standard Chinese common speech characteiistics, standard Chinese common speech sentence characteristics templat library for teacher, Chinese speech signal processing and recognizing and recognizing subsystem, Chinese-character writing signal template library and writing evaluation subsystem. It can automatically perform the quantitative evaluation to the learning effect on speaking standard Chinese common speech and writing Chinese characters in real time mode.

Description

Computer aided Chinese intelligent education system and its implementation

The present invention relates to a kind of computer aided Chinese intelligent education system and its implementation, affiliated field is computer-aided instruction.

Present existing standard Chinese teaching method mainly contains following several, i.e. classroom instruction, distance teaching, television teaching, recording teaching, video teaching and computer multimedia teaching.There is following problem in these teaching methods: a. classroom instruction is subjected to the restriction of teaching conditions such as time, place and teacher's level usually; B. distance teaching, television teaching, recording teaching, video teaching and existing in the market computer media Chinese teaching program etc. are underaction all, can't real-time online and the scientific quantification ground problem that exists in learning process of analytic learning person (, writing stroke whether correct as pronunciation whether correctly and whether write the result correct etc.) automatically.Therefore, also just can not feed back instruction in time, targetedly.The pertinent literature of finding through retrieval has Chinese patent " CN1101446A Computerized system for teching speech ", " CN2202949Y Chinese studying device ", " CN1146820A talking phonics interactive learning device ", " CN2257944Y Chinese phonetic alphabet Received Pronunciation Teaching instrument ".All there is bigger difference in the disclosed content of above-mentioned pertinent literature with the present invention with regard to instructional function and instructional mode.

The purpose of this invention is to provide that a kind of have can online in real time and scientific quantification ground computer aided Chinese intelligent education system and its implementation of the problem that exists of analytic learning person automatically in the Chinese studying process.Adopt this cover intelligent tutoring system, can not be subjected to the time, the standard Chinese teaching of standard is carried out in the restriction of conditions such as place and teachers ' teaching level, and can not have under (live body) the teacher situation on the scene, the results of learning (accuracy that comprises pronunciation to learner's learning Chinese standard mandarin, the standardization of writing process and the aspects such as correctness of writing the result) carry out real-time online and scientific quantification ground automatically and estimate, and can in time provide recommendation on improvement effectively at the different particular problems that the learner exists

Purpose of the present invention is realized by following technical scheme: the multimedia computer that has microphone and CD-ROM, computer media Chinese standard mandarin educational programme, phoneme of speech sound template base and its implementation with Chinese standard mandarin feature kernel, standard mandarin teacher statement feature templates storehouse and its implementation, Chinese speech signal processing and recognition subsystem and its implementation, phonetic study evaluation and recommendations subsystem and its implementation, write signal Processing and pattern-recognition subsystem and its implementation, write signal templates storehouse and its implementation, write evaluation and recommendations subsystem and its implementation.Follow its educational programme to learn by computing machine, the result of study voice or Chinese-character writing can be imported in the computing machine by microphone or mouse.If phonetic study signal, then enter voice evaluation and suggestion subsystem by voice signal processing and recognition subsystem thereof, through with the matching ratio in the phoneme of speech sound template base with standard mandarin feature kernel and teacher's statement feature templates storehouse, obtain learner's phonetic study Evaluation on effect result, promptly be the standard Chinese of standard, still have the standard Chinese of accent or wrong sound, lack the mistake on the phonetic studies such as sound, multitone, and, provide recommendation on improvement automatically in phonetic study at the different problems that the learner exists; If write learning signal, then enter writing process after the signal mode recognition subsystem is handled and write evaluation of result and suggestion subsystem through writing, compare through pattern match with the writing analog board storehouse, obtain learner's writing process and the evaluation result of writing the result, and, provide recommendation on improvement automatically in writing learning process at the different problems that the learner exists.So far, finished the intelligent tutoring task of computer aided Chinese standard mandarin.

The present invention compares with existing Chinese teaching method has the restriction that following advantage: a. is not subjected to teaching conditions such as time, place and teacher's level, can not have (live body)

Under teacher's situation on the scene, to the results of learning (bag of foreign student's learning Chinese standard mandarin

Draw together the accuracy of pronunciation, the standardization of writing process and the aspects such as correctness of writing the result)

Automatically carry out real-time online and scientific quantification ground and estimate, and can be at the difference of learner's existence

Particular problem in time provides recommendation on improvement effectively.B. in the invention, set up the unspecified person large vocabulary and be suitable for continuous speech recognition and tool

The phoneme of speech sound feature templates storehouse of standard mandarin feature kernel is arranged, thereby solved the Chinese effectively

The required template base problem of the accurate mandarin teaching efficiency of logograph evaluation system.C. in the invention, set up standard mandarin teacher statement feature templates storehouse and (" spoken here

The sentence " notion be broad sense, it had both comprised Chinese phonetic alphabet, what comprise Chinese word again has

The speech of tuning joint and Chinese standard mandarin, sentence or text).This template base is by three ones

Be grouped into, i.e. Chinese standard mandarin feature kernel (phoneme) template base, teaching statement phoneme

Label string template base and tone prosodic features template base thereof.D. in the invention, realize that by following plant process the automatic on-line of phonetic study effect is commented

Valency, promptly according to ready area of computer aided multimedia teaching programme content, the student is by calculating

The selected text content that will learn of the mouse of machine (as Chinese syllable/character/word/statement etc.), heat

The percussive sound icon sends this statement standards for teachers mandarin teaching voice.Afterwards, the student begins

The study pronunciation.Through computing machine the voice signal that the student learns to pronounce is handled it in real time

After, with unspecified person large vocabulary Chinese speech phoneme mould with standard mandarin feature kernel

The plate storehouse compares, and obtains a choosing, two choosings of this phoneme in kernel domain or hybrid domain template base

With three select label to (, a variable result phase mark being arranged promptly at the phoneme label of each choosing

Number, 0 ecbatic state belongs to kernel domain, and 1 ecbatic state belongs to hybrid domain, for example: p

(1) the expression recognition result is the phoneme p that belongs to hybrid domain).So far, finished the learner

The identification mission of phoneme of speech sound.Repeat said process, obtain one of the statement of learning select, two select and

Three select phoneme feature label string; Calculate its corresponding fundamental frequency track again.At this moment, with the teacher

Corresponding statement label string template base and tone prosodic features template base in the statement feature templates storehouse

Carry out pattern match relatively, promptly obtain matching degree the quantitatively evaluating result (the one, judge phoneme

Whether mate; The 2nd, judge whether it belongs to kernel domain; Three will judge whether tone is correct),

Thereby the situation that has not only realized computing machine automatic Evaluation learner learning Chinese (comprises whether having

Whether multitone, few sound, wrong sound and tone be correct) and can also estimate learner's learning Chinese

The situation of standard mandarin (promptly with the degree of closeness of standard mandarin, be pronunciation accurately, substantially

Carry a tune, still incorrect).Computing machine is exported the evaluation result of study situation automatically and is had

Recommendation on improvement targetedly.E. in educational programme in the M shape square frame of 3 centimetres of given length of sides, the student is according to writing religion

Teacher's exemplary word and demonstration writing process use mouse to write (bag on computer screen

Draw together stroke and font) exercise.By with write teacher's template base and carry out (dynamically) pattern relatively

Automatically differentiate student's writing process and the correct degree of writing the result, give output then automatically

That changes writes evaluation result and recommendation on improvement targetedly.

Concrete structure of the present invention is provided by accompanying drawing 1.

Accompanying drawing 1 is the computer aided Chinese intelligent education system structural representation.

Concrete structure among the figure is: the multimedia computer 1, computer media Chinese standard mandarin educational programme 2, the phoneme of speech sound template base 3 with Chinese standard mandarin feature kernel, standard mandarin teacher statement feature templates storehouse 4, Chinese speech signal that has microphone and a CD-ROM handles and recognition subsystem 5, phonetic study evaluation with suggestion subsystem 6, write signal Processing and pattern-recognition subsystem 7 thereof, write signal templates storehouse 8, write and estimate and suggestion subsystem 9.

Below in conjunction with accompanying drawing the present invention is described in detail: (1) sets up a unspecified person (preferably number is more than 100 people) large vocabulary (5000 Chinese

More than the word) (it covers as far as possible to be fit to standard Chinese corpus that continuous speech recognition uses

Cover all voice sound connection phenomenons); (2) make (comprising what two teacher's teachings of men and women were pronounced an at least) standard mandarin teaching

The section directory image-tape; (3) convert above-mentioned standard mandarin teaching to the voice teaching with video-tape respectively by computing machine

Database and write (dynamic image) training data storehouse; (4) under MS-Windows 95 environment, adopt Visual C++ (4.0 above version) programming language to carry out modularization programming, set up the phoneme of speech sound template base that the unspecified person large vocabulary with standard mandarin feature kernel is suitable for Chinese continuous speech identification usefulness.Its implementation is: at first, (with the Chinese phonetic alphabet/character/word/sentence is unit by teaching (dynamically) image and voice signal with teacher's standard mandarin videotape instruction program, have men and women's sound two people's sounding samples at least)) carry out digitized processing respectively, and store by standard format.At multimedia (dynamically) image area, set up standard mandarin teacher voiced process teaching (active multi-media image) database; At the voice signal part, set up Chinese standard mandarin teacher teaching language material database.Foundation has the unspecified person large vocabulary Chinese speech phoneme template base of standard mandarin feature kernel below, for this reason, and the extraction of advanced lang sound characteristic parameter.Speech data signal in the standard mandarin teacher teaching language material database is carried out windowing (adopting hamming code window here).The sampling rate of original language material database is 16khz, and quantitative rate is 16, monophony, and every frame sampling point data are 256, and it is 128 that frame moves sampling point, and window is about 16ms.Intraframe data is carried out the FFT conversion, afterwards, ask for the feature parameter vector of each frame.By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound phoneme feature parameter vector storehouse; Continue said process, occur, will be weighted on average, obtain feature parameter vector corresponding to this phoneme corresponding to all frame feature parameter vectors of a voiceless sound phoneme or a voiced sound phoneme up to pure and impure cent circle frame.Repeat said process, at a phoneme template, we can obtain many (men and women mixes) standard mandarin feature parameter vector, have constituted a feature parameter vector group thus.In order to make this feature parameter vector group can reflect the feature of this phoneme well, adopt the Kohonen self organizing neural network, it is carried out the self-organizing clustering computing, thereby obtained a feature parameter vector kernel spacing territory that should phoneme.Repeat said process, can obtain whole standard mandarin phoneme of speech sound feature parameter vector kernel spacings territory template, and preserve separately.In addition, adopt the unspecified person large vocabulary to be used for the Chinese data database of continuous speech recognition, according to above-mentioned similar process, we can obtain unspecified person large vocabulary Chinese speech phoneme template base.At first, the speech data to primary voice data file (being the Chinese data database) lining carries out windowing (adopting hamming code window here).The sampling rate of original language material database is 16khz, and quantitative rate is 16, monophony, and every frame sampling point data are 256, and it is 128 that frame moves sampling point, and window is about 16ms.Intraframe data is carried out the FFT conversion; Afterwards, ask for the feature parameter vector of each frame.By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound group phoneme feature parameter vector; Continue said process, occur, will vow corresponding to all frame characteristic parameters of a voiceless sound phoneme or a voiced sound phoneme (or quiet) up to pure and impure sound (or quiet) boundary frame

Amount is weighted average, obtains the feature parameter vector corresponding to this phoneme.Because herein

Adopt the unspecified person large vocabulary to be fit to Chinese data database (it of continuous speech recognition

Covered all voice sound connection phenomenons as far as possible), repeat said process, at one

The phoneme template, we can obtain many (men and women mixes) feature parameter vector.Will

The kernel feature parameter vector group of being tried to achieve by the standard mandarin corpus is the new feature parameter therewith

Set of vectors merges, and constitutes the feature parameter vector group of a mixing.In order to make this mixing

The feature parameter vector group can reflect the feature of this phoneme well, adopts once more

The Kohonen self organizing neural network carries out self-organizing clustering to composite character parameter vector group

Computing, thus obtained a composite character parameter vector spatial domain that should phoneme.

It can contain standard mandarin feature kernel template apparently, promptly we can say mark

Accurate mandarin feature kernel template can be used as a feature ginseng of mixing voice phoneme template

Number vector subclass.(this also is the reason that we are called " kernel " template).For searching

For the purpose of rope is convenient, voiceless sound and voiced sound can be divided into two groups, setting up two, to have standard general

The phoneme of speech sound subtemplate storehouse of conversation feature kernel.These two sub-template base lump together work

The unspecified person large vocabulary required for this intelligent tutoring system has the standard mandarin feature

The Chinese speech phoneme template base of kernel.(5) set up standard mandarin teacher statement feature templates storehouse, its implementation is: by with (4)

Similar process, we can extract from teacher's standard mandarin voice flow signal has

The feature parameter vector string of tuning joint/character/word/sentence/text, and it is common to set up the Chinese standard

Words teaching statement feature templates storehouse.It is made up of three parts, i.e. Chinese standard mandarin spy

Levy kernel phoneme template base, teaching statement phoneme label string template base and teaching statement sound

Transfer the prosodic features template base.First's task is finished, the realization of second portion

Journey is: a syllable/character/word/sentence/text at the teacher can obtain corresponding

Phoneme (comprising the tone feature) label string template.And deposit it in teaching statement feature

In the template base.Third part is that teaching statement tone prosodic features template base is by the teaching statement

Fundamental frequency track template set is formed.The changing pattern of fundamental frequency is the tone of Chinese characters in the Chinese

Basic feature.The tone of Chinese characters shows as the variation of the four tones of standard Chinese pronunciation and the rhythm in Chinese sentence.Thus not

Be difficult to see, characterize the four tones of standard Chinese pronunciation and the rhythm and change and to obtain part by the estimation pitch period

Realize.The method of asking for the fundamental tone parameter generally has based on short-time autocorrelation function, based on weak point

The time AMDF, handle and based on several algorithms such as linear predictive codings based on homomorphic signal.

Here, select for use based on the short-time autocorrelation function algorithm, main because its operand is little, and

The effect that obtains is fine again.In the process of calculating fundamental frequency, for solving first maximum

The problem that the position of peak point can not coincide with pitch period sometimes, we are from following two

Start with in the aspect, and the one, it is few greater than two pitch periods that window is grown to, and imitates to obtain preferably

Really; The 2nd, be that the bandpass filter of 60-900Hz is carried out voice signal with a bandwidth

Filtering, and utilize the autocorrelation function of filtering signal to carry out the fundamental tone estimation.So far, also

There is a problem to need to solve, because the pitch period that no matter adopts any algorithm to try to achieve

Track can not fit like a glove with real pitch period track, and actual conditions are most of

Paragraph coincide, and at some local paragraphs or one or several regional pitch period

Valuation has departed from normal track, and this situation is called and some wild points occurred.For removing

These wild points need to adopt various smoothing algorithms, wherein the most frequently used is median smoothing algorithm and

Linear smoothing algorithm.Adopted the median smoothing technology in this problem, about smoothed point

Respectively get L sampling point, together with one group of signal sampling value of the common formation of smoothed point, (2L altogether

+ 1) individual sample value, then with this (2L+1) individual sample value by size order line up a team,

Get of the output of this formation intermediate as smoother.The L value generally is taken as 1 or 2, promptly in

The level and smooth window of value generally entangles 3 to 5 sample values.The advantage of median smoothing is to go

Remove a small amount of wild point, can not destroy in the pitch period track step between two smooth section again

The property variation.Thus, can obtain the phoneme spy of whole Chinese standard mandarin teaching statements

Levy vector kernel spacing territory template base, teaching statement phoneme label string template base and tone thereof

The prosodic features template base.These three template base lump together, and it is common promptly to constitute standard Chinese

Words (teacher) teaching statement feature templates storehouse.Building of whole teaching statement feature templates storehouse

Upright process is equivalent to the voice teacher's " preparing lessons " process.(6) under MS-Windows 95 environment, adopt Visual C++ (4.0 above version)

Programming language is write Chinese speech signal processing and identification software module and is used for results of learning

The software module of estimating.Its implementation is: according to the many matchmakers of ready area of computer aided

Body educational programme content is by computer learning standard Chinese mandarin.At first, student

By the study schedule requirement that program is given, the selected Chinese sentence that will learn (or syllable/

Character/word etc.), thermal shock sounding icon sends this statement standards for teachers mandarin sound.It

After, the student begins the study pronunciation, and (setting the used sampling rate of study voice here, is

8khz, quantitative rate is 8, monophony), will learn the voice flow signal and deposit in accordingly

Data file.Extract its characteristic parameter by (4) described process.To voice data file

In speech data carry out windowing (adopting hamming code window here), according to set sampling

Parameter such as frequency, quantitative rate, every frame sampling point data should be 128, and it is 64 that frame moves sampling point

Individual, window is about 16ms.Intraframe data is carried out the FFT conversion; Afterwards, ask for each frame

Feature parameter vector.By the feature parameter vector of every frame, will by pure and impure sound decision rule

It incorporates into and is voiceless sound/voiced sound phoneme feature parameter vector; Continue said process, up to pure and impure

Sound (or quiet) boundary frame occurs, will be corresponding to a voiceless sound phoneme or a voiced sound sound

All frame feature parameter vectors of plain (or quiet) are weighted on average, obtain corresponding to

The feature parameter vector of this phoneme.Adopt Kohonen neural network and supervised learning to calculate

Method, at first, with the Chinese of setting up by (4) with standard mandarin feature kernel

Language phoneme of speech sound feature templates storehouse compares, and obtains this phoneme in kernel domain or hybrid domain

In the template base one choosing, two choosings and three select label to (promptly at the phoneme label of each choosing,

A variable result phase label is arranged, and 0 ecbatic state belongs to kernel domain, 1 expression

Result phase belongs to hybrid domain, and for example: b (1) expression recognition result is for belonging to hybrid domain

Phoneme b).So far, finished learner's phoneme of speech sound identification mission.Repeat above-mentioned

Process, phoneme feature label string is selected in the choosing, two choosings and three that obtain the statement of learning; Count again

Calculate its corresponding fundamental frequency track.By with teacher's teaching statement feature templates storehouse in

Statement label string template base and tone prosodic features template base are carried out pattern match relatively,

Can obtain matching degree the quantitatively evaluating result (the one, judge whether phoneme mates; Two

Be to judge whether it belongs to kernel domain; The 3rd, judge whether tone is correct), promptly realized

Estimate learner's learning Chinese roughly situation (comprise whether multitone is arranged, few sound, wrong sound,

Whether correct with tone) and to estimate the situation of study standard mandarin (promptly general with standard

The degree of closeness of conversation is that pronunciation is accurate, carries a tune substantially, and is still incorrect),

At last export the quantitatively evaluating result of study situation and targetedly automatically by computing machine

Recommendation on improvement.(7) under MS-Windows 95 environment, adopt Delphy language compilation area of computer aided right

Outer Chinese multimedia teaching program software, it comprises: teaching of Chinese pin yin, standard Chinese

The explanation of mandarin word, speech, statement and text and voice and write contents such as teaching.

In the multimedia teaching software module, call soft by Visual C++ programming as required

The part module.This educational programme has the function and the feature of following several aspects: a. to carry out the teaching of teaching of Chinese pin yin and Chinese character/word/statement/text etc. by the text order; B. each teaching unit comprise listen, say, four partial contents of reading and writing; And learn the learner

After " saying " and learning " writing " and finish, automatically real-time learning outcome is estimated, right

After provide the suggestion that improves study targetedly automatically; C. in the process that " listening " teacher teaches, can on screen, see teacher's the shape of the mouth as one speaks simultaneously

Action; D. in writing the teaching process, can on screen, see teacher's writing process and write knot

Really; E. for cooperating teaching, arranged several Chinese studying recreation to select for use for the learner; F. part text content cooperates animation to demonstrate explanation.(8) set up Chinese-character writing teacher teaching (dynamically) template base.Its implementation is: will impart knowledge to students

Is 3 centimetres of square square frames that have the M shape substrate with Chinese character (regular script body) by the length of side

Graphic model preserved and be used as template.Each length of side is 3 centimetres the M shape that has

The gray scale of the grid frame of substrate represents that with-1 screen resolution is 640 * 480,

Font weight is 0 or 1 (expression is white or black respectively).The Chinese of setting up by this setting

The teacher that word (regular script body) template base can be used as in this tutoring system writes the demonstration template

The storehouse.In order to make this tutoring system have the teaching of writing process (such as stroke, the order of strokes observed in calligraphy)

Ability needs in advance whole writing process to be recorded, and through after the digitized processing, presses

Require to store respectively, set up (dynamically) writing analog board storehouse.Then, according to

The teaching requirement, the speed (dynamically) with per second 24 frame figures plays back again.(9) writing process with the implementation of writing the evaluation of result process is: given in educational programme

The M shape square frame of 3 centimetres of length of sides in, the student according to the exemplary word of writing the teacher and

The teaching writing process uses mouse to write on computer screen and (comprises that stroke reaches

Font) exercise.Carry out (dynamically) pattern by writing the teaching template base with (dynamically)

Relatively, can know student's writing process and write the result whether all correct, then from

Moving provide quantification write evaluation result and recommendation on improvement targetedly thereof.

The principle of work and the process thereof of whole intelligent tutoring system are as follows: follow its educational programme 2 to learn by computing machine 1, the result of study voice (or Chinese-character writing) can be imported in the computing machine by microphone (or mouse).If phonetic study signal, then enter voice evaluation and suggestion subsystem 6 by voice signal processing and recognition subsystem 5 thereof, through with the matching ratio in the phoneme of speech sound template base 3 with standard mandarin feature kernel and teacher's statement feature templates storehouse 4, obtain learner's phonetic study Evaluation on effect result (promptly be the standard Chinese of standard, still have the standard Chinese of accent or wrong sound, lack the mistake on the phonetic studies such as sound, multitone), and, provide recommendation on improvement automatically in phonetic study at the different problems that the learner exists; If write learning signal, then enter writing process after signal mode recognition subsystem 7 is handled and write evaluation of result and suggestion subsystem 9 through writing, compare through pattern match with writing analog board storehouse 8, obtain learner's writing process and the evaluation result of writing the result, and, provide recommendation on improvement automatically in writing learning process at the different problems that the learner exists.So far, finished the computer aided Chinese intelligent education task.

Claims

1. a computer aided Chinese intelligent education system and its implementation, mainly comprise a multimedia computer that has microphone and CD-ROM, phoneme of speech sound template base with Chinese standard mandarin feature kernel, Chinese standard mandarin teacher statement feature templates storehouse, Chinese speech signal is handled and recognition subsystem, Chinese speech study evaluation and recommendations subsystem, Chinese writing signal Processing and pattern-recognition subsystem thereof, Chinese writing signal templates storehouse, Chinese writing is estimated and the suggestion subsystem, and corresponding calculated machine multimedia Chinese standard mandarin educational programme, it is characterized in that:

A. phoneme of speech sound template base and its implementation of having Chinese standard mandarin feature kernel: at first, teacher's standard mandarin videotape instruction program is carried out digitized processing respectively by teaching image and voice signal, voice signal is a unit with the Chinese phonetic alphabet/character/word/sentence, have men and women's sound two people's sounding samples at least, and store by standard format; At the Multimedia Dynamic image area, set up standard mandarin teacher voiced process teaching active multi-media image database; At the voice signal part, set up Chinese standard mandarin teacher teaching language material database; Then, carry out the extraction of speech characteristic parameter; Speech data signal in the standard mandarin teacher teaching language material database is carried out windowing; The sampling rate of original language material database is 16khz, and quantitative rate is 16, monophony, and every frame sampling point data are 256, and it is 128 that frame moves sampling point, and window is about 16ms; Intraframe data is carried out the FFT conversion, afterwards, ask for the feature parameter vector of each frame; By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound phoneme feature parameter vector storehouse; Continue said process, occur, will be weighted on average, obtain feature parameter vector corresponding to this phoneme corresponding to all frame feature parameter vectors of a voiceless sound phoneme or a voiced sound phoneme (or quiet) up to pure and impure cent circle frame; Repeat said process, at a phoneme template, we can obtain many standard mandarin feature parameter vectors, have constituted a feature parameter vector group thus; In order to make this feature parameter vector group can reflect the feature of this phoneme well, adopt the Kohonen self organizing neural network, it is carried out the self-organizing clustering computing, thereby obtained a feature parameter vector kernel spacing territory that should phoneme; Repeat said process, can obtain whole standard mandarin phoneme of speech sound feature parameter vector kernel spacings territory template, and preserve separately; According to above-mentioned similar process, adopt the unspecified person large vocabulary to be used for the Chinese data database of continuous speech recognition, we can obtain unspecified person large vocabulary Chinese speech phoneme template base; At first, the speech data in the primary voice data file is carried out windowing; The sampling rate of original language material database is 16khz, and quantitative rate is 16, monophony, and every frame sampling point data are 256, and it is 128 that frame moves sampling point, and window is about 16ms; Intraframe data is carried out the FFT conversion; Afterwards, ask for the feature parameter vector of each frame.By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound group phoneme feature parameter vector; Continue said process, occur, will be weighted on average, obtain feature parameter vector corresponding to this phoneme corresponding to all frame feature parameter vectors of a voiceless sound phoneme or a voiced sound phoneme up to pure and impure cent circle frame; Owing to adopt the unspecified person large vocabulary to be fit to the Chinese data database of continuous speech recognition herein, repeat said process, at a phoneme template, we can obtain many feature parameter vectors; The kernel feature parameter vector group that to be tried to achieve by standard mandarin corpus new feature parameter vector combination therewith also constitutes the feature parameter vector group of a mixing; In order to make this composite character parameter vector group can reflect the feature of this phoneme well, adopt the Kohonen self organizing neural network that composite character parameter vector group is carried out the self-organizing clustering computing once more, thereby obtained a composite character parameter vector spatial domain that should phoneme; It can contain standard mandarin feature kernel template apparently, promptly we can say, standard mandarin feature kernel template can be used as a feature parameter vector subclass of mixing voice phoneme template; For the purpose of searching for conveniently, voiceless sound and voiced sound can be divided into two groups, set up two phoneme of speech sound subtemplate storehouses with standard mandarin feature kernel; These two sub-template base lump together the Chinese speech phoneme template base that has standard mandarin feature kernel as the required unspecified person large vocabulary of this intelligent tutoring system;

B. Chinese standard mandarin teacher statement feature templates storehouse and its implementation: at first extracting from teacher's standard mandarin voice flow signal has the feature parameter vector string of tuning joint/character/word/sentence/text, and sets up Chinese standard mandarin teaching statement feature templates storehouse; It is made up of three parts, i.e. Chinese standard mandarin feature kernel phoneme template base, teaching statement phoneme label string template base and teaching statement tone prosodic features template base; First's task is finished, and the implementation procedure of second portion is at teacher's a syllable/character/word/sentence/text, can obtain corresponding phoneme label string template, and it is deposited in the teaching statement feature templates storehouse; Third part is that teaching statement tone prosodic features template base is made up of teaching statement fundamental frequency track template set; Characterizing that the four tones of standard Chinese pronunciation and the rhythm change can be by estimating that pitch period obtain part and realize, the method for asking for the fundamental tone parameter generally has based on short-time autocorrelation function, based on AMDF in short-term, handles and based on several algorithms such as linear predictive codings based on homomorphic signal; Here, select for use based on the short-time autocorrelation function algorithm, main because its operand is little, the effect that obtains is fine again; In calculating the process of fundamental frequency, for the problem that the position that solves first maximal peak point can not coincide with pitch period sometimes, start with from following two aspects, the one, it is few greater than two pitch periods, to obtain effect preferably that window is grown to; The 2nd, be that the bandpass filter of 60-900Hz is carried out filtering to voice signal with a bandwidth, and utilize the autocorrelation function of filtering signal to carry out fundamental tone and estimate; So far, also have a problem to need to solve, because the pitch period track that no matter adopts any algorithm to try to achieve can not fit like a glove with real pitch period track, actual conditions are that most of paragraph is to coincide, and having departed from normal track at one or several pitch period estimation in some local paragraphs or zone, this situation is called and some wild points occurred; Need to adopt various smoothing algorithms for removing these wild points, adopt the median smoothing technology at this, promptly about smoothed point, respectively get L sampling point, together with one group of signal sampling value of the common formation of smoothed point, be total to 2L+1 sample value, then with this 2L+1 sample value by size order line up a team, get of the output of this formation intermediate as smoother; The L value generally is taken as 1 or 2, and promptly the window of median smoothing generally entangles 3 to 5 sample values; The advantage of median smoothing is to remove a small amount of wild point, and the step evolution that can not destroy again in the pitch period track between two smooth section changes; Thus, phoneme eigenvector kernel spacing territory template base, teaching statement phoneme label string template base and the tone prosodic features template base thereof of whole Chinese standard mandarin teaching statements have been obtained; These three template base lump together, and promptly constitute standard Chinese mandarin teacher teaching statement feature templates storehouse;

C. Chinese speech is learnt evaluation and recommendations subsystem and its implementation: according to ready area of computer aided multimedia teaching programme content, the study schedule requirement of giving by program, by the selected Chinese sentence that will learn of computing machine, thermal shock sounding icon sends this statement standards for teachers mandarin sound; Afterwards, begin the study pronunciation; Here, setting the used sampling rate of study voice is 8khz, and quantitative rate is 8, and monophony will be learnt the voice flow signal and deposit corresponding data file in; Extract its characteristic parameter by the described process of a.; According to set parameters such as sample frequency, quantitative rate, every frame sampling point data should be 128, it is 64 that frame moves sampling point, window is about 16ms; Intraframe data is carried out the FFT conversion; Afterwards, ask for the feature parameter vector of each frame; By the feature parameter vector of every frame, by pure and impure sound decision rule it is incorporated into and to be voiceless sound/voiced sound phoneme feature parameter vector; Continue said process, occur, will be weighted on average, obtain feature parameter vector corresponding to this phoneme corresponding to all frame feature parameter vectors of a voiceless sound phoneme or a voiced sound phoneme up to pure and impure cent circle frame; Adopt Kohonen neural network and supervised learning algorithm, at first, compare with the Chinese speech phoneme feature templates storehouse of setting up by a. with standard mandarin feature kernel, obtaining a choosing, two choosings and three of this phoneme in kernel domain or hybrid domain template base selects label right, promptly at the phoneme label of each choosing, a variable result phase label is arranged, and 0 ecbatic state belongs to kernel domain, and 1 ecbatic state belongs to hybrid domain; So far, finished learner's phoneme of speech sound identification mission; Repeat said process, phoneme feature label string is selected in the choosing, two choosings and three that obtain the statement of learning; Calculate its corresponding fundamental frequency track again; By with teacher's teaching statement feature templates storehouse in statement label string template base and tone prosodic features template base carry out pattern match relatively, can obtain the quantitatively evaluating result of matching degree, the one, judge whether phoneme mates; The 2nd, judge whether it belongs to kernel domain; The 3rd, judge whether tone is correct, promptly realized estimating the roughly situation of learner's learning Chinese, comprise whether multitone is arranged, whether few sound, wrong sound and tone correct, and the situation of estimating the study standard mandarin, promptly with the degree of closeness of standard mandarin, be pronunciation accurately, carry a tune substantially, still incorrect, export the quantitatively evaluating result of study situation and recommendation on improvement targetedly thereof automatically by computing machine at last;

D. Chinese writing signal templates storehouse and its implementation: will impart knowledge to students with regular script body Chinese character is that 3 centimetres of graphic models that have a square square frame of M shape substrate have been preserved and are used as template by the length of side; Each length of side is the gray scale of 3 centimetres the grid frame that has the M shape substrate and represents that with-1 screen resolution is 640 * 480, and font weight is that 0 or 1 expression is white or black respectively; The teacher that the regular script body Chinese character template base of setting up by this setting can be used as in this tutoring system writes the demonstration template base; In order to make this tutoring system have writing process, as the teaching ability of stroke, the order of strokes observed in calligraphy, need in advance whole writing process to be recorded, through after the digitized processing, store respectively on request, set up one and dynamically write template base; Then, according to the teaching requirement, come out with the speed dynamic play of per second 24 frame figures again;

E. Chinese writing evaluation and suggestion subsystem and its implementation: in educational programme in the M shape square frame of 3 centimetres of given length of sides, the student uses mouse to carry out writing practising on computer screen according to exemplary word of writing the teacher and teaching writing process; By with dynamically write the teaching template base and carry out dynamic mode relatively, whether all correctly can know student's writing process and write the result, what provide quantification then automatically writes evaluation result and recommendation on improvement targetedly thereof;

F. computer media Chinese standard mandarin educational programme comprises the function and the feature of following several aspects:

(a). carry out the religion of teaching of Chinese pin yin and Chinese character/word/statement/text etc. by the text order

Learn;

(b). each teaching unit comprise listen, say, four partial contents of reading and writing; And in study

The person carries out learning outcome after learning " saying " and learn " writing " and finishing automatically real-time

Estimate, provide the suggestion that improves study then targetedly automatically;

(c). in the process that " listening " teacher teaches, can on screen, see the teacher's simultaneously

Shape of the mouth as one speaks action;

(d). in writing the teaching process, can on screen, see teacher's writing process and book

Write the result;

(e). for cooperating teaching, arranged several Chinese studying recreation to select for use for the learner;

(f). part text content cooperates animation to demonstrate explanation; The learner follows its educational programme to learn by computing machine, the result of study voice or Chinese-character writing can be imported in the computing machine by microphone or mouse; If phonetic study signal, then enter voice evaluation and suggestion subsystem by voice signal processing and recognition subsystem thereof, through carrying out matching ratio with the phoneme of speech sound template base with standard mandarin feature kernel and teacher's statement feature templates storehouse, obtain learner's phonetic study Evaluation on effect result, promptly be the standard Chinese of standard, still have the standard Chinese of accent or wrong sound, lack the mistake on the phonetic studies such as sound, multitone, and, provide recommendation on improvement automatically in phonetic study at the different problems that the learner exists; If write learning signal, then enter writing process after the signal mode recognition subsystem is handled and write evaluation of result and suggestion subsystem through writing, compare through pattern match with the writing analog board storehouse, obtain learner's writing process and the evaluation result of writing the result, and, provide recommendation on improvement automatically in writing learning process at the different problems that the learner exists; So far, finished the intelligent tutoring task of computer aided Chinese standard mandarin.