CN1609828A

CN1609828A - System and method for synthetizing English words through base phoneme

Info

Publication number: CN1609828A
Application number: CN 200310102475
Authority: CN
Inventors: 杨凰琳
Original assignee: Inventec Besta Co Ltd
Current assignee: Inventec Besta Co Ltd
Priority date: 2003-10-22
Filing date: 2003-10-22
Publication date: 2005-04-27

Abstract

The present invention is phonetic data system and method for synthesizing English words with basic phonemes. Each English word is partitioned in the basic combinations of several consonant phonemes and several vowel phonemes to obtain the consonant phonemes and the vowel phonemes constituting the English word. Then, the consonant phoneme sound waves and the vowel phoneme sound waves are obtained via partitioning the basic phoneme sound wave combination. Finally, the phonetic data of the English word is obtained from the consonant phoneme sound waves and the vowel phoneme sound waves. Storing the consonant phonemes and the vowel phonemes results in saving in the memory space.

Description

The system and method that synthesizes the speech data of English-word by basic phoneme

Technical field

The present invention relates to a kind of system and method for speech data of synthetic English-word, be applied to electronic dictionary, particularly English-word is divided into several basic phonemes, and obtain sound wave, and the system and method for the speech data that passes through the synthetic English-word of basic phoneme of combination in addition corresponding to these basic phonemes.

Background technology

The function of bragging about true man to pronounce becomes the characteristic of the main demand of e-dictionary already.Therefore in order to promote the competitive power of e-dictionary in market, each tame manufacturer under the requirement that takes into account low production cost, is absorbed in the improvement of function of pronunciation invariably.

With regard to English, record the sound wave of each English-word by true man, will consume the considerable storage area of speech data memory of the sub-dictionary of power down, and this mode also is subject to output device, so can't reduce cost.

Based on this, follow-up developments go out by the pronunciation synthesis mode, reach the function of pronunciation near true man, with the space of saving speech data memory, and improve sound quality.

The mode of the synthetic pronunciation of one is according to the decision of the phonetic symbol in the dictionary English-word table (wordlist) syllable phoneme, consonant+(r just, l, w)+vowel+(m, n, r, l, 9)+tone, thereby before the speech data of a synthetic English-word, must earlier this English-word be divided into syllable phoneme, in the original recording data, capture the sound wave of corresponding syllable phoneme again, and in addition in conjunction with promptly being, but this kind mode will cause this English-word to pronounce in the time can't capturing syllable phoneme in the original recording data.

The mode of another synthetic pronunciation is to record the sound wave of each syllable phoneme of all various initial consonants, simple or compound vowel of a Chinese syllable and combinations of tones, and be stored in the speech data memory, before the speech data of a synthetic English-word, must earlier this English-word be divided into syllable phoneme, find out corresponding sound wave from each syllable phoneme of recording again, and in addition in conjunction with promptly being.Yet thus, generally speaking the storage capacity of the sound wave of each syllable phoneme, adds aspirant phoneme and phoneme table with consuming the suitable storage area of speech data memory, needs the above storage area of about 2MB altogether.

Summary of the invention

Purpose of the present invention provides a kind of system and method that synthesizes the speech data of English-word by basic phoneme for addressing the above problem, to cut apart English-word according to basic phoneme, and obtain the sound wave of cutting apart the basic phoneme of gained corresponding to English-word, then promptly get the speech data of this English-word, to reach the purpose of saving the storage area in conjunction with these sound waves.

For achieving the above object, the invention provides a kind of system that synthesizes the speech data of English-word by basic phoneme, it includes: a speech data memory, store a basic phoneme sound wave combination and basic phonotactics of mutual correspondence, wherein this basic phoneme sound wave combination includes several initial consonant phoneme sound waves and several simple or compound vowel of a Chinese syllable phoneme sound waves, and these basic phonotactics include several initial consonant phonemes and a plurality of simple or compound vowel of a Chinese syllable phoneme; An and processing module, be connected with this speech data memory, to comply with these basic phonotactics, one English-word is done to cut apart, obtain forming the more than one initial consonant phoneme and the more than one simple or compound vowel of a Chinese syllable phoneme of this English-word, and this processing module obtains corresponding this initial consonant phoneme sound wave and this simple or compound vowel of a Chinese syllable phoneme sound wave, and makes this processing module that it is combined into the speech data of this English-word according to this initial consonant phoneme and this simple or compound vowel of a Chinese syllable phoneme that this English-word is divided in this speech data memory.

The present invention also provides a kind of method of synthesizing the speech data of English-word by basic phoneme, it includes the following step: basic phonotactics that constituted with several initial consonant phonemes and a plurality of simple or compound vowel of a Chinese syllable phoneme, this English-word is done to cut apart, and obtain forming the more than one initial consonant phoneme and the more than one simple or compound vowel of a Chinese syllable phoneme of this English-word; In a basic phoneme sound wave combination, obtain corresponding to this English-word and cut apart this initial consonant phoneme of gained and an initial consonant phoneme sound wave and a simple or compound vowel of a Chinese syllable phoneme sound wave of this simple or compound vowel of a Chinese syllable phoneme; And in conjunction with this initial consonant phoneme sound wave and this simple or compound vowel of a Chinese syllable phoneme sound wave, and obtain the speech data of this English-word.

That is to say, system and method according to the disclosed speech data that passes through the synthetic English-word of basic phoneme, wherein components of system as directed includes speech data memory and processing module, speech data memory stores the basic phoneme sound wave combination and the basic phonotactics of mutual correspondence, and basic phoneme sound wave combination includes several initial consonant phoneme sound waves and several simple or compound vowel of a Chinese syllable phoneme sound waves, and basic phonotactics include several initial consonant phonemes and several simple or compound vowel of a Chinese syllable phonemes, then be connected as for processing module with speech data memory, English-word is done to cut apart according to basic phonotactics, and obtain forming the initial consonant phoneme and the simple or compound vowel of a Chinese syllable phoneme of English-word, and initial consonant phoneme and simple or compound vowel of a Chinese syllable phoneme that processing module is divided into according to English-word, in speech data memory, obtain corresponding initial consonant phoneme sound wave and simple or compound vowel of a Chinese syllable phoneme sound wave, and make processing module that it is combined into the speech data of English-word, therefore, can save suitable storage area.

For making purpose of the present invention, structural attitude and function thereof are had further understanding, conjunction with figs. is described in detail as follows now.

Description of drawings

Fig. 1 is a system construction drawing of the present invention;

Fig. 2 A, Fig. 2 B and Fig. 2 C are the synoptic diagram of the sound wave of syllable phoneme; And

Fig. 3 is a process flow diagram of the present invention.

Wherein, description of reference numerals is as follows:

11-recording arrangement; 20-electronic dictionary; 21-speech data memory;

22-processing module; 23-acoustical generator;

Step 101-a cut apart English-word;

Step 102-obtain sound wave;

Step 103-become speech data in conjunction with sound wave;

Step 104-to the pronunciation of the speech data of English-word.

Embodiment

As shown in Figure 1, at first, record the sound wave of each syllable phoneme with the recording arrangement 11 of electronic dictionary 20 outsides, and the sound wave of each syllable phoneme done according to basic phonotactics cut apart, and include two kinds of initial consonant phoneme and simple or compound vowel of a Chinese syllable phonemes in the basic phonotactics, both are respectively consonant+(r, l, w)+preceding half vowel+tone and half vowel+(m, n, the r in back, l, 9)+and tone, thereby recording arrangement 11 can will cut apart the initial consonant phoneme sound wave and the simple or compound vowel of a Chinese syllable phoneme sound wave of the sound wave gained of syllable phoneme, transfers in the speech data memory 21 of electronic dictionary 20 to store.

Certainly, the initial consonant phoneme is made of initial consonant consonant and vowel, the initial consonant consonant has 38 of p, k, t, s, f, S, 8,7, pl, kl, pr, kr, tw, kw, h, hw, b, g, d, z, v, 5, G, 6, bl, gl, br, gr, dr, gw, dw, w, m, n, l, r, 9, j etc., and vowel has 14 of l, i, W, X, a, C, u, U, V, E, 3,2, e, o etc., therefore be example with eight kinds of tones, the initial consonant phoneme just can have 38 * 14 * 8+1 * 14 * 8 (not containing the initial consonant consonant)=4368 kinds.

Constituted by vowel and simple or compound vowel of a Chinese syllable consonant as for the simple or compound vowel of a Chinese syllable phonetic symbol, vowel has 14 of l, i, W, X, a, C, u, U, V, E, 3,2, e, o etc., the simple or compound vowel of a Chinese syllable consonant have m, n, l, r, 9 etc. 5, therefore be example with eight kinds of tones, the simple or compound vowel of a Chinese syllable phoneme just can have 14 * 5 * 8+14 * 1 * 8 (not containing the simple or compound vowel of a Chinese syllable consonant)=672 kinds.

Therefore, basic phoneme sound wave combination (initial consonant sound wave phonetic symbol and simple or compound vowel of a Chinese syllable sound wave phonetic symbol) just only need take the space of speech data memory 21 about 500KB, and can save the suitable storage area of speech data memory 21.

The system of the disclosed speech data that passes through the synthetic English-word of basic phoneme is applied to electronic dictionary 20, and include speech data memory 21, processing module 22 and acoustical generator 23, wherein speech data memory 21 stores the basic combination of phoneme sound wave (several initial consonant sound wave phonemes and several simple or compound vowel of a Chinese syllable sound wave phonemes) and basic phonotactics (several initial consonant phonemes and several simple or compound vowel of a Chinese syllable phonemes), processing module 22 is can be according to basic phonotactics stored in the speech data memory 21, English-word is done to cut apart, with initial consonant phoneme and the simple or compound vowel of a Chinese syllable phoneme that obtains this English-word, then, processing module 22 just can obtain initial consonant phoneme sound wave and the simple or compound vowel of a Chinese syllable phoneme sound wave corresponding to the initial consonant phoneme of before cutting apart gained and simple or compound vowel of a Chinese syllable phoneme in speech data storer 21, and with initial consonant phoneme sound wave and the combination in addition of simple or compound vowel of a Chinese syllable phoneme sound wave, with the speech data of synthetic this English-word, be to do pronunciation in order to the speech data of English-word that processing module 22 is synthesized as for 23 of acoustical generators.

Thereby when needing to an English-word do pronunciation, earlier by processing module 22 according to basic phonotactics stored in the speech data memory 21, this English-word done cut apart, with initial consonant phoneme and the simple or compound vowel of a Chinese syllable phoneme that obtains this English-word, then, processing module 22 is the initial consonant phoneme of English-word and simple or compound vowel of a Chinese syllable phoneme according to this just, in speech data memory 21, obtain corresponding initial consonant phoneme sound wave and simple or compound vowel of a Chinese syllable phoneme sound wave, and processing module 22 is with the initial consonant phoneme sound wave and the combination in addition of simple or compound vowel of a Chinese syllable phoneme sound wave that obtain, the speech data of synthetic this English-word, processing module 22 is given acoustical generator 23 with the voice data transmission of this English-word again, to be pronounced by acoustical generator 23.

During actual the use, after recording the sound wave of each syllable phoneme by recording arrangement 11 earlier, recording arrangement 11 is done to cut apart to syllable phoneme according to basic phonotactics again, see also Fig. 2 A, syllable phoneme bim shown in Fig. 2 B and Fig. 2 C, fi, the acoustic waveform of fim, mainly be that sound wave bim and fi with syllable phoneme is divided into initial consonant phoneme sound wave bi and fi, and simple or compound vowel of a Chinese syllable phoneme sound wave im and i, and be stored to speech data memory 21, so when need pronounce to fim, by processing module 22 according to basic phonotactics stored in the speech data memory 21, fim is divided into initial consonant phoneme fi and simple or compound vowel of a Chinese syllable phoneme im, and obtain initial consonant phoneme sound wave and the simple or compound vowel of a Chinese syllable phoneme sound wave of corresponding initial consonant phoneme fi and simple or compound vowel of a Chinese syllable phoneme im from speech data storer 21, and be combined into the speech data of fim, and transmission gives acoustical generator 23, pronounces with the speech data by 23 couples of fim of acoustical generator.

As shown in Figure 3, the method for the disclosed speech data that passes through the synthetic English-word of basic phoneme is to be applied to electronic dictionary, and includes the following step:

At first, cut apart an English-word: be the basic phonotactics that constituted with several initial consonant phonemes and several simple or compound vowel of a Chinese syllable phonemes, this English-word is done to cut apart, and obtain forming the more than one initial consonant phoneme and the more than one simple or compound vowel of a Chinese syllable phoneme of this English-word, this is a step 101.

Obtain sound wave: in a basic phoneme sound wave combination, obtain corresponding to this English-word and cut apart the initial consonant phoneme of gained and the initial consonant phoneme sound wave and the simple or compound vowel of a Chinese syllable phoneme sound wave of simple or compound vowel of a Chinese syllable phoneme, this is a step 102.

Become speech data in conjunction with sound wave: in conjunction with initial consonant phoneme sound wave and simple or compound vowel of a Chinese syllable phoneme sound wave, and obtain this English-word speech data this for step 103.

To the speech data pronunciation of English-word, this is a step 104.

The above only is the present invention's preferred embodiment wherein, is not to be used for limiting practical range of the present invention; Be that all equalizations of being done according to the present patent application claim change and modification, be all claim of the present invention and contain.

Claims

1, a kind of system of the speech data by the synthetic English-word of basic phoneme is characterized in that including:

One speech data memory, store a basic phoneme sound wave combination and basic phonotactics of mutual correspondence, wherein this basic phoneme sound wave combination includes several initial consonant phoneme sound waves and several simple or compound vowel of a Chinese syllable phoneme sound waves, and these basic phonotactics include several initial consonant phonemes and a plurality of simple or compound vowel of a Chinese syllable phoneme; And

One processing module, be connected with this speech data memory, to comply with these basic phonotactics, one English-word is done to cut apart, obtain forming the more than one initial consonant phoneme and the more than one simple or compound vowel of a Chinese syllable phoneme of this English-word, and this processing module obtains corresponding this initial consonant phoneme sound wave and this simple or compound vowel of a Chinese syllable phoneme sound wave, and makes this processing module that it is combined into the speech data of this English-word according to this initial consonant phoneme and this simple or compound vowel of a Chinese syllable phoneme that this English-word is divided in this speech data memory.

2, the system of the speech data by the synthetic English-word of basic phoneme as claimed in claim 1 is characterized in that also including an acoustical generator, after this processing module is given this acoustical generator with the voice data transmission of this English-word, by this acoustical generator pronunciation.

3, a kind of method of synthesizing the speech data of English-word by basic phoneme is characterized in that including the following step:

Basic phonotactics that constituted with several initial consonant phonemes and a plurality of simple or compound vowel of a Chinese syllable phoneme are done to cut apart to this English-word, and obtain forming the more than one initial consonant phoneme and the more than one simple or compound vowel of a Chinese syllable phoneme of this English-word;

In a basic phoneme sound wave combination, obtain corresponding to this English-word and cut apart this initial consonant phoneme of gained and an initial consonant phoneme sound wave and a simple or compound vowel of a Chinese syllable phoneme sound wave of this simple or compound vowel of a Chinese syllable phoneme; And

In conjunction with this initial consonant phoneme sound wave and this simple or compound vowel of a Chinese syllable phoneme sound wave, and obtain the speech data of this English-word.

4, method of synthesizing the speech data of English-word by basic phoneme as claimed in claim 3, it is characterized in that this is in conjunction with this initial consonant phoneme sound wave and this simple or compound vowel of a Chinese syllable phoneme sound wave, and obtain after the step of speech data of this English-word, also comprise step to the speech data pronunciation of this English-word.