Summary of the invention
According to an aspect of the present invention, providing a kind of is the synthetic method of choosing the voice identifier symbol that is used for definite pronunciation waveform of literary composition language conversion, and described method comprises:
(i) choose character string;
Determine that (ii) described character string is whether in main dictionary;
(iii) described character string is divided into independent character, described cutting procedure is implemented when described character string is not in main dictionary;
(iv) whether the search rule collection has the voice identifier symbol in described regular centralised identity to determine independent character; With
(v) choose context-sensitive voice identifier symbol for having at the described independent character of the voice identifier of regular centralised identity symbol, described context-sensitive voice identifier symbol is by the rule application in the rule set is chosen in described independent character, wherein, described application process is included in the sentence or the interior context of determining described independent character of phrase.
Preferably, determining step (ii) also comprises the steps:
(vi) whether the search rule collection has the voice identifier symbol in regular centralised identity to determine described character string, and described retrieving is only implemented when character string is not in main dictionary; With
If (vii) its identifier is in regular centralised identity, for described character string is chosen context-sensitive voice identifier symbol, described context-sensitive voice identifier symbol is by choosing rule application in the rule set in described character string, wherein, described application process is included in the sentence or the interior context of determining described character string of phrase.
Preferably, determining step (ii) also comprises the steps:
(check viii) whether described character string has the mark that is associated or the control character of the described character string of sign, and described checking process is only implemented when character string is not in main dictionary; With
(ix) when described character string has the mark that is associated or control character, in main dictionary, choose formal voice identifier symbol for described character string.
Preferably, determining step (ii) also comprises the steps:
Whether (x) search rule collection has the voice identifier symbol in regular centralised identity to determine described character string; With
(xi) choose context-sensitive voice identifier symbol for having in the described character string of its voice identifier symbol of regular centralised identity, described context-sensitive voice identifier symbol is by the rule application in the rule set is chosen in described character string, described application process is included in the sentence or the interior context of determining described character string of phrase, and wherein, when described character string did not have its voice identifier symbol in regular centralised identity, described character string was chosen the unofficial or default identifier that identifies as by main dictionary with its voice identifier symbol.
Preferably, described method is further characterized in that at least some characters have formal and informal voice identifier symbol in the main dictionary.
Preferably, choosing step (v) also comprises the steps:
(xii) when independent character does not have at the voice identifier of regular centralised identity symbol, searching character pronunciation dictionary, described character pronunciation dictionary comprise independent character and corresponding voice identifier symbol; With
(xiii) from described character pronunciation dictionary, choose the voice identifier symbol for each independent character.
Preferably, described method also is included as the step that each selected voice identifier symbol carries out phonetic synthesis.
Preferably, described phonetic synthesis is chosen the pronunciation waveform by voice identifier symbol and is carried out in the pronunciation corpus.
Preferably, described method is carried out on electronic equipment.Electronic equipment can preferably include the wireless communication module that is used to receive character string.
Preferably, thus described method comprises cuts apart the step formerly that text string provides described character string.
Embodiment
Referring to Fig. 1, illustrate electronic equipment 100 with wireless telephonic form, comprise device handler 102, device handler 102 effectively is connected to user interface 104 by bus 103, and user interface 104 is generally touch-screen or display screen and keyboard.Electronic equipment 100 also has pronunciation corpus 106, voice operation demonstrator 110, nonvolatile memory 120, ROM (read-only memory) 118 and wireless communication module 116, and they all effectively are connected to processor 102 by bus 103.Voice operation demonstrator 110 has connection to drive the output of loudspeaker 112.Corpus 106 comprises the performance of word or phoneme and pronunciation waveform PUW that be associated, sampling, digitized and that handled.In other words, as described below, nonvolatile memory 120 (memory module) provides text string to be used for literary composition language conversion (TTS) synthetic (can receive text by module 116 or other modules).Simultaneously, waveform pronunciation corpus also comprises the pronunciation waveform that shows of the cluster (cluster) that is arranged in the same word of transcribing (transcription), expression phrase and corresponding sampling, digitized pronunciation waveform are positioned at the position with respect to natural phrasal boundary as described below.
As recognized by those skilled in the art, radio frequency communications unit 116 normally has the combined reception device and the transmitter of community antenna.Radio frequency communications unit 116 has transceiver, and transceiver is connected to antenna by radio frequency amplifier.Transceiver is also connected to combined modulator/demodulator, and combined modulator/demodulator is connected to processor 102 with communication unit 116.In the present embodiment simultaneously, nonvolatile memory 120 (memory module) stored user phonebook database Db able to programme, ROM (read-only memory) 118 stores operational code (OC), main dictionary (PLX), special case dictionary (SCLX), character pronunciation dictionary (CPLX), the Knowledge Base rule set that is used for device handler 102, and (Knowledge-Base Rule Set is KBRS) with the code that is used to implement method as described below.
Part main dictionary (PLX) has been shown in the table 1, and it comprises word field (WF1), formal/unofficial attribute field and word pronunciation identifier field (PIF1).Word field WF1 comprises the word of one or more characters, for each word among the word field WF1, formally/will be provided with or not be provided with corresponding sign in the unofficial attribute field.Simultaneously, also have one or more corresponding voice identifier symbols among the word pronunciation identifier field PIF1, in the practice, corresponding voice identifier symbol has identified which pronunciation in the pronunciation corpus 106 corresponding to the word among the word field WF1.
Referring to following table 1, if for the given word among the word field WF1 formally/sign is set in the unofficial attribute field, so just have only a possible pronunciation to be identified among the voice identifier symbol field PIF1.For example, in table 1, the corresponding sign of word " son " is not provided with, so have only a kind of possible (unique) pronunciation " Ji (4) Zi (2) " to be identified among the word pronunciation identifier field PIF1.Conversely, if for the given word among the word field WF1 formally/be provided with sign in the unofficial attribute field, in voice identifier symbol field PIF1, multiple possible pronunciation sign is just arranged so.For example, in table 1, be provided with the corresponding sign that word " does not have ", so have at least two kinds of possible pronunciations " a) Mut (6) jau (2) " and " b) mou (5) " to be identified among the word pronunciation identifier field PIF1.Herein, " Mut (6) jau (2) " is formal pronunciation, be used for formal or commercial talk, and " mou (5) " is unofficial pronunciation, are used for informal daily talk.Therefore, those skilled in the art will recognize that the voice identifier symbol of all types " a) " among the voice identifier symbol field PIF1 all is unofficial or default pronunciation identifier (that is Hok (6) zap (6); Ji (4) Zi (2); And the voice identifier of all types " b) " symbol all is formal identifier (that is ngo (5) dei (6), Mut (6) jau (2) etc.); Mou (5) etc.).
Table 1 main dictionary PLX
Dictionary (word) WF1 |
Formally/unofficial sign |
Voice identifier symbol field PIF1 |
Study |
Be not provided with |
Hok(6)zap(6) |
Son |
Be not provided with |
a)Ji(4)Zi(2) |
We |
Be provided with |
a)ngo(5)mun(4) b)ngo(5)dei(6) |
No |
Be provided with |
a)Mut(6)jau(2) b)mou(5) |
The squad leader |
Be not provided with |
a)Baan(1)Zoeng(2) |
Wavelength |
Be not provided with |
a)Bo(1)Coeng(4) |
The squad leader |
Be not provided with |
a)baan(1)zoeng(2) |
Wavelength |
Be not provided with |
a)bo(1)coeng(4) |
The batter |
Be not provided with |
a)gik(1)kau(4)sau(2) |
Ball |
Be not provided with |
a)kau(4) |
Referring to following table 2, part character pronunciation dictionary CPLX has been described.Character pronunciation dictionary CPLX comprises independent character field (ICF) and character voice identifier symbol field (PIF2).Character field ICF includes only independent character separately, and character voice identifier symbol field PIF2 has the one or more identifiers corresponding to independent character among the independent character field ICF.For example, the independent character " son " among the character field ICF has only a corresponding character voice identifier symbol, the unique pronunciation " Zi (2) " in the sign pronunciation corpus 106 in character voice identifier symbol field PIF2 separately.Conversely, the independent character " once " among the character field ICF has two corresponding character voice identifier symbols, two kinds of possible pronunciations in the sign pronunciation corpus 106 in character voice identifier symbol field PIF2 separately.These two kinds of pronunciations are: (a) first or default pronunciation " Zang (1) "; (b) second may pronounce " Cang (4) ".
Table 2 part character pronunciation CPLX
Character ICF |
Voice identifier symbol PIF2 |
Learn |
Hok(6) |
Practise |
Zap(6) |
{。##.##1}, |
Dik(1) |
Generous |
Ge(3) |
Youngster |
a)Ji(4) |
Son |
a)Zi(2) |
I |
Ngo(5) |
{。##.##1}, |
Mun(4) |
Once |
a)Zang(1);b)Cang(4) |
Small pot with a handle and a spout for boiling water or herbal medicine |
a)jiu(4) |
Not yet |
a)Mut(6);b)mou(5) |
Long |
a)Coeng(4);b)Zoeng(2) |
Have |
a)jau(2) |
Gush |
a)jung(2) |
Be |
a)Hi(4) |
Referring to following table 3A and 3B, show special case dictionary SCLX, be used for the special word of dialect, for example place name (table 3A) and kinsfolk's name (table 3B).As recognized by those skilled in the art, special case dictionary SCLX may be included among the main dictionary PLX, but for the maintenance and the dirigibility of multiple dialect, special case dictionary SCLX is not included among the main dictionary PLX.In table 3A, place name character field PNCF comprises the Chinese character literary style of known place name (known for specific dialect), the special case voice identifier symbol SCPI field that also has the voice identifier symbol is used for identifying which pronunciation of pronunciation corpus 106 corresponding to the word among the place name character field PNCF.Simultaneously, table 3B has kinsfolk's file-name field FMF, comprises the Chinese character literary style of kinsfolk's name (known for specific dialect) of public use.The special case voice identifier symbol SCPI field that also has the voice identifier symbol is used for identifying which pronunciation of pronunciation corpus 106 corresponding to the word among kinsfolk's file-name field FMF.
Table 3A is used for the part special case dictionary SCLX of place name
Place name character field PNCF |
Special case voice identifier symbol SCPI |
The longan hole |
Lung(4)ngaan(5)dung(2) |
Yau Ma Tei |
Jau(4)maa(4)dei(2) |
The Quarry fish is gushed |
Zak(1)jyu(4)cung(1) |
Gush mouth |
Cung(1)hau(2) |
The deep water control |
Sam(1)seoi(2)bou(2) |
The stream flower |
Lau(4)faa(3) |
Shahe |
Saa(1)ho(2) |
Macao |
Ou(3)mun(2) |
Sanyuanli |
Saam(1)jyun(4)lei(2) |
Fanyu |
Pun(1)jyu(4) |
The big pool |
Daai(6)tong(2) |
Gush |
Cung(1) |
Table 3B is used for the part special case dictionary SCLX of kinsfolk's name
Kinsfolk's field FMF |
Special case voice identifier symbol SCPI |
Mother |
Aa(3)maa(1) |
Grandmother |
Maa(4)maa(4) |
Grandfather |
Je(4)je(2) |
The elder brother |
Go(4)go(1) |
Elder sister |
Ze(4)ze(1) |
Younger sister |
Mui(4)mui(2) |
Younger brother |
Dai(4)dai(2) |
Son's wife |
Sam(1)pou(5) |
The elder sister of family |
Gaa(1)ze(1) |
Thin younger sister |
Sai(3)mui(2) |
Thin man |
Sai(3)lou(2) |
Referring to following table 4, show part Knowledge Base rule set KBRS.KBRS comprises Knowledge Base character field (KBCF), is used for character or character group for Knowledge Base character field KBCF, the fixedly pronunciation rule of identification character or character group and rule field (RFLD).
To 2C, illustrate method 200 referring to Fig. 2 A, choose the voice identifier symbol that is used for determining the pronunciation waveform for the conversion of literary composition language is synthetic, method 200 is implemented in equipment 100 usually.Preferably, come call method 200 by allowing the user provide instructions to user interface 104 in beginning step 210, perhaps can be when equipment 100 receives character string (for example by wireless communication module 116) automatic call method 200.After beginning step 210, method 200 is carried out and is chosen step 220, is used for choosing character string (CHS) from the string of a plurality of characters.Character string is the subclass of character sentence, usually is selected from the text string of character with order from left to right, and this to choose order also be the order of people's read text.Therefore, as recognized by those skilled in the art,, text string is divided into one or more character strings (CHS) by according to dictionary guide or based on the dividing method of the analysis of statistical rules.
Table 4 part Knowledge Base rule set KBRS
Character KBCF |
Rule RFLD |
No |
If " " be last word THEN=>Mut (6) jau (2) in the sentence the left side and " not having " of " not having "; ELSE=>mou (5) |
Once |
If " surname " or " crying " or "Yes" are at the left side of " once " THEN=>zang (1); ELSE=>cang (4) |
Not |
If " after " or " remembeing " or " " and the word of relevant money at the right of " not wanting " then=>m (4) jiu3; Else=>m (4) hou (2) |
What |
If " " or " it " or " number " or " gained vote " at left side then=>do (1) siu (2) of " how much "; If " have " or " having " or " also having " or " having " or " feeling " at the right then=>do (1) siu (2) of " how much "; Else=>gei (2) do (1) |
This |
If " problem " or " target " or " being exactly " or " topic " or " problem " or " plan " or " speed " or " industry " or " key " are at the right of " this " then=>nei (1) go (3); Else=>gam (2) |
After step 220, method is carried out the step 230 of retrieval main dictionary, retrieves the word among the main dictionary PLX word field WF1 this moment, seeks the coupling with character string CHS.Then, implement testing procedure 240, be used for determining whether to have found the same word coupling (determining whether to be the character string CHS among the main dictionary PLX) of character string CHS at word field WF1.If do not find coupling (character string is not in main dictionary PLX), then method 200 is carried out searching step 250, is used to retrieve special case dictionary SCLX.
If testing procedure 260 is determined character string CHS not in special case dictionary SCLX, separating character string step 265 is divided into independent character (ICH) comprises the independent character of character string CHS with establishment character set with character string CHS so.Implement retrieval knowledge base rule collection KBRS step 270, be used for determining that independent character ICH is whether in Knowledge Base rule set KBRS (in other words, the retrieval of step 270 determines whether independent character has the voice identifier symbol in regular centralised identity).If determine not have independent character ICH in Knowledge Base rule set KBRS, choose step 290 and just in character pronunciation dictionary CPLX, select default voice identifier symbol at testing procedure 280.Do not have searching character pronunciation dictionary CPL when the voice identifier of regular centralised identity accords with at independent character, the character pronunciation dictionary comprises independent character and corresponding voice identifier symbol, from the character pronunciation dictionary, choose the voice identifier symbol then, realize choosing step thus for each independent character.After step 290, in step 400, provide phonetic synthesis by symbol addressing of default voice identifier and the pronunciation waveform chosen in the corpus 106, provide a signal to loudspeaker 112 and show synthetic voice.
If define one or more independent character ICH in Knowledge Base rule set KBRS at testing procedure 280, method 200 is just carried out determining step 370, is used to use Knowledge Base rule set KBRS to determine the context of each independent character ICH.After this, choose the context of step 380, in character pronunciation dictionary CPLX, choose context-sensitive voice identifier symbol according to each independent character of the ICH that is used for identifying at Knowledge Base rule set KBRS.Therefore, context-sensitive voice identifier symbol is used to have the independent character at the voice identifier symbol of regular centralised identity, by the rule application in the rule set is chosen context-sensitive voice identifier symbol in independent character, wherein, application process is included in the sentence or the interior context of determining independent character of phrase.The independent character ICH of other that do not identify in Knowledge Base rule set KBRS can given simply its default voice identifier symbol of quilt.
After choosing step 380, in the addressing and choose the pronunciation waveform phonetic synthesis is provided in corpus 106 of the voice identifier symbol of step 400 by choosing, to provide a signal to loudspeaker 112 in order to show synthetic voice.
Get back to step 260,, then implement to choose step 300 if determine character string CHS in special case dictionary SCLX at testing procedure 260.Choose step 300 and choose the voice identifier symbol of the special case dictionary SCLX sign of table 3A and 3B.After choosing step 300, in the addressing and choose the pronunciation waveform phonetic synthesis is provided in corpus 106 of the voice identifier symbol of step 400 by choosing, to provide a signal to loudspeaker 112 in order to show synthetic voice.
Get back to step 240, if determine character string CHS in main dictionary PLX at testing procedure 240, then method 240 is carried out further testing procedure 310, is used for checking whether be provided with unofficial/official symbol.If this sign is not set, in main dictionary, just have only a possible voice identifier to accord with character string CHS coupling.Therefore, choosing step 320, choosing unique voice identifier symbol by main dictionary PLX sign.After this, in the addressing and choose the pronunciation waveform phonetic synthesis is provided in corpus 106 of the voice identifier symbol of step 400 by choosing, to provide a signal to loudspeaker 112 in order to show synthetic voice.
If determine to be provided with unofficially/official symbol, a plurality of possible voice identifier symbols are just arranged in step 310.Therefore, method 200 must determine to use which identifier, so testing procedure 330 is determined the mark of correlation whether character string CHS has identification character string type (that is, books, film, TV play etc.).Such mark is a control character, comprise such as " { ... } ", " (...) ", "<... " bracket, can also be such as quoted passage (quotation) and "/.../", " | ... | ", the special control character of " * ... * ", " #...# ", in the middle of these control characters, inserted one group of character that comprises character string CHS.If relevant one or more marks are arranged, just implement to choose step 340, thereby the formal voice identifier of choosing by main dictionary PLX sign for the character string of choosing accords with.Replacedly, if not relevant one or more marks are just implemented retrieval knowledge base rule collection step 345, to determine that character string CHS is whether in Knowledge Base rule set KBRS.If determine character string CHS not in Knowledge Base rule set KBRS at testing procedure 350, just implement to choose step 355, (being character string CHS) chooses the informal voice identifier symbol by main dictionary PLX sign thus.But, if determine that at testing procedure 350 character string CHS is in Knowledge Base rule set KBRS, just implement to choose step 360, (being character string CHS) chooses the formal voice identifier symbol by the sign of the rule among main dictionary PLX and the Knowledge Base rule set KBRS thus.Therefore, if character string CHS has the voice identifier symbol in regular centralised identity, choose step 360 and just choose context-sensitive voice identifier symbol for character string, by the rule application in the rule set is chosen context-sensitive voice identifier symbol to character string CHS, wherein, application process is included in the sentence or the interior context of determining character string of phrase.
In step 340,355 or 360 after any one, in step 400 by the addressing and choose the pronunciation waveform phonetic synthesis is provided in corpus 106 of the voice identifier symbol chosen, to provide a signal to loudspeaker 112 in order to show synthetic voice.After phonetic synthesis 400, finish testing procedure 410 and determine whether that character string CHS will handle in addition, when not having character string to handle, method 200 stops in end step 420, otherwise method 200 turns back to receiving step 220.
The present invention advantageously allows the TTS based on the Chinese character text string, thereby the synthetic speech such as the Chinese dialects of Guangdong language is provided.The present invention carries out in fact and chooses character string; And whether definite character string is in main dictionary PLX.When character string is not in main dictionary, character string is divided into independent character, retrieval knowledge base rule collection KBRS is to determine whether independent character has the voice identifier symbol in regular centralised identity.After this, choose context-sensitive voice identifier symbol for independent character with the voice identifier symbol that in Knowledge Base rule set KBRS, identifies.By the rule application in the rule set is chosen context-sensitive voice identifier symbol to independent character, wherein, application process is included in the sentence or the interior context of determining independent character of phrase.Knowledge Base rule set KBRS also is used for simultaneously the character string CHS that identifies at main dictionary PLX.And special case dictionary SCLX has also increased advantage of the present invention with the mark that is used for books, TV play etc.Therefore, the invention enables the pronunciation of character or word relevant with its context, dialect place name and kinsfolk's name and the pronunciation of formal and informal dialect.
In order further to understand advantage of the present invention, provide following Example.
Example 1: choose character string CHS " batter " choosing step 220, step 240 will be determined " batter " in main dictionary PLX then, and testing procedure 310 will be determined not to be provided with formally/unofficial sign.Therefore, in step 340, be unique pronunciation gik (1) Kau (4) sau (2) of " batter " sign.
Example 2: choose character string CHS and " do not have " choosing step 220, step 240 will be determined " not having " in main dictionary PLX then, and testing procedure 310 is determined to be provided with formally/unofficial sign.Therefore, around determining character string CHS, step 330 has mark, such as "<... do not have ...〉", wherein also can other character in the mark, be the formal voice identifier symbol of " not having " sign mou (5).Replacedly, if around character string CHS, do not have mark, so after step 330, at step 345 retrieval knowledge base rule collection KBRS.If " do not have " not in Knowledge Base rule set KBRS, will choose unofficial voice identifier symbol Mot (6) Jau (2) in step 355.But when " not having " in Knowledge Base rule set KBRS the time, the rule in step 360 service regeulations field RFLD is chosen a) Mot (6) Jau (2) or is chosen b) mou (5).
Example 3: choose character string CHS " longan hole " choosing step 220, step 240 will determine that " longan hole " be not in main dictionary PLX then, testing procedure 260 will be determined " longan hole " in special case dictionary SCLX, therefore choose voice identifier symbol Lung (4) ngaan (5) dung (2).
Example 4: choose character string CHS " being once " choosing step 220, step 240 will determine that " being once " be not in main dictionary PLX then, testing procedure 260 is determined " being once " not in special case dictionary SCLX, therefore this character string CHS is divided into two character "Yes" and " once ".Because " once " identifies in Knowledge Base character field KBCF, execution in step 370 and 380, thereby, and be given its default value of "Yes" by character pronunciation dictionary sign for " once " chooses voice identifier symbol zang (1) (because "Yes" is on the left side of " once ").
The explanation of describing in detail only provides preferred example embodiment, and be not intended to limit the scope of the invention, applicability or structure.In fact, the detailed description of this preferred example embodiment is to provide a kind of the present invention of realization the explanation of preferred example embodiment to those skilled in the art.Should be appreciated that, under the prerequisite of the spirit and scope of the present invention of in not deviating from, being set forth, can make various change the function and the structure of each element as claims.