CN100371928C - Selection of pronunciation designator for determining pronunciation wave-shape for text-to-speed conversion and synthesis - Google Patents

Selection of pronunciation designator for determining pronunciation wave-shape for text-to-speed conversion and synthesis Download PDF

Info

Publication number
CN100371928C
CN100371928C CNB2004100319757A CN200410031975A CN100371928C CN 100371928 C CN100371928 C CN 100371928C CN B2004100319757 A CNB2004100319757 A CN B2004100319757A CN 200410031975 A CN200410031975 A CN 200410031975A CN 100371928 C CN100371928 C CN 100371928C
Authority
CN
China
Prior art keywords
character string
voice identifier
character
pronunciation
identifier symbol
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CNB2004100319757A
Other languages
Chinese (zh)
Other versions
CN1677488A (en
Inventor
祖漪清
麦耘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nuance Communications Inc
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Priority to CNB2004100319757A priority Critical patent/CN100371928C/en
Publication of CN1677488A publication Critical patent/CN1677488A/en
Application granted granted Critical
Publication of CN100371928C publication Critical patent/CN100371928C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The present invention a method of a selection of pronunciation designator for determining pronunciation wave shapes for text-to-speed conversion and synthesis. In the method, a character string (220) is selected, and then, people determine that whether the character string is in a main dictionary (240); the character string is divided into single characters (265), and the dividing procedure is conducted when the character string is not in the main dictionary; a rule set (270) is retrieved in order to determine if the single characters have pronunciation designators designated in the rule set; the next is the step of selection (380) that context-sensitive pronunciation designators are selected for the single characters having pronunciation designators designated in the rule set; the context-sensitive pronunciation designators are selected by applying rules in the rule set to the single characters; the application process comprises context determining the single characters in sentences and phrases; finally, speech synthesis (400) is carried out according to the selected pronunciation designators.

Description

Choose the voice identifier symbol that is used for determining the pronunciation waveform for the conversion of literary composition language is synthetic
Technical field
(Text-To-Speech TTS) synthesizes to the present invention relates generally to the conversion of literary composition language.The present invention is used in particular for to determining the suitably synthetic pronunciation of Chinese character text chunk such as the Chinese dialects of Guangdong language, but and needn't be defined in this.
Background technology
Literary composition language conversion (TTS) typically refers to synthetic from the Text To Speech of consecutive, allows electronic equipment to receive the input text string and provides the conversion of text string to represent with the form of synthetic speech.But the equipment that may need to begin synthetic speech from the reception text string of uncertainty quantity will have difficulties aspect the high-quality synthetic speech true to nature providing.This difficulty mainly is because the pronunciation of each word that will synthesize or syllable (for Chinese character or similar language throughout character) is that context dependent and position are relevant.For example, the pronunciation of the word at sentence (input text string) end may be spun out or prolong.Even in the middle of sentence, locate, as long as require emphasis, just the pronunciation of same word might be prolonged.
When carrying out TTS, text chunk or text string at first are converted into the phoneme stream with the prosodic parameter that is associated.Then, phoneme and prosodic parameter are used for choosing suitable waveform from corpus.But, when the TTS of the text string that carries out relevant Chinese character so that Chinese dialect synthetic speech to be provided, in order to obtain to show the relatively accurate synthetic video of Chinese character, must overcome a large amount of problems.At first, Chinese character is to not mapping one by one always between the specific objective dialect (for example Guangdong language), wherein the pronunciation of character or words and its context and in sentence residing position relevant.Secondly, place name and kinsfolk's name may be not easy to convert the target dialect to from Chinese character.The 3rd, dialect has formal and informal pronunciation usually, so unofficial pronunciation is used formally, when used to synthetic must the determining when of TTS.
In this specification and claims, term " comprises, comprises (comprises, comprising) " or similar term intention contains comprising of nonexcludability, therefore, the method or the device that comprise series of elements are not only to comprise those listed elements, also should comprise the element that other are unlisted well.
Summary of the invention
According to an aspect of the present invention, providing a kind of is the synthetic method of choosing the voice identifier symbol that is used for definite pronunciation waveform of literary composition language conversion, and described method comprises:
(i) choose character string;
Determine that (ii) described character string is whether in main dictionary;
(iii) described character string is divided into independent character, described cutting procedure is implemented when described character string is not in main dictionary;
(iv) whether the search rule collection has the voice identifier symbol in described regular centralised identity to determine independent character; With
(v) choose context-sensitive voice identifier symbol for having at the described independent character of the voice identifier of regular centralised identity symbol, described context-sensitive voice identifier symbol is by the rule application in the rule set is chosen in described independent character, wherein, described application process is included in the sentence or the interior context of determining described independent character of phrase.
Preferably, determining step (ii) also comprises the steps:
(vi) whether the search rule collection has the voice identifier symbol in regular centralised identity to determine described character string, and described retrieving is only implemented when character string is not in main dictionary; With
If (vii) its identifier is in regular centralised identity, for described character string is chosen context-sensitive voice identifier symbol, described context-sensitive voice identifier symbol is by choosing rule application in the rule set in described character string, wherein, described application process is included in the sentence or the interior context of determining described character string of phrase.
Preferably, determining step (ii) also comprises the steps:
(check viii) whether described character string has the mark that is associated or the control character of the described character string of sign, and described checking process is only implemented when character string is not in main dictionary; With
(ix) when described character string has the mark that is associated or control character, in main dictionary, choose formal voice identifier symbol for described character string.
Preferably, determining step (ii) also comprises the steps:
Whether (x) search rule collection has the voice identifier symbol in regular centralised identity to determine described character string; With
(xi) choose context-sensitive voice identifier symbol for having in the described character string of its voice identifier symbol of regular centralised identity, described context-sensitive voice identifier symbol is by the rule application in the rule set is chosen in described character string, described application process is included in the sentence or the interior context of determining described character string of phrase, and wherein, when described character string did not have its voice identifier symbol in regular centralised identity, described character string was chosen the unofficial or default identifier that identifies as by main dictionary with its voice identifier symbol.
Preferably, described method is further characterized in that at least some characters have formal and informal voice identifier symbol in the main dictionary.
Preferably, choosing step (v) also comprises the steps:
(xii) when independent character does not have at the voice identifier of regular centralised identity symbol, searching character pronunciation dictionary, described character pronunciation dictionary comprise independent character and corresponding voice identifier symbol; With
(xiii) from described character pronunciation dictionary, choose the voice identifier symbol for each independent character.
Preferably, described method also is included as the step that each selected voice identifier symbol carries out phonetic synthesis.
Preferably, described phonetic synthesis is chosen the pronunciation waveform by voice identifier symbol and is carried out in the pronunciation corpus.
Preferably, described method is carried out on electronic equipment.Electronic equipment can preferably include the wireless communication module that is used to receive character string.
Preferably, thus described method comprises cuts apart the step formerly that text string provides described character string.
Description of drawings
In order to make easy to understand of the present invention and to put into practice, quote as described preferred embodiment with reference now to accompanying drawing, in the accompanying drawings:
Fig. 1 is the schematic block diagram that is used for electronic equipment of the present invention; With
Fig. 2 A illustrates method 200 to 2C, chooses the voice identifier symbol that is used for determining the pronunciation waveform in the corpus of Fig. 1.
Embodiment
Referring to Fig. 1, illustrate electronic equipment 100 with wireless telephonic form, comprise device handler 102, device handler 102 effectively is connected to user interface 104 by bus 103, and user interface 104 is generally touch-screen or display screen and keyboard.Electronic equipment 100 also has pronunciation corpus 106, voice operation demonstrator 110, nonvolatile memory 120, ROM (read-only memory) 118 and wireless communication module 116, and they all effectively are connected to processor 102 by bus 103.Voice operation demonstrator 110 has connection to drive the output of loudspeaker 112.Corpus 106 comprises the performance of word or phoneme and pronunciation waveform PUW that be associated, sampling, digitized and that handled.In other words, as described below, nonvolatile memory 120 (memory module) provides text string to be used for literary composition language conversion (TTS) synthetic (can receive text by module 116 or other modules).Simultaneously, waveform pronunciation corpus also comprises the pronunciation waveform that shows of the cluster (cluster) that is arranged in the same word of transcribing (transcription), expression phrase and corresponding sampling, digitized pronunciation waveform are positioned at the position with respect to natural phrasal boundary as described below.
As recognized by those skilled in the art, radio frequency communications unit 116 normally has the combined reception device and the transmitter of community antenna.Radio frequency communications unit 116 has transceiver, and transceiver is connected to antenna by radio frequency amplifier.Transceiver is also connected to combined modulator/demodulator, and combined modulator/demodulator is connected to processor 102 with communication unit 116.In the present embodiment simultaneously, nonvolatile memory 120 (memory module) stored user phonebook database Db able to programme, ROM (read-only memory) 118 stores operational code (OC), main dictionary (PLX), special case dictionary (SCLX), character pronunciation dictionary (CPLX), the Knowledge Base rule set that is used for device handler 102, and (Knowledge-Base Rule Set is KBRS) with the code that is used to implement method as described below.
Part main dictionary (PLX) has been shown in the table 1, and it comprises word field (WF1), formal/unofficial attribute field and word pronunciation identifier field (PIF1).Word field WF1 comprises the word of one or more characters, for each word among the word field WF1, formally/will be provided with or not be provided with corresponding sign in the unofficial attribute field.Simultaneously, also have one or more corresponding voice identifier symbols among the word pronunciation identifier field PIF1, in the practice, corresponding voice identifier symbol has identified which pronunciation in the pronunciation corpus 106 corresponding to the word among the word field WF1.
Referring to following table 1, if for the given word among the word field WF1 formally/sign is set in the unofficial attribute field, so just have only a possible pronunciation to be identified among the voice identifier symbol field PIF1.For example, in table 1, the corresponding sign of word " son " is not provided with, so have only a kind of possible (unique) pronunciation " Ji (4) Zi (2) " to be identified among the word pronunciation identifier field PIF1.Conversely, if for the given word among the word field WF1 formally/be provided with sign in the unofficial attribute field, in voice identifier symbol field PIF1, multiple possible pronunciation sign is just arranged so.For example, in table 1, be provided with the corresponding sign that word " does not have ", so have at least two kinds of possible pronunciations " a) Mut (6) jau (2) " and " b) mou (5) " to be identified among the word pronunciation identifier field PIF1.Herein, " Mut (6) jau (2) " is formal pronunciation, be used for formal or commercial talk, and " mou (5) " is unofficial pronunciation, are used for informal daily talk.Therefore, those skilled in the art will recognize that the voice identifier symbol of all types " a) " among the voice identifier symbol field PIF1 all is unofficial or default pronunciation identifier (that is Hok (6) zap (6); Ji (4) Zi (2); And the voice identifier of all types " b) " symbol all is formal identifier (that is ngo (5) dei (6), Mut (6) jau (2) etc.); Mou (5) etc.).
Table 1 main dictionary PLX
Dictionary (word) WF1 Formally/unofficial sign Voice identifier symbol field PIF1
Study Be not provided with Hok(6)zap(6)
Son Be not provided with a)Ji(4)Zi(2)
We Be provided with a)ngo(5)mun(4) b)ngo(5)dei(6)
No Be provided with a)Mut(6)jau(2) b)mou(5)
The squad leader Be not provided with a)Baan(1)Zoeng(2)
Wavelength Be not provided with a)Bo(1)Coeng(4)
The squad leader Be not provided with a)baan(1)zoeng(2)
Wavelength Be not provided with a)bo(1)coeng(4)
The batter Be not provided with a)gik(1)kau(4)sau(2)
Ball Be not provided with a)kau(4)
Referring to following table 2, part character pronunciation dictionary CPLX has been described.Character pronunciation dictionary CPLX comprises independent character field (ICF) and character voice identifier symbol field (PIF2).Character field ICF includes only independent character separately, and character voice identifier symbol field PIF2 has the one or more identifiers corresponding to independent character among the independent character field ICF.For example, the independent character " son " among the character field ICF has only a corresponding character voice identifier symbol, the unique pronunciation " Zi (2) " in the sign pronunciation corpus 106 in character voice identifier symbol field PIF2 separately.Conversely, the independent character " once " among the character field ICF has two corresponding character voice identifier symbols, two kinds of possible pronunciations in the sign pronunciation corpus 106 in character voice identifier symbol field PIF2 separately.These two kinds of pronunciations are: (a) first or default pronunciation " Zang (1) "; (b) second may pronounce " Cang (4) ".
Table 2 part character pronunciation CPLX
Character ICF Voice identifier symbol PIF2
Learn Hok(6)
Practise Zap(6)
{。##.##1}, Dik(1)
Generous Ge(3)
Youngster a)Ji(4)
Son a)Zi(2)
I Ngo(5)
{。##.##1}, Mun(4)
Once a)Zang(1);b)Cang(4)
Small pot with a handle and a spout for boiling water or herbal medicine a)jiu(4)
Not yet a)Mut(6);b)mou(5)
Long a)Coeng(4);b)Zoeng(2)
Have a)jau(2)
Gush a)jung(2)
Be a)Hi(4)
Referring to following table 3A and 3B, show special case dictionary SCLX, be used for the special word of dialect, for example place name (table 3A) and kinsfolk's name (table 3B).As recognized by those skilled in the art, special case dictionary SCLX may be included among the main dictionary PLX, but for the maintenance and the dirigibility of multiple dialect, special case dictionary SCLX is not included among the main dictionary PLX.In table 3A, place name character field PNCF comprises the Chinese character literary style of known place name (known for specific dialect), the special case voice identifier symbol SCPI field that also has the voice identifier symbol is used for identifying which pronunciation of pronunciation corpus 106 corresponding to the word among the place name character field PNCF.Simultaneously, table 3B has kinsfolk's file-name field FMF, comprises the Chinese character literary style of kinsfolk's name (known for specific dialect) of public use.The special case voice identifier symbol SCPI field that also has the voice identifier symbol is used for identifying which pronunciation of pronunciation corpus 106 corresponding to the word among kinsfolk's file-name field FMF.
Table 3A is used for the part special case dictionary SCLX of place name
Place name character field PNCF Special case voice identifier symbol SCPI
The longan hole Lung(4)ngaan(5)dung(2)
Yau Ma Tei Jau(4)maa(4)dei(2)
The Quarry fish is gushed Zak(1)jyu(4)cung(1)
Gush mouth Cung(1)hau(2)
The deep water control Sam(1)seoi(2)bou(2)
The stream flower Lau(4)faa(3)
Shahe Saa(1)ho(2)
Macao Ou(3)mun(2)
Sanyuanli Saam(1)jyun(4)lei(2)
Fanyu Pun(1)jyu(4)
The big pool Daai(6)tong(2)
Gush Cung(1)
Table 3B is used for the part special case dictionary SCLX of kinsfolk's name
Kinsfolk's field FMF Special case voice identifier symbol SCPI
Mother Aa(3)maa(1)
Grandmother Maa(4)maa(4)
Grandfather Je(4)je(2)
The elder brother Go(4)go(1)
Elder sister Ze(4)ze(1)
Younger sister Mui(4)mui(2)
Younger brother Dai(4)dai(2)
Son's wife Sam(1)pou(5)
The elder sister of family Gaa(1)ze(1)
Thin younger sister Sai(3)mui(2)
Thin man Sai(3)lou(2)
Referring to following table 4, show part Knowledge Base rule set KBRS.KBRS comprises Knowledge Base character field (KBCF), is used for character or character group for Knowledge Base character field KBCF, the fixedly pronunciation rule of identification character or character group and rule field (RFLD).
To 2C, illustrate method 200 referring to Fig. 2 A, choose the voice identifier symbol that is used for determining the pronunciation waveform for the conversion of literary composition language is synthetic, method 200 is implemented in equipment 100 usually.Preferably, come call method 200 by allowing the user provide instructions to user interface 104 in beginning step 210, perhaps can be when equipment 100 receives character string (for example by wireless communication module 116) automatic call method 200.After beginning step 210, method 200 is carried out and is chosen step 220, is used for choosing character string (CHS) from the string of a plurality of characters.Character string is the subclass of character sentence, usually is selected from the text string of character with order from left to right, and this to choose order also be the order of people's read text.Therefore, as recognized by those skilled in the art,, text string is divided into one or more character strings (CHS) by according to dictionary guide or based on the dividing method of the analysis of statistical rules.
Table 4 part Knowledge Base rule set KBRS
Character KBCF Rule RFLD
No If " " be last word THEN=>Mut (6) jau (2) in the sentence the left side and " not having " of " not having "; ELSE=>mou (5)
Once If " surname " or " crying " or "Yes" are at the left side of " once " THEN=>zang (1); ELSE=>cang (4)
Not If " after " or " remembeing " or " " and the word of relevant money at the right of " not wanting " then=>m (4) jiu3; Else=>m (4) hou (2)
What If " " or " it " or " number " or " gained vote " at left side then=>do (1) siu (2) of " how much "; If " have " or " having " or " also having " or " having " or " feeling " at the right then=>do (1) siu (2) of " how much "; Else=>gei (2) do (1)
This If " problem " or " target " or " being exactly " or " topic " or " problem " or " plan " or " speed " or " industry " or " key " are at the right of " this " then=>nei (1) go (3); Else=>gam (2)
After step 220, method is carried out the step 230 of retrieval main dictionary, retrieves the word among the main dictionary PLX word field WF1 this moment, seeks the coupling with character string CHS.Then, implement testing procedure 240, be used for determining whether to have found the same word coupling (determining whether to be the character string CHS among the main dictionary PLX) of character string CHS at word field WF1.If do not find coupling (character string is not in main dictionary PLX), then method 200 is carried out searching step 250, is used to retrieve special case dictionary SCLX.
If testing procedure 260 is determined character string CHS not in special case dictionary SCLX, separating character string step 265 is divided into independent character (ICH) comprises the independent character of character string CHS with establishment character set with character string CHS so.Implement retrieval knowledge base rule collection KBRS step 270, be used for determining that independent character ICH is whether in Knowledge Base rule set KBRS (in other words, the retrieval of step 270 determines whether independent character has the voice identifier symbol in regular centralised identity).If determine not have independent character ICH in Knowledge Base rule set KBRS, choose step 290 and just in character pronunciation dictionary CPLX, select default voice identifier symbol at testing procedure 280.Do not have searching character pronunciation dictionary CPL when the voice identifier of regular centralised identity accords with at independent character, the character pronunciation dictionary comprises independent character and corresponding voice identifier symbol, from the character pronunciation dictionary, choose the voice identifier symbol then, realize choosing step thus for each independent character.After step 290, in step 400, provide phonetic synthesis by symbol addressing of default voice identifier and the pronunciation waveform chosen in the corpus 106, provide a signal to loudspeaker 112 and show synthetic voice.
If define one or more independent character ICH in Knowledge Base rule set KBRS at testing procedure 280, method 200 is just carried out determining step 370, is used to use Knowledge Base rule set KBRS to determine the context of each independent character ICH.After this, choose the context of step 380, in character pronunciation dictionary CPLX, choose context-sensitive voice identifier symbol according to each independent character of the ICH that is used for identifying at Knowledge Base rule set KBRS.Therefore, context-sensitive voice identifier symbol is used to have the independent character at the voice identifier symbol of regular centralised identity, by the rule application in the rule set is chosen context-sensitive voice identifier symbol in independent character, wherein, application process is included in the sentence or the interior context of determining independent character of phrase.The independent character ICH of other that do not identify in Knowledge Base rule set KBRS can given simply its default voice identifier symbol of quilt.
After choosing step 380, in the addressing and choose the pronunciation waveform phonetic synthesis is provided in corpus 106 of the voice identifier symbol of step 400 by choosing, to provide a signal to loudspeaker 112 in order to show synthetic voice.
Get back to step 260,, then implement to choose step 300 if determine character string CHS in special case dictionary SCLX at testing procedure 260.Choose step 300 and choose the voice identifier symbol of the special case dictionary SCLX sign of table 3A and 3B.After choosing step 300, in the addressing and choose the pronunciation waveform phonetic synthesis is provided in corpus 106 of the voice identifier symbol of step 400 by choosing, to provide a signal to loudspeaker 112 in order to show synthetic voice.
Get back to step 240, if determine character string CHS in main dictionary PLX at testing procedure 240, then method 240 is carried out further testing procedure 310, is used for checking whether be provided with unofficial/official symbol.If this sign is not set, in main dictionary, just have only a possible voice identifier to accord with character string CHS coupling.Therefore, choosing step 320, choosing unique voice identifier symbol by main dictionary PLX sign.After this, in the addressing and choose the pronunciation waveform phonetic synthesis is provided in corpus 106 of the voice identifier symbol of step 400 by choosing, to provide a signal to loudspeaker 112 in order to show synthetic voice.
If determine to be provided with unofficially/official symbol, a plurality of possible voice identifier symbols are just arranged in step 310.Therefore, method 200 must determine to use which identifier, so testing procedure 330 is determined the mark of correlation whether character string CHS has identification character string type (that is, books, film, TV play etc.).Such mark is a control character, comprise such as " { ... } ", " (...) ", "<... " bracket, can also be such as quoted passage (quotation) and "/.../", " | ... | ", the special control character of " * ... * ", " #...# ", in the middle of these control characters, inserted one group of character that comprises character string CHS.If relevant one or more marks are arranged, just implement to choose step 340, thereby the formal voice identifier of choosing by main dictionary PLX sign for the character string of choosing accords with.Replacedly, if not relevant one or more marks are just implemented retrieval knowledge base rule collection step 345, to determine that character string CHS is whether in Knowledge Base rule set KBRS.If determine character string CHS not in Knowledge Base rule set KBRS at testing procedure 350, just implement to choose step 355, (being character string CHS) chooses the informal voice identifier symbol by main dictionary PLX sign thus.But, if determine that at testing procedure 350 character string CHS is in Knowledge Base rule set KBRS, just implement to choose step 360, (being character string CHS) chooses the formal voice identifier symbol by the sign of the rule among main dictionary PLX and the Knowledge Base rule set KBRS thus.Therefore, if character string CHS has the voice identifier symbol in regular centralised identity, choose step 360 and just choose context-sensitive voice identifier symbol for character string, by the rule application in the rule set is chosen context-sensitive voice identifier symbol to character string CHS, wherein, application process is included in the sentence or the interior context of determining character string of phrase.
In step 340,355 or 360 after any one, in step 400 by the addressing and choose the pronunciation waveform phonetic synthesis is provided in corpus 106 of the voice identifier symbol chosen, to provide a signal to loudspeaker 112 in order to show synthetic voice.After phonetic synthesis 400, finish testing procedure 410 and determine whether that character string CHS will handle in addition, when not having character string to handle, method 200 stops in end step 420, otherwise method 200 turns back to receiving step 220.
The present invention advantageously allows the TTS based on the Chinese character text string, thereby the synthetic speech such as the Chinese dialects of Guangdong language is provided.The present invention carries out in fact and chooses character string; And whether definite character string is in main dictionary PLX.When character string is not in main dictionary, character string is divided into independent character, retrieval knowledge base rule collection KBRS is to determine whether independent character has the voice identifier symbol in regular centralised identity.After this, choose context-sensitive voice identifier symbol for independent character with the voice identifier symbol that in Knowledge Base rule set KBRS, identifies.By the rule application in the rule set is chosen context-sensitive voice identifier symbol to independent character, wherein, application process is included in the sentence or the interior context of determining independent character of phrase.Knowledge Base rule set KBRS also is used for simultaneously the character string CHS that identifies at main dictionary PLX.And special case dictionary SCLX has also increased advantage of the present invention with the mark that is used for books, TV play etc.Therefore, the invention enables the pronunciation of character or word relevant with its context, dialect place name and kinsfolk's name and the pronunciation of formal and informal dialect.
In order further to understand advantage of the present invention, provide following Example.
Example 1: choose character string CHS " batter " choosing step 220, step 240 will be determined " batter " in main dictionary PLX then, and testing procedure 310 will be determined not to be provided with formally/unofficial sign.Therefore, in step 340, be unique pronunciation gik (1) Kau (4) sau (2) of " batter " sign.
Example 2: choose character string CHS and " do not have " choosing step 220, step 240 will be determined " not having " in main dictionary PLX then, and testing procedure 310 is determined to be provided with formally/unofficial sign.Therefore, around determining character string CHS, step 330 has mark, such as "<... do not have ...〉", wherein also can other character in the mark, be the formal voice identifier symbol of " not having " sign mou (5).Replacedly, if around character string CHS, do not have mark, so after step 330, at step 345 retrieval knowledge base rule collection KBRS.If " do not have " not in Knowledge Base rule set KBRS, will choose unofficial voice identifier symbol Mot (6) Jau (2) in step 355.But when " not having " in Knowledge Base rule set KBRS the time, the rule in step 360 service regeulations field RFLD is chosen a) Mot (6) Jau (2) or is chosen b) mou (5).
Example 3: choose character string CHS " longan hole " choosing step 220, step 240 will determine that " longan hole " be not in main dictionary PLX then, testing procedure 260 will be determined " longan hole " in special case dictionary SCLX, therefore choose voice identifier symbol Lung (4) ngaan (5) dung (2).
Example 4: choose character string CHS " being once " choosing step 220, step 240 will determine that " being once " be not in main dictionary PLX then, testing procedure 260 is determined " being once " not in special case dictionary SCLX, therefore this character string CHS is divided into two character "Yes" and " once ".Because " once " identifies in Knowledge Base character field KBCF, execution in step 370 and 380, thereby, and be given its default value of "Yes" by character pronunciation dictionary sign for " once " chooses voice identifier symbol zang (1) (because "Yes" is on the left side of " once ").
The explanation of describing in detail only provides preferred example embodiment, and be not intended to limit the scope of the invention, applicability or structure.In fact, the detailed description of this preferred example embodiment is to provide a kind of the present invention of realization the explanation of preferred example embodiment to those skilled in the art.Should be appreciated that, under the prerequisite of the spirit and scope of the present invention of in not deviating from, being set forth, can make various change the function and the structure of each element as claims.

Claims (10)

1. one kind is the synthetic method of choosing the voice identifier symbol that is used for definite pronunciation waveform of literary composition language conversion, and described method comprises:
(i) choose character string
Determine that (ii) described character string is whether in main dictionary;
(iii) described character string is divided into independent character, described cutting procedure is implemented when described character string is not in main dictionary;
(iv) whether the search rule collection has the voice identifier symbol in described regular centralised identity to determine described independent character; With
(v) choose context-sensitive voice identifier symbol for having at the described independent character of the voice identifier of regular centralised identity symbol, described context-sensitive voice identifier symbol is by the rule application in the rule set is chosen in described independent character, wherein, described application process is included in the sentence or the interior context of determining described independent character of phrase.
2. the method for claim 1, wherein described determining step (ii) also comprises the steps:
(vi) retrieve described rule set, whether have the voice identifier symbol in regular centralised identity to determine described character string, described retrieving is only implemented when character string is not in main dictionary; With
If (vii) its identifier is in regular centralised identity, for described character string is chosen context-sensitive voice identifier symbol, described context-sensitive voice identifier symbol is by choosing rule application in the rule set in described character string, wherein, described application process is included in the sentence or the interior context of determining described character string of phrase.
3. method as claimed in claim 2, wherein, determining step (ii) also comprises the steps:
(check viii) whether described character string has the mark that is associated or the control character of the described character string of sign, and described checking process is only implemented when described character string is not in main dictionary; With
(ix) when described character string has the mark that is associated or control character, in main dictionary, choose formal voice identifier symbol for described character string.
4. method as claimed in claim 3, wherein, determining step (ii) also comprises the steps:
(x) retrieve described rule set, whether have voice identifier symbol in regular centralised identity to determine described character string; With
(xi) choose context-sensitive voice identifier symbol for having in the described character string of its voice identifier symbol of regular centralised identity, described context-sensitive voice identifier symbol is by the rule application in the rule set is chosen in described character string, described application process is included in the sentence or the interior context of determining described character string of phrase, and wherein, when described character string did not have its voice identifier symbol in regular centralised identity, described character string was chosen its voice identifier symbol as the unofficial or default designation by the main dictionary sign and is accorded with.
5. the method for claim 1, wherein described method is further characterized in that at least some characters have formal and informal voice identifier symbol in the main dictionary.
6. the method for claim 1, wherein choosing step (v) also comprises the steps:
(xii) when described independent character does not have at the voice identifier of regular centralised identity symbol, searching character pronunciation dictionary, described character pronunciation dictionary comprise independent character and corresponding voice identifier symbol; With
(xiii) from described character pronunciation dictionary, choose the voice identifier symbol for each independent character.
7. the method for claim 1, wherein described method also is included as the step that each selected voice identifier symbol carries out phonetic synthesis.
8. method as claimed in claim 7, wherein, described phonetic synthesis is chosen the pronunciation waveform by the voice identifier symbol and is carried out in the pronunciation corpus.
9. method as claimed in claim 8, wherein, described method is carried out on electronic equipment.
10., the method for claim 1, wherein described method cuts apart the step formerly that text string provides described character string thereby comprising.
CNB2004100319757A 2004-03-31 2004-03-31 Selection of pronunciation designator for determining pronunciation wave-shape for text-to-speed conversion and synthesis Expired - Lifetime CN100371928C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2004100319757A CN100371928C (en) 2004-03-31 2004-03-31 Selection of pronunciation designator for determining pronunciation wave-shape for text-to-speed conversion and synthesis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2004100319757A CN100371928C (en) 2004-03-31 2004-03-31 Selection of pronunciation designator for determining pronunciation wave-shape for text-to-speed conversion and synthesis

Publications (2)

Publication Number Publication Date
CN1677488A CN1677488A (en) 2005-10-05
CN100371928C true CN100371928C (en) 2008-02-27

Family

ID=35049967

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100319757A Expired - Lifetime CN100371928C (en) 2004-03-31 2004-03-31 Selection of pronunciation designator for determining pronunciation wave-shape for text-to-speed conversion and synthesis

Country Status (1)

Country Link
CN (1) CN100371928C (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE135392C (en) *
JPH09237096A (en) * 1996-03-01 1997-09-09 Nippon Telegr & Teleph Corp <Ntt> Kanji (chinese character) explaining method and device
US6219646B1 (en) * 1996-10-18 2001-04-17 Gedanken Corp. Methods and apparatus for translating between languages
CN1379342A (en) * 2001-03-30 2002-11-13 株式会社东芝 Chinese language input translation processing device and Chinese language translation processing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE135392C (en) *
JPH09237096A (en) * 1996-03-01 1997-09-09 Nippon Telegr & Teleph Corp <Ntt> Kanji (chinese character) explaining method and device
US6219646B1 (en) * 1996-10-18 2001-04-17 Gedanken Corp. Methods and apparatus for translating between languages
CN1379342A (en) * 2001-03-30 2002-11-13 株式会社东芝 Chinese language input translation processing device and Chinese language translation processing method

Also Published As

Publication number Publication date
CN1677488A (en) 2005-10-05

Similar Documents

Publication Publication Date Title
RU2319221C1 (en) Method for identification of natural speech pauses in a text string
US8825486B2 (en) Method and apparatus for generating synthetic speech with contrastive stress
US20170206800A1 (en) Electronic Reading Device
EP1028410A1 (en) Speech recognition enrolment system
US20070239455A1 (en) Method and system for managing pronunciation dictionaries in a speech application
CN105895103A (en) Speech recognition method and device
CN101158947A (en) Method and apparatus for machine translation
Davel et al. Pronunciation dictionary development in resource-scarce environments
US8914291B2 (en) Method and apparatus for generating synthetic speech with contrastive stress
Goronzy Robust adaptation to non-native accents in automatic speech recognition
CN106856091A (en) The automatic broadcasting method and system of a kind of multi-language text
CN104008752A (en) Speech recognition device and method, and semiconductor integrated circuit device
Grabe et al. The IViE Corpus
CN108305611B (en) Text-to-speech method, device, storage medium and computer equipment
Wang et al. MAT-2000-design, collection, and validation of a Mandarin 2000-speaker telephone speech database
US4455615A (en) Intonation-varying audio output device in electronic translator
CN101253547B (en) Speech dialog method and system
CN105895076B (en) A kind of phoneme synthesizing method and system
RU2320026C2 (en) Method for transforming a letter to a sound for synthesized pronunciation of a text segment
CN111489752B (en) Voice output method, voice output device, electronic equipment and computer readable storage medium
van den Heuvel et al. Annotation in the SpeechDat projects
CN100371928C (en) Selection of pronunciation designator for determining pronunciation wave-shape for text-to-speed conversion and synthesis
JP6197523B2 (en) Speech synthesizer, language dictionary correction method, and language dictionary correction computer program
Bartisiute et al. Speech server based Lithuanian voice commands recognition
CN1629933B (en) Device, method and converter for speech synthesis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: NIUANSI COMMUNICATION CO., LTD.

Free format text: FORMER OWNER: MOTOROLA INC.

Effective date: 20101008

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: ILLINOIS STATE, USA TO: DELAWARE STATE, USA

TR01 Transfer of patent right

Effective date of registration: 20101008

Address after: Delaware

Patentee after: NUANCE COMMUNICATIONS, Inc.

Address before: Illinois, USA

Patentee before: Motorola, Inc.

CX01 Expiry of patent term

Granted publication date: 20080227

CX01 Expiry of patent term