CN106856091A - The automatic broadcasting method and system of a kind of multi-language text - Google Patents

The automatic broadcasting method and system of a kind of multi-language text Download PDF

Info

Publication number
CN106856091A
CN106856091A CN201611195723.7A CN201611195723A CN106856091A CN 106856091 A CN106856091 A CN 106856091A CN 201611195723 A CN201611195723 A CN 201611195723A CN 106856091 A CN106856091 A CN 106856091A
Authority
CN
China
Prior art keywords
languages
report
language text
language
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611195723.7A
Other languages
Chinese (zh)
Inventor
原树旗
雷宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Intelligent Housekeeper Technology Co Ltd
Original Assignee
Beijing Intelligent Housekeeper Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Intelligent Housekeeper Technology Co Ltd filed Critical Beijing Intelligent Housekeeper Technology Co Ltd
Priority to CN201611195723.7A priority Critical patent/CN106856091A/en
Publication of CN106856091A publication Critical patent/CN106856091A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses the automatic broadcasting method and system of a kind of multi-language text, the multi-language text that the method treats report carries out languages identification, and the corresponding word section of multiple languages is obtained respectively;Starting and end point to each word section are identified respectively, obtain the languages mark;And, corresponded to respectively according to languages mark and transfer each self-corresponding sound report model of each languages, each word section is reported successively.The system is provided with words identification module and voice broadcast module.Intelligence of the invention and flexibly, realize to the fast and accurately automatic identification of multi-language text with report, it is to avoid the language information that may be omitted during identification multi-language text.

Description

The automatic broadcasting method and system of a kind of multi-language text
Technical field
The present invention relates to Voice Conversion Techniques field, and in particular to a kind of automatic broadcasting method of multi-language text and be System.
Background technology
Many equipment can all have TTS (word is changed into sound and played out by Text to Speech) function, word at present Voice is converted into, the speech engine of specialty typically can be all used, speech engine can select a tone color and the specific voice to carry out The training of sound, after the completion of training, can synthesize the audio of correspondence language.So, it is all right that general specific sound reports model Answer specific language, if the language and sound of synthesis report model do not corresponded to, may heterophonia, or cannot sounding.
In view of this situation, if existing Chinese has English again in passage, if reporting mould using same sound Type, just occurs above mentioned problem, so the difference according to word content must be carried out, selects different sound to report model.
And currently in order to solve this problem, the general method using audio splicing;First model is reported using Chinese sounds By the synthesis of Chinese audio, then the audio that model synthesizes English is reported with English sound, be finally spliced into an audio.Such as text Word:" next stop is Wangjing soho, it is noted that got off ".Model is reported using Chinese sounds first to be synthesized:" next stop is to hope Capital ", reports model and is synthesized with English sound:" soho ", user's Chinese sounds are reported model and are synthesized:" please note down Car ", last three audio splicings are into one, but this method is very dumb, if word is more, workload can be caused excessive And the low defect of splicing accuracy rate.
The content of the invention
For defect of the prior art, the present invention provides the automatic broadcasting method and system of a kind of multi-language text, should Method and system intelligence and flexibly, realize to the fast and accurately automatic identification of multi-language text with report, it is to avoid know The language information that may be omitted during other multi-language text.
In order to solve the above technical problems, the present invention provides following technical scheme:
On the one hand, the invention provides a kind of automatic broadcasting method of multi-language text, including:
The multi-language text for treating report carries out languages identification, and the corresponding word section of multiple languages is obtained respectively;
Starting and end point to each word section are identified respectively, obtain the languages mark;
And, corresponded to respectively according to languages mark and transfer each self-corresponding sound report model of each languages, it is right successively Each word section is reported;
Wherein, the languages mark includes the corresponding languages of current character section and reports sequence number.
Further, it is described when obtaining the corresponding word section of multiple languages respectively, including:
Languages identification is carried out to the multi-language text using default recognition strategy, the multi-language text is drawn by languages It is divided into the word section of multiple different languages;
Starting and end point to each word section are identified, and each mark include the corresponding languages of current character section and Report sequence number.
Further, it is described using default recognition strategy the multi-language text is carried out languages recognize when, including:
With the initial character in the multi-language text as starting point, each character is filtered successively, and finding and previous character Character rule different current character when, current character is confirmed as into the languages different from previous character, and advise according to language Then obtain the corresponding languages of current character;
The end of identification of previous character and the origin identification of current character are marked between current character and previous character.
Further, it is described that the corresponding languages of current character are obtained according to language rule, including:
When the corresponding languages of current character are determined for western character according to language rule, if being sentenced according to phonetic differentiation rule It is disconnected to know that the languages for the Chinese phonetic alphabet, are then updated to Chinese phonetic alphabet character by current western character;
Wherein, the language rule includes the coding rule of default each languages, and the phonetic differentiation rule includes phonetic In initial consonant, simple or compound vowel of a Chinese syllable or the permutation and combination of the two.
Further, described being corresponded to respectively according to languages mark transfers each self-corresponding sound report model of each languages When, including:
According to corresponding whole languages in the multi-language text, each self-corresponding sound of whole languages is transferred respectively Report model,
Mark the sound to report model according to corresponding languages in the multi-language text and export each word section respectively respectively Self-corresponding report voice;
Report sequence number during the report voice is marked by each languages respectively is sequentially synthesized, and obtains described multilingual The corresponding voice messaging of text;
And, the voice messaging is sent to playout center and is reported.
Further, it is described sequentially to be synthesized the report voice by the report sequence number in each mark, including:
The corresponding voice of reporting of each word section is each stored in corresponding each mapping table by languages respectively;
Corresponding report voice in each mapping table is sequentially synthesized by the report sequence number respectively, is obtained described The corresponding voice messaging of multi-language text.
Further, methods described also includes:
Obtain text information to be reported;
The text information is read, and judges whether the languages in the text information are more than one;
If, it is determined that the text information is multi-language text;
Otherwise, the corresponding sound report model of the text information is directly transferred to report the text information.
On the other hand, present invention also offers a kind of automatic broadcasting system of multi-language text, including:
Words identification module, the multi-language text for treating report carries out languages identification, and multiple difference languages are obtained respectively Corresponding word section is planted, and the starting of each word section and end point are identified respectively, obtain the languages mark;
Voice broadcast module, each self-corresponding sound report of each languages is transferred for being corresponded to respectively according to languages mark Model, reports to each word section successively;
Wherein, the mark includes the corresponding languages of current character section and reports sequence number.
Further, the words identification module includes:
Word section division unit, for carrying out languages identification to the multi-language text using default recognition strategy, by institute State the word section that multi-language text is divided into multiple different languages by languages;
Word segment identification unit, is identified for the starting to each word section and end point, and each mark includes working as Above the corresponding languages of field and report sequence number.
Further, the voice broadcast module includes:
Word section voice-output unit, for according to corresponding whole languages in the multi-language text, institute being transferred respectively State each self-corresponding sound of whole languages and report model, mark the sound to report according to correspondence languages in the multi-language text Model exports each self-corresponding report voice of each word section respectively;
Voice messaging synthesis unit, sequentially enters for the report sequence number during the report voice is marked by each languages respectively Row synthesis, obtains the corresponding voice messaging of the multi-language text;
Voice messaging transmitting element, is reported for the voice messaging to be sent to playout center.
As shown from the above technical solution, the automatic broadcasting method and system of a kind of multi-language text of the present invention, should The multi-language text that method treats report carries out languages identification, and the corresponding word section of multiple languages is obtained respectively;To each word section Starting and end point be identified respectively, obtain languages mark;And, corresponded to respectively according to languages mark and transferred Each each self-corresponding sound of languages reports model, and each word section is reported successively;Realize to the quick of multi-language text And accurately identification with report, the process that languages identification and mark are carried out to multi-language text is reliable and accurate, it is to avoid identification The information of the languages that may be omitted during multi-language text;The automatic report to multi-language text is realized, and each sound reports mould The application of type flexibly, and reduces artificial workload, has saved time cost;Ensure that the order of building-up process and to list The differentiation of language text and multi-language text so that the method is more intelligent and flexible.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are the present invention Some embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis These accompanying drawings obtain other accompanying drawings.
Fig. 1 is a kind of a kind of specific embodiment of the automatic broadcasting method of the multi-language text in the embodiment of the present invention one Schematic flow sheet;
Fig. 2 is that a kind of flow of specific embodiment of step 100 in the automatic broadcasting method in the embodiment of the present invention two is shown It is intended to;
Fig. 3 is that a kind of flow of specific embodiment of step 300 in the automatic broadcasting method in the embodiment of the present invention three is shown It is intended to;
Fig. 4 is that a kind of flow of specific embodiment of step 303 in the automatic broadcasting method in the embodiment of the present invention four is shown It is intended to;
Fig. 5 is a kind of specific embodiment of the automatic broadcasting method including step A01 to A04 in the embodiment of the present invention five Schematic flow sheet;
Fig. 6 is a kind of a kind of specific embodiment of the automatic broadcasting system of the multi-language text in the embodiment of the present invention six Structural representation;
Fig. 7 is a kind of specific embodiment of the words identification module 10 of the automatic broadcasting system in the embodiment of the present invention seven Structural representation;
Fig. 8 is a kind of specific embodiment of the voice broadcast module 20 of the automatic broadcasting system in the embodiment of the present invention eight Structural representation.
Specific embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, clear, complete description is carried out to the technical scheme in the embodiment of the present invention, it is clear that described embodiment is A part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
The embodiment of the present invention one provides a kind of a kind of specific embodiment of the automatic broadcasting method of multi-language text.Ginseng See Fig. 1, the automatic broadcasting method specifically includes following content:
Step 100:The multi-language text for treating report carries out languages identification, and the corresponding word of multiple languages is obtained respectively Section.
In step 100, receiving when the text reported is multi-language text, the method according to audio splicing is entered Row is reported, then the defect for playing that workload is excessive and splicing accuracy rate is low is had, accordingly, it would be desirable to using technical side of the invention Case, obtains the species of the languages being related in the multi-language text first, and the language that content in multi-language text is related to Kind, text when being converted with the species for occurring languages every time is separation, multi-language text is divided into multiple word sections, also It is to say, the languages of adjacent word section are different.
Step 200:Starting and end point to each word section are identified respectively, obtain the languages mark.
In step 200, it is identified in the starting and end point for dividing each word section for obtaining, obtains the languages mark At least include the corresponding languages of current character section in note, and languages mark and report sequence number, current character can also be included The information such as chapters and sections or keyword where section, in case when reporting sequence number loss or label mistake occur, being easy to quickly find it Position.
Step 300:Corresponded to respectively according to languages mark and transfer each self-corresponding sound report model of each languages, successively Each word section is reported.
In step 300, corresponded to respectively according to languages mark and transfer the corresponding sound report model of each languages, wherein Sound report the speech engine that model can be specialty, speech engine can select a tone color and the specific voice to carry out sound Training, after the completion of training, can synthesize correspondence language audio;Again so that sound reports model by languages mark Sequence number is reported successively to report each word section.
Knowable to foregoing description, embodiments of the invention realize the content according to text message, automatically select and sound Model is reported, fast and accurately identification and report to multi-language text is realized.
The embodiment of the present invention two provides a kind of specific embodiment of step 100 in above-mentioned automatic broadcasting method.Referring to Fig. 2, the step 100 specifically includes following content:
Step 101:Languages identification is carried out to the multi-language text using default recognition strategy, by the multi-language text The word section of multiple different languages is divided into by languages.
In this step, each character is filtered as starting point with the initial character in the multi-language text successively, and is being found During different from the character of previous character rule current character, current character is confirmed as into the languages different from previous character and root Languages are obtained according to language rule, and the end of identification and current character of previous character are marked between current character and previous character Origin identification, and it is therein according to language rule obtain languages, including:Determining that current character languages are according to language rule During western character, if being judged to know that current western character is the Chinese phonetic alphabet according to phonetic differentiation rule, by the western character Languages are updated to Chinese phonetic alphabet character;Wherein, the language rule includes the coding rule of default all types of language, the spelling Sound differentiation rule includes the initial consonant in phonetic, simple or compound vowel of a Chinese syllable or the permutation and combination of the two.
Step 102:Starting and end point to each word section are identified, and each mark includes current character section correspondence Languages and report sequence number.
In this step, to each word section starting and the mode that is identified of end point can according to preset rules, Starting and the end point of each word section are directly identified using computer program, and the preset rules can be to same languages Word section is numbered or the word Duan Jun of whole languages is numbered successively successively, obtains the languages mark, illustrates It is bright:If the type of word section includes English, Chinese and German, and each word section once includes:" English section 1, Chinese section 2, Chinese Section 3, Chinese section 4, English section 5, German section 6, English section 7, German section 8, Chinese section 9 ";Then the languages mark of this section of word section can Think " E1, C1, C2, C3, E2, G1, E3, G2, C4 ";Can also be " E1, C2, C3, C4, E5, G6, E7, G8, C9 ", wherein, E English is represented, C is Chinese, and G is German.Knowable to foregoing description, embodiments of the invention give to be carried out to multi-language text Languages recognize and mark detailed process, and the process is reliable and accurate, it is to avoid may be omitted during identification multi-language text The information of languages.
The embodiment of the present invention three provides a kind of specific embodiment of step 300 in above-mentioned automatic broadcasting method.Referring to Fig. 3, the step 300 specifically includes following content:
Step 301:According to corresponding whole languages in the multi-language text, whole languages are transferred respectively each right The sound answered reports model.
In this step, multiple sound report models can be by TTS Compositing Engines, for example, sound reports model LILI supporting Chinese, sound reports model Allision and supports English;And it is TTSUU (Text-to-Speech to compare representative in TTS Universal Utility) domestic text reading software, this software has 20 grades of tone changes of voice and 20 grades of word speeds regulation energy Power, punctuation mark automatic decision that can be in text when reading aloud reads aloud pause, also allows user in the optional position of text The arbitrarily long dead time is set, and text can be exported to Wav and mp3 file by TTSUU softwares, while corresponding LRC can be derived And SMI synchronized lyrics subtitle files, moreover it is possible to record and export Wav and mp3 file.TTSUU softwares can be by switching voice Engine, repetition are read aloud, slow down or are accelerated bright reading rate, heighten or turn down outside the functions such as present tone help Students ' Learning Language, there is provided nearly 30 speech engines such as including China and British, Japan and Korea S., moral, method, west, Portugal, Russia.
Step 302:Mark the sound to report model according to corresponding languages in the multi-language text to export respectively respectively Each self-corresponding report voice of word section.
Step 303:Report sequence number during the report voice is marked by each languages respectively is sequentially synthesized, and obtains institute State the corresponding voice messaging of multi-language text.
Step 304:The voice messaging is sent to playout center and is reported.
Knowable to foregoing description, embodiments of the invention report model by the corresponding sound of each languages, realize to many The automatic report of language text, and each sound reports the application of model flexibly, and artificial workload is reduced, save the time Cost.
The embodiment of the present invention four provides a kind of specific embodiment of step 303 in above-mentioned automatic broadcasting method.Referring to Fig. 4, the step 303 specifically includes following content:
Step 303a:The corresponding voice of reporting of each word section is each stored in corresponding each mapping table by languages respectively In.
Step 303b:Corresponding report voice in each mapping table is sequentially closed by the report sequence number respectively Into obtaining the corresponding voice messaging of the multi-language text.
Knowable to foregoing description, embodiments of the invention are given the report voice by the report sequence number in each mark Sequentially synthesis obtains the detailed process of the voice messaging of the multi-language text, it is ensured that the order of building-up process.
One kind of the step of embodiment of the present invention five is there is provided before step 100 in above-mentioned automatic broadcasting method A01 to A04 Specific embodiment.Referring to Fig. 5, step A01 to A04 specifically includes following content:
Step A01:Obtain text information to be reported.
Step A02:The text information is read, and judges whether the languages in the text information are more than one;If so, then Into step A03, otherwise into step A04.
Step A03:It is determined that current text information is multi-language text.
Step A04:Directly transfer the corresponding sound of the text information and report model so that the sound reports model to institute Text information is stated to be reported.
Knowable to foregoing description, embodiments of the invention realize the judgement to multi-language text, it is ensured that to single language The differentiation of text message and multi-language text so that the method is more intelligent and flexible.
It is further description this programme, the present invention also provides a kind of one kind of the automatic broadcasting method of multi-language text Application example.The automatic broadcasting method of the multi-language text specifically includes following content:
According to the content of word, automatic switchover sound reports model.If existing Chinese has English in one section of word, can root again According to word content, select suitable sound to report model, to realize normal sounding, specifically include:
1st, the category of language in filtering text, all classes in text are found out by the coded system of the modes such as canonical and word The spoken and written languages of type.
2nd, starting position and the end position of all spoken and written languages are found out.
3rd, corresponding sound is got from sounding storehouse according to category of language and reports model list.
4th, the starting position according to each language and end position, are separated into multiple word sections.
5th, start to play word section one by one, each word section reports model and carries out synthesis broadcasting with corresponding word sound.
Detailed process is as follows:
1st, the category of language in filtering text, finds out all types of spoken and written languages in text.In passage it is existing in Text has English, also phonetic again, and during screening category of language, although phonetic be English alphabet, however it is necessary that processed according to Chinese, Phonetic can be converted into Chinese text to read by the Compositing Engine of TTS automatically.
Such as following word:Hello, and I is robot up to dog is sprouted, and you can be me da meng.Broadcast using Chinese sounds Reporting the fragment of model has:Hello, and I is robot up to sprouting, and you can be me da meng, use the piece of English sound report model Duan You:dog.
The linguistic method separated in text is as follows:Chinese, due to Chinese, using Unicode codings, (CJK unifies Chinese character Coding is interval:0x4e00-0x9fbb), it is possible to whether canonical makes a decision, and judges this character according to coding is interval It is Chinese, English is directly alphabetical interval in A-Z or a-z.
Screening English finishes rear, it is necessary to judge whether this English is phonetic, determination methods are as follows:Because phonetic is divided into initial consonant And simple or compound vowel of a Chinese syllable, part simple or compound vowel of a Chinese syllable can be with individualism, and some simple or compound vowel of a Chinese syllable are needed and initial consonant is fitted together, it is believed that be a complete spelling Sound.It is all that these permutation and combination are stored in database, the English for finishing will be screened and matched from database, if matched, It is considered a phonetic, is processed according to Chinese.So far, category of language screening operation is finished.
2nd, starting position and the end position of each language are searched.Since first character, screened one by one, specifically Screening rule according to the first step, if this character it is regular and it is upper one it is inconsistent, then it is assumed that a category of language terminates, record The starting position of this category of language and end position, are constantly carried out according to this logic, to the last a character.
According to this logic, starting position and the end position of all language fragments can be found, according to starting and ending position Put, the language fragments of each language can be intercepted out.
3rd, sound is selected to report model according to language fragments.TTS Compositing Engines, the sound bank that can have multiple types, often The corresponding voice of individual sound bank is different, and the TTS language that can be supported is also inconsistent, after system introduces TTS Compositing Engines, ought The all languages and corresponding TTS synthesizers corresponding relation that preceding engine is supported are mapped, in storage to mapping table.Mapping table is Individual locally stored file, stores each TTS and corresponding supporting language list.For example, during sound report model LILI is supported Text, sound reports model Allision and supports English, and the storage format of mapping table is as follows:
Chinese->LILI
English->Allision
According to second step, passage is split into multiple fragments, the comprehensive first step and second step, it is known that each fragment pair What is answered is which type of language, and corresponding synthesizer is then found from mapping table, is synthesized, and can so synthesize multiple languages The fragment of sound.
4th, sound bite is played.Playout center is responsible for playing the sound bite of synthesis, according to step 3, by all of language Fragment reports model synthesis sound bite according to sound, sends into playout center, plays one by one, so sounds being exactly one section complete Voice.
5th, for example, such as robot has the explanatory note of one section of self-introduction as follows:Hello, and I is cried up to sprouting, and I has A lot of abilities, also understand many English, and the pronunciation of English that good morning is:Good morning.I is very serious.
In this implementation method, directly this section of word can be input into, after this string literal is connected to, understood according to the first step, All of category of language in filtering word, takes Chinese and English to illustrate in this example, all of Chinese character is unified in computer Using Unicode codings, (coding that CJK unifies Chinese character is interval:0x4e00-0x9fbb), filtered since first Chinese character, see Whether the interval encoded at this, if illustrating it is Chinese, all of English, using ASSIC coded systems, if not In Chinese interval, then judge whether interval in the coding of English.
It is Chinese or English character that single character can so be filtered out.Hello, and I is cried up to sprouting, and I has a lot of abilities, Also many English are understood, the pronunciation of English that good morning is:Good morning.The words, always according to this rule, determines " English The pronunciation of language is:" all it is the coding of Chinese here, when G is determined, the coding of this character is not interval in encoding of chinese characters In 0x4e00-0x9fbb, then rush and found in ASSIC codings, can find, then it is assumed that be an English character, continually look for, one It is straight to search out d characters, find there is space behind d characters, then illustrate that an English interval is terminated, this English interval is probably one String phonetic a, it is also possible to English word, if not a phonetic, is then processed according to English word.The rule of phonetic Can judge according to the combination of initial consonant and simple or compound vowel of a Chinese syllable, a pinyin-group turns into initial consonant+simple or compound vowel of a Chinese syllable or simple or compound vowel of a Chinese syllable, if in this combination Within, explanation is phonetic, otherwise it is assumed that being English word.If adfgc etc, phonetic is both not belonging to, also it is not belonging to English Literary word, can carry out treatment according to English word, and English engine can directly be read as letter.By the first step, it may be determined that go out A total bilingual, Chinese and English in the words.
It is determined that, it is necessary to according to second step, confirm the original position and knot of each language after finishing the category of language in input Beam position.According to the rule of the first step, the corresponding category of language of each character is can confirm that." hello, and I is cried up to sprouting, and I has A lot of abilities, also understand many English, and the pronunciation of English that good morning is:Good morning." judge since original position, " you " is Chinese, until "Yes" is all Chinese, character late is " G ", not in Chinese character code storehouse, then can intercept out the One fragment is:" hello, and I is cried up to sprouting, and I has a lot of abilities, also understands many English, and the pronunciation of English that good morning is ", according to Same method, the fragment that can intercept out English is " Good morning.", the 3rd Chinese fragment is:" I is very It is severe.”
One has two kinds of languages during second step is confirmed to be input into, and the corresponding sound of each languages is obtained from mapping table and is reported Model, Chinese is LILI, and English is ALLISION, using LILI Composite tones " hello, and I is cried up to sprouting, and I has a lot of abilities, Understand many English, the pronunciation of English that good morning is ", using ALLISION Composite tones " Good morning.", closed using LILI Into audio, " I is very serious.”
Playout center can in the third step synthesize multiple sound bites with each audio fragment of played in order, will be all Sound bite order feeding playout center, finally sound an audio frequency effect:Hello, and I is cried up to sprouting, and I has a lot of sheets Neck, also understands many English, and the English equivalents that good morning are Good morning, and I is very serious.Sounded in user, Above Chinese is LILI sounding, during to English, is automatically cut into ALLISION sounding, and last Chinese is by LILI sounding.
Knowable to foregoing description, application examples of the invention realize to the fast and accurately identification of multi-language text with broadcast Report, the process that languages identification and mark are carried out to multi-language text is reliable and accurate, it is to avoid may during identification multi-language text The information of the languages of omission;The automatic report to multi-language text is realized, and each sound reports the application of model flexibly, and subtract Lack artificial workload, save time cost;Ensure that the order of building-up process and to single language text information and many The differentiation of language text so that the method is more intelligent and flexible.
The embodiment of the present invention six provides a kind of a kind of specific embodiment of the automatic broadcasting system of multi-language text.Ginseng See Fig. 6, the automatic broadcasting system specifically includes following content:
Words identification module 10, languages identification is carried out for treating report text message, obtains the text of multiple different languages Field, and the starting of each word section and end point are identified.
Voice broadcast module 20, reports model so that the sound reports model for transferring the corresponding sound of each languages According to corresponding mark in the multi-language text information, each word section is reported successively, wherein, the mark includes working as Above the corresponding languages of field and report sequence number.
Knowable to foregoing description, embodiments of the invention realize the content according to text message, automatically select and sound Model is reported, fast and accurately identification and report to multi-language text is realized.
The embodiment of the present invention seven provides a kind of specific implementation of the words identification module 10 in above-mentioned automatic broadcasting system Mode.Referring to Fig. 7, the word mark module 10 specifically includes following content:
Word section division unit 11, for carrying out languages knowledge to the multi-language text information using default recognition strategy Not, the multi-language text information is divided into the word section of multiple different languages by languages.
Word segment identification unit 12, is identified for the starting to each word section and end point, and each mark includes The corresponding languages of current character section and report sequence number.
Knowable to foregoing description, embodiments of the invention give the tool that languages identification and mark are carried out to multi-language text Body process, and the process is reliable and accurate, it is to avoid the information of the languages that may be omitted during identification multi-language text.
The embodiment of the present invention eight provides a kind of specific implementation of the voice broadcast module 20 in above-mentioned automatic broadcasting system Mode.Referring to Fig. 8, the voice broadcast module 20 specifically includes following content:
Word section voice-output unit 21, for according to corresponding whole languages in current multiple language characters information, transferring The corresponding sound of whole languages reports model so that the sound reports model according to right in the multi-language text information The report voice of each word section of mark output answered.
Voice messaging synthesis unit 22, obtains for the report voice sequentially to be synthesized by the report sequence number in each mark The voice messaging of the multi-language text information.
Voice messaging transmitting element 23, for the voice messaging of the multi-language text information to be sent to playout center, So that the playout center is reported to the voice messaging.
Knowable to foregoing description, embodiments of the invention report model by the corresponding sound of each languages, realize to many The automatic report of language text, and each sound reports the application of model flexibly, and artificial workload is reduced, save the time Cost.
Above example is merely to illustrate technical scheme, rather than its limitations;Although with reference to the foregoing embodiments The present invention has been described in detail, it will be understood by those within the art that:It still can be to foregoing each implementation Technical scheme described in example is modified, or carries out equivalent to which part technical characteristic;And these are changed or replace Change, do not make the spirit and scope of the essence disengaging various embodiments of the present invention technical scheme of appropriate technical solution.

Claims (10)

1. a kind of automatic broadcasting method of multi-language text, it is characterised in that including:
The multi-language text for treating report carries out languages identification, and the corresponding word section of multiple languages is obtained respectively;
Starting and end point to each word section are identified respectively, obtain the languages mark;
And, corresponded to respectively according to languages mark and transfer each self-corresponding sound report model of each languages, successively to each text Field is reported;
Wherein, the languages mark includes the corresponding languages of current character section and reports sequence number.
2. broadcasting method according to claim 1, it is characterised in that described to obtain the corresponding word section of multiple languages respectively When, including:
Languages identification is carried out to the multi-language text using default recognition strategy, the multi-language text is divided into by languages The word section of multiple different languages;
Starting and end point to each word section are identified, and each mark includes the corresponding languages of current character section and report Sequence number.
3. broadcasting method according to claim 2, it is characterised in that it is described using default recognition strategy to described multilingual When text carries out languages identification, including:
With the initial character in the multi-language text as starting point, each character is filtered successively, and finding the word with previous character During the different current character of symbol rule, current character is confirmed as into the languages different from previous character, and obtain according to language rule Take the corresponding languages of current character;
The end of identification of previous character and the origin identification of current character are marked between current character and previous character.
4. broadcasting method according to claim 3, it is characterised in that described that current character correspondence is obtained according to language rule Languages, including:
When the corresponding languages of current character are determined for western character according to language rule, if being judged to obtain according to phonetic differentiation rule Know that the languages for the Chinese phonetic alphabet, are then updated to Chinese phonetic alphabet character by current western character;
Wherein, the language rule includes the coding rule of default each languages, and the phonetic differentiation rule is included in phonetic Initial consonant, simple or compound vowel of a Chinese syllable or the permutation and combination of the two.
5. broadcasting method according to claim 1, it is characterised in that described being corresponded to respectively according to languages mark is transferred When each each self-corresponding sound of languages reports model, including:
According to corresponding whole languages in the multi-language text, each self-corresponding sound of whole languages is transferred respectively and is reported Model,
The sound report model is marked to export each word section respectively according to corresponding languages in the multi-language text each right The report voice answered;
Report sequence number during the report voice is marked by each languages respectively is sequentially synthesized, and obtains the multi-language text Corresponding voice messaging;
And, the voice messaging is sent to playout center and is reported.
6. broadcasting method according to claim 5, it is characterised in that it is described by the report voice by broadcasting in each mark Report sequence number is sequentially synthesized, including:
The corresponding voice of reporting of each word section is each stored in corresponding each mapping table by languages respectively;
Corresponding report voice in each mapping table is sequentially synthesized by the report sequence number respectively, is obtained described multi-lingual The corresponding voice messaging of speech text.
7. broadcasting method according to claim 1, it is characterised in that methods described also includes:
Obtain text information to be reported;
The text information is read, and judges whether the languages in the text information are more than one;
If, it is determined that the text information is multi-language text;
Otherwise, the corresponding sound report model of the text information is directly transferred to report the text information.
8. the automatic broadcasting system of a kind of multi-language text, it is characterised in that including:
Words identification module, the multi-language text for treating report carries out languages identification, and multiple different languages pair are obtained respectively The word section answered, and the starting of each word section and end point are identified respectively, obtain the languages mark;
Voice broadcast module, each self-corresponding sound report mould of each languages is transferred for being corresponded to respectively according to languages mark Type, reports to each word section successively;
Wherein, the mark includes the corresponding languages of current character section and reports sequence number.
9. broadcasting system according to claim 8, it is characterised in that the words identification module includes:
Word section division unit, for carrying out languages identification to the multi-language text using default recognition strategy, will be described many Language text is divided into the word section of multiple different languages by languages;
Word segment identification unit, is identified for the starting to each word section and end point, and each mark include ought be above The corresponding languages of field and report sequence number.
10. broadcasting system according to claim 8, it is characterised in that the voice broadcast module includes:
Word section voice-output unit is described complete for according to corresponding whole languages in the multi-language text, transferring respectively Languages each self-corresponding sound in portion's reports model, marks the sound to report model according to correspondence languages in the multi-language text Each self-corresponding report voice of each word section is exported respectively;
Voice messaging synthesis unit, is sequentially closed for the report sequence number during the report voice is marked by each languages respectively Into obtaining the corresponding voice messaging of the multi-language text;
Voice messaging transmitting element, is reported for the voice messaging to be sent to playout center.
CN201611195723.7A 2016-12-21 2016-12-21 The automatic broadcasting method and system of a kind of multi-language text Pending CN106856091A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611195723.7A CN106856091A (en) 2016-12-21 2016-12-21 The automatic broadcasting method and system of a kind of multi-language text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611195723.7A CN106856091A (en) 2016-12-21 2016-12-21 The automatic broadcasting method and system of a kind of multi-language text

Publications (1)

Publication Number Publication Date
CN106856091A true CN106856091A (en) 2017-06-16

Family

ID=59126865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611195723.7A Pending CN106856091A (en) 2016-12-21 2016-12-21 The automatic broadcasting method and system of a kind of multi-language text

Country Status (1)

Country Link
CN (1) CN106856091A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107134276A (en) * 2017-07-06 2017-09-05 大连华锐重工集团股份有限公司 A kind of programmable Intelligent voice broadcasting system and method
CN108681529A (en) * 2018-03-26 2018-10-19 山东科技大学 A kind of multi-language text and speech production method of procedural model figure
CN109388404A (en) * 2018-10-10 2019-02-26 北京智能管家科技有限公司 A kind of path coding/decoding method, device, computer equipment and storage medium
CN109509464A (en) * 2017-09-11 2019-03-22 珠海金山办公软件有限公司 It is a kind of text to be read aloud the method and device for being recorded as audio
CN109981448A (en) * 2019-03-28 2019-07-05 联想(北京)有限公司 Information processing method and electronic equipment
CN110133872A (en) * 2019-05-24 2019-08-16 中国人民解放军东部战区总医院 A kind of intelligent glasses can be realized multilingual intertranslation
CN110797003A (en) * 2019-10-30 2020-02-14 合肥名阳信息技术有限公司 Method for displaying caption information by converting text into voice
CN111079725A (en) * 2019-05-27 2020-04-28 广东小天才科技有限公司 Method for distinguishing English from Pinyin and electronic equipment
CN111160044A (en) * 2019-12-31 2020-05-15 出门问问信息科技有限公司 Text-to-speech conversion method and device, terminal and computer readable storage medium
CN111312213A (en) * 2020-03-31 2020-06-19 广东美的制冷设备有限公司 Voice processing method and device of air conditioner, air conditioner and readable storage medium
CN111986649A (en) * 2020-08-28 2020-11-24 普强时代(珠海横琴)信息技术有限公司 Mixing acceleration synthesis method of TTS system
CN112259111A (en) * 2020-09-18 2021-01-22 惠州高盛达智显科技有限公司 Raspberry pie-based emergency broadcasting method and system
CN115580742A (en) * 2022-10-12 2023-01-06 广州市保伦电子有限公司 Sound-text synchronous broadcasting method and system
CN116032566A (en) * 2022-12-14 2023-04-28 平安银行股份有限公司 Voice broadcasting method and device of privacy protocol and terminal equipment

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1731511A (en) * 2004-08-06 2006-02-08 摩托罗拉公司 Method and system for performing speech recognition on multi-language name
CN101090517A (en) * 2006-06-14 2007-12-19 李清隐 Global position mobile phone multi-language guide method and system
CN102280104A (en) * 2010-06-11 2011-12-14 北大方正集团有限公司 File phoneticization processing method and system based on intelligent indexing
CN102541860A (en) * 2010-12-14 2012-07-04 许德武 Internet access multilanguage intelligent identification method
CN104282302A (en) * 2013-07-04 2015-01-14 三星电子株式会社 Apparatus and method for recognizing voice and text
CN204614442U (en) * 2015-03-12 2015-09-02 尚飞科技(湖南)有限公司 A kind of papery text audio frequency and Play System
CN105096953A (en) * 2015-08-11 2015-11-25 东莞市凡豆信息科技有限公司 Voice recognition method capable of realizing multi-language mixed use
CN105427855A (en) * 2015-11-09 2016-03-23 上海语知义信息技术有限公司 Voice broadcast system and voice broadcast method of intelligent software
CN105845125A (en) * 2016-05-18 2016-08-10 百度在线网络技术(北京)有限公司 Speech synthesis method and speech synthesis device
CN105989833A (en) * 2015-02-28 2016-10-05 讯飞智元信息科技有限公司 Multilingual mixed-language text character-pronunciation conversion method and system
CN106228972A (en) * 2016-07-08 2016-12-14 北京光年无限科技有限公司 Multi-language text towards intelligent robot system mixes reads aloud method and system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1731511A (en) * 2004-08-06 2006-02-08 摩托罗拉公司 Method and system for performing speech recognition on multi-language name
CN101090517A (en) * 2006-06-14 2007-12-19 李清隐 Global position mobile phone multi-language guide method and system
CN102280104A (en) * 2010-06-11 2011-12-14 北大方正集团有限公司 File phoneticization processing method and system based on intelligent indexing
CN102541860A (en) * 2010-12-14 2012-07-04 许德武 Internet access multilanguage intelligent identification method
CN104282302A (en) * 2013-07-04 2015-01-14 三星电子株式会社 Apparatus and method for recognizing voice and text
CN105989833A (en) * 2015-02-28 2016-10-05 讯飞智元信息科技有限公司 Multilingual mixed-language text character-pronunciation conversion method and system
CN204614442U (en) * 2015-03-12 2015-09-02 尚飞科技(湖南)有限公司 A kind of papery text audio frequency and Play System
CN105096953A (en) * 2015-08-11 2015-11-25 东莞市凡豆信息科技有限公司 Voice recognition method capable of realizing multi-language mixed use
CN105427855A (en) * 2015-11-09 2016-03-23 上海语知义信息技术有限公司 Voice broadcast system and voice broadcast method of intelligent software
CN105845125A (en) * 2016-05-18 2016-08-10 百度在线网络技术(北京)有限公司 Speech synthesis method and speech synthesis device
CN106228972A (en) * 2016-07-08 2016-12-14 北京光年无限科技有限公司 Multi-language text towards intelligent robot system mixes reads aloud method and system

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107134276A (en) * 2017-07-06 2017-09-05 大连华锐重工集团股份有限公司 A kind of programmable Intelligent voice broadcasting system and method
CN109509464A (en) * 2017-09-11 2019-03-22 珠海金山办公软件有限公司 It is a kind of text to be read aloud the method and device for being recorded as audio
CN109509464B (en) * 2017-09-11 2022-11-04 珠海金山办公软件有限公司 Method and device for recording text reading as audio
CN108681529B (en) * 2018-03-26 2022-01-25 山东科技大学 Multi-language text and voice generation method of flow model diagram
CN108681529A (en) * 2018-03-26 2018-10-19 山东科技大学 A kind of multi-language text and speech production method of procedural model figure
CN109388404A (en) * 2018-10-10 2019-02-26 北京智能管家科技有限公司 A kind of path coding/decoding method, device, computer equipment and storage medium
CN109388404B (en) * 2018-10-10 2022-10-18 北京如布科技有限公司 Path decoding method and device, computer equipment and storage medium
CN109981448A (en) * 2019-03-28 2019-07-05 联想(北京)有限公司 Information processing method and electronic equipment
CN109981448B (en) * 2019-03-28 2022-03-25 联想(北京)有限公司 Information processing method and electronic device
CN110133872A (en) * 2019-05-24 2019-08-16 中国人民解放军东部战区总医院 A kind of intelligent glasses can be realized multilingual intertranslation
CN111079725A (en) * 2019-05-27 2020-04-28 广东小天才科技有限公司 Method for distinguishing English from Pinyin and electronic equipment
CN111079725B (en) * 2019-05-27 2023-08-29 广东小天才科技有限公司 Method for distinguishing English from pinyin and electronic equipment
CN110797003A (en) * 2019-10-30 2020-02-14 合肥名阳信息技术有限公司 Method for displaying caption information by converting text into voice
CN111160044A (en) * 2019-12-31 2020-05-15 出门问问信息科技有限公司 Text-to-speech conversion method and device, terminal and computer readable storage medium
CN111312213A (en) * 2020-03-31 2020-06-19 广东美的制冷设备有限公司 Voice processing method and device of air conditioner, air conditioner and readable storage medium
CN111986649A (en) * 2020-08-28 2020-11-24 普强时代(珠海横琴)信息技术有限公司 Mixing acceleration synthesis method of TTS system
CN112259111A (en) * 2020-09-18 2021-01-22 惠州高盛达智显科技有限公司 Raspberry pie-based emergency broadcasting method and system
CN115580742A (en) * 2022-10-12 2023-01-06 广州市保伦电子有限公司 Sound-text synchronous broadcasting method and system
CN115580742B (en) * 2022-10-12 2023-05-16 广东保伦电子股份有限公司 Voice and text synchronous broadcasting method and broadcasting system
CN116032566A (en) * 2022-12-14 2023-04-28 平安银行股份有限公司 Voice broadcasting method and device of privacy protocol and terminal equipment

Similar Documents

Publication Publication Date Title
CN106856091A (en) The automatic broadcasting method and system of a kind of multi-language text
JP5330450B2 (en) Topic-specific models for text formatting and speech recognition
CA1294056C (en) Language translation system
CN107516509B (en) Voice database construction method and system for news broadcast voice synthesis
CN100568225C (en) The Words symbolization processing method and the system of numeral and special symbol string in the text
CN105957518A (en) Mongolian large vocabulary continuous speech recognition method
CN103578464A (en) Language model establishing method, speech recognition method and electronic device
CN102184167A (en) Method and device for processing text data
CA2818004A1 (en) Text conversion and representation system
Grabe et al. The IViE Corpus
CN108628859A (en) A kind of real-time voice translation system
CN110798733A (en) Subtitle generating method and device, computer storage medium and electronic equipment
CN110740275A (en) nonlinear editing systems
CN104049963A (en) Method for controlling electromechanical equipment operation by use of Chinese speech
CN105931641A (en) Subtitle data generation method and device
CN1945692B (en) Intelligent method for improving prompting voice matching effect in voice synthetic system
CN109492126A (en) A kind of intelligent interactive method and device
CN105895076B (en) A kind of phoneme synthesizing method and system
CN103854648A (en) Chinese and foreign language voiced image data bidirectional reversible voice converting and subtitle labeling method
CN109859746B (en) TTS-based voice recognition corpus generation method and system
Friederici et al. Prosodic structure and word recognition
CN104238989B (en) A method of intelligent refrigerator is controlled with Chinese speech
KR101777141B1 (en) Apparatus and method for inputting chinese and foreign languages based on hun min jeong eum using korean input keyboard
Pitrelli ToBI prosodic analysis of a professional speaker of American English
CN111489742A (en) Acoustic model training method, voice recognition method, device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170616