CN106228972B - Method and system are read aloud in multi-language text mixing towards intelligent robot system - Google Patents
Method and system are read aloud in multi-language text mixing towards intelligent robot system Download PDFInfo
- Publication number
- CN106228972B CN106228972B CN201610537801.0A CN201610537801A CN106228972B CN 106228972 B CN106228972 B CN 106228972B CN 201610537801 A CN201610537801 A CN 201610537801A CN 106228972 B CN106228972 B CN 106228972B
- Authority
- CN
- China
- Prior art keywords
- paragraph
- language
- speech synthesis
- text
- intelligent robot
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 79
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 78
- 230000005540 biological transmission Effects 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 8
- 230000011218 segmentation Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 235000007926 Craterellus fallax Nutrition 0.000 description 1
- 240000007175 Datura inoxia Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
- G10L13/047—Architecture of speech synthesisers
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of, and method and system are read aloud in the multi-language text mixing towards intelligent robot system, and this method includes that the multi-language text for the bright read output to be mixed that intelligent robot end will acquire is sent to Cloud Server;Cloud Server marks the type of different speech synthesis engines according to the language form of the multi-language text, and the result of mark feedback is back to intelligent robot end;Corresponding speech synthesis engine is called to carry out bright read output to the multi-language text according to the information of feedback in intelligent robot end.This method solve mixing in the prior art to read aloud that flexibility is low, and problem at high cost and low accuracy improves user experience.
Description
Technical field
The invention belongs to field in intelligent robotics more particularly to a kind of multi-language text towards intelligent robot system are mixed
Method and system are read aloud in conjunction.
Background technique
With the extensive use of intelligent robot, more and more answered for what multilingual mixing interleaved order was read aloud
In.
The voice output of intelligent robot passes through text-to-speech (Text To Speech, TTS) technology mainly to realize.
Existing multilingual mixing interleaved order is read aloud, and most of realized by a tts engine, such as common Chinese and English
Mixing is read aloud.
The problem of above scheme, is, in order to realize that Chinese and English mixing is read aloud, it is necessary to select and support Chinese, English bright
Tts engine is read, while often there is a phenomenon where the bright read errors of intersection for this engine for supporting multilingual mixing to read aloud, therefore lack
Weary flexibility.In addition, the languages for supporting mixing to read aloud are less, such as Sino-British mixing tts engine is common, still Sino-Russian, Sino-Japan etc.
It is less to mix tts engine.And TTS is supported to mix the engine higher cost read aloud.
Summary of the invention
The first technical problem to be solved by the present invention be need to provide it is a kind of for realizing the multilingual of multi-language text
Mix the method read aloud.
In order to solve the above-mentioned technical problem, embodiments herein provides firstly a kind of towards intelligent robot system
Method is read aloud in multi-language text mixing, the multi-language text transmission including the bright read output to be mixed that intelligent robot end will acquire
To Cloud Server;Cloud Server marks the type of different speech synthesis engines according to the language form of the multi-language text,
And the result of mark feedback is back to intelligent robot end;Corresponding voice is called to close according to the information of feedback in intelligent robot end
Bright read output is carried out to the multi-language text at engine.
Preferably, the Cloud Server marks different speech synthesis engines according to the language form of the multi-language text
Type, comprising: text is divided by least one text chunk according to the language form of the multi-language text;Based on each text
The language form of section marks the type of speech synthesis engine corresponding with this section of text.
Preferably, the speech synthesis engine is the speech synthesis engine of single languages.
Preferably, the result feedback by mark is back to intelligent robot end, comprising: by each text chunk and with this
The type package of the corresponding speech synthesis engine of Duan Wenben is array, wherein each text chunk corresponds to one in array
Array element;Array feedback is back to intelligent robot end.
Preferably, the intelligent robot end calls corresponding speech synthesis engine to described multi-lingual according to the information of feedback
Say that text carries out bright read output, comprising: be successively read each array element of the array, and solve to the data element
Analysis;Corresponding speech synthesis engine is called according to the type of the speech synthesis engine marked in parsing result;Utilize the language of calling
Sound Compositing Engine carries out bright read output to the multi-language text.
Embodiments herein additionally provides a kind of bright read apparatus of multi-language text mixing towards intelligent robot system,
Include: transmission module, be located at intelligent robot end, the multi-language text for the bright read output to be mixed that will acquire is sent to cloud clothes
Business device;Feedback module is marked, Cloud Server is located at, different voices is marked according to the language form of the multi-language text and is closed
Intelligent robot end is back at the type of engine, and by the result of mark feedback;Output module is read aloud, intelligent robot is located at
End calls corresponding speech synthesis engine to carry out bright read output to the multi-language text according to the information of feedback.
Preferably, the mark feedback module is marking different voice conjunctions according to the language form of the multi-language text
At engine type when, text is divided by least one text chunk according to the language form of the multi-language text, and be based on
The language form of each text chunk marks the type of speech synthesis engine corresponding with this section of text.
Preferably, the speech synthesis engine is the speech synthesis engine of single languages.
Preferably, the mark feedback module, will be described each when the result feedback of mark is back to intelligent robot end
The type package of text chunk and speech synthesis engine corresponding with this section of text is array, wherein each text chunk corresponds to
An array element in array;And the array feedback is back to intelligent robot end.
Preferably, the output module of reading aloud is calling corresponding speech synthesis engine to described more according to the information of feedback
When language text carries out bright read output, it is successively read each array element of the array, and parse to the data element;
Corresponding speech synthesis engine is called according to the type of the speech synthesis engine marked in parsing result;It is closed using the voice of calling
Bright read output is carried out to the multi-language text at engine.
Compared with prior art, one or more embodiments in above scheme can have following advantage or beneficial to effect
Fruit:
It is segmented by the multi-language text for treating bright read output according to language form, and for the text that division obtains
Section calls the speech synthesis engine of different single languages respectively to complete the bright read output of multilingual mixing, solves existing skill
Mixing reads aloud that flexibility is low in art, and problem at high cost and low accuracy improves user experience.
Other advantages, target and feature of the invention will be illustrated in the following description to a certain extent, and
And to a certain extent, based on will be apparent to those skilled in the art to investigating hereafter, Huo Zheke
To be instructed from the practice of the present invention.Target and other advantages of the invention can be wanted by following specification, right
Specifically noted structure is sought in book and attached drawing to be achieved and obtained.
Detailed description of the invention
Attached drawing is used to provide to the technical solution of the application or further understanding for the prior art, and constitutes specification
A part.Wherein, the attached drawing for expressing the embodiment of the present application is used to explain the technical side of the application together with embodiments herein
Case, but do not constitute the limitation to technical scheme.
Fig. 1 is to read aloud method according to the multi-language text mixing towards intelligent robot system of first embodiment of the invention
Flow diagram;
Fig. 2 is to read aloud method according to the multi-language text mixing towards intelligent robot system of second embodiment of the invention
Flow diagram;
Fig. 3 is to read aloud method according to the multi-language text mixing towards intelligent robot system of third embodiment of the invention
Flow diagram;
Fig. 4 is to mix bright read apparatus according to the multi-language text towards intelligent robot system of fourth embodiment of the invention
Structural schematic diagram.
Specific embodiment
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings and examples, how to apply to the present invention whereby
Technological means solves technical problem, and the realization process for reaching relevant art effect can fully understand and implement.This Shen
Please each feature in embodiment and embodiment, can be combined with each other under the premise of not colliding, be formed by technical solution
It is within the scope of the present invention.
First embodiment:
Fig. 1 is to read aloud method according to the multi-language text mixing towards intelligent robot system of one embodiment of the invention
Flow diagram, as shown, this method comprises:
Step S110, the multi-language text for the bright read output to be mixed that intelligent robot end will acquire is sent to cloud service
Device.
Step S120, Cloud Server marks the class of different speech synthesis engines according to the language form of multi-language text
Type, and the result of mark feedback is back to intelligent robot end.
Step S130, intelligent robot end calls corresponding speech synthesis engine to multi-language text according to the information of feedback
Carry out bright read output.
In step s 110, the multi-language text that bright read output to be mixed is received by intelligent robot end, can pass through
Internal database obtains, and can also be inputted by user by the reception device at intelligent robot end.In embodiments of the present invention,
Processing to multi-language text is to complete in Cloud Server, therefore intelligent robot end is then by bright read output to be mixed
Multi-language text is sent to Cloud Server.
In the step s 120, Cloud Server handles the multi-language text received.By to multi-language text into
Row analysis obtains language form included in text, and the language form of corresponding multi-language text marks different speech syntheses
The type of engine.
Assuming that in the multi-language text of the present embodiment comprising at least two or more language text, in the prior art,
Generally realized by calling the TTS mixing Compositing Engine read aloud corresponding to support multilingual.And in implementation of the invention
In example, before calling TTS Compositing Engine to read aloud multi-language text, first the content of multi-language text is analyzed.
Specifically, text is divided at least one text chunk according to the language form of multi-language text, then it is based on each text
This section of language form marks the type of TTS speech synthesis engine corresponding with this section of text.By divide obtain to more
Language text only includes a kind of single language form inside each text chunk, therefore reads aloud respectively each text chunk,
The TTS speech synthesis engine of single languages can be only called to complete to read aloud.Due to no longer needing multilingual calling speech synthesis
Engine, thus be conducive to improve TTS speech synthesis accuracy and raising read aloud quality.
Further, in the step s 120, division and the voice of the text chunk of multi-language text are completed in Cloud Server
After the mark of Compositing Engine, the result feedback of mark is back to intelligent robot end.Specifically, Cloud Server obtains division
Each text chunk and speech synthesis engine corresponding with this section of text type package be array, wherein each text chunk pair
It should an array element in array.It again will be by the full text section of multi-language text and language corresponding with this section of text
The feedback of array composed by the type of sound Compositing Engine is back to intelligent robot end.
In an embodiment of the present invention, returned text section and the type of corresponding speech synthesis engine letter in the form of array
Breath can be convenient next inquiry and read aloud with execution, is conducive to the combined coefficient for improving TTS speech synthesis.
In step s 130, intelligent robot end receives the array that feeds back from Cloud Server, and by array into
Row parsing implements bright read output to multi-language text.Specifically, each array member being successively read in the array fed back
Element, and each array element is parsed.The type that the speech synthesis engine of mark is obtained by parsing array element, further according to
The type of speech synthesis engine calls corresponding speech synthesis engine to carry out bright read output to multi-language text.
Due to being segmented to multi-language text, and each text chunk only includes single language form, because
This is the speech synthesis engine of single languages according to the speech synthesis engine that markup information calls.The speech synthesis of single languages is drawn
It holds up and stablizes compared with mixing voice Compositing Engine, and cost is lower.The embodiment of the present invention has while reducing TTS speech synthesis cost
Conducive to the accuracy for increasing speech synthesis, quality is read aloud in raising, improves user experience.
Second embodiment:
Fig. 2 is to read aloud method according to the multi-language text mixing towards intelligent robot system of second embodiment of the invention
Flow diagram be further illustrated in the present embodiment to how multi-language text being divided into different text chunks,
It is described in detail below only for it with the difference in first embodiment.
In step S210, the urtext information of bright read output to be mixed is sent to cloud service by intelligent robot end
Device, the step execute operation identical with step S110 in first embodiment, repeat no more.
Next, Cloud Server is segmented multi-language text according to the language form of multi-language text.Specifically, root
Text chunk is divided according to the natural paragraph of multi-language text.
Each natural paragraph being successively read in multi-language text, and by analyzing and determining whether in the paragraph be same language
Type is sayed, as shown in the step S220 of Fig. 2.If only including a kind of language form in the nature paragraph, according to the class of languages
Type marks the type of corresponding speech synthesis engine, as shown in step S230.If in the nature paragraph including two kinds or two kinds
Above language form then drops into row further division to the paragragh, as shown in step S240.
When the paragraph to the language form comprising two or more carries out further division, can use with fixation
The mode that the section of length divides multi-language text.Specifically, one is marked off from paragraph first in step S240
With the new paragraph consistent in length of preset paragraph, in this way, former paragraph is divided into two paragraphs.Then successively investigate this two
A paragraph judges whether in each paragraph divided be same language type, as shown in figure step S250.
If only including a kind of language form in the paragraph divided according to the length of preset paragraph, then follow the steps
S230 marks the type of corresponding speech synthesis engine according to the language form.
If in the paragraph divided according to the length of preset paragraph still including two or more class of languages
Type, then return step S240, the paragraph is divided according still further to preset bout length, it should be noted that divide again
When preset paragraph when being less than preceding primary division of used preset bout length length, be denoted as the first default section respectively
Fall length and the second default bout length.If also need repeatedly to divide herein, third is respectively adopted and presets bout length, the
Four default bout lengths etc. indicate the length standard divided every time.
Then remaining paragraph after the paragraph for removing and dividing according to the length of preset paragraph is investigated again, investigates method
It is identical, it repeats no more.
It should be noted that being divided to paragraph, at the ending of paragraph, when the length of remaining unallocated paragraph
Less than preset paragraph length when, using the remaining content not divided as a text chunk, then judge respectively above-mentioned
It whether include single language form in each new paragraph.
In step S260, after completing the mark of the type of speech synthesis engine of a paragraph, whether the section is judged
For the last one paragraph, if not the last one paragraph, then return step S220 continues to investigate to next paragraph, if
Through for the last one paragraph, then terminating annotation process.
In completing multi-language text after the mark of the type of the speech synthesis engine of the last one paragraph, by Cloud Server
The result feedback of mark is back to intelligent robot end.Intelligent robot end corresponding speech synthesis is called according to the information of feedback
Engine carries out bright read output to multi-language text.
The method of the embodiment of the present invention, by reference to the natural paragraph information in multi-language text come to multi-language text into
Row divides, and the boundary of natural paragraph is easy to determining, and due to only including generally a kind of language form inside natural paragraph,
This method is conducive to improve the speed of segmentation, while reducing the complexity of the operation of the type of segmentation and Markup Language Compositing Engine
Degree.
The embodiment can be used for the division for being distributed more complex multi-language text of different language type.
3rd embodiment:
Fig. 3 is to read aloud method according to the multi-language text mixing towards intelligent robot system of third embodiment of the invention
Flow diagram, in the present embodiment, using the sides being segmented to multi-language text different from second embodiment
Formula is described in detail only for it with the difference in second embodiment below.
As shown in figure 3, if being transferred to step by judging to obtain in the natural paragraph investigated as same language type
Rapid S330 is executed.In step S330, whether the language form for continuing to judge that the paragraph and previous paragraph are included is identical, if
The paragraph is identical as the language form that previous paragraph is included, then the paragraph and previous paragraph is merged into a paragraph, merges
Paragraph afterwards uses the same speech synthesis engine for being directed to previous paragraph mark.If the paragraph is wrapped with previous paragraph
The language form contained is not identical, then marks corresponding speech synthesis engine according to the language form that the paragraph is included.
After merging paragraph or mark speech synthesis engine, judge whether the paragraph is the last one paragraph, if not
The last one paragraph, then return step S320 continues to investigate to next paragraph, if being the last one paragraph, terminates
Annotation process.
It is as also shown in fig. 3, when being obtained in the paragraph investigated not only by judgement comprising a kind of language form, execute with
The identical operation of step S240 of second embodiment, marked off from paragraph one it is consistent in length new with preset paragraph
Paragraph, then obtained paragraph will be newly divided as the paragraph currently investigated, it is obtained in the paragraph investigated only until by judgement
When comprising a kind of language form, step S330 execution is gone to, that is, enters the circulation of right branch in Fig. 3.
By aforesaid operations, it can be realized and drawn with boundary between the true different language type in multi-language text
Divide text chunk, advantageously reduces mark project.Especially when multi-language text is larger, the language form for being included is less, and point
When cloth is more concentrated, the method for the embodiment of the present invention can reduce the project finally marked significantly.
In the present embodiment, the annotation results of speech synthesis engine are also to be returned in the form of array, when mark project subtracts
When few, corresponding data element is also correspondingly reduced, and can simplify feedback result, conducive to the transmission of data.In addition, record
After the array of annotation results is simplified, corresponding speech synthesis engine is obtained according to array element and text chunk is read aloud
When output, operation is also simplified, and is conducive to the efficiency for improving bright read output, reduce stagnation during reading aloud with it is discontinuous
Situation, improve user experience.
The method of division text chunk in the above second embodiment and 3rd embodiment is merely to illustrate to multi-language text
Operability when being segmented, and do not constitute a limitation of the invention, can be realized will need mixing voice Compositing Engine bright
The method that the multi-language text of read output is divided into multiple text chunks of the bright read output of speech synthesis engine using single languages,
It is within the scope of the invention.
Fourth embodiment:
Fig. 4 is to mix bright read apparatus according to the multi-language text towards intelligent robot system of fourth embodiment of the invention
Structural schematic diagram, as shown, the system includes:
Transmission module 41, is located at intelligent robot end, and the multi-language text for the bright read output to be mixed that will acquire is sent
To Cloud Server.
Feedback module 42 is marked, is located at Cloud Server, different voices is marked according to the language form of multi-language text
The type of Compositing Engine, and the result of mark feedback is back to intelligent robot end.
Output module 43 is read aloud, intelligent robot end is located at, calls corresponding speech synthesis to draw according to the information of feedback
It holds up and bright read output is carried out to multi-language text.
Specifically, mark feedback module 42 marks different speech synthesis engines in the language form according to multi-language text
Type when, text is divided by least one text chunk according to the language form of multi-language text, and based on each text chunk
Language form marks the type of speech synthesis engine corresponding with this section of text.
Mark feedback module 42 when the result of mark feedback is back to intelligent robot end, by each text chunk and with the section
The type package of the corresponding speech synthesis engine of text is array, wherein each text chunk corresponds to a number in array
Group element, and obtained array feedback is back to intelligent robot end.
Read aloud output module 43 is calling corresponding speech synthesis engine to the multi-language text according to the information of feedback
When carrying out bright read output, it is successively read each array element of the array, and data element is parsed, according to parsing result
The type of the speech synthesis engine of middle mark calls corresponding speech synthesis engine, using the speech synthesis engine of calling to multi-lingual
Say that text carries out bright read output.
Further, mark feedback module 42 can also be using the difference as proposed in second embodiment and 3rd embodiment
Segmentation method multi-language text is segmented, details are not described herein again.
The multi-language text of the embodiment of the present invention mixes bright read apparatus, solve in the prior art to multi-language text into
Inflexible problem when row bright read output.System in the present embodiment only needs to call the voice of multiple single languages to close
The bright read output of multilingual mixing can be completed at engine, system constitutes simple, cost significant decrease.
Since the speech synthesis engine of single languages have been relatively mature, and type is compared with horn of plenty, therefore the present invention is implemented
The multi-language text of example mixes bright read apparatus and can support to read aloud defeated due to lacking mixing voice Compositing Engine in the prior art
Text out, is more widely applied.
Those skilled in the art should be understood that each module of the above invention or each step can use general calculating
Device realizes that they can be concentrated on a single computing device, or be distributed in network constituted by multiple computing devices
On, optionally, they can be realized with the program code that computing device can perform, it is thus possible to be stored in storage
It is performed by computing device in device, perhaps they are fabricated to each integrated circuit modules or will be more in them
A module or step are fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific hardware and
Software combines.
Although disclosed herein embodiment it is as above, the content is only to facilitate understanding the present invention and adopting
Embodiment is not intended to limit the invention.Any those skilled in the art to which this invention pertains are not departing from this
Under the premise of the disclosed spirit and scope of invention, any modification and change can be made in the implementing form and in details,
But scope of patent protection of the invention, still should be subject to the scope of the claims as defined in the appended claims.
Claims (8)
1. method is read aloud in a kind of multi-language text mixing towards intelligent robot system, comprising:
The multi-language text for the bright read output to be mixed that intelligent robot end will acquire is sent to Cloud Server;
Cloud Server marks the type of different speech synthesis engines according to the language form of the multi-language text, and will mark
Result feedback be back to intelligent robot end, wherein include:
Each natural paragraph being successively read in multi-language text, and judge in the paragraph whether to be same language type, include:
If in the nature paragraph only including a kind of language form, step 1 judges the language that the paragraph and previous paragraph are included
Whether speech is identical, the paragraph and previous paragraph is then merged into a paragraph if they are the same, and the paragraph after merging uses the last period
The type for falling the same speech synthesis engine of mark, the language form mark for being included according to the paragraph if not identical correspond to
Speech synthesis engine type;
If in the nature paragraph including at least two language forms, step 2 carries out paragraph according to the first default bout length
It divides, successively judges whether in each paragraph divided be same language type, if dividing according to the first default bout length
To paragraph in only include a kind of language form, then return step one, otherwise returns the paragraph according still further to preset bout length
It returns in step 2 and carries out paragraph division again, wherein first when bout length when division is less than preceding primary division again is pre-
If bout length;
Corresponding speech synthesis engine is called to read aloud the multi-language text according to the information of feedback in intelligent robot end
Output.
2. the method according to claim 1, wherein the speech synthesis engine is the speech synthesis of single languages
Engine.
3. according to the method described in claim 2, it is characterized in that, described be back to intelligent robot for the result feedback of mark
End, comprising:
It is array by the type package of each text chunk and speech synthesis engine corresponding with this section of text, wherein each
Text chunk corresponds to an array element in array;
Array feedback is back to intelligent robot end.
4. according to the method described in claim 3, it is characterized in that, phase is called according to the information of feedback in the intelligent robot end
The speech synthesis engine answered carries out bright read output to the multi-language text, comprising:
It is successively read each array element of the array, and the array element is parsed;
Corresponding speech synthesis engine is called according to the type of the speech synthesis engine marked in parsing result;
Bright read output is carried out to the multi-language text using the speech synthesis engine of calling.
5. a kind of multi-language text towards intelligent robot system mixes bright read apparatus, comprising:
Transmission module, is located at intelligent robot end, and the multi-language text for the bright read output to be mixed that will acquire is sent to cloud clothes
Business device;
Feedback module is marked, Cloud Server is located at, different voices is marked according to the language form of the multi-language text and is closed
Intelligent robot end is back at the type of engine, and by the result of mark feedback, wherein the mark feedback module, comprising:
Each natural paragraph being successively read in multi-language text, and judge in the paragraph whether to be same language type, include:
If in the nature paragraph only including a kind of language form, step 1 judges the language that the paragraph and previous paragraph are included
Whether speech is identical, the paragraph and previous paragraph is then merged into a paragraph if they are the same, and the paragraph after merging uses the last period
The type for falling the same speech synthesis engine of mark, the language form mark for being included according to the paragraph if not identical correspond to
Speech synthesis engine type;
If in the nature paragraph including at least two language forms, step 2 carries out paragraph according to the first default bout length
It divides, successively judges whether in each paragraph divided be same language type, if dividing according to the first default bout length
To paragraph in only include a kind of language form, then return step one, otherwise returns the paragraph according still further to preset bout length
It returns in step 2 and carries out paragraph division again, wherein first when bout length when division is less than preceding primary division again is pre-
If bout length;
Output module is read aloud, intelligent robot end is located at, calls corresponding speech synthesis engine to institute according to the information of feedback
It states multi-language text and carries out bright read output.
6. system according to claim 5, which is characterized in that the speech synthesis engine is the speech synthesis of single languages
Engine.
7. system according to claim 6, which is characterized in that the mark feedback module is fed back to by the result of mark
It is number by the type package of each text chunk and speech synthesis engine corresponding with this section of text when to intelligent robot end
Group, wherein each text chunk corresponds to an array element in array;And the array feedback is back to intelligent robot end.
8. system according to claim 7, which is characterized in that the output module of reading aloud is called according to the information of feedback
When corresponding speech synthesis engine carries out bright read output to the multi-language text, it is successively read each array member of the array
Element, and the array element is parsed;It is called according to the type of the speech synthesis engine marked in parsing result corresponding
Speech synthesis engine;Bright read output is carried out to the multi-language text using the speech synthesis engine of calling.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610537801.0A CN106228972B (en) | 2016-07-08 | 2016-07-08 | Method and system are read aloud in multi-language text mixing towards intelligent robot system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610537801.0A CN106228972B (en) | 2016-07-08 | 2016-07-08 | Method and system are read aloud in multi-language text mixing towards intelligent robot system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106228972A CN106228972A (en) | 2016-12-14 |
CN106228972B true CN106228972B (en) | 2019-09-27 |
Family
ID=57519428
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610537801.0A Active CN106228972B (en) | 2016-07-08 | 2016-07-08 | Method and system are read aloud in multi-language text mixing towards intelligent robot system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106228972B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106856091A (en) * | 2016-12-21 | 2017-06-16 | 北京智能管家科技有限公司 | The automatic broadcasting method and system of a kind of multi-language text |
CN111079725B (en) * | 2019-05-27 | 2023-08-29 | 广东小天才科技有限公司 | Method for distinguishing English from pinyin and electronic equipment |
CN111179904B (en) * | 2019-12-31 | 2022-12-09 | 出门问问创新科技有限公司 | Mixed text-to-speech conversion method and device, terminal and computer readable storage medium |
CN112530406A (en) * | 2020-11-30 | 2021-03-19 | 深圳市优必选科技股份有限公司 | Voice synthesis method, voice synthesis device and intelligent equipment |
CN114267332B (en) * | 2021-11-29 | 2024-08-20 | 重庆长安汽车股份有限公司 | Voice wake-up word generalization method and storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1929655A (en) * | 2006-09-28 | 2007-03-14 | 中山大学 | Mobile phone capable of realizing text and voice conversion |
US8244534B2 (en) * | 2007-08-20 | 2012-08-14 | Microsoft Corporation | HMM-based bilingual (Mandarin-English) TTS techniques |
US9483461B2 (en) * | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
CN104239323B (en) * | 2013-06-17 | 2019-03-26 | 腾讯科技(深圳)有限公司 | The method and device of data output |
US8768704B1 (en) * | 2013-09-30 | 2014-07-01 | Google Inc. | Methods and systems for automated generation of nativized multi-lingual lexicons |
-
2016
- 2016-07-08 CN CN201610537801.0A patent/CN106228972B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN106228972A (en) | 2016-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106228972B (en) | Method and system are read aloud in multi-language text mixing towards intelligent robot system | |
CN107451109B (en) | Report generation method and system | |
CN109871326A (en) | A kind of method and apparatus that script is recorded | |
CN105630524B (en) | Web page text analytic method, device and mobile terminal | |
CN110377462B (en) | Interface testing method and device and terminal equipment | |
CN103793432A (en) | Method and device for splitting database reading and writing | |
CN107665170B (en) | Flow testing method and device | |
CN103605525A (en) | Achievement method and device of desktop application program | |
CN103176996A (en) | Image search method based on image feature information and image search engine server based on image feature information | |
CN110381363A (en) | Video encoding/decoding method, device, server and storage medium | |
CN112083966B (en) | Method for loading remote vue component | |
CN107798025B (en) | Running and compiling method and device of stored process and database system | |
CN105022667B (en) | One kind being based on built-in browser CSS engine parallel methods | |
US9208250B2 (en) | System and method for dynamic modification of web page content to ensure consistent response time | |
CN113760242B (en) | Data processing method, device, server and medium | |
WO2017035938A1 (en) | Method and apparatus for switching bandwidth setting | |
KR20220133856A (en) | Test methods, devices, electronic equipment and storage media of electronic maps | |
CN106293862B (en) | A kind of analysis method and device of expandable mark language XML data | |
CN110109672B (en) | Analysis processing method and device for expression | |
CN109918391B (en) | Streaming transaction processing method and system | |
CN105487925A (en) | Data scanning method and device | |
CN109474643A (en) | Data analysis method, device, equipment and storage medium | |
CN112818183B (en) | Data synthesis method, device, computer equipment and storage medium | |
CN109960553A (en) | A kind of more window context rendering methods and system | |
CN111027196B (en) | Simulation analysis task processing method and device for power equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |