CN106557298A - Background towards intelligent robot matches somebody with somebody sound outputting method and device - Google Patents
Background towards intelligent robot matches somebody with somebody sound outputting method and device Download PDFInfo
- Publication number
- CN106557298A CN106557298A CN201610982284.8A CN201610982284A CN106557298A CN 106557298 A CN106557298 A CN 106557298A CN 201610982284 A CN201610982284 A CN 201610982284A CN 106557298 A CN106557298 A CN 106557298A
- Authority
- CN
- China
- Prior art keywords
- background
- voice
- text data
- intelligent robot
- dubs
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000001960 triggered effect Effects 0.000 claims description 6
- 230000035807 sensation Effects 0.000 abstract description 2
- 230000015572 biosynthetic process Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000033764 rhythmic process Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 230000002452 interceptive effect Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000009394 selective breeding Methods 0.000 description 2
- 244000086443 Craterellus fallax Species 0.000 description 1
- 235000007926 Craterellus fallax Nutrition 0.000 description 1
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 1
- 241000533950 Leucojum Species 0.000 description 1
- 241001446467 Mama Species 0.000 description 1
- 244000131316 Panax pseudoginseng Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 241001442654 Percnon planissimum Species 0.000 description 1
- 244000269722 Thea sinensis Species 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Toys (AREA)
Abstract
The invention provides a kind of background towards intelligent robot matches somebody with somebody sound outputting method, which comprises the following steps:Judge the type of voice content to be exported;Obtain voice data is dubbed with the background of the type matching;Background is played while output voice content and dubs voice data.Background of the invention with sound outputting method can allow user machine is converted a text to voice experience it is truer, broadcastings that background is dubbed can allow people to have sensation on the spot in person, allow express it is more lively.
Description
Technical field
The present invention relates to field in intelligent robotics, specifically, be related to a kind of background towards intelligent robot dub it is defeated
Go out method and device.
Background technology
Current robot chat is mainly, and computer, will be system to be exported using TTS technologies according to interactive result
Text carries out voice conversion, then plays back again.However, this chat interactive mode can not allow user to feel true
Experience.There can be experience on the spot in person to allow user, need a kind of interactive energy that can improve constantly intelligent robot
Power enters the technical scheme of experience so as to lift user.
The content of the invention
It is an object of the invention to provide a kind of background towards intelligent robot solves above-mentioned skill with sound outputting method
Art problem.In the method for the invention, which comprises the following steps:
Judge the type of voice content to be exported;
Obtain voice data is dubbed with the background of the type matching;
Background is played while output voice and dubs voice data.
Background towards intelligent robot of the invention matches somebody with somebody sound outputting method, it is preferred that in the same of output voice
When and trigger condition meet in the case of play background dub voice data, wherein, trigger condition includes following several situations:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
Background towards intelligent robot of the invention matches somebody with somebody sound outputting method, it is preferred that judging to be exported
In the type step of voice content, according to current application, the type of voice content to be exported is judged.
Background towards intelligent robot of the invention matches somebody with somebody sound outputting method, it is preferred that by dialog interface
Reception will export the corresponding text data of voice.
According to another aspect of the present invention, additionally provide a kind of background towards intelligent robot and dub output device,
Described device is comprised the following steps:
Text data receiving unit, which is to receive the corresponding text data of voice to be exported, and analyzes the text
The semanteme of data;
Background dubs search unit, and which is to the type belonging to the semantic content that represented according to the text data in data
Matched background is searched in storehouse and dubs voice data;
Audio output unit, plays while output text data corresponding voice and in the case where trigger condition meets
Background dubs voice data.
Background towards intelligent robot of the invention dubs output device, it is preferred that to export text
The audio output list that background dubs voice data is played while data corresponding voice and in the case where trigger condition meets
In unit, trigger condition includes following several situations:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
Background towards intelligent robot of the invention dubs output device, it is preferred that to according to described
The type belonging to semantic content that text data is represented is searched for matched background in data bank and dubs voice data
Background is dubbed in search unit, also including judging unit, its to judge the corresponding sound-type of text data to be exported, with true
Fixed matching background music.
Background towards intelligent robot of the invention dubs output device, it is preferred that by dialog interface
Reception will export the corresponding text data of voice.
Present invention be advantageous in that, by realize the method for the present invention can greatly improve intelligence machine person to person it
Between interaction capabilities, so as to lift the experience of user.Specifically, background of the invention can allow with sound outputting method and make
The experience that user converts a text to voice to machine is truer, and the broadcasting that background is dubbed can allow people to have sense on the spot in person
Feel, make expression more lively.
Other features and advantages of the present invention will be illustrated in the following description, also, partly be become from specification
Obtain it is clear that or being understood by implementing the present invention.The purpose of the present invention and other advantages can be by specification, rights
In claim and accompanying drawing, specifically noted structure is realizing and obtain.
Description of the drawings
Accompanying drawing is used for providing a further understanding of the present invention, and constitutes a part for specification, the reality with the present invention
Apply example to be provided commonly for explaining the present invention, be not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is the ensemble stream with sound outputting method according to the background towards intelligent robot of one embodiment of the present of invention
Cheng Tu
Fig. 2 is the detailed stream with sound outputting method according to the background towards intelligent robot of one embodiment of the present of invention
Cheng Tu;
Fig. 3 is tactile with sound outputting method towards the background of intelligent robot according to the triggering of one embodiment of the present of invention
Send out process flow diagram flow chart;And
Fig. 4 is the structural frames that output device is dubbed according to the background towards intelligent robot of one embodiment of the present of invention
Figure.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, the embodiment of the present invention is made below in conjunction with accompanying drawing
Further describe in detail.
As shown in figure 1, which show carrying out the overview flow chart that background is dubbed according to the present invention.
In the method, text input is carried out first, for example, the defeated of user is obtained by text scanner of robot etc.
Enter, or the input of user is obtained by way of screen touch.After robot obtains text input result, text point is carried out
Analysis and phonetic synthesis.Voice output is carried out finally, the voice of output contains the result and the selected back of the body out of TTS process
Scape is dubbed.The details of these technologies will hereafter be discussed in detail.
As shown in Fig. 2 which show a kind of overview flow chart of background towards intelligent robot with sound outputting method.
Method starts from step S101.In this step, system judges the type of voice content to be exported.In intelligence machine person to person
When interacting, it will usually the interactive instruction of receive user first, or when some conditions meet, actively send chat language
Sound.Robot system of the invention internally receives the corresponding text data of voice to be exported, and analyzes the text
The semanteme of data.For example, the semantic content for representing for obtaining text data by analysis is recitation of poems, children's stories etc..System root
It is marked with label according to the different classifications of different voice contents.According to for voice content mark come judge be poem or
Person's children's stories.
Preferably, it is when being analyzed to text data, further comprising the steps of:
Text structure detecting step, is processed according to punctuation mark, text normalization rule, participle and part-of-speech tagging, pause
And making character fonts are detected to the text structure being input into;
The rhythm produces step, obtains the parameter for characterizing prosodic features according to the contextual information of text analyzing acquisition;
Unit selection step, according to phone string to be synthesized and its contextual information, prosodic features parameter, and in accordance with
Specified criteria, selects one group of optimal voice unit to carry out waveform concatenation as synthesis unit from corpus.
In one embodiment, system can receive the corresponding text data of voice to be exported by dialog interface.
In the TTS of the present invention is processed, need to be analyzed text first.During beginning, system needs first to recognize word,
Reasonable participle is carried out, and judges that there is pause etc. where.Machine pronunciation also needs to produce certain rhythm generation.Characterize the rhythm special
The parameter levied includes such as fundamental frequency, duration and energy.And in the present invention data utilized by the generation rhythm are from text analyzing portion
The contextual information for separately winning.
In TTS process, need to carry out Unit selection to select most suitable voice unit to carry out phonetic synthesis.Specifically
Say, system is according to pinyin string (phone string) to be synthesized and its contextual information, prosodic information, it then follows a certain criterion, from
One group of optimal voice unit is selected in corpus is used for waveform concatenation as synthesis unit, and criterion here is exactly to make certain in fact
The value of one cost function is minimum.The value of this cost function will be affected by some factors, such as:The rhythm it is inconsistent,
Different mismatch with context environmental of spectral difference etc..
Last processing module of tts system is Waveform composition unit.When Waveform composition is carried out, generally two kinds are adopted
Strategy, one does not need prosody modification when being splicing, and another is to need prosody modification.
Processing procedure of the tts system from Text To Speech is described generally above.And in the present invention, through TTS process
Voice afterwards directly might not be exported.Also need to ensuing process.
As shown in Fig. 2 in step s 102, obtain and voice data is dubbed with the background of the type matching.When previous
The result obtained in step is that voice content is recitation of poems, then system can search the background matched with the poem in thesaurus
Music.For example, after intelligent robot is by further analyzing semanteme, after substantially having understood the style of poem, by setting
Labeling which is further marked.Then by search and the mark in mark word bank different in storage
Corresponding background is dubbed.For example the music of magnificence will be equipped with for the recitation of poems of bold and unconstrained group.For example, poem content is to eulogize ancestral
State, then by " love of the republic, I come up as snowflake day, red flag song, Long March symphony, Long March symphony, the army of volunteers are carried out
Song, the Five-Starred Red Flag (the national flag of the People's Republic of China), the Yellow River piano concerto, the sound in township, the sound in township, the sound in township, ten send Red Army to dub in background music, youth China dubs in background music, yellow
River work song, I and I motherland, Great Wall ballad, the Yellow River lead my hand, rivers and mountains is unlimited, climb snow mountain, same first song, the song in the Changjiang river "
Scan in the word bank constituted etc. class song.If poem content is singer's emotional affection township feelings, by " white hair real mother, big
Bie Shan, old father, the ballad of mother, mother, that be exactly me, Qianmen stall tea, dear Papa and Mama, sunset, in candle light
Mother, recall the south of the River, think of one's home, pray within thousand, the moon over a fountain " etc. class song constitute word bank in scan for.
Storage background is dubbed the thesaurus of music and can be set up in many ways.For example can be with the wind according to melody itself
Lattice are setting up music word bank.For example, violin theme word bank, symphony word bank, light music word bank, classic Gu fun storehouse etc. are set up
Deng to adapt to the voice of plurality of kinds of contents.
The matched back of the body is searched out in data bank in the type according to belonging to the semantic content that text data is represented
After scape dubs voice data, just exported.
In step s 103, it is preferred that system can be while output text data corresponding voice and in triggering
Condition is played background in the case of meeting and dubs voice data.
So the content of output is dubbed with background and is matched, and user hears the voice of existing machine synthesis, also
There is interesting to listen to melodious background music so that interactive experience more horn of plenty.
Background towards intelligent robot of the invention matches somebody with somebody sound outputting method, it is preferred that as shown in Fig. 2 defeated
The step of background played while going out text data corresponding voice and in the case where trigger condition meets dubbing voice data
In, trigger condition generally comprises following several situations:
For example, when carrying out background and dubbing trigger condition and judge, step S201 as shown in Figure 3, a kind of situation is, can be with
When system receives the particular statement of user input, the broadcasting that background is dubbed just is triggered.That is, in this case,
It is not to say that when Text To Speech conversion is exported, background will be exported simultaneously and be dubbed, but also need to the specific instruction of user
To start.When judging to there is the particular statement of user, then trigger background and dub with speech text while playing.
When judging there is no the particular statement of user, then continue to determine whether that setting automatically background dubs rising for broadcasting
Only time, step S202.If it is, according to the pre-set beginning and ending time carry out background dub it is synchronous with speech text
Play.
When judgement system, the broadcasting beginning and ending time that background is dubbed is not played in setting automatically, then continued to determine whether artificial
Selection function is carrying out the broadcasting that background is dubbed, step S203.If it is, background is carried out under conditions of artificial selection dubbing
Synchronization with speech text is played.
Further, generate as no speech text can pass through different applications, for example, poem can pass through name
Be that the application of " recitation of poems " is generated, and the application that children's stories can pass through entitled " children's story " generated, thus it is actual should
With in, can pass through to determine the application of current operation, judge the type of voice content to be exported.
As the method for the present invention describes what is realized in computer systems.The computer system can for example be arranged
In the control core processor of robot.For example, method described herein can be implemented as what is can performed with control logic
Software, which is performed by the CPU in robot control system.Function as herein described can be implemented as being stored in non-transitory to be had
Programmed instruction set in shape computer-readable medium.When implemented in this fashion, the computer program includes one group of instruction,
When the group instruction is run by computer, which promotes computer to perform the method that can implement above-mentioned functions.FPGA can be temporary
When or be permanently mounted in non-transitory tangible computer computer-readable recording medium, for example ROM chip, computer storage,
Disk or other storage mediums.In addition to realizing except with software, logic as herein described can utilize discrete parts, integrated electricity
What road and programmable logic device (such as, field programmable gate array (FPGA) or microprocessor) were used in combination programmable patrols
Volume, or any other equipment being combined including them is embodying.All such embodiments are intended to fall under the model of the present invention
Within enclosing.
Therefore, according to another aspect of the present invention, additionally provide a kind of background towards intelligent robot and dub output
Device 300, as shown in Figure 4.The device is included with lower unit:
Text data receiving unit 301, which is to receive the corresponding text data of voice to be exported, and analyzes the text
The semanteme of notebook data;
Background dubs search unit 302, and which exists to the type belonging to the semantic content that represented according to the text data
Matched background is searched in data bank and dubs voice data;
Audio output unit 303, while output text data corresponding voice and in the case where trigger condition meets
Play background and dub voice data.
Background towards intelligent robot of the invention dubs output device 300, it is preferred that to export text
The audio output that background dubs voice data is played while notebook data corresponding voice and in the case where trigger condition meets
In unit, trigger condition includes following several situations:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
Background towards intelligent robot of the invention dubs output device 300, it is preferred that to according to institute
State the type belonging to the semantic content of text data representative matched background is searched in data bank to dub voice data
Background dub in search unit, also including judging unit, its to judge the corresponding sound-type of text data to be exported, with
Determine matching background music.
Background towards intelligent robot of the invention dubs output device 300, it is preferred that by dialog box circle
Face receives and will export the corresponding text data of voice.
Background towards intelligent robot of the invention dubs output device 300, it is preferred that to the text
In the text data receiving unit 301 that data are analyzed, also include with lower unit:
Text structure detector unit, its to according to punctuation mark, text normalization rule, participle and part-of-speech tagging, stop
Process and making character fonts to be input into text structure detect;
Rhythm generation unit, which obtains the ginseng for characterizing prosodic features to the contextual information that obtains according to text analyzing
Number;
Unit selection unit, its to according to phone string to be synthesized and its contextual information, prosodic features parameter,
And in accordance with specified criteria, from corpus, select one group of optimal voice unit to carry out waveform concatenation as synthesis unit.
By each embodiment of the present invention, can cause between computer and people, can be as interpersonal
Exchanged by language.When TTS is played, while play background dubbing, the two combines, and makes computer language output trueer
It is real and attractive.Dub the model-based optimization sound experience combined with TTS, it is necessary first to which output information is converted into using background
Voice, is then selected to be dubbed with the background that TTS matches, background is dubbed and is combined with TTS, while play reception and registration giving people.Example
Such as, when playing poem TTS, while playing the music matched with poem situation, the two matches and combines, and allows the people for listening to produce
Sensation on the spot in person.
It should be understood that disclosed embodiment of this invention is not limited to ad hoc structure disclosed herein, process step
Or material, and the equivalent substitute of these features that those of ordinary skill in the related art are understood should be extended to.Should also manage
Solution, term as used herein are only used for the purpose for describing specific embodiment, and are not intended to limit.
" one embodiment " or " embodiment " mentioned in specification means special characteristic, the structure for describing in conjunction with the embodiments
Or characteristic is included at least one embodiment of the present invention.Therefore, the phrase " reality that specification various places throughout occurs
Apply example " or " embodiment " same embodiment might not be referred both to.
While it is disclosed that embodiment as above, but described content only to facilitate understand the present invention and adopt
Embodiment, is not limited to the present invention.Technical staff in any the technical field of the invention, without departing from this
On the premise of the disclosed spirit and scope of invention, any modification and change can be made in the formal and details implemented,
But the scope of patent protection of the present invention, still must be defined by the scope of which is defined in the appended claims.
Claims (8)
1. a kind of background towards intelligent robot matches somebody with somebody sound outputting method, it is characterised in that the method comprising the steps of:
Judge the type of voice content to be exported;
Obtain voice data is dubbed with the background of the type matching;
Background is played while output voice content and dubs voice data.
2. the background towards intelligent robot as claimed in claim 1 matches somebody with somebody sound outputting method, it is characterised in that in output voice
While and trigger condition meet in the case of play background dub voice data, wherein, trigger condition includes following several
Situation:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
3. the background towards intelligent robot as claimed in claim 1 matches somebody with somebody sound outputting method, it is characterised in that judging defeated
In the type step of the voice content for going out, the type of voice content to be exported is judged according to current application.
4. the background towards intelligent robot as claimed in claim 1 matches somebody with somebody sound outputting method, it is characterised in that by dialog box
Interface receives and will export the corresponding text data of voice.
5. a kind of background towards intelligent robot dubs output device, it is characterised in that described device is included with lower unit:
Text data receiving unit, which is to receive the corresponding text data of voice to be exported, and analyzes the text data
Semanteme;
Background dubs search unit, and which is to the type belonging to the semantic content that represented according to the text data in data bank
The matched background of search dubs voice data;
Audio output unit, plays background while output text data corresponding voice and in the case where trigger condition meets
Dub voice data.
6. the background towards intelligent robot as claimed in claim 5 dubs output device, it is characterised in that to export
Play while text data corresponding voice and in the case where trigger condition meets background dub voice data audio frequency it is defeated
Go out in unit, trigger condition includes following several situations:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
7. the background towards intelligent robot as claimed in claim 6 dubs output device, it is characterised in that to basis
The type belonging to semantic content that the text data is represented is searched for matched background in data bank and dubs audio frequency number
According to background dub in search unit, also including judging unit, its to judge the corresponding sound-type of text data to be exported,
To determine matching background music.
8. the background towards intelligent robot as claimed in claim 7 dubs output device, it is characterised in that by dialog box
Interface receives and will export the corresponding text data of voice.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610982284.8A CN106557298A (en) | 2016-11-08 | 2016-11-08 | Background towards intelligent robot matches somebody with somebody sound outputting method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610982284.8A CN106557298A (en) | 2016-11-08 | 2016-11-08 | Background towards intelligent robot matches somebody with somebody sound outputting method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106557298A true CN106557298A (en) | 2017-04-05 |
Family
ID=58444684
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610982284.8A Pending CN106557298A (en) | 2016-11-08 | 2016-11-08 | Background towards intelligent robot matches somebody with somebody sound outputting method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106557298A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107437413A (en) * | 2017-07-05 | 2017-12-05 | 百度在线网络技术(北京)有限公司 | voice broadcast method and device |
CN107463626A (en) * | 2017-07-07 | 2017-12-12 | 深圳市科迈爱康科技有限公司 | A kind of voice-control educational method, mobile terminal, system and storage medium |
CN107731219A (en) * | 2017-09-06 | 2018-02-23 | 百度在线网络技术(北京)有限公司 | Phonetic synthesis processing method, device and equipment |
CN108242238A (en) * | 2018-01-11 | 2018-07-03 | 广东小天才科技有限公司 | Audio file generation method and device and terminal equipment |
CN109065018A (en) * | 2018-08-22 | 2018-12-21 | 北京光年无限科技有限公司 | A kind of narration data processing method and system towards intelligent robot |
CN109241331A (en) * | 2018-09-25 | 2019-01-18 | 北京光年无限科技有限公司 | A kind of narration data processing method towards intelligent robot |
CN109460548A (en) * | 2018-09-30 | 2019-03-12 | 北京光年无限科技有限公司 | A kind of narration data processing method and system towards intelligent robot |
CN109543021A (en) * | 2018-11-29 | 2019-03-29 | 北京光年无限科技有限公司 | A kind of narration data processing method and system towards intelligent robot |
CN109542389A (en) * | 2018-11-19 | 2019-03-29 | 北京光年无限科技有限公司 | Sound effect control method and system for the output of multi-modal story content |
CN111104544A (en) * | 2018-10-29 | 2020-05-05 | 阿里巴巴集团控股有限公司 | Background music recommendation method and equipment, client device and electronic equipment |
CN113779204A (en) * | 2020-06-09 | 2021-12-10 | 阿里巴巴集团控股有限公司 | Data processing method and device, electronic equipment and computer storage medium |
CN109522427B (en) * | 2018-09-30 | 2021-12-10 | 北京光年无限科技有限公司 | Intelligent robot-oriented story data processing method and device |
CN114189587A (en) * | 2021-11-10 | 2022-03-15 | 阿里巴巴(中国)有限公司 | Call method, device, storage medium and computer program product |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1737901A (en) * | 2004-08-16 | 2006-02-22 | 华为技术有限公司 | System for realizing voice service to syncretize background music and its method |
CN104391980A (en) * | 2014-12-08 | 2015-03-04 | 百度在线网络技术(北京)有限公司 | Song generating method and device |
CN105709416A (en) * | 2016-03-14 | 2016-06-29 | 上海科睿展览展示工程科技有限公司 | Personalized dubbing method and system for multi-user operating game |
-
2016
- 2016-11-08 CN CN201610982284.8A patent/CN106557298A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1737901A (en) * | 2004-08-16 | 2006-02-22 | 华为技术有限公司 | System for realizing voice service to syncretize background music and its method |
CN104391980A (en) * | 2014-12-08 | 2015-03-04 | 百度在线网络技术(北京)有限公司 | Song generating method and device |
CN105709416A (en) * | 2016-03-14 | 2016-06-29 | 上海科睿展览展示工程科技有限公司 | Personalized dubbing method and system for multi-user operating game |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107437413A (en) * | 2017-07-05 | 2017-12-05 | 百度在线网络技术(北京)有限公司 | voice broadcast method and device |
CN107437413B (en) * | 2017-07-05 | 2020-09-25 | 百度在线网络技术(北京)有限公司 | Voice broadcasting method and device |
CN107463626A (en) * | 2017-07-07 | 2017-12-12 | 深圳市科迈爱康科技有限公司 | A kind of voice-control educational method, mobile terminal, system and storage medium |
CN107731219A (en) * | 2017-09-06 | 2018-02-23 | 百度在线网络技术(北京)有限公司 | Phonetic synthesis processing method, device and equipment |
CN108242238B (en) * | 2018-01-11 | 2019-12-31 | 广东小天才科技有限公司 | Audio file generation method and device and terminal equipment |
CN108242238A (en) * | 2018-01-11 | 2018-07-03 | 广东小天才科技有限公司 | Audio file generation method and device and terminal equipment |
CN109065018A (en) * | 2018-08-22 | 2018-12-21 | 北京光年无限科技有限公司 | A kind of narration data processing method and system towards intelligent robot |
CN109065018B (en) * | 2018-08-22 | 2021-09-10 | 北京光年无限科技有限公司 | Intelligent robot-oriented story data processing method and system |
CN109241331A (en) * | 2018-09-25 | 2019-01-18 | 北京光年无限科技有限公司 | A kind of narration data processing method towards intelligent robot |
CN109241331B (en) * | 2018-09-25 | 2022-03-15 | 北京光年无限科技有限公司 | Intelligent robot-oriented story data processing method |
CN109460548A (en) * | 2018-09-30 | 2019-03-12 | 北京光年无限科技有限公司 | A kind of narration data processing method and system towards intelligent robot |
CN109460548B (en) * | 2018-09-30 | 2022-03-15 | 北京光年无限科技有限公司 | Intelligent robot-oriented story data processing method and system |
CN109522427B (en) * | 2018-09-30 | 2021-12-10 | 北京光年无限科技有限公司 | Intelligent robot-oriented story data processing method and device |
CN111104544A (en) * | 2018-10-29 | 2020-05-05 | 阿里巴巴集团控股有限公司 | Background music recommendation method and equipment, client device and electronic equipment |
CN109542389A (en) * | 2018-11-19 | 2019-03-29 | 北京光年无限科技有限公司 | Sound effect control method and system for the output of multi-modal story content |
CN109543021A (en) * | 2018-11-29 | 2019-03-29 | 北京光年无限科技有限公司 | A kind of narration data processing method and system towards intelligent robot |
CN109543021B (en) * | 2018-11-29 | 2022-03-18 | 北京光年无限科技有限公司 | Intelligent robot-oriented story data processing method and system |
CN113779204A (en) * | 2020-06-09 | 2021-12-10 | 阿里巴巴集团控股有限公司 | Data processing method and device, electronic equipment and computer storage medium |
CN113779204B (en) * | 2020-06-09 | 2024-06-11 | 浙江未来精灵人工智能科技有限公司 | Data processing method, device, electronic equipment and computer storage medium |
CN114189587A (en) * | 2021-11-10 | 2022-03-15 | 阿里巴巴(中国)有限公司 | Call method, device, storage medium and computer program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106557298A (en) | Background towards intelligent robot matches somebody with somebody sound outputting method and device | |
CN108492817B (en) | Song data processing method based on virtual idol and singing interaction system | |
CN110782900B (en) | Collaborative AI storytelling | |
CN101064103B (en) | Chinese voice synthetic method and system based on syllable rhythm restricting relationship | |
CN108962217A (en) | Phoneme synthesizing method and relevant device | |
US20210158795A1 (en) | Generating audio for a plain text document | |
US10229669B2 (en) | Apparatus, process, and program for combining speech and audio data | |
US8027837B2 (en) | Using non-speech sounds during text-to-speech synthesis | |
Eide et al. | A corpus-based approach to< ahem/> expressive speech synthesis | |
CN108288468A (en) | Audio recognition method and device | |
CN103632663B (en) | A kind of method of Mongol phonetic synthesis front-end processing based on HMM | |
CN110782875B (en) | Voice rhythm processing method and device based on artificial intelligence | |
CN104391980A (en) | Song generating method and device | |
CN108305611B (en) | Text-to-speech method, device, storage medium and computer equipment | |
CN110782880A (en) | Training method and device of rhythm generation model | |
CN109492126B (en) | Intelligent interaction method and device | |
Ogden et al. | ProSynth: an integrated prosodic approach to device-independent, natural-sounding speech synthesis | |
CN112669815A (en) | Song customization generation method and corresponding device, equipment and medium | |
CN116917984A (en) | Interactive content output | |
CN106297766A (en) | Phoneme synthesizing method and system | |
CN106292424A (en) | Music data processing method and device for anthropomorphic robot | |
TWI605350B (en) | Text-to-speech method and multiplingual speech synthesizer using the method | |
TWI574254B (en) | Speech synthesis method and apparatus for electronic system | |
CN102970618A (en) | Video on demand method based on syllable identification | |
CN1331113C (en) | Speech synthesizer,method and recording medium for speech recording synthetic program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170405 |