CN106557298A - Background towards intelligent robot matches somebody with somebody sound outputting method and device - Google Patents

Background towards intelligent robot matches somebody with somebody sound outputting method and device Download PDF

Info

Publication number
CN106557298A
CN106557298A CN201610982284.8A CN201610982284A CN106557298A CN 106557298 A CN106557298 A CN 106557298A CN 201610982284 A CN201610982284 A CN 201610982284A CN 106557298 A CN106557298 A CN 106557298A
Authority
CN
China
Prior art keywords
background
voice
text data
intelligent robot
dubs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610982284.8A
Other languages
Chinese (zh)
Inventor
谢文静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Guangnian Wuxian Technology Co Ltd
Original Assignee
Beijing Guangnian Wuxian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guangnian Wuxian Technology Co Ltd filed Critical Beijing Guangnian Wuxian Technology Co Ltd
Priority to CN201610982284.8A priority Critical patent/CN106557298A/en
Publication of CN106557298A publication Critical patent/CN106557298A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Toys (AREA)

Abstract

The invention provides a kind of background towards intelligent robot matches somebody with somebody sound outputting method, which comprises the following steps:Judge the type of voice content to be exported;Obtain voice data is dubbed with the background of the type matching;Background is played while output voice content and dubs voice data.Background of the invention with sound outputting method can allow user machine is converted a text to voice experience it is truer, broadcastings that background is dubbed can allow people to have sensation on the spot in person, allow express it is more lively.

Description

Background towards intelligent robot matches somebody with somebody sound outputting method and device
Technical field
The present invention relates to field in intelligent robotics, specifically, be related to a kind of background towards intelligent robot dub it is defeated Go out method and device.
Background technology
Current robot chat is mainly, and computer, will be system to be exported using TTS technologies according to interactive result Text carries out voice conversion, then plays back again.However, this chat interactive mode can not allow user to feel true Experience.There can be experience on the spot in person to allow user, need a kind of interactive energy that can improve constantly intelligent robot Power enters the technical scheme of experience so as to lift user.
The content of the invention
It is an object of the invention to provide a kind of background towards intelligent robot solves above-mentioned skill with sound outputting method Art problem.In the method for the invention, which comprises the following steps:
Judge the type of voice content to be exported;
Obtain voice data is dubbed with the background of the type matching;
Background is played while output voice and dubs voice data.
Background towards intelligent robot of the invention matches somebody with somebody sound outputting method, it is preferred that in the same of output voice When and trigger condition meet in the case of play background dub voice data, wherein, trigger condition includes following several situations:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
Background towards intelligent robot of the invention matches somebody with somebody sound outputting method, it is preferred that judging to be exported In the type step of voice content, according to current application, the type of voice content to be exported is judged.
Background towards intelligent robot of the invention matches somebody with somebody sound outputting method, it is preferred that by dialog interface Reception will export the corresponding text data of voice.
According to another aspect of the present invention, additionally provide a kind of background towards intelligent robot and dub output device, Described device is comprised the following steps:
Text data receiving unit, which is to receive the corresponding text data of voice to be exported, and analyzes the text The semanteme of data;
Background dubs search unit, and which is to the type belonging to the semantic content that represented according to the text data in data Matched background is searched in storehouse and dubs voice data;
Audio output unit, plays while output text data corresponding voice and in the case where trigger condition meets Background dubs voice data.
Background towards intelligent robot of the invention dubs output device, it is preferred that to export text The audio output list that background dubs voice data is played while data corresponding voice and in the case where trigger condition meets In unit, trigger condition includes following several situations:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
Background towards intelligent robot of the invention dubs output device, it is preferred that to according to described The type belonging to semantic content that text data is represented is searched for matched background in data bank and dubs voice data Background is dubbed in search unit, also including judging unit, its to judge the corresponding sound-type of text data to be exported, with true Fixed matching background music.
Background towards intelligent robot of the invention dubs output device, it is preferred that by dialog interface Reception will export the corresponding text data of voice.
Present invention be advantageous in that, by realize the method for the present invention can greatly improve intelligence machine person to person it Between interaction capabilities, so as to lift the experience of user.Specifically, background of the invention can allow with sound outputting method and make The experience that user converts a text to voice to machine is truer, and the broadcasting that background is dubbed can allow people to have sense on the spot in person Feel, make expression more lively.
Other features and advantages of the present invention will be illustrated in the following description, also, partly be become from specification Obtain it is clear that or being understood by implementing the present invention.The purpose of the present invention and other advantages can be by specification, rights In claim and accompanying drawing, specifically noted structure is realizing and obtain.
Description of the drawings
Accompanying drawing is used for providing a further understanding of the present invention, and constitutes a part for specification, the reality with the present invention Apply example to be provided commonly for explaining the present invention, be not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is the ensemble stream with sound outputting method according to the background towards intelligent robot of one embodiment of the present of invention Cheng Tu
Fig. 2 is the detailed stream with sound outputting method according to the background towards intelligent robot of one embodiment of the present of invention Cheng Tu;
Fig. 3 is tactile with sound outputting method towards the background of intelligent robot according to the triggering of one embodiment of the present of invention Send out process flow diagram flow chart;And
Fig. 4 is the structural frames that output device is dubbed according to the background towards intelligent robot of one embodiment of the present of invention Figure.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, the embodiment of the present invention is made below in conjunction with accompanying drawing Further describe in detail.
As shown in figure 1, which show carrying out the overview flow chart that background is dubbed according to the present invention.
In the method, text input is carried out first, for example, the defeated of user is obtained by text scanner of robot etc. Enter, or the input of user is obtained by way of screen touch.After robot obtains text input result, text point is carried out Analysis and phonetic synthesis.Voice output is carried out finally, the voice of output contains the result and the selected back of the body out of TTS process Scape is dubbed.The details of these technologies will hereafter be discussed in detail.
As shown in Fig. 2 which show a kind of overview flow chart of background towards intelligent robot with sound outputting method. Method starts from step S101.In this step, system judges the type of voice content to be exported.In intelligence machine person to person When interacting, it will usually the interactive instruction of receive user first, or when some conditions meet, actively send chat language Sound.Robot system of the invention internally receives the corresponding text data of voice to be exported, and analyzes the text The semanteme of data.For example, the semantic content for representing for obtaining text data by analysis is recitation of poems, children's stories etc..System root It is marked with label according to the different classifications of different voice contents.According to for voice content mark come judge be poem or Person's children's stories.
Preferably, it is when being analyzed to text data, further comprising the steps of:
Text structure detecting step, is processed according to punctuation mark, text normalization rule, participle and part-of-speech tagging, pause And making character fonts are detected to the text structure being input into;
The rhythm produces step, obtains the parameter for characterizing prosodic features according to the contextual information of text analyzing acquisition;
Unit selection step, according to phone string to be synthesized and its contextual information, prosodic features parameter, and in accordance with Specified criteria, selects one group of optimal voice unit to carry out waveform concatenation as synthesis unit from corpus.
In one embodiment, system can receive the corresponding text data of voice to be exported by dialog interface.
In the TTS of the present invention is processed, need to be analyzed text first.During beginning, system needs first to recognize word, Reasonable participle is carried out, and judges that there is pause etc. where.Machine pronunciation also needs to produce certain rhythm generation.Characterize the rhythm special The parameter levied includes such as fundamental frequency, duration and energy.And in the present invention data utilized by the generation rhythm are from text analyzing portion The contextual information for separately winning.
In TTS process, need to carry out Unit selection to select most suitable voice unit to carry out phonetic synthesis.Specifically Say, system is according to pinyin string (phone string) to be synthesized and its contextual information, prosodic information, it then follows a certain criterion, from One group of optimal voice unit is selected in corpus is used for waveform concatenation as synthesis unit, and criterion here is exactly to make certain in fact The value of one cost function is minimum.The value of this cost function will be affected by some factors, such as:The rhythm it is inconsistent, Different mismatch with context environmental of spectral difference etc..
Last processing module of tts system is Waveform composition unit.When Waveform composition is carried out, generally two kinds are adopted Strategy, one does not need prosody modification when being splicing, and another is to need prosody modification.
Processing procedure of the tts system from Text To Speech is described generally above.And in the present invention, through TTS process Voice afterwards directly might not be exported.Also need to ensuing process.
As shown in Fig. 2 in step s 102, obtain and voice data is dubbed with the background of the type matching.When previous The result obtained in step is that voice content is recitation of poems, then system can search the background matched with the poem in thesaurus Music.For example, after intelligent robot is by further analyzing semanteme, after substantially having understood the style of poem, by setting Labeling which is further marked.Then by search and the mark in mark word bank different in storage Corresponding background is dubbed.For example the music of magnificence will be equipped with for the recitation of poems of bold and unconstrained group.For example, poem content is to eulogize ancestral State, then by " love of the republic, I come up as snowflake day, red flag song, Long March symphony, Long March symphony, the army of volunteers are carried out Song, the Five-Starred Red Flag (the national flag of the People's Republic of China), the Yellow River piano concerto, the sound in township, the sound in township, the sound in township, ten send Red Army to dub in background music, youth China dubs in background music, yellow River work song, I and I motherland, Great Wall ballad, the Yellow River lead my hand, rivers and mountains is unlimited, climb snow mountain, same first song, the song in the Changjiang river " Scan in the word bank constituted etc. class song.If poem content is singer's emotional affection township feelings, by " white hair real mother, big Bie Shan, old father, the ballad of mother, mother, that be exactly me, Qianmen stall tea, dear Papa and Mama, sunset, in candle light Mother, recall the south of the River, think of one's home, pray within thousand, the moon over a fountain " etc. class song constitute word bank in scan for.
Storage background is dubbed the thesaurus of music and can be set up in many ways.For example can be with the wind according to melody itself Lattice are setting up music word bank.For example, violin theme word bank, symphony word bank, light music word bank, classic Gu fun storehouse etc. are set up Deng to adapt to the voice of plurality of kinds of contents.
The matched back of the body is searched out in data bank in the type according to belonging to the semantic content that text data is represented After scape dubs voice data, just exported.
In step s 103, it is preferred that system can be while output text data corresponding voice and in triggering Condition is played background in the case of meeting and dubs voice data.
So the content of output is dubbed with background and is matched, and user hears the voice of existing machine synthesis, also There is interesting to listen to melodious background music so that interactive experience more horn of plenty.
Background towards intelligent robot of the invention matches somebody with somebody sound outputting method, it is preferred that as shown in Fig. 2 defeated The step of background played while going out text data corresponding voice and in the case where trigger condition meets dubbing voice data In, trigger condition generally comprises following several situations:
For example, when carrying out background and dubbing trigger condition and judge, step S201 as shown in Figure 3, a kind of situation is, can be with When system receives the particular statement of user input, the broadcasting that background is dubbed just is triggered.That is, in this case, It is not to say that when Text To Speech conversion is exported, background will be exported simultaneously and be dubbed, but also need to the specific instruction of user To start.When judging to there is the particular statement of user, then trigger background and dub with speech text while playing.
When judging there is no the particular statement of user, then continue to determine whether that setting automatically background dubs rising for broadcasting Only time, step S202.If it is, according to the pre-set beginning and ending time carry out background dub it is synchronous with speech text Play.
When judgement system, the broadcasting beginning and ending time that background is dubbed is not played in setting automatically, then continued to determine whether artificial Selection function is carrying out the broadcasting that background is dubbed, step S203.If it is, background is carried out under conditions of artificial selection dubbing Synchronization with speech text is played.
Further, generate as no speech text can pass through different applications, for example, poem can pass through name Be that the application of " recitation of poems " is generated, and the application that children's stories can pass through entitled " children's story " generated, thus it is actual should With in, can pass through to determine the application of current operation, judge the type of voice content to be exported.
As the method for the present invention describes what is realized in computer systems.The computer system can for example be arranged In the control core processor of robot.For example, method described herein can be implemented as what is can performed with control logic Software, which is performed by the CPU in robot control system.Function as herein described can be implemented as being stored in non-transitory to be had Programmed instruction set in shape computer-readable medium.When implemented in this fashion, the computer program includes one group of instruction, When the group instruction is run by computer, which promotes computer to perform the method that can implement above-mentioned functions.FPGA can be temporary When or be permanently mounted in non-transitory tangible computer computer-readable recording medium, for example ROM chip, computer storage, Disk or other storage mediums.In addition to realizing except with software, logic as herein described can utilize discrete parts, integrated electricity What road and programmable logic device (such as, field programmable gate array (FPGA) or microprocessor) were used in combination programmable patrols Volume, or any other equipment being combined including them is embodying.All such embodiments are intended to fall under the model of the present invention Within enclosing.
Therefore, according to another aspect of the present invention, additionally provide a kind of background towards intelligent robot and dub output Device 300, as shown in Figure 4.The device is included with lower unit:
Text data receiving unit 301, which is to receive the corresponding text data of voice to be exported, and analyzes the text The semanteme of notebook data;
Background dubs search unit 302, and which exists to the type belonging to the semantic content that represented according to the text data Matched background is searched in data bank and dubs voice data;
Audio output unit 303, while output text data corresponding voice and in the case where trigger condition meets Play background and dub voice data.
Background towards intelligent robot of the invention dubs output device 300, it is preferred that to export text The audio output that background dubs voice data is played while notebook data corresponding voice and in the case where trigger condition meets In unit, trigger condition includes following several situations:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
Background towards intelligent robot of the invention dubs output device 300, it is preferred that to according to institute State the type belonging to the semantic content of text data representative matched background is searched in data bank to dub voice data Background dub in search unit, also including judging unit, its to judge the corresponding sound-type of text data to be exported, with Determine matching background music.
Background towards intelligent robot of the invention dubs output device 300, it is preferred that by dialog box circle Face receives and will export the corresponding text data of voice.
Background towards intelligent robot of the invention dubs output device 300, it is preferred that to the text In the text data receiving unit 301 that data are analyzed, also include with lower unit:
Text structure detector unit, its to according to punctuation mark, text normalization rule, participle and part-of-speech tagging, stop Process and making character fonts to be input into text structure detect;
Rhythm generation unit, which obtains the ginseng for characterizing prosodic features to the contextual information that obtains according to text analyzing Number;
Unit selection unit, its to according to phone string to be synthesized and its contextual information, prosodic features parameter, And in accordance with specified criteria, from corpus, select one group of optimal voice unit to carry out waveform concatenation as synthesis unit.
By each embodiment of the present invention, can cause between computer and people, can be as interpersonal Exchanged by language.When TTS is played, while play background dubbing, the two combines, and makes computer language output trueer It is real and attractive.Dub the model-based optimization sound experience combined with TTS, it is necessary first to which output information is converted into using background Voice, is then selected to be dubbed with the background that TTS matches, background is dubbed and is combined with TTS, while play reception and registration giving people.Example Such as, when playing poem TTS, while playing the music matched with poem situation, the two matches and combines, and allows the people for listening to produce Sensation on the spot in person.
It should be understood that disclosed embodiment of this invention is not limited to ad hoc structure disclosed herein, process step Or material, and the equivalent substitute of these features that those of ordinary skill in the related art are understood should be extended to.Should also manage Solution, term as used herein are only used for the purpose for describing specific embodiment, and are not intended to limit.
" one embodiment " or " embodiment " mentioned in specification means special characteristic, the structure for describing in conjunction with the embodiments Or characteristic is included at least one embodiment of the present invention.Therefore, the phrase " reality that specification various places throughout occurs Apply example " or " embodiment " same embodiment might not be referred both to.
While it is disclosed that embodiment as above, but described content only to facilitate understand the present invention and adopt Embodiment, is not limited to the present invention.Technical staff in any the technical field of the invention, without departing from this On the premise of the disclosed spirit and scope of invention, any modification and change can be made in the formal and details implemented, But the scope of patent protection of the present invention, still must be defined by the scope of which is defined in the appended claims.

Claims (8)

1. a kind of background towards intelligent robot matches somebody with somebody sound outputting method, it is characterised in that the method comprising the steps of:
Judge the type of voice content to be exported;
Obtain voice data is dubbed with the background of the type matching;
Background is played while output voice content and dubs voice data.
2. the background towards intelligent robot as claimed in claim 1 matches somebody with somebody sound outputting method, it is characterised in that in output voice While and trigger condition meet in the case of play background dub voice data, wherein, trigger condition includes following several Situation:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
3. the background towards intelligent robot as claimed in claim 1 matches somebody with somebody sound outputting method, it is characterised in that judging defeated In the type step of the voice content for going out, the type of voice content to be exported is judged according to current application.
4. the background towards intelligent robot as claimed in claim 1 matches somebody with somebody sound outputting method, it is characterised in that by dialog box Interface receives and will export the corresponding text data of voice.
5. a kind of background towards intelligent robot dubs output device, it is characterised in that described device is included with lower unit:
Text data receiving unit, which is to receive the corresponding text data of voice to be exported, and analyzes the text data Semanteme;
Background dubs search unit, and which is to the type belonging to the semantic content that represented according to the text data in data bank The matched background of search dubs voice data;
Audio output unit, plays background while output text data corresponding voice and in the case where trigger condition meets Dub voice data.
6. the background towards intelligent robot as claimed in claim 5 dubs output device, it is characterised in that to export Play while text data corresponding voice and in the case where trigger condition meets background dub voice data audio frequency it is defeated Go out in unit, trigger condition includes following several situations:
When the particular statement of user input is received, the broadcasting that background is dubbed is triggered;
Automatically the broadcasting beginning and ending time that background is dubbed is played in setting in systems;
Dub in the when broadcasting background for playing the corresponding voice of text data.
7. the background towards intelligent robot as claimed in claim 6 dubs output device, it is characterised in that to basis The type belonging to semantic content that the text data is represented is searched for matched background in data bank and dubs audio frequency number According to background dub in search unit, also including judging unit, its to judge the corresponding sound-type of text data to be exported, To determine matching background music.
8. the background towards intelligent robot as claimed in claim 7 dubs output device, it is characterised in that by dialog box Interface receives and will export the corresponding text data of voice.
CN201610982284.8A 2016-11-08 2016-11-08 Background towards intelligent robot matches somebody with somebody sound outputting method and device Pending CN106557298A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610982284.8A CN106557298A (en) 2016-11-08 2016-11-08 Background towards intelligent robot matches somebody with somebody sound outputting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610982284.8A CN106557298A (en) 2016-11-08 2016-11-08 Background towards intelligent robot matches somebody with somebody sound outputting method and device

Publications (1)

Publication Number Publication Date
CN106557298A true CN106557298A (en) 2017-04-05

Family

ID=58444684

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610982284.8A Pending CN106557298A (en) 2016-11-08 2016-11-08 Background towards intelligent robot matches somebody with somebody sound outputting method and device

Country Status (1)

Country Link
CN (1) CN106557298A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107437413A (en) * 2017-07-05 2017-12-05 百度在线网络技术(北京)有限公司 voice broadcast method and device
CN107463626A (en) * 2017-07-07 2017-12-12 深圳市科迈爱康科技有限公司 A kind of voice-control educational method, mobile terminal, system and storage medium
CN107731219A (en) * 2017-09-06 2018-02-23 百度在线网络技术(北京)有限公司 Phonetic synthesis processing method, device and equipment
CN108242238A (en) * 2018-01-11 2018-07-03 广东小天才科技有限公司 Audio file generation method and device and terminal equipment
CN109065018A (en) * 2018-08-22 2018-12-21 北京光年无限科技有限公司 A kind of narration data processing method and system towards intelligent robot
CN109241331A (en) * 2018-09-25 2019-01-18 北京光年无限科技有限公司 A kind of narration data processing method towards intelligent robot
CN109460548A (en) * 2018-09-30 2019-03-12 北京光年无限科技有限公司 A kind of narration data processing method and system towards intelligent robot
CN109543021A (en) * 2018-11-29 2019-03-29 北京光年无限科技有限公司 A kind of narration data processing method and system towards intelligent robot
CN109542389A (en) * 2018-11-19 2019-03-29 北京光年无限科技有限公司 Sound effect control method and system for the output of multi-modal story content
CN111104544A (en) * 2018-10-29 2020-05-05 阿里巴巴集团控股有限公司 Background music recommendation method and equipment, client device and electronic equipment
CN113779204A (en) * 2020-06-09 2021-12-10 阿里巴巴集团控股有限公司 Data processing method and device, electronic equipment and computer storage medium
CN109522427B (en) * 2018-09-30 2021-12-10 北京光年无限科技有限公司 Intelligent robot-oriented story data processing method and device
CN114189587A (en) * 2021-11-10 2022-03-15 阿里巴巴(中国)有限公司 Call method, device, storage medium and computer program product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1737901A (en) * 2004-08-16 2006-02-22 华为技术有限公司 System for realizing voice service to syncretize background music and its method
CN104391980A (en) * 2014-12-08 2015-03-04 百度在线网络技术(北京)有限公司 Song generating method and device
CN105709416A (en) * 2016-03-14 2016-06-29 上海科睿展览展示工程科技有限公司 Personalized dubbing method and system for multi-user operating game

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1737901A (en) * 2004-08-16 2006-02-22 华为技术有限公司 System for realizing voice service to syncretize background music and its method
CN104391980A (en) * 2014-12-08 2015-03-04 百度在线网络技术(北京)有限公司 Song generating method and device
CN105709416A (en) * 2016-03-14 2016-06-29 上海科睿展览展示工程科技有限公司 Personalized dubbing method and system for multi-user operating game

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107437413A (en) * 2017-07-05 2017-12-05 百度在线网络技术(北京)有限公司 voice broadcast method and device
CN107437413B (en) * 2017-07-05 2020-09-25 百度在线网络技术(北京)有限公司 Voice broadcasting method and device
CN107463626A (en) * 2017-07-07 2017-12-12 深圳市科迈爱康科技有限公司 A kind of voice-control educational method, mobile terminal, system and storage medium
CN107731219A (en) * 2017-09-06 2018-02-23 百度在线网络技术(北京)有限公司 Phonetic synthesis processing method, device and equipment
CN108242238B (en) * 2018-01-11 2019-12-31 广东小天才科技有限公司 Audio file generation method and device and terminal equipment
CN108242238A (en) * 2018-01-11 2018-07-03 广东小天才科技有限公司 Audio file generation method and device and terminal equipment
CN109065018A (en) * 2018-08-22 2018-12-21 北京光年无限科技有限公司 A kind of narration data processing method and system towards intelligent robot
CN109065018B (en) * 2018-08-22 2021-09-10 北京光年无限科技有限公司 Intelligent robot-oriented story data processing method and system
CN109241331A (en) * 2018-09-25 2019-01-18 北京光年无限科技有限公司 A kind of narration data processing method towards intelligent robot
CN109241331B (en) * 2018-09-25 2022-03-15 北京光年无限科技有限公司 Intelligent robot-oriented story data processing method
CN109460548A (en) * 2018-09-30 2019-03-12 北京光年无限科技有限公司 A kind of narration data processing method and system towards intelligent robot
CN109460548B (en) * 2018-09-30 2022-03-15 北京光年无限科技有限公司 Intelligent robot-oriented story data processing method and system
CN109522427B (en) * 2018-09-30 2021-12-10 北京光年无限科技有限公司 Intelligent robot-oriented story data processing method and device
CN111104544A (en) * 2018-10-29 2020-05-05 阿里巴巴集团控股有限公司 Background music recommendation method and equipment, client device and electronic equipment
CN109542389A (en) * 2018-11-19 2019-03-29 北京光年无限科技有限公司 Sound effect control method and system for the output of multi-modal story content
CN109543021A (en) * 2018-11-29 2019-03-29 北京光年无限科技有限公司 A kind of narration data processing method and system towards intelligent robot
CN109543021B (en) * 2018-11-29 2022-03-18 北京光年无限科技有限公司 Intelligent robot-oriented story data processing method and system
CN113779204A (en) * 2020-06-09 2021-12-10 阿里巴巴集团控股有限公司 Data processing method and device, electronic equipment and computer storage medium
CN113779204B (en) * 2020-06-09 2024-06-11 浙江未来精灵人工智能科技有限公司 Data processing method, device, electronic equipment and computer storage medium
CN114189587A (en) * 2021-11-10 2022-03-15 阿里巴巴(中国)有限公司 Call method, device, storage medium and computer program product

Similar Documents

Publication Publication Date Title
CN106557298A (en) Background towards intelligent robot matches somebody with somebody sound outputting method and device
CN108492817B (en) Song data processing method based on virtual idol and singing interaction system
CN110782900B (en) Collaborative AI storytelling
CN101064103B (en) Chinese voice synthetic method and system based on syllable rhythm restricting relationship
CN108962217A (en) Phoneme synthesizing method and relevant device
US20210158795A1 (en) Generating audio for a plain text document
US10229669B2 (en) Apparatus, process, and program for combining speech and audio data
US8027837B2 (en) Using non-speech sounds during text-to-speech synthesis
Eide et al. A corpus-based approach to< ahem/> expressive speech synthesis
CN108288468A (en) Audio recognition method and device
CN103632663B (en) A kind of method of Mongol phonetic synthesis front-end processing based on HMM
CN110782875B (en) Voice rhythm processing method and device based on artificial intelligence
CN104391980A (en) Song generating method and device
CN108305611B (en) Text-to-speech method, device, storage medium and computer equipment
CN110782880A (en) Training method and device of rhythm generation model
CN109492126B (en) Intelligent interaction method and device
Ogden et al. ProSynth: an integrated prosodic approach to device-independent, natural-sounding speech synthesis
CN112669815A (en) Song customization generation method and corresponding device, equipment and medium
CN116917984A (en) Interactive content output
CN106297766A (en) Phoneme synthesizing method and system
CN106292424A (en) Music data processing method and device for anthropomorphic robot
TWI605350B (en) Text-to-speech method and multiplingual speech synthesizer using the method
TWI574254B (en) Speech synthesis method and apparatus for electronic system
CN102970618A (en) Video on demand method based on syllable identification
CN1331113C (en) Speech synthesizer,method and recording medium for speech recording synthetic program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170405