CN109036374B - Data processing method and device - Google Patents
Data processing method and device Download PDFInfo
- Publication number
- CN109036374B CN109036374B CN201810720403.1A CN201810720403A CN109036374B CN 109036374 B CN109036374 B CN 109036374B CN 201810720403 A CN201810720403 A CN 201810720403A CN 109036374 B CN109036374 B CN 109036374B
- Authority
- CN
- China
- Prior art keywords
- voice
- story
- speech synthesis
- kinsfolk
- synthesis model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 19
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 177
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 177
- 238000006243 chemical reaction Methods 0.000 claims abstract description 77
- 238000000034 method Methods 0.000 claims abstract description 29
- 238000012549 training Methods 0.000 claims abstract description 23
- 238000012545 processing Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 4
- 230000005055 memory storage Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 abstract description 6
- 230000003993 interaction Effects 0.000 abstract description 5
- 238000013461 design Methods 0.000 description 24
- 238000010586 diagram Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 241001269238 Data Species 0.000 description 2
- 241000220259 Raphanus Species 0.000 description 2
- 235000006140 Raphanus sativus var sativus Nutrition 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000000630 rising effect Effects 0.000 description 2
- 235000001674 Agaricus brunnescens Nutrition 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The embodiment of the present application provides a kind of data processing method and device, includes the information of content to be played and the type of playing request in the playing request this method comprises: receiving the playing request of user's input;Using speech synthesis model corresponding with the type of playing request, content to be played is subjected to voice conversion, obtains voice;Speech synthesis model is to carry out the audio model that analyzing and training is established to the voice data of the kinsfolk for the children being collected into;Voice is played out.The speech synthesis model that the application passes through the corresponding kinsfolk of the different playing request types of acquisition, again because different playing requests corresponds to different scenes, therefore content transformation to be played at kinsfolk's and can be met into the sound of scene at that time, it can be applied to parent-child interaction, parent-offspring reads.
Description
Technical field
The invention relates to computer technology more particularly to a kind of data processing method and device.
Background technique
Nearly 2 years, with popularizing for artificial intelligence interaction technique, intelligent robot product was rapidly developed.Wherein, family
It is released one after another like the mushrooms after rain with the service robot high fire of type in particular for the company humanoid robot of children.
Existing company humanoid robot, can not issue the sound the same or similar with kinsfolk, can not be according to not
It requests to issue the sound for meeting scene with the user under scene.
Summary of the invention
The embodiment of the present application provides a kind of data processing method and device, cannot be issued in the prior art and family with overcoming
Member is the same and the technical issues of meeting the sound of scene.
In a first aspect, the embodiment of the present application provides a kind of data processing method, comprising:
The playing request of user's input is received, includes that the information of content to be played and the broadcasting are asked in the playing request
The type asked;
Using speech synthesis model corresponding with the type of the playing request, the content to be played is subjected to voice and is turned
It changes, obtains voice;The speech synthesis model is to carry out analyzing and training to the voice data of the kinsfolk for the children being collected into
The audio model established, the voice data are sound of the kinsfolk under scene corresponding with the type of the playing request
Sound data;
The voice is played out.
In a kind of possible design, the type of the playing request is story playing request, then the content to be played
Information include needed for play narration information;
It is described that the content is carried out by voice conversion using speech synthesis model corresponding with the type of the playing request,
Obtain voice, comprising:
Voice conversion is carried out to story content corresponding to the narration information using the first speech synthesis model, obtains event
Thing voice, wherein the first speech synthesis model is carried out to the first voice data of the kinsfolk for the children being collected into
The audio model that analyzing and training is established;First voice data is that kinsfolk's sound is being told a story under scene for children
Voice data;
The voice is played out, comprising:
The story voice is played out.
It is described to use the first speech synthesis model to story corresponding to the narration information in a kind of possible design
Content carries out voice conversion, obtains story voice, comprising:
Voice is carried out to story content corresponding to the narration information using the first speech synthesis model being locally stored
Conversion, obtains story voice;
Correspondingly, before the playing request for receiving user's input, further includes:
Receive the first speech synthesis model that cloud server is sent.
It is described to use the first speech synthesis model to story corresponding to the narration information in a kind of possible design
Content carries out voice conversion, obtains story voice, comprising:
Cloud server is sent by story content corresponding to the narration information, so that the cloud server uses
First speech synthesis model carries out voice conversion to the story content, obtains story voice;
Receive the story voice that the cloud server is sent.
In a kind of possible design, the required narration information played, comprising: the mark of the story of required broadcasting is believed
Breath;
Alternatively, the text information of the story of required broadcasting.
In a kind of possible design, in the playing request further include: corresponding to each kinsfolk of children
The selection information of one speech synthesis model;
Correspondingly, described carry out voice to story content corresponding to the narration information using the first speech synthesis model
Conversion, comprising:
Using with the corresponding first speech synthesis model of selection information, to story content corresponding to the narration information into
The conversion of row voice.
In a kind of possible design, in the playing request further include: the first voice corresponding for the father closes
At the selection information of model and the first speech synthesis model corresponding with mother;
Correspondingly, described carry out voice to story content corresponding to the narration information using the first speech synthesis model
Conversion, comprising:
Using with the corresponding first speech synthesis model of selection information, to story content corresponding to the narration information into
The conversion of row voice.
In a kind of possible design, the type of the playing request is daily voice playing request, described to be played interior
The information of appearance includes text to be played;
It is described to use speech synthesis model corresponding with the type of the playing request, the content to be played is subjected to language
Sound conversion, obtains voice, comprising:
Voice conversion is carried out to the text using the second speech synthesis model, obtains daily voice, wherein described second
Speech synthesis model is to carry out the audio that analyzing and training is established to the second sound data of the kinsfolk for the children being collected into
Model;The second sound data are second sound data of the kinsfolk under common dialogue scene;
The voice is played out, comprising:
The daily voice is played out.
It is described to be carried out the content using speech synthesis model corresponding with the type in a kind of possible design
Voice conversion, before obtaining voice, further includes:
Collect each of third voice data and the children of the kinsfolk of the children under common dialogue scene
A kinsfolk is in the first voice data told a story under scene for the children;
First voice data and second sound data are sent to cloud server, so that the cloud server:
Clustering is carried out to the third voice data, obtains second sound data corresponding with each kinsfolk, establish with
The corresponding generic sound database of each kinsfolk, establishes individualized voice data corresponding with each kinsfolk
Library;And for each kinsfolk, the second sound data for including to the corresponding generic sound database of the kinsfolk
It is trained, obtains the second speech synthesis model of the kinsfolk, individualized voice number corresponding to the kinsfolk
It is trained according to the first voice data that library includes, obtains the first speech synthesis model of the kinsfolk.
Second aspect, the embodiment of the present application provide a kind of device of data processing, comprising:
Receiving module includes the letter of content to be played for receiving the playing request of user's input, in the playing request
The type of breath and the playing request;
Literary periodicals module, for using speech synthesis model corresponding with the type of the playing request, will it is described to
Broadcasting content carries out voice conversion, obtains voice;The speech synthesis model is the sound to the kinsfolk for the children being collected into
Sound data carry out the audio model that analyzing and training is established, and the voice data is kinsfolk in the class with the playing request
Voice data under the corresponding scene of type;
Playing module, for being played out to the voice.
In a kind of possible design, the type for stating playing request is story playing request, then the content to be played
The narration information that information plays needed for including;
The literary periodicals module is specifically used for:
Voice conversion is carried out to story content corresponding to the narration information using the first speech synthesis model, obtains event
Thing voice, wherein the first speech synthesis model is carried out to the first voice data of the kinsfolk for the children being collected into
The audio model that analyzing and training is established;First voice data is that kinsfolk's sound is being told a story under scene for children
Voice data;
The playing module, is specifically used for:
The story voice is played out.
In a kind of possible design, the literary periodicals module is specifically used for:
Voice is carried out to story content corresponding to the narration information using the first speech synthesis model being locally stored
Conversion, obtains story voice;
The receiving module, is also used to: receiving the first speech synthesis model that cloud server is sent.
In a kind of possible design, further includes: sending module is used for: will be in story corresponding to the narration information
Appearance is sent to cloud server, so that the cloud server carries out language to the story content using the first speech synthesis model
Sound conversion, obtains story voice;
The receiving module, is also used to: receiving the story voice that the cloud server is sent.
In a kind of possible design, the required narration information played, comprising: the mark of the story of required broadcasting is believed
Breath;
Alternatively, the text information of the story of required broadcasting.
In a kind of possible design, in the playing request further include: corresponding to each kinsfolk of children
The selection information of one speech synthesis model;
Correspondingly, the literary periodicals module is specifically used for:
Using with the corresponding first speech synthesis model of selection information, to story content corresponding to the narration information into
The conversion of row voice.
In a kind of possible design, which is characterized in that in the playing request further include: corresponding for the father
The selection information of first speech synthesis model and the first speech synthesis model corresponding with mother;
Correspondingly, the literary periodicals module is specifically used for:
Using with the corresponding first speech synthesis model of selection information, to story content corresponding to the narration information into
The conversion of row voice.
In a kind of possible design, the type of the playing request is daily voice playing request, described to be played interior
The information of appearance includes text to be played;
The literary periodicals module is specifically used for:
Voice conversion is carried out to the text using the second speech synthesis model, obtains daily voice, wherein described second
Speech synthesis model is to carry out the audio that analyzing and training is established to the second sound data of the kinsfolk for the children being collected into
Model;The second sound data are second sound data of the kinsfolk under common dialogue scene;
The playing module, is specifically used for:
The daily voice is played out.
In a kind of possible design, further includes:
Collection module, for using speech synthesis model corresponding with the type described in the literary periodicals module
Content carries out voice conversion, before obtaining voice, collects third sound of the kinsfolk of the children under common dialogue scene
Each kinsfolk of sound data and the children are in the first voice data told a story under scene for the children;
Sending module, for first voice data and second sound data to be sent to cloud server, so that institute
It states cloud server: clustering being carried out to the third voice data, obtains the rising tone corresponding with each kinsfolk
Sound data establish generic sound database corresponding with each kinsfolk, establish corresponding with each kinsfolk
Property audio database;And for each kinsfolk, include to the corresponding generic sound database of the kinsfolk
Second sound data are trained, and obtain the second speech synthesis model of the kinsfolk, corresponding to the kinsfolk
The first voice data that individualized voice database includes is trained, and obtains the first speech synthesis mould of the kinsfolk
Type.
The third aspect, the embodiment of the present application provide a kind of computer readable storage medium, on computer readable storage medium
It is stored with computer program, when the computer program is executed by processor, first aspect is executed and first aspect is any
Method described in possible design.
Fourth aspect, a kind of device of data processing of the embodiment of the present application, including processor and memory, wherein
Memory, for storing program;
Processor, for executing the described program of memory storage, when described program is performed, the processor
For executing method described in first aspect and any possible design of first aspect.
Voice data of the user under different scenes is had collected for different scenes in the application, under each scene
Voice data is trained, and obtains the corresponding speech synthesis model of each scene.In this way, being inputted not in user by terminal device
With type corresponding under scene playing request when (different types of playing request corresponding different scenes), so that it may using with
Content to be played is converted to voice and played by the corresponding speech synthesis model of the type of playing request.That is, the application
Data processing method, can be by content transformation to be played at the sound for meeting scene at that time.
And speech synthesis model is the corresponding speech synthesis model of kinsfolk, then the data processing method of the present embodiment,
Content transformation to be played at kinsfolk's and can be met into the sound of scene at that time, can be applied to parent-child interaction, parent
Son is read.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is this Shen
Some embodiments please for those of ordinary skill in the art without any creative labor, can be with
It obtains other drawings based on these drawings.
Fig. 1 is system architecture diagram provided by the embodiments of the present application;
Fig. 2 is the flow chart one of data processing method provided by the embodiments of the present application;
Fig. 3 is the flowchart 2 of data processing method provided by the embodiments of the present application;
Fig. 4 is the flow chart 3 of data processing method provided by the embodiments of the present application;
Fig. 5 is the structural schematic diagram one of the device of data processing provided by the embodiments of the present application;
Fig. 6 is the structural schematic diagram two of the device of data processing provided by the embodiments of the present application;
Fig. 7 is the structural schematic diagram of terminal device provided by the embodiments of the present application.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
Every other embodiment obtained without creative efforts, shall fall in the protection scope of this application.
Fig. 1 is system architecture diagram provided by the embodiments of the present application, and referring to Fig. 1, the system architecture of the present embodiment includes: terminal
Equipment 11 and cloud server 12.
Wherein, what terminal device was used to receive user please play the voice data for asking and collecting user, cloud server
12 for according to the training of the voice data of user and storaged voice synthetic model.
Speech synthesis model can be also sent to terminal device 11 after obtaining speech synthesis model by cloud server 12.
The data processing method of the embodiment of the present application is described in detail using specific embodiment below.
Fig. 2 is the flow chart one of data processing method provided by the embodiments of the present application, as shown in Fig. 2, the side of the present embodiment
Method may include:
Step S101, the playing request of user's input is received, includes the information and broadcasting of content to be played in playing request
The type of request;
Step S102, using speech synthesis model corresponding with the type of playing request, content to be played is subjected to voice
Conversion, obtains voice;The speech synthesis model is to carry out analyzing and training to the voice data of the kinsfolk for the children being collected into
The audio model established, voice data are voice data of the kinsfolk under scene corresponding with the type of playing request;
Step S103, the voice is played out.
Specifically, the executing subject of the present embodiment can be terminal device, and terminal device can be children-story machine.
The playing request that corresponding step S101, user input can include: story playing request, daily voice playing request;
It is understood that inputting story playing request if user wants to listen story;If other users are sent out by the terminal device of itself
It send to one Duan Wenben of terminal device (text can be described as text to be played), then when user determines and plays the text, quite
Daily voice playing request is had input in user.
Daily voice playing request refers to that request plays the voice after conversion using the sound characteristic under common dialogue scene.
That is story playing request and daily voice playing request is two distinct types of playing request, each type of to broadcast
It puts type and corresponds to a kind of scene.
For step S102~step S103, using speech synthesis model corresponding with the type of playing request, will be wait broadcast
It puts content and carries out voice conversion, obtain voice, and play out to voice.
Corresponding sound characteristic is different when specifically, due to by story and common dialogue.Such as needs of telling a story
Emotion with the corresponding scene of story, and word speed is slower, and common dialogue does not need excessive emotion then, word speed is normal;If
It, all will be wait broadcast using same speech synthesis model either for story playing request, or for daily voice playing request
Content transformation is put into voice, then the voice obtained can not meet the corresponding scene of one of playing request: tell a story
Scene or common dialogue scene.
Therefore, according to the type for the playing request for including in playing request in the present embodiment, using the class with playing request
Content to be played is carried out voice conversion, obtains voice by the corresponding speech synthesis model of type.
For example, using the first language corresponding with story playing request if the type of playing request is story playing request
Content to be played is carried out voice conversion, obtains story voice, play the story voice by sound synthetic model;
If the type of playing request is daily voice playing request, use and daily voice playing request corresponding second
Content to be played is carried out voice conversion, obtains daily voice, play the daily voice by speech synthesis model.
The voice obtained in this way can meet the corresponding scene of playing request, improve the experience of user.
Wherein, terminal device can with enough first voice data of the collector A under scene of telling a story, by this first
Audio data transmitting is to cloud server, and cloud server establishes the corresponding individuation data library people A, and cloud server is to people A
The first voice data that corresponding individuation data library includes carries out analyzing and training and obtains the corresponding first speech synthesis mould of people A
Type.
Terminal device can be with enough first voice datas of collector B under scene of telling a story, by first sound
Sound data are sent to cloud server, and cloud server establishes the corresponding individuation data library people B, and cloud server is to B pairs of people
The first voice data that the individuation data library answered includes carries out analyzing and training and obtains the corresponding first speech synthesis model of people B.
That is, the corresponding first speech synthesis model of one or more people can be obtained according to demand, if cloud
End server gets the corresponding first speech synthesis model of multiple people, and playing request is story playing request, then broadcasts
It puts in request further include: the selection information of the first speech synthesis model;I.e. user inputs selection information by terminal device, if with
Family wants that A is listened to tell a story, then select include in information A mark.When playing story, terminal device or cloud server are used
The first speech synthesis model corresponding with selection information converts story text to voice, that is, selects the first voice corresponding with A
Synthetic model converts story text to voice.
It is understood that user is generally child for children-story machine, in order to enable in parent or other
Kinsfolk not when, child still can hear the sound of kinsfolk, in other words Story machine use kinsfolk sound
It tells a story, then for each kinsfolk, terminal device can collect enough of the house person under scene of telling a story
One voice data, by first audio data transmitting to cloud server, cloud server founds a family the personalized sound of member
Sound database, cloud server carry out analyzing and training to first voice data and obtain the first speech synthesis mould of house person
Type.It is, the corresponding first speech synthesis model of each kinsfolk.
Acquisition for the second speech synthesis model, it is enough under common dialogue scene that terminal device collects kinsfolk
More third voice datas;Clustering is carried out to the third voice data, obtains corresponding with each kinsfolk second
Voice data, and generic sound database corresponding with each kinsfolk is established, that is, each kinsfolk corresponding one
A generic sound database, for each kinsfolk, the rising tone for including to the corresponding generic sound database of kinsfolk
Sound data are trained, and obtain the second speech synthesis model of kinsfolk.It is, each kinsfolk is one second corresponding
Speech synthesis model.
If cloud server gets the corresponding second speech synthesis model of multiple kinsfolks, playing request is day
Chang Yuyin playing request then further includes the selection information of the second speech synthesis model in playing request;I.e. user is set by terminal
Standby input selection information selects the mark in information including A, terminal device or cloud service if user wants to listen the sound of A
Device uses the second speech synthesis model corresponding with selection information by text conversion to be played to voice, that is, selects corresponding with A
Second speech synthesis model is by text conversion to be played to daily voice.
Wherein, clustering algorithm, such as K-means clustering algorithm can be used in clustering.
After cloud server is according to clustering is carried out by third voice data, it must attend the meeting to obtain multiple universal phonetic collection,
Each universal phonetic collection corresponds to one family member, can be used following mode be each universal phonetic collection add corresponding family at
The mark of member:
A kind of achievable mode are as follows: for each all-purpose language collection, cloud server is by one section of language of all-purpose language collection
Sound is sent to terminal device, according to this section of speech recognition is which kinsfolk for user, after identification, user passes through terminal device
Input the mark of the universal phonetic collection, the mark for the universal phonetic collection that cloud server receiving terminal apparatus is sent.
Another achievable mode are as follows: for each all-purpose language collection, cloud server is that the addition of universal phonetic collection is pre-
One section of voice data that pre-selection mark and corresponding universal phonetic are concentrated is sent to terminal device and shown, for user by choosing mark
Judgement pre-selection identifies whether correctly, if incorrect, receives the correct mark of user equipment input, and will correctly identify hair
It send to cloud server, if correctly, input validation instruction.
It is understood that cloud server can be to third language after the third voice data for collecting preset duration
Sound data carry out clustering;After the mark for adding kinsfolk for each universal phonetic collection by the above method, terminal is set
It is standby to will continue to collect third voice data, using clustering by the second of the different home member for including in third voice data
Voice data is referred to corresponding universal phonetic and concentrates.
Cloud server can establish universal phonetic database, the universal phonetic collection of each kinsfolk for each kinsfolk
In voice data be stored in corresponding universal phonetic database, the mark in universal phonetic database is exactly corresponding family
The mark of member.
Voice data of the user under different scenes is had collected for different scenes in the present embodiment, under each scene
Voice data be trained, obtain the corresponding speech synthesis model of each scene.In this way, being inputted in user by terminal device
Under different scenes when the playing request of corresponding type (different types of playing request corresponding different scenes), so that it may use
Content to be played is converted to voice and played by speech synthesis model corresponding with the type of playing request.That is, this implementation
The data processing method of example, can be by content transformation to be played at the sound for meeting scene at that time.
And speech synthesis model is the corresponding speech synthesis model of kinsfolk, then the data processing method of the present embodiment,
Content transformation to be played at kinsfolk's and can be met into the sound of scene at that time, can be applied to parent-child interaction, parent
Son is read.
The corresponding data processing method of different types of playing request is illustrated using specific embodiment below.
Fig. 3 is the flowchart 2 of data processing method provided by the embodiments of the present application, referring to Fig. 3, the method for the present embodiment,
Include:
Step S201, the story playing request of user's input is received, includes the story of required broadcasting in story playing request
Information;
Step S202, voice conversion is carried out to story content corresponding to narration information using the first speech synthesis model,
Obtain story voice, wherein the first speech synthesis model be to the first voice data of the kinsfolk for the children being collected into
The audio model that row analyzing and training is established;First voice data is kinsfolk's sound in the sound told a story under scene for children
Sound data;
Step S203 plays out story voice.
Specifically, the executing subject of the present embodiment can be terminal device, and terminal device can be company humanoid robot, such as youngster
Virgin Story machine.
Story playing request is inputted by terminal device for step S201, user, comprising:
The story playing request of user's input is received, the mark letter of the story comprising required broadcasting in story playing request
Breath;Alternatively,
The story playing request of user's input is received, the text envelope of the story comprising required broadcasting in story playing request
Breath.
Wherein, when terminal device be children-story machine when, user input story playing request mode have it is following several,
But it is not limited to following several:
First way are as follows: user clicks directly on the button input story playing request of the broadcasting story on Story machine, this
When, Story machine is the required story played according to the story that the preset sequence of Story machine should currently play.
The second way are as follows: story list is shown on the display screen of Story machine, user is by choosing institute in story list
The story that need to be played inputs story playing request, at this time the identification information of the story in story playing request comprising required broadcasting.
The third mode are as follows: the display screen display input frame of Story machine, user needed for input frame input institute by playing
Story name and input story playing request, at this time in story playing request comprising required broadcasting story mark letter
Breath.
4th kind of mode are as follows: the display screen display input frame of Story machine, user needed for input frame input institute by playing
Story text information, at this time in story playing request comprising required broadcasting story text information.
Fifth procedure are as follows: Story machine, which has the function of to sweep, to be swept, and by the corresponding text of story played needed for scanning, is obtained
The text information of the story played needed for taking, that is, have input story playing request.It is broadcast in story playing request comprising required at this time
The text information for the story put.
6th kind of mode are as follows: user inputs story playing request using voice, for example, user inputs voice " pulling out radish ",
The identification information of story in story playing request comprising required broadcasting at this time.
Further, if cloud server obtains the corresponding first speech synthesis model of multiple kinsfolks, story
In playing request further include: the selection information of the first speech synthesis model corresponding for each kinsfolk.It is understood that
Select which kinsfolk corresponding first speech synthesis model that the corresponding story content of narration information is converted to voice, just
The sound who tells a story can be played out.
For example, if cloud server obtains and the corresponding first speech synthesis model of father and corresponding with mother
One speech synthesis model then includes: the first speech synthesis model corresponding for father and and mother in story playing request
The selection information of corresponding first speech synthesis model.
It is to be appreciated that but being not limited to following several at this point, the mode that user inputs story playing request has following several
Kind:
First way are as follows: story list is shown on the display screen of Story machine, user, which first passes through, to be chosen in story list
The story of required broadcasting, then by choosing a certain kinsfolk in kinsfolk's selective listing, asked to input story broadcasting
Ask, at this time the identification information of the story in story playing request comprising required broadcasting, and to the selection information of kinsfolk (or
Person says the selection information to the first speech synthesis model).
The second way are as follows: at least two input frames are shown on the display screen of Story machine, user passes through the first input frame
The name of the story played needed for input, the address of kinsfolk is inputted by the second input frame, is asked to input story broadcasting
It asks;Wherein, address i.e. the mark of the corresponding individualized voice database of corresponding kinsfolk.At this time in story playing request
The identification information of story comprising required broadcasting, and the selection information to the first speech synthesis model.
The third mode are as follows: at least two input frames are shown on the display screen of Story machine, user passes through the first input frame
Input institute needed for play story text information, by the second input frame input kinsfolk address, to input story
Playing request;The text information of story in story playing request comprising required broadcasting at this time, and to the first speech synthesis mould
The selection information of type.
4th kind of mode are as follows: Story machine, which has the function of to sweep, to be swept, and by the corresponding text of story played needed for scanning, is obtained
The former text information played needed for taking inputs the address of kinsfolk by input frame, to input story playing request;At this time
The text information of story in story playing request comprising required broadcasting, and the selection information to the first speech synthesis model.
Fifth procedure are as follows: user inputs story playing request using voice, " father is listened to say for example, user inputs voice
Pull out radish ", it at this time include the identification information of the story of required broadcasting in story playing request, and to the first speech synthesis model
Selection information
Corresponding step S202: voice is carried out to story content corresponding to narration information using the first speech synthesis model and is turned
It changes, obtains story voice, in a kind of possible embodiment, comprising:
Voice conversion is carried out to story content corresponding to narration information using the first speech synthesis model being locally stored,
Obtain story voice;
In this embodiment, before the story playing request for receiving user's input, further includes:
The first speech synthesis model that cloud server is sent is received, that is, each first will obtained in cloud server
Speech synthesis model is sent to terminal device, stores at terminal device.After terminal device receives story playing request, according to
The identification information or text information of the story of the required broadcasting carried in story playing request, the story played needed for obtaining
Content (the namely content of the corresponding story of narration information) then directlys adopt and is stored in local speech synthesis model to event
Story content corresponding to thing information carries out voice conversion, obtains story voice.
If including in story playing request " the selection information of the first speech synthesis model corresponding for each kinsfolk ",
Voice conversion is then carried out to story content corresponding to narration information using the first speech synthesis model being locally stored, obtains event
Thing voice, comprising:
Then using the first speech synthesis model corresponding with selection information being locally stored to event corresponding to narration information
Thing content carries out voice conversion, obtains story voice.
In alternatively possible embodiment, using the first speech synthesis model to story content corresponding to narration information
Voice conversion is carried out, story voice is obtained, comprising:
Cloud server is sent by story content corresponding to narration information, so that cloud server uses the first voice
Synthetic model carries out voice conversion to the story content, obtains story voice;
Receive the story voice that cloud server is sent.
In this embodiment, the required broadcasting carried in the story playing request that terminal device can be inputted according to user
The identification information or text information of story, content (the namely corresponding story of narration information of the story played needed for obtaining
Content);Then, cloud server is sent by story content corresponding to narration information, so that cloud server is using the
One speech synthesis model carries out voice conversion to the story content, obtains story voice.
If including in story playing request " the selection information of the first speech synthesis model corresponding for each kinsfolk ",
Then receive the story voice of cloud server transmission, comprising:
Receiving cloud server uses the first speech synthesis model corresponding with selection information to carry out language to the story content
Sound conversion, obtained story voice.
In the present embodiment, when the type of playing request is the type of story playing request, using corresponding with the type
Story plays the corresponding first speech synthesis model of scene and carries out voice conversion to story content, and obtained story voice meets
The scene that current story plays, and obtained story voice is the sound of kinsfolk, realizes parent-offspring's reading, improves use
Family uses the experience of Story machine.
Fig. 4 is the flow chart 3 of data processing method provided by the embodiments of the present application, referring to fig. 4, the method for the present embodiment,
Include:
Step S301, the daily voice playing request of user's input is received, includes to be played in daily voice playing request
Text;
Step S302, voice conversion is carried out to the text using the second speech synthesis model, obtains daily voice, wherein
Second speech synthesis model is to carry out analyzing and training to the second sound data of the kinsfolk for the children being collected into be established
Audio model;Second sound data are voice data of kinsfolk's sound under common dialogue scene;
Step S303 plays out daily voice.
Specifically, when the executing subject of the present embodiment is Story machine, for step S301, when some kinsfolk use
After the terminal device of oneself sends one section of text to Story machine, Story machine can show or issue prompt information, prompt information instruction
Receive text, if need to play, if user agrees to play, user can be inputted daily voice broadcasting by Story machine and be asked
It asks.
Further, daily if cloud server obtains the corresponding second speech synthesis model of multiple kinsfolks
In voice playing request further include: the selection information of the second speech synthesis model corresponding for each kinsfolk.It is understood that
, the corresponding second speech synthesis model of which kinsfolk has been selected, whose sound will be the text be played using.
For example, if cloud server obtains and the corresponding second speech synthesis model of father and corresponding with mother
Two speech synthesis models, then in daily voice playing request further include: the second speech synthesis model corresponding for father and
The selection information of the second speech synthesis model corresponding with mother.
For step S302: carrying out voice conversion to the text using the second speech synthesis model, obtain daily voice, In
In a kind of possible embodiment, comprising:
Voice conversion is carried out to the text using the second speech synthesis model being locally stored, obtains daily voice;
In this embodiment, before the daily voice playing request for receiving user's input, further includes:
Receive the second speech synthesis model that cloud server is sent, that is, each second language that cloud server will obtain
Sound synthetic model is sent to terminal device, stores at terminal device.Terminal device receives straight after daily voice playing request
It connects using the second local speech synthesis model is stored in text progress voice conversion to be played, obtains daily voice.
If including that " selection of the second speech synthesis model corresponding for each kinsfolk is believed in daily voice playing request
Breath " then carries out voice conversion to text to be played using the second speech synthesis model being locally stored, obtains daily voice,
Include:
Language is then carried out to text to be played using the second speech synthesis model corresponding with selection information being locally stored
Sound conversion, obtains daily voice.
In alternatively possible embodiment, voice is carried out to text to be played using the second speech synthesis model and is turned
It changes, obtains daily voice, comprising:
Cloud server is sent by text to be played, so that cloud server is treated using the second speech synthesis model
The text of broadcasting carries out voice conversion, obtains daily voice;
Receive the daily voice that cloud server is sent.
In this embodiment, terminal device can send cloud server for text to be played, so that cloud service
Device carries out voice conversion to text to be played using the second speech synthesis model, obtains daily voice.
If including that " selection of the first speech synthesis model corresponding for each kinsfolk is believed in daily voice playing request
Breath " then receives the daily voice of cloud server transmission, comprising:
Receiving cloud server uses the second speech synthesis model corresponding with selection information to carry out text to be played
Voice conversion, obtained daily voice.
In the present embodiment, under the type of this playing request of daily voice playing request, using the type corresponding day
The corresponding second speech synthesis model of Chang Yuyin scene carries out voice conversion to text to be played, and obtained daily voice meets
The scene of current daily voice or common dialogue, and obtained daily voice is the sound of kinsfolk, realizes parent-offspring
Interaction improves the experience that user uses Story machine.
Fig. 5 is the structural schematic diagram one of the device of data processing provided by the embodiments of the present application, as shown in figure 5, this implementation
The device of example may include: receiving module 41, literary periodicals module 42 and playing module 43;
Receiving module 41 includes content to be played in the playing request for receiving the playing request of user's input
The type of information and the playing request;
Literary periodicals module 42 will be described for using speech synthesis model corresponding with the type of the playing request
Content to be played carries out voice conversion, obtains voice;The speech synthesis model is the kinsfolk to the children being collected into
Voice data carries out the audio model established of analyzing and training, the voice data be kinsfolk with the playing request
Voice data under the corresponding scene of type;
Playing module 43, for being played out to the voice.
The device of the present embodiment can be used for executing the technical solution of above method embodiment, realization principle and technology
Effect is similar, and details are not described herein again.
In a kind of possible design, the type for stating playing request is story playing request, then the content to be played
The narration information that information plays needed for including;
The literary periodicals module 42 is specifically used for:
Voice conversion is carried out to story content corresponding to the narration information using the first speech synthesis model, obtains event
Thing voice, wherein the first speech synthesis model is carried out to the first voice data of the kinsfolk for the children being collected into
The audio model that analyzing and training is established;First voice data is that kinsfolk's sound is being told a story under scene for children
Voice data;
The playing module 43, is specifically used for:
The story voice is played out.
In a kind of possible design, the literary periodicals module 42 is specifically used for:
Voice is carried out to story content corresponding to the narration information using the first speech synthesis model being locally stored
Conversion, obtains story voice;
The receiving module 41, is also used to: receiving the first speech synthesis model that cloud server is sent.
In a kind of possible design, the required narration information played, comprising: the mark of the story of required broadcasting is believed
Breath;
Alternatively, the text information of the story of required broadcasting.
In a kind of possible design, in the playing request further include: corresponding to each kinsfolk of children
The selection information of one speech synthesis model;
Correspondingly, the literary periodicals module 42 is specifically used for:
Using with the corresponding first speech synthesis model of selection information, to story content corresponding to the narration information into
The conversion of row voice.
In a kind of possible design, in the playing request further include: the first voice corresponding for the father closes
At the selection information of model and the first speech synthesis model corresponding with mother;
Correspondingly, the literary periodicals module 42 is specifically used for:
Using with the corresponding first speech synthesis model of selection information, to story content corresponding to the narration information into
The conversion of row voice.
In a kind of possible design, the type of the playing request is daily voice playing request, described to be played interior
The information of appearance includes text to be played;
The literary periodicals module 42 is specifically used for:
Voice conversion is carried out to the text using the second speech synthesis model, obtains daily voice, wherein described second
Speech synthesis model is to carry out the audio that analyzing and training is established to the second sound data of the kinsfolk for the children being collected into
Model;The second sound data are second sound data of the kinsfolk under common dialogue scene;
The playing module 43, is specifically used for:
The daily voice is played out.
The device of the present embodiment can be used for executing the technical solution of above method embodiment, realization principle and technology
Effect is similar, and details are not described herein again.
Fig. 6 is the structural schematic diagram two of the device of data processing provided by the embodiments of the present application, as shown in fig. 6, this implementation
It can also include: sending module 44, collection module 45 further on the basis of the device apparatus structure shown in Fig. 5 of example;
Collection module 45, for using speech synthesis model corresponding with the type by institute in the literary periodicals module
It states content and carries out voice conversion, before obtaining voice, collect third of the kinsfolk of the children under common dialogue scene
Each kinsfolk of voice data and the children are in the first voice data told a story under scene for the children;
Sending module 44, for first voice data and second sound data to be sent to cloud server, so that
The cloud server: clustering is carried out to the third voice data, obtains corresponding with each kinsfolk second
Voice data establishes generic sound database corresponding with each kinsfolk, establishes corresponding with each kinsfolk
Individualized voice database;And for each kinsfolk, include to the corresponding generic sound database of the kinsfolk
Second sound data be trained, obtain the second speech synthesis model of the kinsfolk, it is corresponding to the kinsfolk
Individualized voice database the first voice data for including be trained, obtain the first speech synthesis mould of the kinsfolk
Type.
Sending module 45, is also used to: cloud server is sent by story content corresponding to the narration information, so that
The cloud server carries out voice conversion to the story content using the first speech synthesis model, obtains story voice;
Correspondingly, the receiving module 41 is also used to: receiving the story voice that the cloud server is sent.
The device of the present embodiment can be used for executing the technical solution of above method embodiment, realization principle and technology
Effect is similar, and details are not described herein again.
The embodiment of the present application provides a kind of computer readable storage medium, and calculating is stored on computer readable storage medium
Machine program executes the method in above method embodiment when the computer program is executed by processor.
Fig. 7 is the structural schematic diagram of terminal device provided by the embodiments of the present application, and referring to Fig. 7, the device of this implementation includes
Processor 71, memory 72 and communication bus 73, communication bus 73 are used for the connection of an electronic device, wherein
Memory 71, for storing program;
Processor 72, for executing the described program of memory storage, when described program is performed, the processing
Device is used to execute the method in above method embodiment.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above-mentioned each method embodiment can lead to
The relevant hardware of program instruction is crossed to complete.Program above-mentioned can be stored in a computer readable storage medium.The journey
When being executed, execution includes the steps that above-mentioned each method embodiment to sequence;And storage medium above-mentioned include: ROM, RAM, magnetic disk or
The various media that can store program code such as person's CD.
Finally, it should be noted that the above various embodiments is only to illustrate the technical solution of the application, rather than its limitations;To the greatest extent
Pipe is described in detail the application referring to foregoing embodiments, those skilled in the art should understand that: its according to
So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into
Row equivalent replacement;And these are modified or replaceed, each embodiment technology of the application that it does not separate the essence of the corresponding technical solution
The range of scheme.
Claims (18)
1. a kind of data processing method characterized by comprising
The playing request of user's input is received, includes the information and the playing request of content to be played in the playing request
Type;
Using speech synthesis model corresponding with the type of the playing request, the content to be played is subjected to voice conversion,
Obtain voice;The speech synthesis model is to carry out analyzing and training to the voice data of the kinsfolk for the children being collected into be built
Vertical audio model;The voice data is sound number of the kinsfolk under scene corresponding with the type of the playing request
According to;
The voice is played out;
It is described that the content is carried out by voice conversion using speech synthesis model corresponding with the type, before obtaining voice,
Further include:
Collect each family of third voice data and the children of the kinsfolk of the children under common dialogue scene
Front yard member is in the first voice data told a story under scene for the children;
First voice data and third audio data transmitting are given to cloud server, so that the cloud server: to institute
It states third voice data and carries out clustering, obtain second sound data corresponding with each kinsfolk, establish and each family
The corresponding generic sound database of front yard member establishes individualized voice database corresponding with each kinsfolk;And
And for each kinsfolk, the second sound data for including to the corresponding generic sound database of the kinsfolk are instructed
Practice, obtains the second speech synthesis model of the kinsfolk, individualized voice database packet corresponding to the kinsfolk
The first voice data included is trained, and obtains the first speech synthesis model of the kinsfolk.
2. the method according to claim 1, wherein
The type of the playing request is story playing request, then the story played needed for the information of the content to be played includes
Information;
It is described that the content is carried out by voice conversion using speech synthesis model corresponding with the type of the playing request, it obtains
Voice, comprising:
Voice conversion is carried out to story content corresponding to the narration information using the first speech synthesis model, obtains story language
Sound, wherein the first speech synthesis model is carried out to the first voice data of the kinsfolk for the children being collected into
The audio model that analyzing and training is established;First voice data is kinsfolk in the sound told a story under scene for children
Data;
The voice is played out, comprising:
The story voice is played out.
3. according to the method described in claim 2, it is characterized in that, described believe the story using the first speech synthesis model
The corresponding story content of breath carries out voice conversion, obtains story voice, comprising:
Voice conversion is carried out to story content corresponding to the narration information using the first speech synthesis model being locally stored,
Obtain story voice;
Correspondingly, before the playing request for receiving user's input, further includes:
Receive the first speech synthesis model that cloud server is sent.
4. according to the method described in claim 2, it is characterized in that, described believe the story using the first speech synthesis model
The corresponding story content of breath carries out voice conversion, obtains story voice, comprising:
Cloud server is sent by story content corresponding to the narration information, so that the cloud server uses first
Speech synthesis model carries out voice conversion to the story content, obtains story voice;
Receive the story voice that the cloud server is sent.
5. the method according to any one of claim 2~4, which is characterized in that the narration information played needed for described, packet
It includes: the identification information of the story of required broadcasting;
Alternatively, the text information of the story of required broadcasting.
6. the method according to any one of claim 2~4, which is characterized in that in the playing request further include: to institute
State the selection information of the corresponding first speech synthesis model of each kinsfolk of children;
Correspondingly, described carry out voice turn to story content corresponding to the narration information using the first speech synthesis model
It changes, comprising:
Using the first speech synthesis model corresponding with selection information, language is carried out to story content corresponding to the narration information
Sound conversion.
7. the method according to any one of claim 2~4, which is characterized in that in the playing request further include: for
The selection information of the corresponding first speech synthesis model of father and the first speech synthesis model corresponding with mother;
Correspondingly, described carry out voice turn to story content corresponding to the narration information using the first speech synthesis model
It changes, comprising:
Using the first speech synthesis model corresponding with selection information, language is carried out to story content corresponding to the narration information
Sound conversion.
8. the method according to claim 1, wherein
The type of the playing request is daily voice playing request, and the information of the content to be played includes text to be played
This;
It is described to use speech synthesis model corresponding with the type of the playing request, the content to be played is subjected to voice and is turned
It changes, obtains voice, comprising:
Voice conversion is carried out to the text to be played using the second speech synthesis model, obtains daily voice, wherein described
Second speech synthesis model is to carry out analyzing and training to the second sound data of the kinsfolk for the children being collected into be established
Audio model;The second sound data are second sound data of the kinsfolk under common dialogue scene;
The voice is played out, comprising:
The daily voice is played out.
9. a kind of device of data processing characterized by comprising
Receiving module, include for receiving the playing request of user's input, in the playing request content to be played information and
The type of the playing request;
Literary periodicals module will be described to be played for using speech synthesis model corresponding with the type of the playing request
Content carries out voice conversion, obtains voice;The speech synthesis model is the sound number to the kinsfolk for the children being collected into
The audio model established according to analyzing and training is carried out, the voice data are kinsfolk in the type pair with the playing request
Voice data under the scene answered;
Playing module, for being played out to the voice;
Collection module, for using speech synthesis model corresponding with the type by the content in the literary periodicals module
Voice conversion is carried out, before obtaining voice, collects third sound number of the kinsfolk of the children under common dialogue scene
Accordingly and each kinsfolk of the children is in the first voice data told a story under scene for the children;
Sending module, for giving first voice data and third audio data transmitting to cloud server, so that the cloud
It holds server: clustering being carried out to the third voice data, obtains second sound number corresponding with each kinsfolk
According to foundation generic sound database corresponding with each kinsfolk establishes personalization corresponding with each kinsfolk
Audio database;And for each kinsfolk, include to the corresponding generic sound database of the kinsfolk second
Voice data is trained, and obtains the second speech synthesis model of the kinsfolk, individual character corresponding to the kinsfolk
Change the first voice data that audio database includes to be trained, obtains the first speech synthesis model of the kinsfolk.
10. device according to claim 9, which is characterized in that
The type of the playing request is story playing request, then the story played needed for the information of the content to be played includes
Information;
The literary periodicals module is specifically used for:
Voice conversion is carried out to story content corresponding to the narration information using the first speech synthesis model, obtains story language
Sound, wherein the first speech synthesis model is analyzed the first voice data of the kinsfolk for the children being collected into
The established audio model of training;First voice data is kinsfolk's sound in the sound told a story under scene for children
Data;
The playing module, is specifically used for:
The story voice is played out.
11. device according to claim 10, which is characterized in that the literary periodicals module is specifically used for:
Voice conversion is carried out to story content corresponding to the narration information using the first speech synthesis model being locally stored,
Obtain story voice;
The receiving module, is also used to: receiving the first speech synthesis model that cloud server is sent.
12. device according to claim 10, which is characterized in that the sending module is also used to: by the narration information
Corresponding story content is sent to cloud server, so that the cloud server is using the first speech synthesis model to described
Story content carries out voice conversion, obtains story voice;
The receiving module, is also used to: receiving the story voice that the cloud server is sent.
13. device described in any one of 0~12 according to claim 1, which is characterized in that the story letter played needed for described
Breath, comprising: the identification information of the story of required broadcasting;
Alternatively, the text information of the story of required broadcasting.
14. device described in any one of 0~12 according to claim 1, which is characterized in that in the playing request further include:
To the selection information of the corresponding first speech synthesis model of each kinsfolk of children;
Correspondingly, the literary periodicals module is specifically used for:
Using the first speech synthesis model corresponding with selection information, language is carried out to story content corresponding to the narration information
Sound conversion.
15. device described in any one of 0~12 according to claim 1, which is characterized in that in the playing request further include:
The selection information of the first speech synthesis model corresponding for father and the first speech synthesis model corresponding with mother;
Correspondingly, the literary periodicals module is specifically used for:
Using the first speech synthesis model corresponding with selection information, language is carried out to story content corresponding to the narration information
Sound conversion.
16. device according to claim 9, which is characterized in that
The type of the playing request is daily voice playing request, and the information of the content to be played includes text to be played
This;
The literary periodicals module is specifically used for:
Voice conversion is carried out to the text using the second speech synthesis model, obtains daily voice, wherein second voice
Synthetic model is to carry out the audio model that analyzing and training is established to the second sound data of the kinsfolk for the children being collected into;
The second sound data are second sound data of the kinsfolk under common dialogue scene;
The playing module, is specifically used for:
The daily voice is played out.
17. a kind of computer readable storage medium, which is characterized in that be stored with computer journey on computer readable storage medium
Sequence, when the computer program is executed by processor, method described in any one of perform claim requirement 1 to 8.
18. a kind of device of data processing, which is characterized in that including processor and memory, wherein
Memory, for storing program;
Processor, for executing the described program of the memory storage, when described program is performed, the processor is used for
Perform claim requires any method in 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810720403.1A CN109036374B (en) | 2018-07-03 | 2018-07-03 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810720403.1A CN109036374B (en) | 2018-07-03 | 2018-07-03 | Data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109036374A CN109036374A (en) | 2018-12-18 |
CN109036374B true CN109036374B (en) | 2019-12-03 |
Family
ID=65521587
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810720403.1A Active CN109036374B (en) | 2018-07-03 | 2018-07-03 | Data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109036374B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW202009924A (en) * | 2018-08-16 | 2020-03-01 | 國立臺灣科技大學 | Timbre-selectable human voice playback system, playback method thereof and computer-readable recording medium |
CN110032355B (en) * | 2018-12-24 | 2022-05-17 | 阿里巴巴集团控股有限公司 | Voice playing method and device, terminal equipment and computer storage medium |
CN109903748A (en) * | 2019-02-14 | 2019-06-18 | 平安科技(深圳)有限公司 | A kind of phoneme synthesizing method and device based on customized sound bank |
CN110751940B (en) | 2019-09-16 | 2021-06-11 | 百度在线网络技术(北京)有限公司 | Method, device, equipment and computer storage medium for generating voice packet |
CN110600000B (en) * | 2019-09-29 | 2022-04-15 | 阿波罗智联(北京)科技有限公司 | Voice broadcasting method and device, electronic equipment and storage medium |
CN111696517A (en) * | 2020-05-28 | 2020-09-22 | 平安科技(深圳)有限公司 | Speech synthesis method, speech synthesis device, computer equipment and computer readable storage medium |
CN111816168A (en) * | 2020-07-21 | 2020-10-23 | 腾讯科技(深圳)有限公司 | Model training method, voice playing method, device and storage medium |
CN114024789A (en) * | 2021-10-15 | 2022-02-08 | 北京金茂绿建科技有限公司 | Voice playing method based on working mode and intelligent household equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104123857A (en) * | 2014-07-16 | 2014-10-29 | 北京网梯科技发展有限公司 | Device and method for achieving individualized touch reading |
CN104318813A (en) * | 2014-10-30 | 2015-01-28 | 天津侣途科技有限公司 | Child early education method and system based on mobile internet |
CN104992703A (en) * | 2015-07-24 | 2015-10-21 | 百度在线网络技术(北京)有限公司 | Speech synthesis method and system |
CN107464554A (en) * | 2017-09-28 | 2017-12-12 | 百度在线网络技术(北京)有限公司 | Phonetic synthesis model generating method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170256177A1 (en) * | 2016-03-01 | 2017-09-07 | International Business Machines Corporation | Genealogy and hereditary based analytics and delivery |
-
2018
- 2018-07-03 CN CN201810720403.1A patent/CN109036374B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104123857A (en) * | 2014-07-16 | 2014-10-29 | 北京网梯科技发展有限公司 | Device and method for achieving individualized touch reading |
CN104318813A (en) * | 2014-10-30 | 2015-01-28 | 天津侣途科技有限公司 | Child early education method and system based on mobile internet |
CN104992703A (en) * | 2015-07-24 | 2015-10-21 | 百度在线网络技术(北京)有限公司 | Speech synthesis method and system |
CN107464554A (en) * | 2017-09-28 | 2017-12-12 | 百度在线网络技术(北京)有限公司 | Phonetic synthesis model generating method and device |
Also Published As
Publication number | Publication date |
---|---|
CN109036374A (en) | 2018-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109036374B (en) | Data processing method and device | |
CN105304080B (en) | Speech synthetic device and method | |
JP6876752B2 (en) | Response method and equipment | |
CN107633719B (en) | Anthropomorphic image artificial intelligence teaching system and method based on multi-language human-computer interaction | |
CN109272984A (en) | Method and apparatus for interactive voice | |
CN107833574A (en) | Method and apparatus for providing voice service | |
JP2019212288A (en) | Method and device for outputting information | |
CN106200886A (en) | A kind of intelligent movable toy manipulated alternately based on language and toy using method | |
CN108882101B (en) | Playing control method, device, equipment and storage medium of intelligent sound box | |
CN100585663C (en) | Language studying system | |
CN109119071A (en) | Training method and device of voice recognition model | |
CN204650422U (en) | A kind of intelligent movable toy manipulated alternately based on language | |
CN109710799B (en) | Voice interaction method, medium, device and computing equipment | |
CN104166547A (en) | Channel control method and device | |
WO2021197301A1 (en) | Auxiliary reading method and apparatus, storage medium, and electronic device | |
CN107959882B (en) | Voice conversion method, device, terminal and medium based on video watching record | |
CN107908743A (en) | Artificial intelligence application construction method and device | |
CN110349569A (en) | The training and recognition methods of customized product language model and device | |
CN111339881A (en) | Baby growth monitoring method and system based on emotion recognition | |
CN109460548B (en) | Intelligent robot-oriented story data processing method and system | |
CN107908709A (en) | Parent-child language chat interaction method, device and system | |
CN113205569B (en) | Image drawing method and device, computer readable medium and electronic equipment | |
CN101819797A (en) | Electronic device with interactive audio recording function and recording method thereof | |
CN116403583A (en) | Voice data processing method and device, nonvolatile storage medium and vehicle | |
JP4899383B2 (en) | Language learning support method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210519 Address after: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing Patentee after: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd. Patentee after: Shanghai Xiaodu Technology Co.,Ltd. Address before: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing Patentee before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd. |