CN109065019B - Intelligent robot-oriented story data processing method and system - Google Patents

Intelligent robot-oriented story data processing method and system Download PDF

Info

Publication number
CN109065019B
CN109065019B CN201810981546.8A CN201810981546A CN109065019B CN 109065019 B CN109065019 B CN 109065019B CN 201810981546 A CN201810981546 A CN 201810981546A CN 109065019 B CN109065019 B CN 109065019B
Authority
CN
China
Prior art keywords
story
data
voice
over
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810981546.8A
Other languages
Chinese (zh)
Other versions
CN109065019A (en
Inventor
贾志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Guangnian Wuxian Technology Co Ltd
Original Assignee
Beijing Guangnian Wuxian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guangnian Wuxian Technology Co Ltd filed Critical Beijing Guangnian Wuxian Technology Co Ltd
Priority to CN201810981546.8A priority Critical patent/CN109065019B/en
Publication of CN109065019A publication Critical patent/CN109065019A/en
Application granted granted Critical
Publication of CN109065019B publication Critical patent/CN109065019B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation

Abstract

The invention discloses a story data processing method and system for an intelligent robot. The method comprises the following steps: acquiring story text data; analyzing the story text data, and identifying conversation and voice-over in the story text; calling a story data processing model, and carrying out sound effect processing on the conversation and the voice-over in the story text to generate conversation and voice-over data with sound effects; and generating and outputting multi-modal data matched with the story text, wherein the multi-modal data comprises the dialogue with sound effect and the voice-over data. Compared with the prior art, according to the method and the system, the story in the text form can be converted into multi-mode data which can be displayed in a multi-mode, and the display modes of dialogues and bystanders in the story can be optimized in a targeted mode, so that the user experience of listeners in telling the story is greatly improved.

Description

Intelligent robot-oriented story data processing method and system
Technical Field
The invention relates to the field of computers, in particular to a story data processing method and system for an intelligent robot.
Background
In the traditional daily life of human beings, reading characters is a main way for people to appreciate literary works. However, in certain specific scenarios, people also appreciate literary works by sound, e.g., listening to a comment, listening to a reciting, etc. Most often, children with inadequate literacy are often listened to by others' narration (listening to others telling a story).
With the continuous development of multimedia technology, more and more multimedia devices are applied to the daily life of human beings. With the support of multimedia technology, the body of the acoustic form of the literary works, in particular the storytelling, is gradually transformed to multimedia devices.
In general, storytelling using multimedia devices is usually manual storytelling in advance and recording audio files. The multimedia device simply plays the recorded audio file. With the development of computer technology, in order to simply and conveniently acquire a sound source, in the prior art, a method of converting text data into audio data is also adopted. Therefore, manual text recitation and recording are not needed, and story telling can be realized by using the multimedia equipment only by providing story text. However, the text-to-speech conversion is directly performed by using a computer technology, and only the direct conversion of text contents can be ensured, and the harmony of real people in story telling cannot be achieved, so that in the prior art, story telling based on the text conversion technology is quite dry and uninteresting, only direct word meanings can be simply conveyed, and the user experience is very poor.
Disclosure of Invention
The invention provides a story data processing method facing an intelligent robot, which comprises the following steps:
acquiring story text data;
analyzing the story text data, and identifying conversation and voice-over in the story text;
calling a story data processing model, and carrying out sound effect processing on the conversation and the voice-over in the story text to generate conversation and voice-over data with sound effects;
and generating and outputting multi-modal data matched with the story text, wherein the multi-modal data comprises the dialogue with sound effect and the voice-over data.
In an embodiment, the multimodal data further comprises intelligent robot motion data, wherein:
generating corresponding intelligent robot action data for the dialogue and the dialogue in the story text.
In an embodiment, the method further comprises:
combining the dialogue and the voice-over data with the sound effect, and performing character voice conversion on the dialogue and the voice-over in the story text to generate dialogue and voice-over data with the sound effect;
converting texts except dialogues and voices in the story text data into first voice data;
and fusing the dialogue with the sound effect, the voice data beside the voice and the first voice data to generate story voice data.
In one embodiment, calling a story data processing model to perform sound effect processing on the dialogue and the voice-over in the story text comprises the following steps:
performing text recognition on the story text, performing content element disassembly on a story based on a text recognition result, and extracting story elements;
determining sound effect characteristics matching the conversation and the voice-over according to story elements corresponding to the conversation and the voice-over;
and converting the dialogue and the voice-over into the dialogue and voice-over data with the sound effect, wherein the dialogue and voice-over data are matched with the sound effect characteristics.
In an embodiment, the story elements corresponding to the conversation include a conversation character, conversation content, conversation environment, and/or conversation context references.
In an embodiment, the story elements corresponding to the voice-over include voice-over content, voice-over environment, and/or voice-over context references.
The invention also proposes a storage medium on which a program code implementing the method according to the invention is stored.
The invention also provides a story data processing system facing the intelligent robot, which comprises:
a text acquisition module configured to acquire story text data;
a text parsing module configured to parse the story text data, identify dialogs and bystanders in the story text;
a story data processing model library configured to store story data processing models;
the sound effect processing module is configured to call the story data processing model, carry out sound effect processing on the conversation and the voice-over in the story text and generate conversation and voice-over data with sound effects;
a multi-modal story data generation module configured to generate and output multi-modal data matching the story text, the multi-modal data including the dialogue with sound effects and the voice-over data.
In one embodiment, the multimodal story data generating module further comprises:
the first voice conversion unit is configured to combine the dialogue and the voice-over data with the sound effect, perform character voice conversion on the dialogue and the voice-over in the story text, and generate the dialogue and the voice-over data with the sound effect;
a second voice conversion unit configured to convert text other than the dialogue and the voice in the story text data into first voice data;
and the voice synthesis unit is configured to fuse the dialogue and voice-over data with sound effect and the first voice data to generate story voice data.
The invention also provides an intelligent story machine, comprising:
the input acquisition module is configured to acquire multi-modal input of a user and receive story requirements of the user;
the story data processing system is configured to acquire corresponding story text data according to the user story requirements and generate the multi-modal data;
an output module configured to output the multimodal data to a user.
Compared with the prior art, according to the method and the system, the story in the text form can be converted into multi-mode data which can be displayed in a multi-mode, and the display modes of dialogues and bystanders in the story can be optimized in a targeted mode, so that the user experience of listeners in telling the story is greatly improved.
Additional features and advantages of the invention will be set forth in the description which follows. Also, some of the features and advantages of the invention will be apparent from the description, or may be learned by practice of the invention. The objectives and some of the advantages of the invention may be realized and attained by the process particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIGS. 1 and 2 are flow diagrams of methods according to embodiments of the invention;
FIGS. 3 and 4 are schematic system configurations according to embodiments of the present invention;
fig. 5 and 6 are schematic diagrams of story teller according to embodiments of the present invention.
Detailed Description
The following detailed description will be provided for the embodiments of the present invention with reference to the accompanying drawings and examples, so that the practitioner of the present invention can fully understand how to apply the technical means to solve the technical problems, achieve the technical effects, and implement the present invention according to the implementation procedures. It should be noted that, as long as there is no conflict, the embodiments and the features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.
In the traditional daily life of human beings, reading characters is a main way for people to appreciate literary works. However, in certain specific scenarios, people also appreciate literary works by sound, e.g., listening to a comment, listening to a reciting, etc. Most often, children with inadequate literacy are often listened to by others' narration (listening to others telling a story).
With the continuous development of multimedia technology, more and more multimedia devices are applied to the daily life of human beings. With the support of multimedia technology, the body of the acoustic form of the literary works, in particular the storytelling, is gradually transformed to multimedia devices.
In general, storytelling using multimedia devices is usually manual storytelling in advance and recording audio files. The multimedia device simply plays the recorded audio file. With the development of computer technology, in order to simply and conveniently acquire a sound source, in the prior art, a method of converting text data into audio data is also adopted. Therefore, manual text recitation and recording are not needed, and story telling can be realized by using the multimedia equipment only by providing story text. However, the text-to-speech conversion is directly performed by using a computer technology, and only the direct conversion of text contents can be ensured, and the harmony of real people in story telling cannot be achieved, so that in the prior art, story telling based on the text conversion technology is quite dry and uninteresting, only direct word meanings can be simply conveyed, and the user experience is very poor.
Aiming at the problems, the invention provides a story data processing method facing an intelligent robot. In the method, the story in the text form is converted into multi-modal data which can be displayed in a multi-modal manner, so that the expressive force of story content is improved.
Further, in practical application scenarios, when human beings communicate with each other, the voice of different people is different, and the voice has the own voice characteristics of the speaker. Generally, the story text usually contains dialogue and dialogue as well as dialogue and dialogue as a character in the story. Therefore, in one embodiment, the matched sound effect is added to the conversation and the voice-over in the story in a targeted manner, so that the voice expression of the conversation and the voice-over is more real and vivid, the vividness of the story telling is improved, and the user experience is optimized.
Compared with the prior art, according to the method and the system, the story in the text form can be converted into multi-mode data which can be displayed in a multi-mode, and the display modes of dialogues and bystanders in the story can be optimized in a targeted mode, so that the user experience of listeners in telling the story is greatly improved.
The detailed flow of a method according to an embodiment of the invention is described in detail below based on the accompanying drawings, the steps shown in the flow chart of which can be executed in a computer system containing instructions such as a set of computer executable instructions. Although a logical order of steps is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
As shown in fig. 1, in one embodiment, the method includes the following steps:
s110, acquiring story text data;
s120, analyzing the story text data, and identifying conversation and voice-over in the story text;
s131, calling a story data processing model;
s132, carrying out sound effect processing on the dialogue and the voice-over in the story text to generate dialogue and voice-over data with sound effects;
s140, generating and outputting multi-modal data matching the story text, the multi-modal data including the dialogue and the voice-over data with sound effects generated in the step S132.
Further, in one embodiment, the dialog and voice-over TTS output is mainly performed by voice, so that the final output multi-modal data includes the dialog and voice-over data with sound effect and converted into voice. Specifically, in one embodiment, the dialogue and the voice-over data with sound effect are combined to perform character voice conversion on the dialogue and the voice-over in the story text to generate the dialogue and the voice-over data with sound effect.
Further, in order to further improve the vividness of the story expression, in one embodiment, the story is not limited to be told in a voice manner, and the dialogue and the voice-over are displayed in a text manner. Specifically, in one embodiment, the multimodal data includes dialogue and voice-over text data with sound effects.
Further, to further enhance the vividness of the story performance, in one embodiment, the story is not limited to being told in voice and/or text. Specifically, in an embodiment, the multi-modal data generated in step S130 further includes intelligent robot motion data, where:
corresponding intelligent robot action data is generated for the dialog and the dialogue in the story text.
Therefore, when the intelligent robot carries out story telling, the intelligent robot can output dialogue with sound effect and voice-over data and simultaneously can assist corresponding actions, thereby greatly improving the vividness of story telling.
Further, in the story text, other contents may be contained in addition to the dialog and the dialogue. In one embodiment, text in the story text data except conversation and voice is also converted into voice data and fused with the conversation and voice-over data with sound effects. Specifically, the method further comprises:
combining the dialogue and the voice-over data with sound effect to perform character voice conversion on the dialogue and the voice-over in the story text to generate dialogue and voice-over data with sound effect;
converting texts except dialogues and voices in the story text data into first voice data;
and combining the dialogue and the voice-over data with sound effect and the first voice data to generate story voice data.
Further, in order to ensure that the sound effect added to the dialogue and the voice-over data can improve the vividness of story expression, rather than adopting an error sound effect to reduce the story expression, in one embodiment, the story text data is analyzed to determine story content, and the sound effect corresponding to the dialogue and the voice-over is determined according to the specific content of the story.
Specifically, in one embodiment, the story text data is parsed based on text recognition techniques. Specifically, in one embodiment, parsing story text data includes: and performing text recognition on the story text data to determine story content.
Further, in consideration of the characteristics of computer analysis, in one embodiment, the story text data is analyzed in an element decomposition mode. Specifically, in one embodiment, the story is subjected to content element decomposition based on the text recognition result, and story elements are extracted, wherein the story elements comprise the style, characters and/or conversation of the story.
Specifically, in an embodiment, invoking a story data processing model to perform sound effect processing on dialog and voice-over in a story text includes:
text recognition is carried out on the story text, content elements of the story are disassembled based on the text recognition result, and story elements are extracted;
determining sound effect characteristics matching the conversation and the voice-over according to story elements corresponding to the conversation and the voice-over;
the dialogue and the voice-over are converted into dialogue and voice-over data with sound effect, wherein the dialogue and the voice-over data are matched with sound effect characteristics.
Specifically, in an embodiment, as shown in fig. 2, the method includes the following steps:
s210, obtaining story text data;
s220, analyzing the story text data;
s221, performing content element disassembling on the story based on the text recognition result, and extracting story elements;
s222, recognizing dialogue and voice-over in the story text;
s230, calling a story data processing model;
s231, determining sound effect characteristics matching the conversation and the voice-over according to story elements corresponding to the conversation and the voice-over;
s232, converting the dialogue and the voice-over into dialogue and voice-over data with sound effect, wherein the dialogue and the voice-over data are matched with sound effect characteristics.
Specifically, in an embodiment, the parsing target is divided into several specific categories (several story elements), keyword extraction is performed for each story element, and the extracted keywords and the story element tags are saved as a parsing result.
Further, in the story text, the descriptors of the dialog and the dialogue, the description content, and/or the description background may be different according to the progress of the story content. Therefore, in one embodiment, the sound effects corresponding to the dialog and the dialog are determined according to the story elements corresponding to the dialog and the voice-over. Specifically, in one embodiment, the sound effect is determined for each sentence of dialog and each sentence of voice-over, respectively.
In particular, in one embodiment, the story elements corresponding to a conversation include a conversation character, conversation content, conversation environment, and/or conversation context references.
In particular, in an embodiment, the story elements corresponding to the voice-overs include voice-over content, voice-over environment, and/or voice-over context references.
Further, based on the method of the present invention, the present invention also provides a storage medium having stored thereon program codes that can implement the method of the present invention.
Furthermore, based on the method, the invention also provides a story data processing system facing the intelligent robot.
Specifically, as shown in fig. 3, in an embodiment, the system includes:
a text acquisition module 310 configured to acquire story text data;
a text parsing module 320 configured to parse story text data, identify dialogs and bystanders in the story text;
a story data processing model library 341 configured to store story data processing models;
the sound effect processing module 340 is configured to invoke a story data processing model, perform sound effect processing on the dialogue and the voice-over in the story text, and generate dialogue and voice-over data with sound effects;
a multi-modal story data generation module 330 configured to generate and output multi-modal data matching story text, the multi-modal data including dialogue with sound effects and voice over data.
Further, in an embodiment, multimodal story data generation module 330 is further configured to generate corresponding smart robot action data for dialogs and bystandings in the story text.
Further, in an embodiment, as shown in fig. 4, multimodal story data generating module 430 further includes:
a voice conversion unit 431 configured to convert text other than the dialogue and the voice in the story text data into first voice data;
a voice conversion unit 432 configured to perform text-to-speech conversion on the dialog and the voice-over in the story text in combination with the dialog and the voice-over data with the sound effect to generate the dialog and the voice-over data with the sound effect;
and a voice synthesis unit 433 configured to fuse the dialogue and voice-over data with sound effects and the first voice data to generate story voice data.
Furthermore, based on the story data processing system provided by the invention, the invention also provides an intelligent story machine. Specifically, as shown in fig. 5, in an embodiment, the story teller includes:
an input acquisition module 510 configured to collect user multimodal input, receive user story requirements;
a story data processing system 520 configured to obtain corresponding story text data according to a user story requirement, and generate multi-modal data;
an output module 530 configured to output the multimodal data to a user.
Specifically, in one embodiment, the output module 530 includes a playing unit configured to play the dialogue and voice-over data with sound effects.
Specifically, as shown in fig. 6, in an embodiment, the story machine includes an intelligent device 610 and a cloud server 620, wherein:
cloud server 620 includes a story data processing system 630. The story data processing system 630 is configured to invoke a capability interface of the cloud server 620 to obtain story text data and analyze the story text data, and generate and output multi-modal data including dialogue and voice-over data with sound effects. Specifically, each capability interface of the story data processing system 630 calls a corresponding logic process during the data parsing process.
Specifically, in an embodiment, the capability interfaces of the cloud server 620 include a text recognition interface 621, a text/speech conversion interface 622, and a sound effect synthesis interface 623.
The smart device 610 includes a human-machine interaction input output module 611, a communication module 612, a play module 613, and an action module 614.
The human-computer interaction input/output module 611 is configured to obtain a control instruction of the user and determine a story listening requirement of the user.
The communication module 612 is configured to output the user story listening requirement acquired by the human-computer interaction input/output module 611 to the cloud server 620, and receive multi-modal data from the cloud server 620.
The playing module 613 is configured to play the dialogue and voice-over data with sound effects or the story voice data in the multimodal data.
The action module 614 is configured to make corresponding action behaviors according to the intelligent robot action data in the multi-modal data.
Specifically, in a specific application scenario, the human-computer interaction input/output module 611 obtains a control instruction of the user, and determines a story listening requirement of the user.
Communication module 612 sends the user story listening requirements to cloud server 620.
Cloud server 620 selects corresponding story text data based on user story listening requirements. The story data processing system in the cloud server 620 obtains and parses story text data, and generates and outputs multimodal data. The multi-modal data comprises intelligent robot action data and story voice data, wherein the story voice data comprises dialogue with sound effect and voice data beside the voice.
The communication module 612 receives multimodal data sent by the cloud server 620.
The play module 613 plays story voice data among the multimodal data received by the communication module 612.
The action module 614 makes corresponding action behaviors according to the intelligent robot action data in the multi-modal data.
It is to be understood that the disclosed embodiments of the invention are not limited to the particular structures, process steps, or materials disclosed herein but are extended to equivalents thereof as would be understood by those ordinarily skilled in the relevant arts. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.
Reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. Thus, appearances of the phrase "an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment.
Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. There are various other embodiments of the method of the present invention. Various corresponding changes or modifications may be made by those skilled in the art without departing from the spirit of the invention, and these corresponding changes or modifications are intended to fall within the scope of the appended claims.

Claims (7)

1. A story data processing method facing an intelligent robot is characterized by comprising the following steps:
acquiring story text data;
analyzing the story text data, and identifying conversation and voice-over in the story text; the method comprises the steps of analyzing story text data in an element decomposition mode, dividing the story text data serving as an analysis target into several corresponding categories according to set story element types, further extracting keywords for each category, and storing the extracted keywords and story element labels as analysis results;
calling a story data processing model, and carrying out sound effect processing on the conversation and the voice-over in the story text to generate conversation and voice-over data with sound effects;
generating and outputting multi-modal data matched with the story text, wherein the multi-modal data comprises the dialogue with sound effect and the voice-over data;
the method further comprises the following steps:
combining the dialogue and the voice-over data with the sound effect, and performing character voice conversion on the dialogue and the voice-over in the story text to generate dialogue and voice-over data with the sound effect; wherein the story elements corresponding to the voice-overs include voice-over content, voice-over environment, and/or voice-over context references;
converting texts except dialogues and voices in the story text data into first voice data;
and fusing the dialogue with the sound effect, the voice data beside the voice and the first voice data to generate story voice data.
2. The method of claim 1, wherein the multimodal data further comprises intelligent robot motion data, wherein:
generating corresponding intelligent robot action data for the dialogue and the dialogue in the story text.
3. The method of claim 2, wherein invoking a story data processing model to perform sound effect processing on dialog and voice-overs in the story text comprises:
performing text recognition on the story text, performing content element disassembly on a story based on a text recognition result, and extracting story elements;
determining sound effect characteristics matching the conversation and the voice-over according to story elements corresponding to the conversation and the voice-over;
and converting the dialogue and the voice-over into the dialogue and voice-over data with the sound effect, wherein the dialogue and voice-over data are matched with the sound effect characteristics.
4. The method of claim 3, wherein the story elements corresponding to the conversation include a conversation character, conversation content, conversation environment, and/or conversation context references.
5. A storage medium having stored thereon program code for implementing the method according to any one of claims 1-4.
6. An intelligent robot-oriented story data processing system, the system comprising:
a text acquisition module configured to acquire story text data;
a text parsing module configured to parse the story text data, identify dialogs and bystanders in the story text; the method comprises the steps of analyzing story text data in an element decomposition mode, dividing the story text data serving as an analysis target into several corresponding categories according to set story element types, further extracting keywords for each category, and storing the extracted keywords and story element labels as analysis results;
a story data processing model library configured to store story data processing models;
the sound effect processing module is configured to call the story data processing model, carry out sound effect processing on the conversation and the voice-over in the story text and generate conversation and voice-over data with sound effects;
a multi-modal story data generation module configured to generate and output multi-modal data matching the story text, the multi-modal data including the dialogue with sound effects and the voice-over data;
the multimodal story data generation module further comprises:
the first voice conversion unit is configured to combine the dialogue and the voice-over data with the sound effect, perform character voice conversion on the dialogue and the voice-over in the story text, and generate the dialogue and the voice-over data with the sound effect; wherein the story elements corresponding to the voice-overs include voice-over content, voice-over environment, and/or voice-over context references;
a second voice conversion unit configured to convert text other than the dialogue and the voice in the story text data into first voice data;
and the voice synthesis unit is configured to fuse the dialogue and voice-over data with sound effect and the first voice data to generate story voice data.
7. An intelligent story machine, the story machine comprising:
the input acquisition module is configured to acquire multi-modal input of a user and receive story requirements of the user;
the story data processing system of claim 6, configured to retrieve corresponding story text data in accordance with the user story requirements, generating the multimodal data;
an output module configured to output the multimodal data to a user.
CN201810981546.8A 2018-08-27 2018-08-27 Intelligent robot-oriented story data processing method and system Active CN109065019B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810981546.8A CN109065019B (en) 2018-08-27 2018-08-27 Intelligent robot-oriented story data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810981546.8A CN109065019B (en) 2018-08-27 2018-08-27 Intelligent robot-oriented story data processing method and system

Publications (2)

Publication Number Publication Date
CN109065019A CN109065019A (en) 2018-12-21
CN109065019B true CN109065019B (en) 2021-06-15

Family

ID=64757210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810981546.8A Active CN109065019B (en) 2018-08-27 2018-08-27 Intelligent robot-oriented story data processing method and system

Country Status (1)

Country Link
CN (1) CN109065019B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110390927B (en) * 2019-06-28 2021-11-23 北京奇艺世纪科技有限公司 Audio processing method and device, electronic equipment and computer readable storage medium
CN111415650A (en) * 2020-03-25 2020-07-14 广州酷狗计算机科技有限公司 Text-to-speech method, device, equipment and storage medium
CN113658577A (en) * 2021-08-16 2021-11-16 腾讯音乐娱乐科技(深圳)有限公司 Speech synthesis model training method, audio generation method, device and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096932A (en) * 2015-07-14 2015-11-25 百度在线网络技术(北京)有限公司 Voice synthesis method and apparatus of talking book
CN105894873A (en) * 2016-06-01 2016-08-24 北京光年无限科技有限公司 Child teaching method and device orienting to intelligent robot

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354840B (en) * 2008-09-08 2011-09-28 众智瑞德科技(北京)有限公司 Method and apparatus for performing voice reading control of electronic book
CN101694772B (en) * 2009-10-21 2014-07-30 北京中星微电子有限公司 Method for converting text into rap music and device thereof
JP2013072957A (en) * 2011-09-27 2013-04-22 Toshiba Corp Document read-aloud support device, method and program
US20140122082A1 (en) * 2012-10-29 2014-05-01 Vivotext Ltd. Apparatus and method for generation of prosody adjusted sound respective of a sensory signal and text-to-speech synthesis
CN106985137B (en) * 2017-03-09 2019-11-08 北京光年无限科技有限公司 Multi-modal exchange method and system for intelligent robot

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096932A (en) * 2015-07-14 2015-11-25 百度在线网络技术(北京)有限公司 Voice synthesis method and apparatus of talking book
CN105894873A (en) * 2016-06-01 2016-08-24 北京光年无限科技有限公司 Child teaching method and device orienting to intelligent robot

Also Published As

Publication number Publication date
CN109065019A (en) 2018-12-21

Similar Documents

Publication Publication Date Title
CN107657017B (en) Method and apparatus for providing voice service
KR102582291B1 (en) Emotion information-based voice synthesis method and device
CN110517689B (en) Voice data processing method, device and storage medium
CN109543021B (en) Intelligent robot-oriented story data processing method and system
US9805718B2 (en) Clarifying natural language input using targeted questions
US20200126566A1 (en) Method and apparatus for voice interaction
CN110473546B (en) Media file recommendation method and device
CN110689877A (en) Voice end point detection method and device
CN104488027A (en) Speech processing system and terminal device
CN109065019B (en) Intelligent robot-oriented story data processing method and system
CN111145777A (en) Virtual image display method and device, electronic equipment and storage medium
KR20200056261A (en) Electronic apparatus and method for controlling thereof
CN109256133A (en) A kind of voice interactive method, device, equipment and storage medium
KR101534413B1 (en) Method and apparatus for providing counseling dialogue using counseling information
JP2011504624A (en) Automatic simultaneous interpretation system
CN109460548B (en) Intelligent robot-oriented story data processing method and system
CN116821290A (en) Multitasking dialogue-oriented large language model training method and interaction method
CN116917984A (en) Interactive content output
Tsiakoulis et al. Dialogue context sensitive HMM-based speech synthesis
CN109065018B (en) Intelligent robot-oriented story data processing method and system
CN109241331B (en) Intelligent robot-oriented story data processing method
CN113314096A (en) Speech synthesis method, apparatus, device and storage medium
CN116403583A (en) Voice data processing method and device, nonvolatile storage medium and vehicle
CN115050351A (en) Method and device for generating timestamp and computer equipment
CN113066473A (en) Voice synthesis method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant