CN117520502A - Information display method and device, electronic equipment and storage medium - Google Patents

Information display method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117520502A
CN117520502A CN202311474663.2A CN202311474663A CN117520502A CN 117520502 A CN117520502 A CN 117520502A CN 202311474663 A CN202311474663 A CN 202311474663A CN 117520502 A CN117520502 A CN 117520502A
Authority
CN
China
Prior art keywords
information
theme
text
description
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311474663.2A
Other languages
Chinese (zh)
Inventor
闫铠
杜晶
雷凯翔
李奇
劳丰
程彧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to CN202311474663.2A priority Critical patent/CN117520502A/en
Publication of CN117520502A publication Critical patent/CN117520502A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the disclosure provides an information display method, an information display device, electronic equipment and a storage medium. Wherein the method comprises the following steps: responding to the information display triggering operation, and acquiring theme description information; generating theme extension information including at least a theme extension text for describing extension content associated with the theme description information based on the theme description information; and generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the avatar aiming at the theme extension text. According to the technical scheme, the technical problems that the preparation workload of the alternative information is large and the content is limited in the related technology, so that the displayed feedback information is caused are solved, and the target display information of the text explanation audio carrying the virtual image can be automatically fed back by inputting the theme description information, so that the theme expansion information is displayed in more detail, image and definition.

Description

Information display method and device, electronic equipment and storage medium
Technical Field
The embodiment of the disclosure relates to a computer application technology, in particular to an information display method, an information display device, electronic equipment and a storage medium.
Background
With the continuous progress of technology and the rapid development of intelligent technology, more and more devices are equipped with man-machine interaction functions. In the related art, aiming at the communication problem input by the user, the response mode of the device is often that after the response information corresponding to the communication information is screened out from a large amount of pre-stored alternative response information, the screened response information is displayed to the user.
However, in the process of implementing the related art, it is found that the prior art has at least the following technical problems: a large amount of alternative response information needs to be prepared in a preselection mode, and the workload is large; and because the communication problems are abundant and various, the response information matched with the communication problems cannot be accurately and naturally determined through the uniformly set limited number of alternative response information, and the difference requirements of users are difficult to meet.
Disclosure of Invention
The disclosure provides an information display method, an information display device, electronic equipment and a storage medium, so as to achieve the purposes of accurately determining target display information and displaying the target display information more vividly and clearly.
In a first aspect, an embodiment of the present disclosure provides an information display method, including:
responding to the information display triggering operation, and acquiring theme description information;
generating theme extension information based on the theme description information, wherein the theme extension information at least comprises theme extension text, and the theme extension text is text for describing extension content associated with the theme description information;
and generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the theme extension text by an avatar.
In a second aspect, embodiments of the present disclosure further provide an information display apparatus, including:
the information acquisition module is used for responding to the information display triggering operation and acquiring the theme description information;
the information expansion module is used for generating theme expansion information based on the theme description information, wherein the theme expansion information at least comprises theme expansion texts, and the theme expansion texts are texts for describing expansion contents associated with the theme description information;
And the information display module is used for generating target display information corresponding to the theme description information based on the theme expansion text and displaying the target display information, wherein the target display information at least comprises text explanation audio of the theme expansion text by an avatar.
In a third aspect, embodiments of the present disclosure further provide an electronic device, including:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the information presentation method as described in any of the embodiments of the present disclosure.
In a fourth aspect, the disclosed embodiments also provide a storage medium containing computer-executable instructions for performing the information presentation method according to any one of the disclosed embodiments when executed by a computer processor.
According to the technical scheme, the theme description information is obtained in response to information display triggering operation; generating topic extension information based on the topic description information, wherein the topic extension information at least comprises topic extension text, and the topic extension text is text for describing extension content associated with the topic description information; therefore, the topic description information is further and comprehensively expanded through the topic expansion text; and generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the avatar aiming at the theme extension text. Therefore, the target display information can display the content corresponding to the theme description information singly, and can display the content corresponding to the theme expansion information, so that the requirements of user variability and richness are better met, and the target display information is accurately determined; in addition, a large amount of alternative display information is not required to be predetermined, and the problems that in the prior art, the preparation workload of the alternative information is large, the content is limited, and the accuracy of the displayed feedback information is poor are solved; in addition, the target display information is displayed more vividly and clearly through the virtual image display.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
Fig. 1 is a flow chart of an information display method according to an embodiment of the disclosure;
fig. 2 is a flowchart of another information display method according to an embodiment of the disclosure;
fig. 3 is a flowchart of yet another information display method according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an information display device according to an embodiment of the disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to the relevant legal regulations.
For example, in response to receiving an active request from a user, a prompt is sent to the user to explicitly prompt the user that the operation it is requesting to perform will require personal information to be obtained and used with the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as an electronic device, an application program, a server or a storage medium for executing the operation of the technical scheme of the present disclosure according to the prompt information.
As an alternative but non-limiting implementation, in response to receiving an active request from a user, the manner in which the prompt information is sent to the user may be, for example, a popup, in which the prompt information may be presented in a text manner. In addition, a selection control for the user to select to provide personal information to the electronic device in a 'consent' or 'disagreement' manner can be carried in the popup window.
It will be appreciated that the above-described notification and user authorization process is merely illustrative and not limiting of the implementations of the present disclosure, and that other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.
It will be appreciated that the data (including but not limited to the data itself, the acquisition or use of the data) involved in the present technical solution should comply with the corresponding legal regulations and the requirements of the relevant regulations.
Before the technical solution is introduced, an exemplary description of the application scenario may be given. The technical scheme can be applied to the preset field and used for providing consultation service for the user or communicating with the user. For example, in the educational arts, knowledge presentation information may be generated for a user to present, etc., for questions of a user's proposed preset discipline categories. Based on the scheme of the embodiment, the corresponding target display information can be accurately determined aiming at rich and various communication contents of users, and the display is performed through the virtual image, so that the requirements of user variability are better met.
Fig. 1 is a schematic flow chart of an information display method provided by an embodiment of the present disclosure, where the embodiment of the present disclosure is applicable to a situation that target display information is generated and displayed based on acquired theme description information, the method may be performed by an information display apparatus, and the apparatus may be implemented in a form of software and/or hardware, optionally, the apparatus may be implemented by an electronic device, and the electronic device may be a mobile terminal, a PC side, a server, or the like.
As shown in fig. 1, the method of this embodiment may specifically include:
s110, responding to the information display triggering operation, and acquiring the theme description information.
The information display triggering operation is an operation for requesting information display. For example, the information presentation triggering operation may include an input operation of at least one of voice, text, and picture; and the method can also comprise a selection operation, a clicking operation and the like of the content displayed on the current interface.
In a specific implementation, the user can perform information presentation triggering operation through the input device. In the field of education and teaching, for example, a user can complete information display triggering operation by inputting a question text and/or a question voice to an input device so as to display corresponding answers; or, the user can perform operations such as selecting and clicking on the content in the teaching display interface through the input device to complete information display triggering operation, so that interface content corresponding to the operations such as selecting and clicking is used as a question to display corresponding answers.
In this embodiment, the topic description information is used to reflect the communication content when the communication interaction is performed with the user, for example, in a teaching scenario, the question content of the student can be used as the topic description information. In a specific implementation, the manner of obtaining the theme description information may include: and detecting selected information on the interactive interface, and taking the selected information as the theme description information. Wherein the selected information may include text information and picture information. In addition, the method for acquiring the theme description information may further include acquiring the theme description information based on input information of the external input device. Specifically, the topic description information is topic description text or topic description voice. In practical application, the method for obtaining the topic description information includes: acquiring a theme description text input in a preset theme editing box; or, acquiring the theme description voice acquired based on the preset sound acquisition control.
The text for describing the communication content is a text for describing the communication content, and the voice for describing the communication content is a voice for describing the communication content. In practical applications, different input modes such as text input and voice input can be provided, so that the user can conveniently operate.
Specifically, a theme edit box may be generated in advance and displayed on the interactive interface. When text input in the topic editing box is detected, the input text can be determined to be topic description text, so that the acquisition operation of topic description information is completed. Or, the voice can be collected in real time through the voice collection control, and the collected voice is used as the theme description voice so as to complete the acquisition operation of the theme description information.
In order to ensure the accuracy and effectiveness of the acquired topic description text or topic description voice, when the topic description information is acquired, the information input confirmation operation can be performed on the topic description text or topic description voice so as to avoid acquiring the wrong topic description information.
For example, the manner of performing the information input confirmation operation on the theme description text may be: after an input text in a theme editing box is acquired, determining whether a preset confirmation key is clicked, and taking the input text as a theme description text under the condition that the preset confirmation key is clicked; or after detecting the input text in the theme editing box, determining whether the input text in the theme editing box is changed or not after a preset time interval, and taking the input text as the theme description text under the condition that the input text is not changed. The method for performing the information input confirmation operation on the topic description voice may be that when the input voice is detected, it is determined whether a preset topic description specified vocabulary is included in the input voice, and when the preset topic description specified vocabulary is included, the input voice is determined to be the topic description voice.
In the embodiment, the input topic description text or the collected topic description voice is used as the topic description information, so that various modes for acquiring the topic description information are provided, and the topic description information can be obtained for different input types, thereby providing convenience for information exchange with users.
S120, generating theme extension information based on the theme description information, wherein the theme extension information at least comprises a theme extension text, and the theme extension text is a text for describing extension content associated with the theme description information.
For example, for the educational education field, the extended content may be extended knowledge associated with topic description information. For example, the topic description information is topic description text with the content of idioms of "fine guard sea filling", and the corresponding topic extension text can be the content of sentence making of fine guard sea filling, paraphrasing of fine guard sea filling, application practice problems and the like.
In a specific implementation, the manner of generating the theme extension information based on the theme descriptive information may be: and determining a topic description text corresponding to the topic description information, performing word segmentation operation on the topic description text, and generating a topic expansion text based on all the word segmentation.
Alternatively, the implementation manner of generating the theme extension information based on the theme descriptive information may be: and determining a topic description text corresponding to the topic description information, determining a description keyword corresponding to the topic description text, and generating a topic extension text based on the description keyword.
In particular, the topic description information may be topic description text or topic description speech. When the topic description information is a topic description text, a topic extension text may be generated based on the received topic description text. In the case that the topic description information is topic description voice, in order to facilitate analysis of the topic description information and quickly determine topic extension information, the topic description voice can be subjected to text conversion, and a description text obtained after conversion is used as the topic description text. Alternatively, the manner of Text conversion of the topic description Speech may be based on a TTS (Text to Speech) service, which completes the conversion from the topic description Speech to the topic description Text.
In this embodiment, the description keywords are keywords extracted from each word after the topic description text is segmented. For example, the description keyword may be a word in the topic description text after the stop word is removed. Further, topic extension text may be generated based on the descriptive keywords. For example, when the description keywords are words such as "lean fill sea", "idiom" and "learn", the corresponding subject expanded text may be knowledge of different aspects of the idiom such as sentence making, idiom provenance, and the like.
The embodiment can generate the theme extended text by describing the keywords, thereby reducing the interference of unimportant words, reducing the workload of generating the theme extended text and being beneficial to improving the accuracy of generating the theme extended text.
Further, descriptive keywords may be determined in the topic description text through a GPT (generating Pre-Trained Transformer, generating Pre-training transducer model) service. The description keywords are extracted by GPT instead of the traditional regularized extraction mode, so that the description keywords corresponding to the topic description texts can be more accurately determined, and the method is beneficial to determining more natural and rich topic expansion texts based on the description keywords.
In a specific implementation, generating the subject expanded text based on the descriptive keywords includes: inquiring a theme expansion text corresponding to the description keyword through a preset inquiry path; and/or inputting the description keywords into a text generation model to obtain the theme expansion text corresponding to the description keywords.
In this embodiment, generating the subject expanded text includes two ways:
the first way is to generate subject expanded text based on a preset query path. Specifically, before the subject expanded text is queried based on the preset query path, the storage path of each expanded text can be predetermined, and the query path of the subject expanded text corresponding to each keyword is determined based on the storage path and the corresponding relation between the expanded text and each keyword. When the theme expansion text is determined, a preset query path corresponding to the description keyword can be determined, the expansion text is queried through the preset query path, and the expansion text obtained by query is used as the theme expansion text corresponding to the description keyword.
The second way is to generate subject expanded text based on a text generation model. The text generation model is a deep learning model trained based on sample keywords and expected output texts corresponding to the sample keywords. By way of example, the deep learning model may be a natural language processing model. The sample keywords may be historic real description keywords; the desired output text is the expanded text associated with the historic real description keywords that is desired to be presented. In practical application, the description keyword may be input into a text generation model, and the text generation model outputs the subject expanded text corresponding to the description keyword.
In the embodiment, the theme expansion text is determined by adopting the two modes, and the theme expansion text can be determined rapidly and accurately based on the description keywords.
S130, generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the virtual image aiming at the theme extension text.
It should be noted that the avatar may include a virtual sound, which may simulate different sound characteristics; and aiming at different character types and character characteristics, setting the characteristics of tone, language type and the like of the virtual sound so as to obtain different virtual sounds. The avatar may also include an avatar having a specific character, and a capability; the avatar may play a virtual role in the virtual world; virtual characters with different images can be set according to different shapes, character types, capability characteristics and the like, for example, the shapes of virtual characters corresponding to the character types of astronauts, teachers, doctors and the like are different. In the teaching field, the virtual image of a virtual teacher can be set to vividly and vividly explain knowledge and interact questions and answers for users.
In this embodiment, in order to obtain more vivid and lively target presentation information, a virtual sound corresponding to the theme extended text may be predetermined; and determining the explanation content of the text explanation audio based on the theme expansion text, and generating the text explanation audio for reading the explanation content through virtual sound as target display information. For example, the whole content in the theme extended text may be used as the explanation content, text explanation audio for reading the explanation content with a predetermined virtual sound may be generated, and the text explanation audio may be used as the target display information; for example, for the teaching field, the target presentation information may be courseware corresponding to the subject description information. In this embodiment, the content of the explanation may be determined in the theme extension text based on the received content display type specified by the user, and the text explanation audio may be obtained by combining the virtual sound and the content of the explanation, and the text explanation audio may be determined as the target display information. For example, when the topic description information explains the learning topic, the content display type specified by the user includes problem solving, used basic knowledge and exercise consolidation, then removing other contents in the topic expansion text, extracting text contents corresponding to the problem solving, used basic knowledge and exercise consolidation as explanation contents, and obtaining text explanation audio based on the explanation contents and a predetermined virtual sound.
Further, the target presentation information may further include a virtual character corresponding to the theme extended text, so that the text explains the audio and the virtual character to constitute the target presentation information. The theme expansion text is displayed more vividly through the specific virtual character, so that the interest in displaying the target display information is increased, the user has more picture feel when viewing the target display information, and the user is helped to know the target display information more clearly and vividly.
In this embodiment, the theme expansion information further includes a theme expansion image; after determining the description keyword corresponding to the topic description text, the method further comprises the following steps: generating an image guidance text based on the description keywords, and generating a theme expansion image corresponding to the theme description information based on the image guidance text.
Wherein the theme extended image is an image for describing extended contents associated with the theme descriptive information. For example, when the topic description information is idiom paraphrasing of the fine guard sea, the topic expansion image is a fine guard and sea image; for example, a schematic diagram of the fine-guard sea filling in the teaching material can be used as a theme expansion image.
It should be noted that, since the description keywords cannot fully and completely reflect the topic description information, the topic expansion image is simply generated based on the description keywords, which may result in poor relevance between the topic expansion image and the topic description information, or even misleading of the topic expansion image. In order to generate the situation that the determined theme extension image has poor relevance to the theme descriptive information, image guidance text can be generated based on the descriptive keywords, so that the theme extension image can be obtained based on the image guidance text.
In specific implementation, image features corresponding to the descriptive keywords can be determined, image guidance texts are formed based on the determined image features, and theme expansion images are generated based on the pre-built text-to-image service. The text of the image-text guidance is a text which is obtained after the description keywords are learned and comprises image feature descriptions corresponding to the description keywords.
When the description keywords comprise words such as sea, essence, idioms and the like, it can be determined that the corresponding image features comprise features such as sea water color, shape and the like and appearance features of essence in a history story when the description keywords are displayed in an image form, image description contents of the description keywords are generated based on the image features, and the image description contents are used as image guidance texts.
In specific implementation, corresponding theme expansion images can be obtained respectively aiming at different description keywords, and one theme expansion image can be obtained together based on a plurality of description keywords. Furthermore, in order to make the matching degree of the theme extended image and the theme extended text higher, the theme extended text can be determined based on the description keyword, and the theme extended text is used as an image-text guiding text, so that the theme extended image matched with the theme extended text is obtained.
In this embodiment, the theme expansion image is obtained through the generated image guidance text, which increases the accuracy and realism of the theme expansion image for the theme description information, and is beneficial to improving the relevance and matching degree between the theme expansion image and the theme description information, so as to better satisfy the requirements of users.
In order to improve the integrity and the richness of the target display information, generating the target display information corresponding to the theme description information based on the theme expansion text comprises the following steps: and generating text explanation audio of the virtual image aiming at the theme expansion text based on the theme expansion text, and generating target display information based on the text explanation audio and the theme expansion image.
Specifically, an avatar corresponding to the theme extended text may be predetermined; and determining the explanation content corresponding to the avatar based on the theme extension text, and generating text explanation audio for reading the explanation content through the avatar. In order to comprehensively and clearly interact with a user, the theme expansion image and text explanation audio can be combined, so that target display information is obtained.
Further, in order to avoid the situation that a user listens to the content of the text explanation audio when the target display information is displayed, the theme expansion text, the theme explanation audio and the theme expansion image can be combined to obtain the target display information; when the target display information is displayed, the theme expansion text and the corresponding theme expansion image can be displayed on the display interface, meanwhile, text explanation audio is played correspondingly, the target display information is clearly and completely displayed, and the user can know the content of the target display information from various aspects of hearing and vision.
In addition, under the condition of a plurality of theme extension images, in order to improve the matching degree of the theme extension images and text explanation audios, the text explanation audios can be divided according to different explanation contents to obtain audio fragments; and determining the corresponding relation between the theme expansion image and the audio fragment, and combining the text explanation audio with the theme expansion image based on the corresponding relation to obtain target display information, so that the corresponding theme expansion image is displayed when the text explanation audio is explained to different contents.
According to the embodiment, the text explanation audio is combined with the theme expansion image to obtain the target display information, so that the diversity and the interestingness of the target display information are improved, the interaction with the user can be performed more clearly and in detail, and the user can understand the theme expansion text better.
After the target display information is determined, the target display information can be displayed according to a preset typesetting template. Wherein the ranking templates comprise typesetting of at least one item of content of text, images and audio, and the typesetting templates can be preset based on the topic type of the topic description information. For example, if the target presentation information is a teaching courseware, the typesetting template is used for determining the content of the theme expansion text placed on each page of the teaching courseware, the placed theme expansion image and the explanation content in the text explanation audio. If the topic description information is to explain the idiom of the fine-guard sea, the topic expansion text comprises the content such as sentence making, idiom source, explanation and practice problem of the fine-guard sea, the sentence making, idiom source, explanation and practice problem can be respectively arranged on different pages of the teaching courseware to be respectively displayed, corresponding topic expansion images are correspondingly placed on each content, each page of the teaching courseware is sequentially discharged according to the explanation sequence of the text explanation audio, and therefore the target display information placed according to the typesetting template is displayed, and the target display information is smoother and more natural in display.
According to the technical scheme, the theme description information is obtained in response to information display triggering operation; generating topic extension information based on the topic description information, wherein the topic extension information at least comprises topic extension text, and the topic extension text is text for describing extension content associated with the topic description information; therefore, the topic description information is further and comprehensively expanded through the topic expansion text; and generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the avatar aiming at the theme extension text. Therefore, the target display information can display the content corresponding to the theme description information singly, and can display the content corresponding to the theme expansion information, so that the requirements of user variability and richness are better met, and the target display information is accurately determined; in addition, a large amount of alternative display information is not required to be predetermined, and the problems that the preparation workload of the alternative information is large and the content is limited in the related technology, so that the accuracy of the displayed feedback information is poor are solved; in addition, the target display information is displayed more vividly and clearly through the virtual image display.
Fig. 2 is a flowchart of another information display method according to an embodiment of the disclosure. The technical solution of the present embodiment further refines the process of generating the target display information on the basis of the foregoing embodiment. Optionally, generating the target presentation information corresponding to the topic description information based on the topic extension text includes: determining an avatar corresponding to the theme extended text, and generating text explanation audio and avatar action information corresponding to the avatar based on the theme extended text; and generating target display information corresponding to the theme description information based on the avatar, the avatar action information and the text explanation audio. Reference is made to the description of this example for a specific implementation. The technical features that are the same as or similar to those of the foregoing embodiments are not described herein.
As shown in fig. 2, the method of this embodiment may specifically include:
s210, responding to the information display triggering operation, and acquiring the theme description information.
S220, generating theme extension information based on the theme description information, wherein the theme extension information at least comprises a theme extension text, and the theme extension text is a text for describing extension content associated with the theme description information.
S230, determining the virtual image corresponding to the theme expansion text, and generating text explanation audio and image action information corresponding to the virtual image based on the theme expansion text.
Wherein the avatar includes a virtual sound and a virtual character.
To more intuitively and interestingly present the target presentation information, virtual sounds and virtual characters corresponding to the subject expanded text may be determined, respectively. Specifically, virtual sound and virtual characters can be determined based on the topic type, topic scene and other aspects of the topic extension text; for example, when the topic type of the topic extension text is aerospace knowledge and the topic scene is the explanation satellite emission principle, the virtual sound can be set as middle-aged man sound and the virtual character is astronaut, so that the interestingness of displaying the target display information is increased.
In a specific implementation, corresponding text explanation audio can be generated based on the determined virtual sound and the theme extended text; and generating the avatar action information of the virtual character based on the theme extended text. Wherein, different actions and gestures of the virtual character can be reflected by the image action information.
Specifically, the avatar action information of the avatar may be determined based on the topic type of the topic extension text and the vocabulary type in the topic extension text. For example, if the theme scene is a song teaching scene, the song to be taught can be used as a theme expansion text, and the image action information of the virtual character can be determined based on the lyrics to be taught. If the song is a folk song, the image action information can be the folk dance action of the folk; if the song is a baby song, the image action information can be a common action of children.
Optionally, the manner of generating the avatar action information corresponding to the avatar based on the theme extended text may be: determining display state information corresponding to the avatar based on the theme extended text, and determining expression information and other action information corresponding to the avatar based on the display state information; mouth shape information corresponding to the avatar is generated based on the text interpretation audio, and face motion information corresponding to the avatar is generated based on the mouth shape information and the expression information.
Wherein the character motion information comprises character motion information including facial motion information and other motion information; the other motion information is motion information of other parts except the face of the avatar; for example, other motion information includes limb motion, torso motion, and the like. The presentation state information may include at least one of states of "happy", "sad", "anger", "surprise", "fear", and "aversion" of the avatar. The mouth shape information is the shape corresponding to the mouth when making different sounds.
In implementations, presentation state information may be determined based on the theme scenes described in the theme extension text. For example, if the theme scene is a happy game interaction scene, the display status information is "happy"; and if the theme scene is a sad story telling scene, displaying state information is sad. Further, each theme expansion text can correspond to various display state information, and different display state information can be correspondingly set based on text contents of different parts in the theme expansion text, so that the virtual image has rich expression information. For example, the subject expanded text is divided into content, and for each part of divided text content, whether the text content includes state-related words, such as words representing laughter, crying, shouting and the like, can be determined, if so, based on the state-related words, display state information corresponding to each text content can be determined, so that display state information corresponding to the subject expanded text is sequentially determined according to the sequence of the text content in the subject expanded text.
In this embodiment, expression information of the avatar may be determined based on the presentation state information. Specifically, the expression information of the avatar corresponding to the theme expansion text can be determined according to the first corresponding relation between the pre-established state information and the expression information and the determined display state information. For example, the correspondence between the state information and the expression information may include: "happy" corresponds to smiling expression, "sad" corresponds to crying expression, and "aversion" corresponds to frowning expression.
Further, other motion information of the avatar corresponding to the presentation state information may be determined based on a second correspondence between the preset state information and the other motion information. For example, the second correspondence may include that "happy" corresponds to a clapping action, that "anger" corresponds to a crossing waist action, and that "sad" corresponds to a low head action. Accordingly, corresponding other action information can be determined based on the display state information and the second corresponding relation.
In addition, in order to better attach the expression and action of the virtual image to the text explanation audio when the target display information is displayed, the real interactive communication is more similar, and the mouth shape information of the virtual image can be determined. Specifically, each interpretation pronunciation in the text interpretation audio can be determined based on the text interpretation audio, the interpretation mouth shape corresponding to each interpretation pronunciation is determined based on the corresponding relation between the predetermined pronunciation and the mouth shape, and mouth shape information is formed based on each interpretation mouth shape. Specifically, the outlet type information may be determined based on a lip sync (LipSync) technique.
The LipSync technology is used to synchronize lip motions of an avatar to voice. Through the output audio, a set of pronunciation mouth shapes for animating the lips of the avatar are then predicted, and in order to improve the accuracy of the audio-driven facial animation, lipSync learns the mapping relationship between the voice and the phonemes using a neural network model. The input audio is converted into phonemes through the neural network model, the phonemes can correspond to specific visual phonemes, and the gestures and expressions of the lips and the faces of the avatar are realized based on the Unity integration technology.
Further, the mouth shape information and the expression information are combined to obtain facial motion information of the virtual image. It should be noted that, before the combination, it may be determined whether the mouth shape information and the expression information are matched, and in the case of matching, the mouth shape information and the expression information are combined. For example, when the mouth shape information is mouth opening and the corners of the mouth are raised, if the expression information is laughing, the mouth shape information and the expression information can be determined to be matched; if the expression information is crying, the expression information and the expression information are not matched.
According to the method, the expression information and other action information are determined through the display state information, the export type information is determined based on text explanation audio, and therefore the image action information is obtained through combination, the obtained image action information can be attached to the content of the theme expansion text, and the content of the target display information and the interactive communication scene close to reality are enriched.
S240, generating target display information corresponding to the theme description information based on the virtual image, the image action information and the text explanation audio, and displaying the target display information, wherein the target display information at least comprises the text explanation audio of the virtual image aiming at the theme expansion text.
Specifically, the virtual image and the image action information can be combined to obtain an animation image corresponding to the virtual image, and the animation image and the text explanation audio are combined to obtain target display information corresponding to the theme description information. In order to ensure that the obtained corresponding matching among the sound, the image action information and the theme expansion text is realized when the target display information is displayed, the matching relation between different parts of the text explanation audio and the image action information can be determined, so that the alignment operation is carried out on the text explanation audio and the animation image based on the matching relation, and the aligned text explanation audio and the animation image are combined to obtain the target display information.
According to the method, the target display information is obtained through the image action information, the virtual image and the text explanation audio, so that the target display information not only comprises the text explanation audio, but also can be matched with the virtual images with different shape actions, the target display information is more vivid and visual, and the user experience is improved.
In this embodiment, after displaying the target display information, the method further includes: and responding to the interactive triggering operation aiming at the target display information input, acquiring the interactive information, generating voice feedback information corresponding to the interactive information, and displaying the voice feedback information based on the virtual image.
After the target display information is displayed, in order to facilitate the user to carry out further communication and exchange on the target display information, the user experience is improved, and the interactive information can be acquired in response to the interactive triggering operation input on the target display information. The interactive triggering operation comprises at least one of a pause operation, a selection operation, a screenshot uploading operation, a voice output operation and a text input operation of the target display information.
After the interactive triggering operation is triggered, the interactive information corresponding to the interactive triggering operation can be obtained. The interactive information includes question information, answer information, and supplementary content information, for example. The voice feedback information for carrying out communication feedback on the interaction information can be determined. Illustratively, the voice feedback information includes answer information to the question, rating information to the answer, and extended supplementary information to the interactive information. The voice feedback information can be displayed according to the virtual sound of the virtual image and the virtual character corresponding to the target display information. Further, based on text content corresponding to the voice feedback information, the character action information corresponding to the virtual character can be determined, the character action information is combined with the virtual character, and the voice feedback information is displayed.
After the target display information is displayed, the acquired interactive information is subjected to voice feedback, so that the understanding of the target display information by the user is promoted, and the communication with the user is smoother.
Further, the method for generating the voice feedback information corresponding to the interaction information includes: and calling an objective function to generate voice feedback information corresponding to the interaction information by taking the interaction information as a reference.
The objective function is used for extracting interaction keywords corresponding to the interaction information and indicating a representation template corresponding to the voice feedback information corresponding to the interaction keywords. The target function is a registered calling function in a function calling mode of the GPT model. The interactive keywords are keywords obtained after keyword extraction is performed on the interactive information; the expression template is a template indicating the expression sequence and the expression rhythm of the voice feedback information.
Specifically, the interactive information can be used as an input parameter, and the interactive keywords in the interactive information are extracted by calling the objective function. And determining feedback texts corresponding to the interactive keywords based on the obtained interactive keywords, and generating voice feedback information corresponding to the feedback texts based on the feedback texts and the expression templates indicated by the objective functions.
According to the embodiment, the user interaction requirement reflected in the interaction information can be known more accurately by calling the objective function, the voice feedback information corresponding to the interaction information is determined efficiently and accurately, the interaction information is fed back in time, and the user experience is improved.
The foregoing details of the embodiments corresponding to the information display method are described, so that, in order to make the technical solution of the method further clear to those skilled in the art, the technical solution is specifically described below in connection with an application scenario of generating courseware in the teaching process. Fig. 3 is a flowchart of another information display method provided by an embodiment of the present disclosure, and as shown in fig. 3, the method of the embodiment of the present disclosure may include the following steps:
1. and receiving the voice teaching information and/or text teaching information input by the user, and performing text conversion on the voice teaching information through the TTS service when the voice teaching information is received so as to obtain corresponding conversion teaching information. And converting the teaching information or the text teaching information as the theme description information.
2. And extracting keywords from the topic description information through the GPT model to obtain description keywords corresponding to the topic description information.
3. And inputting the description keywords into a pre-constructed deep learning model to obtain corresponding theme expansion text, and determining display state information of the theme expansion text, such as at least one of states of happiness, sadness, anger, surprise, fear, aversion and the like. Determining the expression and the action of the avatar through the display state information and the action capture editor (for example, AI Motion); and obtaining text explanation audio through the theme expansion text, determining a mouth shape corresponding to the text explanation audio based on the text explanation audio and lip synchronization (LipSync) technology, and combining mouth shape information and expressions to obtain facial action information. And, the theme extended text is converted into audio through the TTS service.
4. Generating image guidance text for answering the theme description information based on the description keyword, driving a text-to-picture service based on the image guidance text, and generating a theme expansion image.
5. The user end renders the virtual image by utilizing digital human technology and facial action information in the illusion engine and other action information, and combines the theme expansion image, the theme expansion text and text explanation audio to obtain target display information, namely courseware corresponding to the theme description information. Typesetting the courseware based on the number of texts, the number of images and a preset typesetting template, and displaying the typeset courseware.
According to the technical scheme, the description keywords in the theme description information are extracted in an artificial intelligence extraction mode, so that the accuracy of the description keywords is ensured, and the richness and the natural sense of the theme expansion text are improved when the theme expansion text is obtained based on the description keywords; in addition, the embodiment does not generate a theme extended image according to the traditional text-to-graphic scheme, so that the relevance of the graphics context is not strong; but based on the image guidance text generated by the descriptive keywords, obtaining a theme extended image based on the image guidance text, and increasing the sense of reality and accuracy of the theme extended image; furthermore, the lip synchronization technology is used for obtaining the mouth shape information, so that the audio frequency and the mouth shape can be aligned accurately, and the animation fusion effect is better.
Fig. 4 is a schematic structural diagram of an information display device according to an embodiment of the disclosure, as shown in fig. 4, where the device includes:
an information acquisition module 100, configured to acquire topic description information in response to an information presentation trigger operation;
an information expansion module 110, configured to generate theme expansion information based on the theme description information, where the theme expansion information includes at least a theme expansion text, and the theme expansion text is a text for describing expansion content associated with the theme description information;
The information display module 120 generates target display information corresponding to the theme description information based on the theme extension text, and displays the target display information, wherein the target display information at least comprises text explanation audio of the theme extension text by an avatar.
On the basis of the above-mentioned alternative solutions, optionally, the information expansion module 110 includes:
and the text determination sub-module is used for determining the topic description text corresponding to the topic description information, determining the description keyword corresponding to the topic description text and generating the topic expansion text based on the description keyword.
On the basis of the above-mentioned alternative solutions, optionally, the text determining submodule includes:
the text query unit is used for querying the theme expansion text corresponding to the description keyword through a preset query path; and/or the number of the groups of groups,
and the keyword input unit is used for inputting the description keywords into a text generation model to obtain a theme expansion text corresponding to the description keywords, wherein the text generation model is a deep learning model trained based on the sample keywords and expected output texts corresponding to the sample keywords.
On the basis of the above-mentioned various optional technical solutions, optionally, the theme extension information further includes a theme extension image; the text determination submodule further includes:
And a guide text generation unit for generating an image guide text based on the description keywords and generating a theme expansion image corresponding to the theme description information based on the image guide text after determining the description keywords corresponding to the theme description text.
On the basis of the above-mentioned alternative solutions, optionally, the instruction text generating unit includes:
and the text explanation audio generation subunit is used for generating text explanation audio of the virtual image aiming at the theme expansion text based on the theme expansion text and generating target display information based on the text explanation audio and the theme expansion image.
On the basis of the above-mentioned alternative solutions, optionally, the information display module 120 includes:
the virtual image determining sub-module is used for determining virtual images corresponding to the theme expansion texts and generating text explanation audio and image action information corresponding to the virtual images based on the theme expansion texts;
and the target display information generation sub-module is used for generating target display information corresponding to the theme description information based on the virtual image, the image action information and the text explanation audio.
On the basis of the above-mentioned alternative solutions, optionally, the avatar action information includes facial action information and other action information, where the other action information is action information of other parts except the face of the avatar; an avatar determination sub-module comprising:
A display state information determining unit for determining display state information corresponding to the avatar based on the theme extended text, and determining expression information and other action information corresponding to the avatar based on the display state information;
and a mouth shape information generating unit for generating mouth shape information corresponding to the avatar based on the text interpretation audio, and generating face action information corresponding to the avatar based on the mouth shape information and the expression information.
On the basis of the above-mentioned alternative technical solutions, optionally, the topic description information is topic description text or topic description voice; the information acquisition module 100 includes:
the theme description text acquisition sub-module is used for acquiring the theme description text input in a preset theme editing box; or,
the theme description voice acquisition sub-module is used for acquiring the theme description voice acquired based on the preset sound acquisition control.
On the basis of the above-mentioned alternative solutions, optionally, the method further includes:
the interactive information acquisition module is used for responding to the interactive triggering operation input aiming at the target display information after the target display information is displayed, acquiring the interactive information, generating voice feedback information corresponding to the interactive information and displaying the voice feedback information based on the virtual image.
On the basis of the above-mentioned alternative technical solutions, optionally, the interactive information acquisition module includes:
the voice feedback information generation sub-module is used for calling an objective function to generate voice feedback information corresponding to the interaction information by taking the interaction information as a reference, wherein the objective function is used for extracting interaction keywords corresponding to the interaction information and indicating a representation template corresponding to the voice feedback information corresponding to the interaction keywords.
According to the technical scheme, the theme description information is obtained in response to information display triggering operation; generating topic extension information based on the topic description information, wherein the topic extension information at least comprises topic extension text, and the topic extension text is text for describing extension content associated with the topic description information; therefore, the topic description information is further and comprehensively expanded through the topic expansion text; and generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the avatar aiming at the theme extension text. Therefore, the target display information can display the content corresponding to the theme description information singly, and can display the content corresponding to the theme expansion information, so that the requirements of user variability and richness are better met, and the target display information is accurately determined; in addition, a large amount of alternative display information is not required to be predetermined, and the problems that the preparation workload of the alternative information is large and the content is limited in the related technology, so that the accuracy of the displayed feedback information is poor are solved; in addition, the object display information is displayed more vividly, more abundantly and clearly through the avatar display.
It should be noted that each unit and module included in the above apparatus are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for convenience of distinguishing from each other, and are not used to limit the protection scope of the embodiments of the present disclosure.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. Referring now to fig. 5, a schematic diagram of a configuration of an electronic device (e.g., a terminal device or server in fig. 5) 200 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 5 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 5, the electronic device 200 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 201, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 202 or a program loaded from a storage means 208 into a Random Access Memory (RAM) 203. In the RAM203, various programs and data necessary for the operation of the electronic apparatus 200 are also stored. The processing device 201, ROM202, and RAM203 are connected to each other through a bus 204. An edit/output (I/O) interface 205 is also connected to bus 204.
In general, the following devices may be connected to the I/O interface 205: input devices 206 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 207 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 208 including, for example, magnetic tape, hard disk, etc.; and a communication device 209. The communication means 209 may allow the electronic device 200 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 shows an electronic device 200 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 209, or from the storage means 208, or from the ROM 202. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 201.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
The electronic device provided by the embodiment of the present disclosure and the information display method provided by the foregoing embodiment belong to the same inventive concept, and technical details not described in detail in the present embodiment may be referred to the foregoing embodiment, and the present embodiment has the same beneficial effects as the foregoing embodiment.
The present disclosure provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the information presentation method provided by the above embodiments.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: responding to the information display triggering operation, and acquiring theme description information; generating theme extension information based on the theme description information, wherein the theme extension information at least comprises theme extension text, and the theme extension text is text for describing extension content associated with the theme description information; and generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the theme extension text by an avatar.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit does not in any way constitute a limitation of the unit itself, for example the first acquisition unit may also be described as "unit acquiring at least two internet protocol addresses".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, there is provided an information presentation method, including:
responding to the information display triggering operation, and acquiring theme description information;
generating topic extension information based on the topic description information, wherein the topic extension information at least comprises topic extension text, and the topic extension text is text for describing extension content associated with the topic description information;
and generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the avatar aiming at the theme extension text.
According to one or more embodiments of the present disclosure, there is provided a method of example one, further comprising:
in some alternative implementations, a topic description text corresponding to the topic description information is determined, and a description keyword corresponding to the topic description text is determined, and a topic extension text is generated based on the description keyword.
According to one or more embodiments of the present disclosure, there is provided a method of example one, further comprising:
in some optional implementations, the topic extension text corresponding to the description keyword is queried through a preset query path; and/or the number of the groups of groups,
And inputting the description keywords into a text generation model to obtain a theme expansion text corresponding to the description keywords, wherein the text generation model is a deep learning model trained based on the sample keywords and expected output texts corresponding to the sample keywords.
According to one or more embodiments of the present disclosure, there is provided a method of example one [ example four ], further comprising:
in some alternative implementations, image guidance text is generated based on the descriptive keywords, and a theme extension image corresponding to the theme descriptive information is generated based on the image guidance text.
According to one or more embodiments of the present disclosure, there is provided a method of example one [ example five ], further comprising:
in some alternative implementations, text explanation audio of the avatar for the theme extended text is generated based on the theme extended text, and the target presentation information is generated based on the text explanation audio and the theme extended image.
According to one or more embodiments of the present disclosure, there is provided a method of example one [ example six ], further comprising:
in some alternative implementations, an avatar corresponding to the theme extended text is determined, and text interpretation audio and avatar action information corresponding to the avatar are generated based on the theme extended text;
And generating target display information corresponding to the theme description information based on the avatar, the avatar action information and the text explanation audio.
According to one or more embodiments of the present disclosure, there is provided a method of example one [ example seven ], further comprising:
in some alternative implementations, presentation state information corresponding to the avatar is determined based on the theme extended text, and expression information and other action information corresponding to the avatar are determined based on the presentation state information;
mouth shape information corresponding to the avatar is generated based on the text interpretation audio, and face motion information corresponding to the avatar is generated based on the mouth shape information and the expression information.
According to one or more embodiments of the present disclosure, there is provided a method of example one, further comprising:
in some optional implementations, obtaining a theme description text input in a preset theme editing box; or,
and acquiring the theme description voice acquired based on the preset sound acquisition control.
According to one or more embodiments of the present disclosure, there is provided a method of example one, further comprising:
in some alternative implementations, in response to an interactive trigger operation for target presentation information input, interactive information is acquired, and voice feedback information corresponding to the interactive information is generated, and the voice feedback information is presented based on the avatar.
According to one or more embodiments of the present disclosure, there is provided a method of example one, further comprising:
in some optional implementations, the interactive information is taken as a reference, and an objective function is called to generate voice feedback information corresponding to the interactive information, wherein the objective function is used for extracting an interactive keyword corresponding to the interactive information and indicating a representation template corresponding to the voice feedback information corresponding to the interactive keyword.
According to one or more embodiments of the present disclosure, there is provided an information presentation apparatus, including:
the information acquisition module is used for responding to the information display triggering operation and acquiring the theme description information;
the information expansion module is used for generating theme expansion information based on the theme description information, wherein the theme expansion information at least comprises a theme expansion text, and the theme expansion text is a text for describing expansion content associated with the theme description information;
and the information display module is used for generating target display information corresponding to the theme description information based on the theme expansion text and displaying the target display information, wherein the target display information at least comprises text explanation audio of the virtual image aiming at the theme expansion text.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).
Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims (13)

1. An information display method, comprising:
responding to the information display triggering operation, and acquiring theme description information;
generating theme extension information based on the theme description information, wherein the theme extension information at least comprises theme extension text, and the theme extension text is text for describing extension content associated with the theme description information;
and generating target display information corresponding to the theme description information based on the theme extension text, and displaying the target display information, wherein the target display information at least comprises text explanation audio of the theme extension text by an avatar.
2. The information presentation method according to claim 1, wherein the generating theme extension information based on the theme description information includes:
And determining a topic description text corresponding to the topic description information, determining a description keyword corresponding to the topic description text, and generating a topic expansion text based on the description keyword.
3. The information presentation method according to claim 2, wherein the generating the subject expanded text based on the description keyword includes:
inquiring a theme expansion text corresponding to the description keyword through a preset inquiry path; and/or the number of the groups of groups,
and inputting the description keywords into a text generation model to obtain a theme expansion text corresponding to the description keywords, wherein the text generation model is a deep learning model trained based on sample keywords and expected output texts corresponding to the sample keywords.
4. The information presentation method of claim 2, wherein the theme extension information further includes a theme extension image; after determining the description keywords corresponding to the topic description text, the method further comprises the following steps:
generating an image guidance text based on the description keywords, and generating a theme expansion image corresponding to the theme description information based on the image guidance text.
5. The information presentation method according to claim 4, wherein the generating target presentation information corresponding to the topic description information based on the topic extension text includes:
generating text explanation audio of the virtual image aiming at the theme expansion text based on the theme expansion text, and generating target display information based on the text explanation audio and the theme expansion image.
6. The information presentation method according to claim 1, wherein the generating target presentation information corresponding to the topic description information based on the topic extension text includes:
determining an avatar corresponding to the theme expansion text, and generating text explanation audio and avatar action information corresponding to the avatar based on the theme expansion text;
and generating target display information corresponding to the theme description information based on the virtual image, the image action information and the text explanation audio.
7. The information presentation method of claim 6, wherein the avatar motion information includes facial motion information and other motion information, the other motion information being motion information of other parts except the face of the avatar; the generating avatar action information corresponding to the avatar based on the theme extended text includes:
Determining display state information corresponding to the avatar based on the theme extended text, and determining expression information and other action information corresponding to the avatar based on the display state information;
and generating mouth shape information corresponding to the virtual image based on the text explanation audio, and generating face action information corresponding to the virtual image based on the mouth shape information and the expression information.
8. The information presentation method according to claim 1, wherein the topic description information is topic description text or topic description voice; the obtaining the topic description information comprises the following steps:
acquiring a theme description text input in a preset theme editing box; or,
and acquiring the theme description voice acquired based on the preset sound acquisition control.
9. The information presentation method according to claim 1, further comprising, after the presenting the target presentation information:
and responding to the interactive triggering operation aiming at the target display information input, acquiring interactive information, generating voice feedback information corresponding to the interactive information, and displaying the voice feedback information based on the virtual image.
10. The information presentation method of claim 9, wherein the generating the voice feedback information corresponding to the interactive information includes:
and calling an objective function to generate voice feedback information corresponding to the interaction information by taking the interaction information as a reference, wherein the objective function is used for extracting an interaction keyword corresponding to the interaction information and indicating a representation template corresponding to the voice feedback information corresponding to the interaction keyword.
11. An information display device, comprising:
the information acquisition module is used for responding to the information display triggering operation and acquiring the theme description information;
the information expansion module is used for generating theme expansion information based on the theme description information, wherein the theme expansion information at least comprises theme expansion texts, and the theme expansion texts are texts for describing expansion contents associated with the theme description information;
and the information display module is used for generating target display information corresponding to the theme description information based on the theme expansion text and displaying the target display information, wherein the target display information at least comprises text explanation audio of the theme expansion text by an avatar.
12. An electronic device, the electronic device comprising:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the information presentation method of any of claims 1-10.
13. A storage medium containing computer executable instructions which, when executed by a computer processor, are for performing the information presentation method of any of claims 1-10.
CN202311474663.2A 2023-11-07 2023-11-07 Information display method and device, electronic equipment and storage medium Pending CN117520502A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311474663.2A CN117520502A (en) 2023-11-07 2023-11-07 Information display method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311474663.2A CN117520502A (en) 2023-11-07 2023-11-07 Information display method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117520502A true CN117520502A (en) 2024-02-06

Family

ID=89759980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311474663.2A Pending CN117520502A (en) 2023-11-07 2023-11-07 Information display method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117520502A (en)

Similar Documents

Publication Publication Date Title
CN110033659B (en) Remote teaching interaction method, server, terminal and system
WO2021114881A1 (en) Intelligent commentary generation method, apparatus and device, intelligent commentary playback method, apparatus and device, and computer storage medium
US10249207B2 (en) Educational teaching system and method utilizing interactive avatars with learning manager and authoring manager functions
US20230042654A1 (en) Action synchronization for target object
Cole et al. Perceptive animated interfaces: First steps toward a new paradigm for human-computer interaction
CN105632251B (en) 3D virtual teacher system and method with phonetic function
US11871109B2 (en) Interactive application adapted for use by multiple users via a distributed computer-based system
CN111541908A (en) Interaction method, device, equipment and storage medium
WO2022170848A1 (en) Human-computer interaction method, apparatus and system, electronic device and computer medium
CN111414506B (en) Emotion processing method and device based on artificial intelligence, electronic equipment and storage medium
KR20220129989A (en) Avatar-based interaction service method and apparatus
CN110046290B (en) Personalized autonomous teaching course system
CN112232066A (en) Teaching outline generation method and device, storage medium and electronic equipment
US20200027364A1 (en) Utilizing machine learning models to automatically provide connected learning support and services
Dhiman Artificial Intelligence and Voice Assistant in Media Studies: A Critical Review
CN113205569A (en) Image drawing method and device, computer readable medium and electronic device
CN115442495A (en) AI studio system
CN117520502A (en) Information display method and device, electronic equipment and storage medium
Divekar AI enabled foreign language immersion: Technology and method to acquire foreign languages with AI in immersive virtual worlds
CN113253838A (en) AR-based video teaching method and electronic equipment
WO2022196880A1 (en) Avatar-based interaction service method and device
CN109993671B (en) EAOG-based autonomous teaching design method and system applied to autonomous teaching
CN117591660B (en) Material generation method, equipment and medium based on digital person
Zikky et al. Utilizing Virtual Humans as Campus Virtual Receptionists
WO2023065963A1 (en) Interactive display method and apparatus, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination