WO2022121601A1 - 一种直播互动方法、装置、设备及介质 - Google Patents

一种直播互动方法、装置、设备及介质 Download PDF

Info

Publication number
WO2022121601A1
WO2022121601A1 PCT/CN2021/129508 CN2021129508W WO2022121601A1 WO 2022121601 A1 WO2022121601 A1 WO 2022121601A1 CN 2021129508 W CN2021129508 W CN 2021129508W WO 2022121601 A1 WO2022121601 A1 WO 2022121601A1
Authority
WO
WIPO (PCT)
Prior art keywords
live broadcast
scene
virtual object
information
data
Prior art date
Application number
PCT/CN2021/129508
Other languages
English (en)
French (fr)
Inventor
杨贺
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Priority to JP2023534896A priority Critical patent/JP2023553101A/ja
Publication of WO2022121601A1 publication Critical patent/WO2022121601A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4316Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/012Walk-in-place systems for allowing a user to walk in a virtual environment while constraining him to a given position in the physical environment

Definitions

  • the present disclosure relates to the field of live broadcast technology, and in particular, to a live broadcast interactive method, device, device, and medium.
  • virtual objects can be used to replace live broadcasters for live broadcasts.
  • the above-mentioned virtual objects can usually only be broadcast live according to the preset content, the audience can only watch passively, and cannot decide the content to watch, and the live broadcast effect is not good.
  • the present disclosure provides a live interactive method, apparatus, device and medium.
  • An embodiment of the present disclosure provides a live interactive method, which is applied to multiple viewer terminals entering a live broadcast room of a virtual object, including:
  • the video content of the virtual object in the second live broadcast scene is played on the live broadcast interface, wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object.
  • Embodiments of the present disclosure also provide a live broadcast interaction method, which is applied to a server, including:
  • second video data corresponding to the second live broadcast scene is sent to the multiple viewer terminals; wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object in the live broadcast room.
  • the embodiment of the present disclosure also provides a live interactive device, the device is arranged on multiple viewer terminals entering the live broadcast room of the virtual object, including:
  • a first live broadcast module configured to play the video content of the virtual object in the first live broadcast scene on the live broadcast interface, and display interactive information from the plurality of viewer terminals;
  • a second live broadcast module configured to play the video content of the virtual object in the second live broadcast scene on the live broadcast interface in response to the interaction information meeting a trigger condition; wherein the live broadcast scene is used to represent the live broadcast content of the virtual object type.
  • the embodiment of the present disclosure also provides a live interactive device, the device is set on the server, and includes:
  • an information receiving module configured to receive interactive information of multiple viewer terminals in the first live broadcast scene, and determine whether a trigger condition for live broadcast scene switching is satisfied based on the interactive information
  • a data sending module configured to send the second video data corresponding to the second live broadcast scene to the plurality of viewer terminals if the trigger condition is satisfied, wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object in the live broadcast room.
  • An embodiment of the present disclosure further provides an electronic device, the electronic device includes: a processor; a memory for storing instructions executable by the processor; the processor for reading the memory from the memory The instructions can be executed, and the instructions can be executed to implement the live interaction method provided by the embodiments of the present disclosure.
  • An embodiment of the present disclosure further provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute the live interaction method provided by the embodiment of the present disclosure.
  • the technical solutions provided by the embodiments of the present disclosure have the following advantages: in the live interactive solutions provided by the embodiments of the present disclosure, multiple viewer terminals entering the live broadcast room of the virtual object can play the virtual reality in the first live broadcast scene on the live broadcast interface.
  • the video content of the object, and interactive information from multiple viewer terminals is displayed; in response to the interactive information meeting the trigger condition, the video content of the virtual object in the second live broadcast scene is played on the live broadcast interface; wherein, the live broadcast scene is used to represent the live broadcast of the virtual object content type.
  • the virtual object can switch from the live broadcast in the first live broadcast scene to the live broadcast in the second live broadcast scene based on the interactive information of the audience, and the audience realizes the interactive links between the virtual object and the audience in different live broadcast scenes to satisfy the audience.
  • the variety and interest of virtual object live broadcasts are improved, and the interactive experience effect of the audience is improved.
  • FIG. 1 is a schematic flowchart of a live interactive method according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a live broadcast interaction provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of another live broadcast interaction provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic flowchart of another live interaction method according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of a live interactive device according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of another live interactive device according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, “including but not limited to”.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • FIG. 1 is a schematic flowchart of a live interactive method according to an embodiment of the present disclosure.
  • the method can be executed by a live interactive device, where the device can be implemented by software and/or hardware, and can generally be integrated into an electronic device. As shown in Figure 1, it is applied to multiple viewer terminals entering the live broadcast room of the virtual object, including:
  • Step 101 Play the video content of the virtual object in the first live broadcast scene on the live broadcast interface, and display interactive information from multiple viewer terminals.
  • the virtual object can be a three-dimensional model pre-created based on artificial intelligence (Artificial Intelligence, AI) technology
  • a controllable digital object can be set for the computer, and the body movements and facial information of the real person can be obtained through the motion capture device and the face capture device.
  • the specific types of virtual objects may include multiple types, and different virtual objects may have different appearances.
  • the virtual objects may specifically be virtual animals or virtual characters of different styles.
  • virtual objects through the combination of artificial intelligence technology and live video technology, virtual objects can replace real people to realize live video.
  • the live interface refers to a page in the live room for displaying virtual objects, and the page may be a web page or a page in an application client.
  • the live broadcast scene is a scene used to characterize the type of live broadcast content of the virtual object, and the live broadcast scene of the virtual object may include a variety of live broadcast scenes. Scenes and multimedia resources may include reading books, singing songs, and painting topics, etc., which are not limited in detail.
  • the first live broadcast scene is a live broadcast scene in which a virtual object performs a multimedia resource
  • playing the video content of the virtual object in the first live broadcast scene on the live broadcast interface may include: displaying in the first area of the live broadcast interface a lot of Multimedia resource information of each multimedia resource; playing the video content of the target multimedia resource performed by the virtual object, wherein the target multimedia resource is determined based on the trigger information of the plurality of multimedia resources by the plurality of viewer terminals.
  • the multimedia resource information to be performed may include books to be read, songs to be sung, and painting topics to be painted.
  • the first area is an area set in the live broadcast interface for displaying multimedia resource information of the multimedia resource to be performed, and supports the audience's triggering operation on the multimedia resource.
  • the trigger operation includes one or more of single click, double click, slide and voice command.
  • the terminal may receive multimedia resource information of multiple multimedia resources to be performed sent by the server, and display the multimedia resource information in the first area of the live broadcast interface.
  • Each terminal sends the trigger information about the multimedia resource from the viewer to the server, and the server can determine the target multimedia resource from multiple multimedia resources according to the trigger information.
  • the terminal can receive the video data of the target multimedia resource delivered by the server, and play the video content of the virtual object performing the multimedia resource on the live interface based on the video data.
  • the virtual object can perform a live broadcast scene of multimedia resources according to the selection of the audience, and the audience can decide the content to watch, which improves the degree of participation and further improves the live broadcast effect of the virtual object.
  • playing the video content of the virtual object in the first live broadcast scene on the live broadcast interface may include: receiving first video data corresponding to the first live broadcast scene, wherein the first video data includes first scene data, first Action data and first audio data, the first scene data is used to represent the background picture of the live broadcast room in the first live broadcast scene, the first action data is used to represent the facial expressions and body movements of the virtual object in the first live broadcast scene, and the audio data Matching with the target multimedia resource; playing the video content of the virtual object performing the target multimedia resource in the first live broadcast scene in the live broadcast interface based on the first video data.
  • the first video data refers to data preconfigured by the server for implementing the virtual object to perform live broadcast in the first live broadcast scene
  • the first video data may include first scene data, first action data, and first audio data.
  • the scene corresponding to the background picture of the live broadcast room may include the background scene of the virtual object in the first live broadcast scene and the scene of the screen view angle. / or different display directions, etc.
  • the first motion data may be used to generate facial expressions and body movements of the virtual object in the first live broadcast scene.
  • the audio data matches the target multimedia resource among the multiple multimedia resources. For example, when the target multimedia resource is a singing song, the audio data is the audio of the singing song.
  • the terminal after detecting the triggering operation of the virtual object by the viewer, the terminal can obtain the first video data corresponding to the first live broadcast scene sent by the server, and can generate corresponding video content by decoding the first video data. , and play the video content of the virtual object performing the target multimedia resource in the first live broadcast scene in the live broadcast interface.
  • the terminal may receive multiple pieces of interactive information from multiple live viewers, and display the multiple pieces of interactive information on the live broadcast interface. The display position can be set according to the actual situation.
  • the background picture of the live broadcast room and the action of the virtual object vary with the video content. Switching is possible.
  • FIG. 2 is a schematic diagram of a live broadcast interaction provided by an embodiment of the present disclosure.
  • the figure shows a live broadcast interface of a virtual object 11 in a first live broadcast scene.
  • the virtual object 11 reads a live screen of a reading book, and an electronic reader is placed in front of the virtual object 11, indicating that the virtual object 11 is narrating a reading book.
  • the upper left corner of the live broadcast interface in FIG. 2 also displays the avatar and name of the virtual object 11 , which is named “Little A”, and the focus button 12 .
  • the bottom of the live broadcast interface in Figure 2 also shows the interactive information sent by different users who watch the virtual object live broadcast, such as "This story is awesome” sent by user A (viewer A) in the figure, user B (viewer B) ), and "I'm coming to you” sent by user C (viewer C).
  • the bottom of the live broadcast interface also shows the editing area 13 for the current user to send interactive information and other function buttons, such as the selection button 14, the interactive button 15, and the activity and reward button 16 in the figure. Different function buttons have different functions.
  • Step 102 In response to the interactive information satisfying the trigger condition, play the video content of the virtual object in the second live broadcast scene on the live broadcast interface; wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object.
  • the trigger condition refers to a condition for determining whether to switch the live broadcast scene based on the interactive information of the audience.
  • the trigger condition may include that the number of interactive information reaches a preset threshold, the interactive information includes the first keyword, the interactive information At least one of the number of second keywords in the information reaches the keyword threshold, the duration of the first live broadcast scene reaches a preset duration, and the first live broadcast scene reaches a preset mark point.
  • the above-mentioned preset threshold, the first keyword, the second keyword, the keyword threshold, the preset duration, and the preset marking point can all be set according to actual conditions.
  • playing the video content of the virtual object in the second live broadcast scene on the live broadcast interface includes: playing the video content of the virtual object replying to the interactive information on the live broadcast interface.
  • the second live broadcast scene is different from the above-mentioned first live broadcast scene, and refers to a live broadcast scene in which the virtual object replies to interactive information.
  • the terminal may receive reply audio data corresponding to one or more interactive information, and jointly generate the reply video content based on the reply audio data, the second scene data and the second action data of the virtual object in the second live broadcast scene, and The video content of the virtual object replying to the interactive information is played in the live interface.
  • the virtual object replies to the target interactive information in the interactive information;
  • the live broadcast interaction method may further include: displaying the target interactive information in the second area of the live broadcast interface and replying to the text information of the target interactive information.
  • the target interactive information is one or more pieces of interactive information that the server determines based on the preset scheme to be responded to from the multiple interactive information sent by the live audience.
  • the preset scheme can be set according to the actual situation, for example, it can be a live broadcast based on sending interactive information.
  • the audience's points determine the target interaction information; or find the target interaction information matching the preset keywords, wherein the preset keywords can be mined and extracted in advance according to the hot information, or can be keywords related to the live broadcast content; Semantic recognition is performed, and interactive information with similar meanings is clustered to obtain several information sets.
  • the set with the most interactive information is the hottest topic for live audience interaction, and the interactive information corresponding to this set is used as the target interactive information.
  • the text information for replying to the target interaction information refers to the reply text information determined by the server based on the corpus that matches the target interaction information.
  • the terminal may receive text information replying to the target interactive information, and display the target interactive information and the text information replying to the target interactive information in the second area of the live broadcast interface.
  • the terminal in the second live broadcast scenario, can play the video content of the virtual object replying to the interactive information in the live broadcast interface, and display the current interactive information and the corresponding reply text information, so that the audience can know which part the virtual object is replying to.
  • the interactive content of an audience further enhances the depth of interaction between the audience and virtual objects, and improves the interactive interaction experience.
  • playing the video content of the virtual object in the second live broadcast scene on the live broadcast interface may include: receiving second multimedia data corresponding to the second live broadcast scene, wherein the second multimedia data includes the second scene data, second action data, and second audio data, the second scene data is used to represent the background picture of the live room in the second live broadcast scene, and the second action data is used to represent the facial expressions and limbs of the virtual object in the second live broadcast scene action, the second audio data is generated based on the target interactive information; based on the second multimedia data, the video content of the virtual object responding to the target interactive information in the second live broadcast scene is played in the live broadcast interface.
  • the second multimedia data includes the second scene data, second action data, and second audio data
  • the second scene data is used to represent the background picture of the live room in the second live broadcast scene
  • the second action data is used to represent the facial expressions and limbs of the virtual object in the second live broadcast scene action
  • the second audio data is generated based on the target interactive information
  • the second video data refers to the data pre-configured by the server for implementing the virtual object to perform live broadcast in the second live broadcast scene.
  • the second video data may include the second scene data, the second action data and the second audio data.
  • the expressed meaning is similar to the data in the above-mentioned first video data, and will not be described in detail here. The difference is that the specific video data in the first live broadcast scene and the second live broadcast scene are different.
  • the server when the server determines that the trigger condition is satisfied based on the interaction information, the server may send the second video data corresponding to the second live scene to the terminal.
  • the terminal After receiving the second video data, the terminal can generate corresponding video content by decoding the second video data, and play the video content in which the virtual object replies to the target interactive information in the second live broadcast scene in the live broadcast interface.
  • the terminal in the process of playing the video content in which the virtual object responds to the target interactive information in the second live broadcast scene, the terminal may also display interactive information from multiple viewer terminals.
  • the actions of the background picture and the virtual object in the live broadcast room follow the changes of the video content.
  • the changes can be switched, but the actions of the background images and virtual objects in the live broadcast room in the first live broadcast scene can be different.
  • FIG. 3 is a schematic diagram of another live broadcast interaction provided by an embodiment of the present disclosure.
  • the figure shows a live broadcast screen of the virtual object 11 in the process of replying to the interactive information in the second live broadcast scene.
  • FIG. 2 there is no electronic reader in front of the virtual object 11 .
  • the bottom of the live broadcast interface also shows the interactive information sent by different users during the live chat process, such as "I miss you” sent by user A (viewer A) in the figure, and "Hello” sent by user B (viewer B) , and "let's chat” sent by user C (viewer C).
  • the second area 17 is also displayed on the live page in FIG. 3.
  • the second area 17 can contain the interactive information of the current audience and the text information of the virtual object's replying interactive information, so that the audience can know which audience the virtual object is replying to. interactive content.
  • the interactive information in the figure is "let's chat for a while" sent by audience C, and the reply text of the virtual object is "it's too late, let's chat tomorrow".
  • the reply text corresponds to the reply audio data, which is consistent with the speech content of the virtual object when replying.
  • the actions of the virtual object 11 in Fig. 2 and Fig. 3 are different, in the first live broadcast scene of Fig. 2, the left hand of the virtual object 11 supports his cheek, and in the second live broadcast scene of Fig. 3, the left hand of the virtual object 11 is raised, Right hand chin.
  • the above-mentioned first live broadcast scene is a live broadcast scene in which a virtual object performs multimedia resources
  • the second live broadcast scene is a live broadcast scene in which the virtual object replies to interactive information.
  • the settings of the first live broadcast scene and the second live broadcast scene can also be replaced. That is, the first live broadcast scene may be a live broadcast scene in which the virtual object responds to interactive information, and the second live broadcast scene may be a live broadcast scene in which the virtual object performs multimedia resources, which are not limited in particular.
  • the first live broadcast scene and the second live broadcast scene can be continuously alternated, so that the live broadcast scene of the virtual object is constantly switched.
  • the live broadcast of virtual objects in different live broadcast scenarios can be realized, the live broadcast scenes can be switched according to the selection of the audience, and the background pictures and actions of the virtual objects in the live broadcast room can be different in different live broadcast scenarios, which satisfies the needs of the audience. interactive needs.
  • multiple viewer terminals entering the live broadcast room of the virtual object can play the video content of the virtual object in the first live broadcast scene on the live broadcast interface, and display the interactive information from the multiple viewer terminals;
  • the video content of the virtual object in the second live broadcast scene is played on the live broadcast interface; wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object.
  • the virtual object can switch from the live broadcast in the first live broadcast scene to the live broadcast in the second live broadcast scene based on the interactive information of the audience, so as to realize the interactive links between the virtual object and the audience in different live broadcast scenes and satisfy the audience's needs.
  • a variety of interactive needs improve the diversity and interest of virtual object live broadcasts, thereby improving the interactive experience effect of audiences.
  • FIG. 4 is a schematic flowchart of another live broadcast interaction method provided by an embodiment of the present disclosure. On the basis of the foregoing embodiment, this embodiment further optimizes the foregoing live broadcast interaction method. As shown in Figure 4, the method is applied to the server, including:
  • Step 201 Receive interactive information of multiple viewer terminals in the first live broadcast scene, and determine whether the trigger condition for live broadcast scene switching is satisfied based on the interactive information.
  • the live broadcast scene is a scene used to characterize the type of live broadcast content of the virtual object, and the live broadcast scene of the virtual object may include multiple types.
  • the live broadcast scene may include a live broadcast scene of the virtual object performing multimedia resources and the virtual object replying to interactive information
  • the multimedia resources can include reading books, singing songs and painting topics, etc., which are not limited.
  • the interactive information refers to interactive text information sent by multiple viewers watching the live broadcast in the first live broadcast scenario through the terminal.
  • the server may receive interactive information sent by multiple viewer terminals in the first live broadcast scene, and determine whether the trigger condition for live broadcast scene switching is satisfied based on the interactive information and/or relevant information of the first live broadcast scene.
  • the trigger conditions may include that the number of interactive information reaches a preset threshold, the interactive information includes a first keyword, the number of second keywords in the interactive information reaches the keyword threshold, and the duration of the first live broadcast scene reaches a preset threshold At least one of the duration and the first live broadcast scene reaching a preset marker point.
  • the above-mentioned preset threshold, the first keyword, the second keyword, the keyword threshold, the preset duration, and the preset marking point can all be set according to actual conditions.
  • the first live broadcast scene is a live broadcast scene in which a virtual object performs a multimedia resource
  • the live broadcast interaction method may further include: searching an audio database for first audio data that matches the target multimedia resource, and searching a virtual object action database for first audio data.
  • the first action data corresponding to the target multimedia resource is used to represent the facial expressions and body movements of the virtual object in the first live broadcast scene;
  • the first scene data is determined based on the scene identifier of the first live broadcast scene, and the first scene
  • the data is used to represent the background picture of the live room under the first live broadcast scene;
  • the first action data, the first audio data and the first scene data are combined into the first video data corresponding to the first live broadcast scene;
  • the first video data is sent to up to audience terminal.
  • the audio database and the virtual object action database may be preset databases.
  • the target multimedia resource is one of multiple multimedia resources.
  • the scene identifier refers to an identifier used to distinguish different live broadcast scenes, and the server can set corresponding scene data for different live broadcast scenes in advance.
  • the server can search in the audio database and the virtual object action database, determine the first audio data and the first action data matching the target multimedia resource, and determine the corresponding first scene data based on the scene identifier of the first live broadcast scene; then The server can combine the first action data, the first audio data and the first scene data to obtain the first video data, and send the first video data to multiple viewer terminals.
  • the viewer terminal After receiving the first video data, the viewer terminal can generate corresponding video content by decoding the first video data, and play the video content of the virtual object performing the target multimedia resource in the first live broadcast scene in the live broadcast interface.
  • the background picture of the live broadcast room and the action of the virtual object can be switched as the video content changes.
  • the live interaction method may further include: receiving trigger information from multiple viewer terminals for multiple multimedia resources displayed in the first live broadcast scene; and determining a target multimedia resource from the multiple multimedia resources based on the trigger information .
  • the trigger information may be related information corresponding to the trigger operation of the viewer on the multimedia resource, for example, the trigger information may include the number of triggers, the trigger time, and the like.
  • the viewer terminal can display the multimedia resource information of multiple multimedia resources in the live broadcast interface, receive the triggering operation of the multimedia resource from the viewer, and send the triggering information of the multimedia resource to the server.
  • the server can determine the target multimedia resource from multiple multimedia resources, for example, the multimedia resource with the most trigger times can be determined as the target multimedia resource.
  • Step 202 If the trigger condition is satisfied, send second video data corresponding to the second live broadcast scene to multiple viewer terminals; wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object in the live broadcast room.
  • the trigger condition is determined in at least one of the following ways: if the number of similar interaction information in the interaction information reaches a preset threshold, the trigger condition is satisfied, wherein the similar interaction information is an interaction whose similarity is greater than the similarity threshold information; extract the keywords in the interactive information, and match the keywords with the first keyword and/or the second keyword in the keyword database, if the interactive information includes the first keyword and/or the first keyword in the interactive information If the number of the two keywords reaches the keyword threshold, the trigger condition is satisfied; if the duration of the first live broadcast scene reaches the preset duration, the trigger condition is met; if the first live broadcast scene reaches the preset mark point, the trigger condition is met.
  • the server can perform semantic recognition on the interaction information, and cluster the interaction information whose similarity is greater than the similarity threshold, which is called similar interaction information. If the number of similar interactive information reaches a preset threshold, it can be determined that the triggering condition for switching the live broadcast scene is satisfied. And/or, the server can extract keywords in the interactive information based on semantics, and match the keywords with the first keywords in the keyword database. If the matching is successful, it can be determined that the interactive information includes the first keywords, and Trigger conditions are met.
  • the server can match the keywords of the interactive information with the second keywords, and if the matching is successful, the number of the second keywords is increased by one, and if the number of the second keywords reaches the keyword threshold, it can be Make sure the trigger condition is met.
  • the above-mentioned first keyword and second keyword may be keywords related to the second live broadcast scene.
  • the server can acquire the duration of the first live broadcast scene, and if the duration reaches the preset duration, it is determined that the trigger condition is satisfied. And/or, if the server determines that the first live broadcast scene has reached the preset marker point, it may determine that the trigger condition is satisfied.
  • the preset marker points may be set in advance according to the multimedia resources in the first live broadcast scenario. For example, when the multimedia resources are reading books, the reading books can be semantically split to obtain multiple reading paragraphs, which can be displayed at the end of each text paragraph. A preset marker point is set; for another example, when the multimedia resource is a singing song, the preset marker point may be set based on the attribute characteristics of the singing song.
  • the second video data is generated by the following methods: determining text information to reply to the target interactive information in a preset text library based on the target interactive information; converting the text information into second audio data;
  • the second action data corresponding to the target interaction information is searched in , the second action data is used to represent the facial expressions and body movements of the virtual object in the first live broadcast scene;
  • the second scene data is determined based on the scene identifier of the second live broadcast scene, and the second scene data is determined based on the scene identifier of the second live broadcast scene.
  • the second scene data is used to represent the background picture of the live broadcast room under the second live broadcast scene; the second action data, the second audio data and the second scene data are combined into the second video data corresponding to the second live broadcast scene; the second video data sent to multiple viewer terminals.
  • searching the virtual object motion database for the second motion data corresponding to the target interaction information includes: identifying the emotional information fed back by the virtual object according to the target interaction information; searching the virtual object motion database for the second motion data corresponding to the emotional information.
  • Action data corresponding to different emotional information are preset in the virtual object action database, for example, the clapping action corresponding to the happy emotion, and the clapping action corresponding to the angry emotion.
  • the second video data can be generated based on the target interactive information.
  • the server can determine the text information that matches the target interaction information in the preset text library, and convert the text information into virtual objects in real time through the text-to-speech (TTS) technology.
  • TTS text-to-speech
  • the natural voice data obtained from the second audio data is obtained; then the second action data corresponding to the emotional information represented by the target interaction information is searched in the virtual object action database, and the second scene data is determined based on the scene identifier of the second live broadcast scene.
  • the server can obtain the second video data by combining the second audio data, the second action data and the second scene data, and send the second video data to a plurality of viewer terminals.
  • the viewer terminal can generate corresponding video content by decoding the second video data, and play the video content in which the virtual object responds to the target interactive information in the second live broadcast scene in the live broadcast interface.
  • the actions of the background picture and the virtual object in the live broadcast room follow the changes of the video content. The changes can be switched, but the actions of the background images and virtual objects in the live broadcast room in the first live broadcast scene can be different.
  • first live broadcast scene is a live broadcast scene in which virtual objects perform multimedia resources
  • second live broadcast scene is a live broadcast scene in which virtual objects reply to interactive information, which are only examples, and the settings of the first live broadcast scene and the second live broadcast scene are also acceptable.
  • the first live broadcast scene and the second live broadcast scene can be continuously alternated, so that the live broadcast scene of the virtual object is constantly switched.
  • the live interaction method may further include: sending target interaction information and text information replying to the target interaction information to multiple viewer terminals.
  • the server can determine target interactive information from multiple interactive information sent by live viewers based on a preset scheme, and the preset scheme can be set according to the actual situation, for example, the target interactive information can be determined based on the points of the live viewers who sent the interactive information; Or find target interactive information that matches preset keywords, where preset keywords can be mined and extracted in advance according to hot information, or keywords related to the live broadcast content; or semantically identify the interactive information to express similar meanings Clustering the interactive information to obtain several information sets, the set with the most interactive information is the hottest topic of live audience interaction, and the interactive information corresponding to this set is used as the target interactive information.
  • the server can send the target interactive information and the text information replying to the target interactive information to the viewer terminal, and the terminal can receive the text information replying to the target interactive information, and display the target interactive information and the reply target in the second area of the live broadcast interface Text information for interactive information.
  • the server may receive the interaction information of multiple viewer terminals in the first live broadcast scene, and determine whether the trigger condition for switching the live broadcast scene is satisfied based on the interaction information;
  • the viewer terminal sends second video data corresponding to the second live broadcast scene, wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object in the live broadcast room.
  • the live broadcast in the live broadcast scene is switched to the live broadcast in the second live broadcast scene, which realizes the interactive link between the virtual object and the audience in different live broadcast scenes, meets the various interactive needs of the audience, and improves the diversity and interest of the virtual object live broadcast. This enhances the interactive experience of the audience.
  • FIG. 5 is a schematic structural diagram of a live interactive device according to an embodiment of the present disclosure.
  • the device may be implemented by software and/or hardware, and may generally be integrated into an electronic device. As shown in Figure 5, the device is arranged on multiple viewer terminals entering the live broadcast room of the virtual object, including:
  • a first live broadcast module 301 configured to play the video content of the virtual object in the first live broadcast scene on the live broadcast interface, and display interactive information from the multiple viewer terminals;
  • the second live broadcast module 302 is configured to play the video content of the virtual object in the second live broadcast scene on the live broadcast interface in response to the interaction information satisfying the trigger condition; wherein the live broadcast scene is used to represent the live broadcast of the virtual object content type.
  • the live broadcast scene includes a live broadcast scene in which the virtual object performs multimedia resources and a live broadcast scene in which the virtual object replies to interactive information.
  • the first live broadcast scene is a live broadcast scene in which the virtual object performs multimedia resources
  • the first live broadcast module 301 is specifically used for:
  • the second live broadcast module 302 is specifically used for:
  • the video content in which the virtual object replies to the interactive information is played on the live broadcast interface.
  • the trigger condition includes that the quantity of the interactive information reaches a preset threshold, the interactive information includes the first keyword, the quantity of the second keyword in the interactive information reaches the keyword threshold, and the first keyword. At least one of the duration of a live broadcast scene reaches a preset duration and the first live broadcast scene reaches a preset mark point.
  • the virtual object replies to the target interactive information in the interactive information;
  • the device further includes a replying module for:
  • the target interactive information and text information replying to the target interactive information are displayed in the second area of the live broadcast interface.
  • the first live broadcast module 301 is specifically used for:
  • Receive first video data corresponding to the first live broadcast scene where the first video data includes first scene data, first action data, and first audio data, and the first scene data is used to represent the first scene data.
  • a background picture of a live room in a live broadcast scene the first motion data is used to represent the facial expressions and body movements of the virtual object in the first live broadcast scene, and the audio data matches the target multimedia resource ;
  • the video content of the virtual object performing the target multimedia resource in the first live broadcast scene is played in the live broadcast interface based on the first video data.
  • the second live broadcast module is specifically used for:
  • Receive second multimedia data corresponding to the second live broadcast scene where the second multimedia data includes second scene data, second action data, and second audio data, and the second scene data is used for Characterize the background picture of the live broadcast room under the second live broadcast scene, the second action data is used to characterize the facial expressions and body movements of the virtual object in the second live broadcast scene, and the second audio data is based on generated by the target interaction information;
  • the video content in which the virtual object replies to the target interaction information in the second live broadcast scene is played in the live broadcast interface.
  • the live interactive device provided by the embodiment of the present disclosure can execute the live interactive method provided by any embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.
  • FIG. 6 is a schematic structural diagram of another live interactive device according to an embodiment of the present disclosure.
  • the device may be implemented by software and/or hardware, and may generally be integrated into an electronic device. As shown in Figure 6, the device is set on the server, including:
  • An information receiving module 401 configured to receive interactive information of multiple viewer terminals in a first live broadcast scene, and determine whether a trigger condition for live broadcast scene switching is satisfied based on the interactive information;
  • the data sending module 402 is configured to send the second video data corresponding to the second live broadcast scene to the multiple viewer terminals if the trigger condition is satisfied, wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object in the live broadcast room.
  • the live broadcast scene includes a live broadcast scene in which the virtual object performs multimedia resources and a live broadcast scene in which the virtual object replies to interactive information.
  • the first live broadcast scene is a live broadcast scene in which the virtual object performs multimedia resources
  • the apparatus further includes a data determination module for:
  • First audio data matching the target multimedia resource is searched in the audio database, first action data corresponding to the target multimedia resource is searched in the virtual object action database, and the first action data is used to represent the virtual object in the The facial expressions and body movements in the first live broadcast scene;
  • the device further includes a resource determination module for:
  • the target multimedia resource is determined from the plurality of multimedia resources based on the trigger information.
  • the device further includes a second data module for:
  • the second data module is used for:
  • the device further includes a reply information sending module for:
  • the device further includes a trigger condition module for:
  • the trigger condition is satisfied, wherein the similar interaction information is interaction information whose similarity is greater than the similarity threshold;
  • the trigger condition is satisfied
  • the trigger condition is satisfied.
  • the live interactive device provided by the embodiment of the present disclosure can execute the live interactive method provided by any embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. Referring specifically to FIG. 7 below, it shows a schematic structural diagram of an electronic device 500 suitable for implementing an embodiment of the present disclosure.
  • the electronic device 500 in the embodiment of the present disclosure may include, but is not limited to, such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), an in-vehicle terminal ( For example, mobile terminals such as car navigation terminals) and the like, and stationary terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in FIG. 7 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • an electronic device 500 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 501 that may be loaded into random access according to a program stored in a read only memory (ROM) 502 or from a storage device 508 Various appropriate actions and processes are executed by the programs in the memory (RAM) 503 . In the RAM 503, various programs and data required for the operation of the electronic device 500 are also stored.
  • the processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504.
  • An input/output (I/O) interface 505 is also connected to bus 504 .
  • I/O interface 505 input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration
  • An output device 507 such as a computer
  • a storage device 508 including, for example, a magnetic tape, a hard disk, etc.
  • Communication means 509 may allow electronic device 500 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 7 shows electronic device 500 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 509, or from the storage device 508, or from the ROM 502.
  • the processing apparatus 501 When the computer program is executed by the processing apparatus 501, the above-mentioned functions defined in the live interaction method of the embodiment of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic equipment, the electronic equipment is made to: play the video content of the virtual object in the first live broadcast scene on the live broadcast interface, and Displaying interactive information from the plurality of viewer terminals; in response to the interactive information meeting a trigger condition, playing the video content of the virtual object in the second live broadcast scene on the live broadcast interface; wherein the live broadcast scene is used to represent the The live content type of the virtual object.
  • the computer-readable medium carries one or more programs, and when the one or more programs are executed by the electronic device, the electronic device: receives the interactive information of multiple viewer terminals in the first live broadcast scene, The interactive information determines whether the trigger condition for the switching of the live broadcast scene is met; if the trigger condition is met, the second video data corresponding to the second live broadcast scene is sent to the multiple viewer terminals; wherein, the live broadcast scene is used to represent the virtual reality in the live broadcast room.
  • the object's live content type when the one or more programs are executed by the electronic device, the electronic device: receives the interactive information of multiple viewer terminals in the first live broadcast scene, The interactive information determines whether the trigger condition for the switching of the live broadcast scene is met; if the trigger condition is met, the second video data corresponding to the second live broadcast scene is sent to the multiple viewer terminals; wherein, the live broadcast scene is used to represent the virtual reality in the live broadcast room.
  • the object's live content type is used to represent the virtual reality in the live broadcast room.
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation of the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the present disclosure provides a live broadcast interaction method, which is applied to multiple viewer terminals entering a live broadcast room of a virtual object, including:
  • the video content of the virtual object in the second live broadcast scene is played on the live broadcast interface, wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object.
  • the live broadcast scene includes a live broadcast scene in which the virtual object performs multimedia resources and a live broadcast scene in which the virtual object replies to interactive information.
  • the first live broadcast scene is a live broadcast scene in which the virtual object performs multimedia resources
  • the first live broadcast scene is played on the live broadcast interface under the first live broadcast scene.
  • the playing the video content of the virtual object in the second live broadcast scene on the live broadcast interface includes:
  • the video content in which the virtual object replies to the interactive information is played on the live broadcast interface.
  • the trigger condition includes that the quantity of the interactive information reaches a preset threshold, the interactive information includes a first keyword, the interactive information At least one of the number of second keywords in the information reaches a keyword threshold, the duration of the first live broadcast scene reaches a preset duration, and the first live broadcast scene reaches a preset mark point.
  • the virtual object replies to target interactive information in the interactive information; the method further includes:
  • the target interactive information and text information replying to the target interactive information are displayed in the second area of the live broadcast interface.
  • the playing the video content of the virtual object in the first live broadcast scene on the live broadcast interface includes:
  • Receive first video data corresponding to the first live broadcast scene where the first video data includes first scene data, first action data, and first audio data, and the first scene data is used to represent the first scene data.
  • a background picture of a live room in a live broadcast scene the first motion data is used to represent the facial expressions and body movements of the virtual object in the first live broadcast scene, and the audio data matches the target multimedia resource ;
  • the video content of the virtual object performing the target multimedia resource in the first live broadcast scene is played in the live broadcast interface based on the first video data.
  • the playing the video content of the virtual object in the second live broadcast scene on the live broadcast interface includes:
  • Receive second multimedia data corresponding to the second live broadcast scene where the second multimedia data includes second scene data, second action data, and second audio data, and the second scene data is used for Characterize the background picture of the live broadcast room under the second live broadcast scene, the second action data is used to characterize the facial expressions and body movements of the virtual object in the second live broadcast scene, and the second audio data is based on generated by the target interaction information;
  • the video content in which the virtual object replies to the target interaction information in the second live broadcast scene is played in the live broadcast interface.
  • the present disclosure provides a live interactive method, applied to a server, including:
  • second video data corresponding to the second live broadcast scene is sent to the multiple viewer terminals; wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object in the live broadcast room.
  • the live broadcast scene includes a live broadcast scene in which the virtual object performs multimedia resources and a live broadcast scene in which the virtual object replies to interactive information.
  • the first live broadcast scene is a live broadcast scene in which the virtual object performs a multimedia resource, further comprising:
  • First audio data matching the target multimedia resource is searched in the audio database, first action data corresponding to the target multimedia resource is searched in the virtual object action database, and the first action data is used to represent the virtual object in the The facial expressions and body movements in the first live broadcast scene;
  • the live interaction method provided by the present disclosure further includes:
  • the target multimedia resource is determined from the plurality of multimedia resources based on the trigger information.
  • the second video data is generated by the following method:
  • searching for the second action data corresponding to the target interaction information in the virtual object action database includes:
  • the method further includes:
  • the trigger condition is determined in at least one of the following ways:
  • the trigger condition is satisfied, wherein the similar interaction information is interaction information whose similarity is greater than the similarity threshold;
  • the trigger condition is satisfied
  • the trigger condition is satisfied.
  • the present disclosure provides a live interactive device, including:
  • a first live broadcast module configured to play the video content of the virtual object in the first live broadcast scene on the live broadcast interface, and display interactive information from the plurality of viewer terminals;
  • a second live broadcast module configured to play the video content of the virtual object in the second live broadcast scene on the live broadcast interface in response to the interaction information meeting a trigger condition; wherein the live broadcast scene is used to represent the live broadcast content of the virtual object type.
  • the live broadcast scene includes a live broadcast scene in which the virtual object performs multimedia resources and a live broadcast scene in which the virtual object replies to interactive information.
  • the first live broadcast scene is a live broadcast scene in which the virtual object performs a multimedia resource
  • the first live broadcast module is specifically used for:
  • the second live broadcast module is specifically used for:
  • the video content in which the virtual object replies to the interactive information is played on the live broadcast interface.
  • the trigger conditions include that the quantity of the interactive information reaches a preset threshold, the interactive information includes a first keyword, the interactive information At least one of the number of second keywords in the information reaches a keyword threshold, the duration of the first live broadcast scene reaches a preset duration, and the first live broadcast scene reaches a preset mark point.
  • the virtual object replies to target interactive information in the interactive information; the device further includes a replying module for:
  • the target interactive information and text information replying to the target interactive information are displayed in the second area of the live broadcast interface.
  • the first live broadcast module is specifically used for:
  • Receive first video data corresponding to the first live broadcast scene where the first video data includes first scene data, first action data, and first audio data, and the first scene data is used to represent the first scene data.
  • a background picture of a live room in a live broadcast scene the first motion data is used to represent the facial expressions and body movements of the virtual object in the first live broadcast scene, and the audio data matches the target multimedia resource ;
  • the video content of the virtual object performing the target multimedia resource in the first live broadcast scene is played in the live broadcast interface based on the first video data.
  • the second live broadcast module is specifically used for:
  • Receive second multimedia data corresponding to the second live broadcast scene where the second multimedia data includes second scene data, second action data, and second audio data, and the second scene data is used for Characterize the background picture of the live broadcast room under the second live broadcast scene, the second action data is used to characterize the facial expressions and body movements of the virtual object in the second live broadcast scene, and the second audio data is based on generated by the target interaction information;
  • the video content in which the virtual object replies to the target interaction information in the second live broadcast scene is played in the live broadcast interface.
  • the present disclosure provides a live interactive device, including:
  • an information receiving module configured to receive interactive information of multiple viewer terminals in the first live broadcast scene, and determine whether a trigger condition for live broadcast scene switching is satisfied based on the interactive information
  • a data sending module configured to send the second video data corresponding to the second live broadcast scene to the plurality of viewer terminals if the trigger condition is satisfied, wherein the live broadcast scene is used to represent the live broadcast content type of the virtual object in the live broadcast room.
  • the live broadcast scene includes a live broadcast scene in which the virtual object performs multimedia resources and a live broadcast scene in which the virtual object replies to interactive information.
  • the first live broadcast scene is a live broadcast scene in which the virtual object performs a multimedia resource
  • the device further includes a data determination module for:
  • First audio data matching the target multimedia resource is searched in the audio database, first action data corresponding to the target multimedia resource is searched in the virtual object action database, and the first action data is used to represent the virtual object in the The facial expressions and body movements in the first live broadcast scene;
  • the device further includes a resource determination module for:
  • the device further includes a second data module for:
  • the second data module is used for:
  • the device further includes a reply information sending module for:
  • the device further includes a trigger condition module for:
  • the trigger condition is satisfied, wherein the similar interaction information is interaction information whose similarity is greater than the similarity threshold;
  • the trigger condition is satisfied
  • the trigger condition is satisfied.
  • the present disclosure provides an electronic device, comprising:
  • a memory for storing the processor-executable instructions
  • the processor is configured to read the executable instructions from the memory, and execute the instructions to implement any of the live interactive methods provided in the present disclosure.
  • the present disclosure provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute any of the live broadcasts provided by the present disclosure interactive method.

Abstract

本公开实施例涉及一种直播互动方法、装置、设备及介质,其中该方法包括:进入虚拟对象的直播间的多个观众终端可以在直播界面播放第一直播场景下虚拟对象的视频内容,并展示来自多个观众终端的互动信息;响应于互动信息满足触发条件,在直播界面播放第二直播场景下虚拟对象的视频内容;其中,直播场景用于表征虚拟对象的直播内容类型。采用上述技术方案,虚拟对象可以基于观众的互动信息实现从第一直播场景下的直播切换到第二直播场景下的直播,实现了虚拟对象与观众之间不同直播场景的互动环节满足了观众的多种互动需求,提高了虚拟对象直播的多样性和趣味性,进而提升了观众的互动体验效果观众。

Description

一种直播互动方法、装置、设备及介质
本申请要求于2020年12月11日提交的申请号为202011463601.8、申请名称为“一种直播互动方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及直播技术领域,尤其涉及一种直播互动方法、装置、设备及介质。
背景技术
随着直播技术的不断发展,观看直播成为人们生活中的一项重要的娱乐活动。
目前,可以采用虚拟对象替代真人主播进行直播。但是上述虚拟对象通常只能按照预先设置的内容进行直播,观众只能被动观看,不能决定观看的内容,直播效果不佳。
发明内容
为了解决上述技术问题或者至少部分地解决上述技术问题,本公开提供了一种直播互动方法、装置、设备及介质。
本公开实施例提供了一种直播互动方法,所述方法应用于进入虚拟对象的直播间的多个观众终端,包括:
在直播界面播放第一直播场景下所述虚拟对象的视频内容,并展示来自所述多个观众终端的互动信息;
响应于所述互动信息满足触发条件,在所述直播界面播放第二直播场景下所述虚拟对象的视频内容;其中,直播场景用于表征所述虚拟对象的直播内容类型。
观众本公开实施例还提供了一种直播互动方法,所述方法应用于服务端,包括:
接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;
若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
本公开实施例还提供了一种直播互动装置,所述装置设置于进入虚拟对象的直播间的多个观众终端,包括:
第一直播模块,用于在直播界面播放第一直播场景下所述虚拟对象的视频内容,并展示来自所述多个观众终端的互动信息;
第二直播模块,用于响应于所述互动信息满足触发条件,在所述直播界面播放第二直播场景下所述虚拟对象的视频内容;其中,直播场景用于表征所述虚拟对象的直播内容类型。
本公开实施例还提供了一种直播互动装置,所述装置设置于服务端,包括:
信息接收模块,用于接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;
数据发送模块,用于若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
本公开实施例还提供了一种电子设备,所述电子设备包括:处理器;用于存储所述处理器可执行指令的存储器;所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现如本公开实施例提供的直播互动方法。
本公开实施例还提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行如本公开实施例提供的直播互动方法。
本公开实施例提供的技术方案与现有技术相比具有如下优点:本公开实施例提供的直播互动方案,进入虚拟对象的直播间的多个观众终端可以在直播界面播放第一直播场景下虚拟对象的视频内容,并展示来自多个观众终端的互动信息;响应于互动信息满足触发条件,在直播界面播放第二直播场景下虚拟对象的视频内容;其中,直播场景用于表征虚拟对象的直播内容类型。采用上述技术方案,虚拟对象可以基于观众的互动信息实现从第一直播场景下的直播切换到第二直播场景下的直播,观众实现了虚拟对象与观众之间不同直播场景的互动环节满足了观众的多种互动需求,提高了虚拟对象直播的多样性和趣味性,进而提升了观众的互动体验效果。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。
图1为本公开实施例提供的一种直播互动方法的流程示意图;
图2为本公开实施例提供的一种直播互动的示意图;
图3为本公开实施例提供的另一种直播互动的示意图;
图4为本公开实施例提供的另一种直播互动方法的流程示意图;
图5为本公开实施例提供的一种直播互动装置的结构示意图;
图6为本公开实施例提供的另一种直播互动装置的结构示意图;
图7为本公开实施例提供的一种电子设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于” 是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
图1为本公开实施例提供的一种直播互动方法的流程示意图,该方法可以由直播互动装置执行,其中该装置可以采用软件和/或硬件实现,一般可集成在电子设备中。如图1所示,应用于进入虚拟对象的直播间的多个观众终端,包括:
步骤101、在直播界面播放第一直播场景下虚拟对象的视频内容,并展示来自多个观众终端的互动信息。
其中,虚拟对象可以为基于人工智能(Artificial Intelligence,AI)技术预先创建的一个三维模型,可以为计算机设置可控制的数字化对象,通过动作捕捉设备和面部捕捉设备可以获取真人的肢体动作和面部信息来驱动虚拟对象。虚拟对象的具体类型可以包括多种,不同的虚拟对象可以具有不同的外貌形象,虚拟对象具体可以为虚拟动物,也可以为不同风格的虚拟人物。本公开实施例中,通过人工智能技术与视频直播技术的结合,虚拟对象可以代替真人实现视频直播。
直播界面是指用于展示虚拟对象的直播间的页面,该页面可以是网页页面,也可以是应用程序客户端中的页面。直播场景是用于表征虚拟对象的直播内容类型的场景,虚拟对象的直播场景可以包括多种,本公开实施例中直播场景可以包括虚拟对象表演多媒体资源的直播场景和虚拟对象回复互动信息的直播场景,多媒体资源可以包括阅读书籍、演唱歌曲和绘画题目等,具体不限。
本公开实施例中,第一直播场景为虚拟对象表演多媒体资源的直播场景,在直播界面播放第一直播场景下虚拟对象的视频内容,可以包括:在直播界面的第一区域展示待表演的多个多媒体资源的多媒体资源信息;播放虚拟对象表演目标多媒体资源的视频内容,其中,目标多媒体资源是基于多个观众终端对多个多媒体资源的触发信息确定的。
由于多媒体资源可以包括阅读书籍、演唱歌曲和绘画题目等,待表演的多媒体资源信息可以包括待阅读的书籍、待演唱的歌曲以及待绘画的绘画题目等。第一区域为直播界面中设置的用于展示待表演的多媒体资源的多媒体资源信息的区域,支持观众对多媒体资源的触发操作。其中,触发操作包括单击、双击、滑动和语音指令中的一种或多种。
进一步地,终端可以接收服务端发送的待表演的多个多媒体资源的多媒体资源信息,并将多媒体资源信息展示直播界面的第一区域中。各终端将观众对多媒体资源的触发信息发送至服务端,服务端根据触发信息,可以从多个多媒体资源中确定目标多媒体资源,例如可以将触发次数最多的多媒体资源确定为目标多媒体资源。终端可以接收服务端下发的 目标多媒体资源的视频数据,并基于视频数据在直播界面中播放虚拟对象表演多媒体资源的视频内容。
上述方案中,虚拟对象可以根据观众的选择进行表演多媒体资源的直播场景,观众可以决定观看的内容,提高了参与程度,进而提高了虚拟对象直播效果。
本公开实施例中,在直播界面播放第一直播场景下虚拟对象的视频内容,可以包括:接收第一直播场景对应的第一视频数据,其中,第一视频数据包括第一场景数据、第一动作数据和第一音频数据,第一场景数据用于表征第一直播场景下的直播间背景画面,第一动作数据用于表征虚拟对象在第一直播场景中的表情动作和肢体动作,音频数据与目标多媒体资源相匹配;基于第一视频数据在直播界面中播放虚拟对象在第一直播场景下表演目标多媒体资源的视频内容。
第一视频数据是指服务器预先配置好的用于实现虚拟对象进行第一直播场景下直播的数据,第一视频数据中可以包括第一场景数据、第一动作数据和第一音频数据。直播间背景画面所对应的场景可以包括虚拟对象在第一直播场景下的背景场景以及画面视角场景,画面视角可以为不同镜头拍摄虚拟对象时的视角,不同画面视角对应的场景图像的展示大小和/或展示方向等不同。第一动作数据可以用于生成虚拟对象在第一直播场景中的表情动作和肢体动作。音频数据与多个多媒体资源中的目标多媒体资源相匹配,例如目标多媒体资源为一个演唱歌曲时,音频数据为该演唱歌曲的音频。
本公开实施例中,终端检测到观众对虚拟对象的触发操作之后,可以获取服务端发送的第一直播场景对应的第一视频数据,通过对第一视频数据的解码处理可以生成对应的视频内容,在直播界面中播放虚拟对象在第一直播场景下表演目标多媒体资源的视频内容。并且,终端在播放虚拟对象在第一直播场景下表演目标多媒体资源的视频内容的过程中,可以接收来自多个直播观众的多个互动信息,并将多个互动信息展示在直播界面中,具体展示的位置可以根据实际情况进行设定。可选的,在播放虚拟对象在第一直播场景下表演目标多媒体资源的视频内容的过程中,基于第一场景数据、第一动作数据,直播间背景画面、虚拟对象的动作随视频内容的变化可以进行切换。
示例性的,图2为本公开实施例提供的一种直播互动的示意图,如图2所示,图中展示了一个虚拟对象11在第一直播场景下的直播界面,该直播界面中展示了虚拟对象11阅读一个阅读书籍的一个直播画面,虚拟对象11的前方放置一个电子阅读器,表征虚拟对象11正在进行一个阅读书籍的讲述。图2中的直播界面的左上角还展示了虚拟对象11的头像和名称,名称为“小A”,以及关注按键12。
参见图2,图2中直播界面的下方还展示了观看虚拟对象直播的不同用户发送的互动信息,例如图中的用户A(观众A)发送的“这个故事真棒”,用户B(观众B)发送的“你好”,以及用户C(观众C)发送的“我来找你啦”。直播界面的最下方还展示了当前用户发送互动信息的编辑区域13以及其他功能按键,例如图中的选择按键14、互动按键15以及活动及奖励按键16等,不同的功能按键具备不同的功能。
步骤102、响应于互动信息满足触发条件,在直播界面播放第二直播场景下虚拟对象的视频内容;其中,直播场景用于表征虚拟对象的直播内容类型。
其中,触发条件是指用于基于观众的互动信息确定是否进行直播场景切换的条件,本公开实施例中触发条件可以包括互动信息的数量达到预设阈值、互动信息中包括第一关键词、互动信息中第二关键词的数量达到关键词阈值、第一直播场景的时长达到预设时长以及第一直播场景到达预设标记点中至少一个。上述预设阈值、第一关键词、第二关键词、关键词阈值、预设时长以及预设标记点均可以根据实际情况进行设定。
本公开实施例中,在直播界面播放第二直播场景下虚拟对象的视频内容,包括:在直播界面播放虚拟对象针对互动信息进行回复的视频内容。第二直播场景与上述第一直播场景不同,是指虚拟对象回复互动信息的直播场景。
具体地,终端可以接收一个或多个互动信息对应的回复音频数据,并基于回复音频数据和虚拟对象在第二直播场景下的第二场景数据、第二动作数据共同生成回复的视频内容,并在直播界面中播放虚拟对象针对互动信息进行回复的视频内容。
可选的,虚拟对象针对互动信息中的目标互动信息进行回复;直播互动方法还可以包括:在直播界面的第二区域展示目标互动信息以及回复目标互动信息的文本信息。
目标互动信息是服务端基于预设方案在直播观众发送的多个互动信息中确定的需要回复的一个或多个,预设方案可以根据实际情况进行设定,例如可以为基于发送互动信息的直播观众的积分确定目标互动信息;或者查找与预设关键词匹配的目标互动信息,其中,预设关键词可以预先根据热点信息挖掘提取,也可以是与直播内容相关的关键词;或者对互动信息进行语义识别,将表述意思相近的互动信息进行聚类,得到若干信息集合,拥有互动信息最多的集合即为直播观众互动最热的话题,将该集合对应的互动信息作为目标互动信息。回复目标互动信息的文本信息是指服务端基于语料库确定的与目标互动信息匹配的回复文本信息。终端可以接收回复目标互动信息的文本信息,并在直播界面的第二区域中展示目标互动信息以及该回复目标互动信息的文本信息。
上述方案中,在第二直播场景下,终端可以在直播界面中播放虚拟对象针对互动信息进行回复的视频内容,并展示当前互动信息以及对应的回复文本信息,以使观众了解虚拟对象正在回复哪一个观众的交互内容,进一步提升了观众与虚拟对象之间的互动深入程度,提高了互动交互体验。
本公开实施例中,在直播界面播放第二直播场景下虚拟对象的视频内容,可以包括:接收第二直播场景对应的第二多媒体数据,其中,第二多媒体数据包括第二场景数据、第二动作数据和第二音频数据,第二场景数据用于表征第二直播场景下的直播间背景画面,第二动作数据用于表征虚拟对象在第二直播场景中的表情动作和肢体动作,第二音频数据是基于目标互动信息生成的;基于第二多媒体数据在直播界面中播放虚拟对象在第二直播场景针对目标互动信息进行回复的视频内容。
第二视频数据是指服务器预先配置好的用于实现虚拟对象进行第二直播场景下直播的数据,第二视频数据中可以包括第二场景数据、第二动作数据和第二音频数据,各数据所表达的含义与上述第一视频数据中的数据类似,在此不进行具体说明。不同的是,第一直播场景和第二直播场景下具体的视频数据不同。
本公开实施例中,服务端在基于互动信息确定满足触发条件时,可以发送第二直播场 景对应的第二视频数据给终端。终端接收到第二视频数据之后,通过对第二视频数据的解码处理可以生成对应的视频内容,在直播界面中播放虚拟对象在第二直播场景针对目标互动信息进行回复的视频内容。并且,终端在播放虚拟对象在第二直播场景针对目标互动信息进行回复的视频内容的过程中,也可以展示来自多个观众终端的互动信息。可选的,在播放虚拟对象在第二直播场景针对目标互动信息进行回复的视频内容的过程中,基于第二场景数据、第二动作数据,直播间背景画面、虚拟对象的动作随视频内容的变化可以进行切换,但是与在第一直播场景下直播间背景画面、虚拟对象的动作可以不同。
示例性的,图3为本公开实施例提供的另一种直播互动的示意图,如图3所示,图中展示了虚拟对象11在第二直播场景针对互动信息进行回复过程中的一个直播画面,与图2相比,虚拟对象11的前方没有了电子阅读器。直播界面的下方还展示了在直播聊天过程中不同用户发送的互动信息,例如图中的用户A(观众A)发送的“我想你了”,用户B(观众B)发送的“你好”,以及用户C(观众C)发送的“我们聊天吧”。
图3中直播页面中还展示了第二区域17,第二区域17中可以为当前一个观众的互动信息以及虚拟对象的回复互动信息的文本信息,以使观众了解虚拟对象正在回复哪一个观众的交互内容。如图中互动信息为观众C发送的“我们聊会天吧”,虚拟对象的回复文本为“现在已经太晚了,明天聊吧”。回复文本与回复音频数据相对应,与虚拟对象回复时的说话内容保持一致。参见图2和图3,图2和图3中虚拟对象11的动作不同,图2的第一直播场景中虚拟对象11左手托腮,图3的第二直播场景中虚拟对象11左手抬起,右手托腮。
需要说明的是,上述第一直播场景为虚拟对象表演多媒体资源的直播场景,第二直播场景为虚拟对象回复互动信息的直播场景,第一直播场景和第二直播场景的设置还可以替换,也即,第一直播场景可以为虚拟对象回复互动信息的直播场景,第二直播场景可以为虚拟对象表演多媒体资源的直播场景,具体不限。并且,第一直播场景和第二直播场景可以不断进行交替,使得虚拟对象的直播场景不断切换。
本公开实施例中,可以实现虚拟对象在不同直播场景下的直播,直播场景可以根据观众的选择进行切换,并且不同直播场景下直播间背景画面、虚拟对象的动作可以不同,满足了观众的多种互动需求。
本公开实施例提供的直播互动方案,进入虚拟对象的直播间的多个观众终端可以在直播界面播放第一直播场景下虚拟对象的视频内容,并展示来自多个观众终端的互动信息;响应于互动信息满足触发条件,在直播界面播放第二直播场景下虚拟对象的视频内容;其中,直播场景用于表征虚拟对象的直播内容类型。采用上述技术方案,虚拟对象可以基于观众的互动信息实现从第一直播场景下的直播切换到第二直播场景下的直播,实现了虚拟对象与观众之间不同直播场景的互动环节满足了观众的多种互动需求,提高了虚拟对象直播的多样性和趣味性,进而提升了观众的互动体验效果。
图4为本公开实施例提供的另一种直播互动方法的流程示意图,本实施例在上述实施例的基础上,进一步优化了上述直播互动方法。如图4所示,该方法应用于服务端,包括:
步骤201、接收第一直播场景下多个观众终端的互动信息,基于互动信息确定是否满 足直播场景切换的触发条件。
其中,直播场景是用于表征虚拟对象的直播内容类型的场景,虚拟对象的直播场景可以包括多种,本公开实施例中直播场景可以包括虚拟对象表演多媒体资源的直播场景和虚拟对象回复互动信息的直播场景,多媒体资源可以包括阅读书籍、演唱歌曲和绘画题目等,具体不限。互动信息是指观看在第一直播场景下直播的多个观众通过终端发送的互动文本信息。
具体的,服务端可以接收第一直播场景下多个观众终端发送的互动信息,并基于互动信息和/或第一直播场景的相关信息确定是否满足直播场景切换的触发条件。本公开实施例中触发条件可以包括互动信息的数量达到预设阈值、互动信息中包括第一关键词、互动信息中第二关键词的数量达到关键词阈值、第一直播场景的时长达到预设时长以及第一直播场景到达预设标记点中至少一个。上述预设阈值、第一关键词、第二关键词、关键词阈值、预设时长以及预设标记点均可以根据实际情况进行设定。
本公开实施例中,第一直播场景为虚拟对象表演多媒体资源的直播场景,直播互动方法还可以包括:在音频数据库中查找与目标多媒体资源匹配的第一音频数据,在虚拟对象动作数据库中查找与目标多媒体资源对应的第一动作数据,第一动作数据用于表征虚拟对象在第一直播场景中的表情动作和肢体动作;基于第一直播场景的场景标识确定第一场景数据,第一场景数据用于表征第一直播场景下的直播间背景画面;将第一动作数据、第一音频数据以及第一场景数据组合成第一直播场景对应的第一视频数据;将第一视频数据发送至多个观众终端。
其中,音频数据库、虚拟对象动作数据库可以为预先设置的数据库。目标多媒体资源为多个多媒体资源中的一个。场景标识是指用于区分不同直播场景的标识,服务端可以预先针对不同的直播场景设置对应的场景数据。服务端可以在音频数据库和虚拟对象动作数据库中进行查找,确定与目标多媒体资源匹配的第一音频数据和第一动作数据,并且基于第一直播场景的场景标识确定对应的第一场景数据;之后服务端可以将第一动作数据、第一音频数据以及第一场景数据组合得到第一视频数据,并将第一视频数据发送至多个观众终端。
观众终端接收到第一视频数据之后,通过对第一视频数据的解码处理可以生成对应的视频内容,在直播界面中播放虚拟对象在第一直播场景下表演目标多媒体资源的视频内容。在播放虚拟对象在第一直播场景下表演目标多媒体资源的视频内容的过程中,基于第一场景数据、第一动作数据,直播间背景画面、虚拟对象的动作随视频内容的变化可以进行切换。
本公开实施例中,直播互动方法还可以包括:接收来自多个观众终端的针对第一直播场景中展示的多个多媒体资源的触发信息;基于触发信息,从多个多媒体资源中确定目标多媒体资源。触发信息可以为观众对多媒体资源的触发操作对应的相关信息,例如触发信息可以包括触发次数、触发时间等。
观众终端可以在直播界面中展示多个多媒体资源的多媒体资源信息,并接收观众对多媒体资源的触发操作,将多媒体资源的触发信息发送至服务端。服务端接收到触发信息, 可以从多个多媒体资源中确定目标多媒体资源,例如可以将触发次数最多的多媒体资源确定为目标多媒体资源。
步骤202、若满足触发条件,向多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
本公开实施例中,触发条件通过如下至少一种方式确定:如果互动信息中的相似互动信息的数量达到预设阈值,则满足触发条件,其中,相似互动信息为相似度大于相似度阈值的互动信息;提取互动信息中的关键词,将关键词与关键词数据库中的第一关键词和/或第二关键词进行匹配,如果互动信息中包括第一关键词和/或互动信息中的第二关键词的数量达到关键词阈值,则满足触发条件;如果第一直播场景的时长达到预设时长,则满足触发条件;如果第一直播场景到达预设标记点,则满足触发条件。
具体的,服务端可以对互动信息进行语义识别,将相似度大于相似度阈值的互动信息进行聚类,称为相似互动信息。如果相似互动信息的数量达到预设阈值,则可以确定满足切换直播场景的触发条件。和/或,服务端可以基于语义提取互动信息中的关键词,将关键词与关键词数据库中的第一关键词进行匹配,如果匹配成功,则可以确定互动信息中包括第一关键词,确定满足触发条件。和/或,服务端可以将互动信息的关键词与第二关键词进行匹配,如果匹配成功,则将第二关键词的数量加一,如果第二关键词的数量达到关键词阈值,则可以确定满足触发条件。上述第一关键词和第二关键词可以是与第二直播场景相关的关键词。
和/或,服务端可以获取第一直播场景的时长,如果该时长达到预设时长,则确定满足触发条件。和/或,服务端如果确定第一直播场景到达预设标记点,可以确定满足触发条件。预设标记点可以为预先根据第一直播场景下的多媒体资源进行设置,例如多媒体资源为阅读书籍时,可以对阅读书籍进行语义拆分,得到多个阅读段落,可以在每个文本段落的尾部设置预设标记点;又如,多媒体资源为演唱歌曲时,可以基于演唱歌曲的属性特征设置预设标记点。
本公开实施例中,第二视频数据通过如下方法生成:基于目标互动信息在预设的文本库中确定回复目标互动信息的文本信息;将文本信息转化成第二音频数据;在虚拟对象动作数据库中查找与目标互动信息对应的第二动作数据,第二动作数据用于表征虚拟对象在第一直播场景中的表情动作和肢体动作;基于第二直播场景的场景标识确定第二场景数据,第二场景数据用于表征第二直播场景下的直播间背景画面;将第二动作数据、第二音频数据以及第二场景数据组合成第二直播场景对应的第二视频数据;将第二视频数据发送至多个观众终端。
可选的,在虚拟对象动作数据库中查找与目标互动信息对应的第二动作数据,包括:根据目标互动信息识别虚拟对象反馈的情绪信息;在虚拟对象动作数据库中查找与情绪信息对应的第二动作数据。虚拟对象动作数据库中预先设置了不同的情绪信息对应的动作数据,例如高兴情绪对应的拍手动作,生气情绪对应拍桌子的动作。
由于第二直播场景为虚拟对象回复互动信息的直播场景,第二视频数据可以基于目标互动信息生成。具体的,服务端通过语义识别和分析可以在预设的文本库中确定与目标互 动信息匹配的文本信息,并将该文本信息文本转语音(Text To Speech,TTS)技术,实时转化为虚拟对象的自然语音数据,得到第二音频数据;之后通过在虚拟对象动作数据库中查找确定目标互动信息表征的情绪信息所对应的第二动作数据,并基于第二直播场景的场景标识确定第二场景数据;服务端将第二音频数据、第二动作数据以及第二场景数据组合可以得到第二视频数据,并将第二视频数据发送至多个观众终端。
观众终端接收到第二视频数据之后,通过对第二视频数据的解码处理可以生成对应的视频内容,在直播界面中播放虚拟对象在第二直播场景针对目标互动信息进行回复的视频内容。可选的,在播放虚拟对象在第二直播场景针对目标互动信息进行回复的视频内容的过程中,基于第二场景数据、第二动作数据,直播间背景画面、虚拟对象的动作随视频内容的变化可以进行切换,但是与在第一直播场景下直播间背景画面、虚拟对象的动作可以不同。
可以理解的,上述第一直播场景为虚拟对象表演多媒体资源的直播场景,第二直播场景为虚拟对象回复互动信息的直播场景,仅为示例,第一直播场景和第二直播场景的设置还可以替换,第一直播场景和第二直播场景可以不断进行交替,使得虚拟对象的直播场景不断切换。
本公开实施例中,直播互动方法还可以包括:将目标互动信息和回复目标互动信息的文本信息发送至多个观众终端。
服务端可以基于预设方案在直播观众发送的多个互动信息中确定目标互动信息,预设方案可以根据实际情况进行设定,例如可以为基于发送互动信息的直播观众的积分确定目标互动信息;或者查找与预设关键词匹配的目标互动信息,其中,预设关键词可以预先根据热点信息挖掘提取,也可以是与直播内容相关的关键词;或者对互动信息进行语义识别,将表述意思相近的互动信息进行聚类,得到若干信息集合,拥有互动信息最多的集合即为直播观众互动最热的话题,将该集合对应的互动信息作为目标互动信息。之后,服务端可以将目标互动信息和回复目标互动信息的文本信息发送至观众终端,终端可以接收回复目标互动信息的文本信息,并在直播界面的第二区域中展示目标互动信息以及该回复目标互动信息的文本信息。
本公开实施例中,服务端可以接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。采用上述技术方案,服务端确定满足直播场景切换的触发条件时,可以发送第二直播场景的数据给观众终端,使观众终端进行直播场景的切换,虚拟对象可以基于观众的互动信息实现从第一直播场景下的直播切换到第二直播场景下的直播,实现了虚拟对象与观众之间不同直播场景的互动环节满足了观众的多种互动需求,提高了虚拟对象直播的多样性和趣味性,进而提升了观众的互动体验效果。
图5为本公开实施例提供的一种直播互动装置的结构示意图,该装置可由软件和/或硬件实现,一般可集成在电子设备中。如图5所示,该装置设置于进入虚拟对象的直播间的 多个观众终端,包括:
第一直播模块301,用于在直播界面播放第一直播场景下所述虚拟对象的视频内容,并展示来自所述多个观众终端的互动信息;
第二直播模块302,用于响应于所述互动信息满足触发条件,在所述直播界面播放第二直播场景下所述虚拟对象的视频内容;其中,直播场景用于表征所述虚拟对象的直播内容类型。
可选的,所述直播场景包括所述虚拟对象表演多媒体资源的直播场景和所述虚拟对象回复互动信息的直播场景。
可选的,所述第一直播场景为所述虚拟对象表演多媒体资源的直播场景,所述第一直播模块301具体用于:
在所述直播界面的第一区域展示待表演的多个多媒体资源的多媒体资源信息;
播放所述虚拟对象表演目标多媒体资源的视频内容,其中,所述目标多媒体资源是基于多个所述观众终端对所述多个多媒体资源的触发信息确定的。
可选的,所述第二直播模块302具体用于:
在所述直播界面播放所述虚拟对象针对所述互动信息进行回复的视频内容。
可选的,所述触发条件包括所述互动信息的数量达到预设阈值、所述互动信息中包括第一关键词、所述互动信息中第二关键词的数量达到关键词阈值、所述第一直播场景的时长达到预设时长以及所述第一直播场景到达预设标记点中至少一个。
可选的,所述虚拟对象针对所述互动信息中的目标互动信息进行回复;所述装置还包括回复模块,用于:
在所述直播界面的第二区域展示所述目标互动信息以及回复所述目标互动信息的文本信息。
可选的,所述第一直播模块301具体用于:
接收所述第一直播场景对应的第一视频数据,其中,所述第一视频数据包括第一场景数据、第一动作数据和第一音频数据,所述第一场景数据用于表征所述第一直播场景下的直播间背景画面,所述第一动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作,所述音频数据与所述目标多媒体资源相匹配;
基于所述第一视频数据在所述直播界面中播放所述虚拟对象在所述第一直播场景下表演所述目标多媒体资源的视频内容。
可选的,所述第二直播模块具体用于:
接收所述第二直播场景对应的第二多媒体数据,其中,所述第二多媒体数据包括第二场景数据、第二动作数据和第二音频数据,所述第二场景数据用于表征所述第二直播场景下的直播间背景画面,所述第二动作数据用于表征所述虚拟对象在所述第二直播场景中的表情动作和肢体动作,所述第二音频数据是基于所述目标互动信息生成的;
基于所述第二多媒体数据在所述直播界面中播放所述虚拟对象在所述第二直播场景针对所述目标互动信息进行回复的视频内容。
本公开实施例所提供的直播互动装置可执行本公开任意实施例所提供的直播互动方 法,具备执行方法相应的功能模块和有益效果。
图6为本公开实施例提供的另一种直播互动装置的结构示意图,该装置可由软件和/或硬件实现,一般可集成在电子设备中。如图6所示,该装置设置于服务端,包括:
信息接收模块401,用于接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;
数据发送模块402,用于若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
可选的,所述直播场景包括所述虚拟对象表演多媒体资源的直播场景和所述虚拟对象回复互动信息的直播场景。
可选的,所述第一直播场景为所述虚拟对象表演多媒体资源的直播场景,所述装置还包括数据确定模块,用于:
在音频数据库中查找与目标多媒体资源匹配的第一音频数据,在虚拟对象动作数据库中查找与所述目标多媒体资源对应的第一动作数据,第一动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作;
基于所述第一直播场景的场景标识确定第一场景数据,所述第一场景数据用于表征所述第一直播场景下的直播间背景画面;
将所述第一动作数据、所述第一音频数据以及所述第一场景数据组合成所述第一直播场景对应的第一视频数据;
将所述第一视频数据发送至所述多个观众终端。
可选的,所述装置还包括资源确定模块,用于:
接收来自所述多个观众终端的针对第一直播场景中展示的多个多媒体资源的触发信息;
基于所述触发信息,从所述多个多媒体资源中确定所述目标多媒体资源。
可选的,所述装置还包括第二数据模块,用于:
基于目标互动信息在预设的文本库中确定回复所述目标互动信息的文本信息;
将所述文本信息转化成第二音频数据;
在虚拟对象动作数据库中查找与所述目标互动信息对应的第二动作数据,第二动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作;
基于所述第二直播场景的场景标识确定第二场景数据,所述第二场景数据用于表征所述第二直播场景下的直播间背景画面;
将所述第二动作数据、所述第二音频数据以及所述第二场景数据组合成所述第二直播场景对应的第二视频数据;
将所述第二视频数据发送至所述多个观众终端。
可选的,所述第二数据模块,用于:
根据所述目标互动信息识别所述虚拟对象反馈的情绪信息;
在虚拟对象动作数据库中查找与所述情绪信息对应的第二动作数据。
可选的,所述装置还包括回复信息发送模块,用于:
将所述目标互动信息和回复所述目标互动信息的文本信息发送至所述多个观众终端。
可选的,所述装置还包括触发条件模块,用于:
如果所述互动信息中的相似互动信息的数量达到预设阈值,则满足触发条件,其中,所述相似互动信息为相似度大于相似度阈值的互动信息;
提取所述互动信息中的关键词,将所述关键词与关键词数据库中的第一关键词和/或第二关键词进行匹配,如果所述互动信息中包括所述第一关键词和/或所述互动信息中的所述第二关键词的数量达到关键词阈值,则满足触发条件;
如果所述第一直播场景的时长达到预设时长,则满足触发条件;
如果所述第一直播场景到达预设标记点,则满足触发条件。
观众本公开实施例所提供的直播互动装置可执行本公开任意实施例所提供的直播互动方法,具备执行方法相应的功能模块和有益效果。
图7为本公开实施例提供的一种电子设备的结构示意图。下面具体参考图7,其示出了适于用来实现本公开实施例中的电子设备500的结构示意图。本公开实施例中的电子设备500可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图7示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图7所示,电子设备500可以包括处理装置(例如中央处理器、图形处理器等)501,其可以根据存储在只读存储器(ROM)502中的程序或者从存储装置508加载到随机访问存储器(RAM)503中的程序而执行各种适当的动作和处理。在RAM 503中,还存储有电子设备500操作所需的各种程序和数据。处理装置501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。
通常,以下装置可以连接至I/O接口505:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置506;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置507;包括例如磁带、硬盘等的存储装置508;以及通信装置509。通信装置509可以允许电子设备500与其他设备进行无线或有线通信以交换数据。虽然图7示出了具有各种装置的电子设备500,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置509从网络上被下载和安装,或者从存储装置508被安装,或者从ROM 502被安装。在该计算机程序被处理装置501执行时,执行本公开实施例的直播交互方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机 可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:在直播界面播放第一直播场景下所述虚拟对象的视频内容,并展示来自所述多个观众终端的互动信息;响应于所述互动信息满足触发条件,在所述直播界面播放第二直播场景下所述虚拟对象的视频内容;其中,直播场景用于表征所述虚拟对象的直播内容类型。
或者,上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网 (LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的一个或多个实施例,本公开提供了一种直播互动方法,应用于进入虚拟对象的直播间的多个观众终端,包括:
在直播界面播放第一直播场景下所述虚拟对象的视频内容,并展示来自所述多个观众终端的互动信息;
响应于所述互动信息满足触发条件,在所述直播界面播放第二直播场景下所述虚拟对象的视频内容;其中,直播场景用于表征所述虚拟对象的直播内容类型。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述直播场景包括所述虚拟对象表演多媒体资源的直播场景和所述虚拟对象回复互动信息的直播场景。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述第一直播场景为所述虚拟对象表演多媒体资源的直播场景,所述在直播界面播放第一直播场景下所述虚拟对象的视频内容,包括:
在所述直播界面的第一区域展示待表演的多个多媒体资源的多媒体资源信息;
播放所述虚拟对象表演目标多媒体资源的视频内容,其中,所述目标多媒体资源是基于多个所述观众终端对所述多个多媒体资源的触发信息确定的。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述在所述直播界面播放第二直播场景下所述虚拟对象的视频内容,包括:
在所述直播界面播放所述虚拟对象针对所述互动信息进行回复的视频内容。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述触发条件包括所述互动信息的数量达到预设阈值、所述互动信息中包括第一关键词、所述互动信息中第二关键词的数量达到关键词阈值、所述第一直播场景的时长达到预设时长以及所述第一直播场景到达预设标记点中至少一个。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述虚拟对象针对所述互动信息中的目标互动信息进行回复;所述方法还包括:
在所述直播界面的第二区域展示所述目标互动信息以及回复所述目标互动信息的文本信息。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述在直播界面播放第一直播场景下所述虚拟对象的视频内容,包括:
接收所述第一直播场景对应的第一视频数据,其中,所述第一视频数据包括第一场景数据、第一动作数据和第一音频数据,所述第一场景数据用于表征所述第一直播场景下的直播间背景画面,所述第一动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作,所述音频数据与所述目标多媒体资源相匹配;
基于所述第一视频数据在所述直播界面中播放所述虚拟对象在所述第一直播场景下表演所述目标多媒体资源的视频内容。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述在所述直播界面播放第二直播场景下所述虚拟对象的视频内容,包括:
接收所述第二直播场景对应的第二多媒体数据,其中,所述第二多媒体数据包括第二场景数据、第二动作数据和第二音频数据,所述第二场景数据用于表征所述第二直播场景下的直播间背景画面,所述第二动作数据用于表征所述虚拟对象在所述第二直播场景中的表情动作和肢体动作,所述第二音频数据是基于所述目标互动信息生成的;
基于所述第二多媒体数据在所述直播界面中播放所述虚拟对象在所述第二直播场景针对所述目标互动信息进行回复的视频内容。
根据本公开的一个或多个实施例,本公开提供了一种直播互动方法,应用于服务端,包括:
接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;
若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述直播场景包括所述虚拟对象表演多媒体资源的直播场景和所述虚拟对象回复互动信息的直播场景。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述第一直播场景为所述虚拟对象表演多媒体资源的直播场景,还包括:
在音频数据库中查找与目标多媒体资源匹配的第一音频数据,在虚拟对象动作数据库中查找与所述目标多媒体资源对应的第一动作数据,第一动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作;
基于所述第一直播场景的场景标识确定第一场景数据,所述第一场景数据用于表征所述第一直播场景下的直播间背景画面;
将所述第一动作数据、所述第一音频数据以及所述第一场景数据组合成所述第一直播场景对应的第一视频数据;
将所述第一视频数据发送至所述多个观众终端。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,还包括:
接收来自所述多个观众终端的针对第一直播场景中展示的多个多媒体资源的触发信息;
基于所述触发信息,从所述多个多媒体资源中确定所述目标多媒体资源。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述第二视频数据通过如下方法生成:
基于目标互动信息在预设的文本库中确定回复所述目标互动信息的文本信息;
将所述文本信息转化成第二音频数据;
在虚拟对象动作数据库中查找与所述目标互动信息对应的第二动作数据,第二动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作;
基于所述第二直播场景的场景标识确定第二场景数据,所述第二场景数据用于表征所述第二直播场景下的直播间背景画面;
将所述第二动作数据、所述第二音频数据以及所述第二场景数据组合成所述第二直播场景对应的第二视频数据;
将所述第二视频数据发送至所述多个观众终端。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,在虚拟对象动作数据库中查找与所述目标互动信息对应的第二动作数据,包括:
根据所述目标互动信息识别所述虚拟对象反馈的情绪信息;
在虚拟对象动作数据库中查找与所述情绪信息对应的第二动作数据。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述方法还包括:
将所述目标互动信息和回复所述目标互动信息的文本信息发送至所述多个观众终端。
根据本公开的一个或多个实施例,本公开提供的直播互动方法中,所述触发条件通过如下至少一种方式确定:
如果所述互动信息中的相似互动信息的数量达到预设阈值,则满足触发条件,其中,所述相似互动信息为相似度大于相似度阈值的互动信息;
提取所述互动信息中的关键词,将所述关键词与关键词数据库中的第一关键词和/或第二关键词进行匹配,如果所述互动信息中包括所述第一关键词和/或所述互动信息中的所述 第二关键词的数量达到关键词阈值,则满足触发条件;
如果所述第一直播场景的时长达到预设时长,则满足触发条件;
如果所述第一直播场景到达预设标记点,则满足触发条件。
根据本公开的一个或多个实施例,本公开提供了一种直播互动装置,包括:
第一直播模块,用于在直播界面播放第一直播场景下所述虚拟对象的视频内容,并展示来自所述多个观众终端的互动信息;
第二直播模块,用于响应于所述互动信息满足触发条件,在所述直播界面播放第二直播场景下所述虚拟对象的视频内容;其中,直播场景用于表征所述虚拟对象的直播内容类型。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述直播场景包括所述虚拟对象表演多媒体资源的直播场景和所述虚拟对象回复互动信息的直播场景。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述第一直播场景为所述虚拟对象表演多媒体资源的直播场景,所述第一直播模块具体用于:
在所述直播界面的第一区域展示待表演的多个多媒体资源的多媒体资源信息;
播放所述虚拟对象表演目标多媒体资源的视频内容,其中,所述目标多媒体资源是基于多个所述观众终端对所述多个多媒体资源的触发信息确定的。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述第二直播模块具体用于:
在所述直播界面播放所述虚拟对象针对所述互动信息进行回复的视频内容。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述触发条件包括所述互动信息的数量达到预设阈值、所述互动信息中包括第一关键词、所述互动信息中第二关键词的数量达到关键词阈值、所述第一直播场景的时长达到预设时长以及所述第一直播场景到达预设标记点中至少一个。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述虚拟对象针对所述互动信息中的目标互动信息进行回复;所述装置还包括回复模块,用于:
在所述直播界面的第二区域展示所述目标互动信息以及回复所述目标互动信息的文本信息。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述第一直播模块具体用于:
接收所述第一直播场景对应的第一视频数据,其中,所述第一视频数据包括第一场景数据、第一动作数据和第一音频数据,所述第一场景数据用于表征所述第一直播场景下的直播间背景画面,所述第一动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作,所述音频数据与所述目标多媒体资源相匹配;
基于所述第一视频数据在所述直播界面中播放所述虚拟对象在所述第一直播场景下表演所述目标多媒体资源的视频内容。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述第二直播模块具体用于:
接收所述第二直播场景对应的第二多媒体数据,其中,所述第二多媒体数据包括第二场景数据、第二动作数据和第二音频数据,所述第二场景数据用于表征所述第二直播场景下的直播间背景画面,所述第二动作数据用于表征所述虚拟对象在所述第二直播场景中的表情动作和肢体动作,所述第二音频数据是基于所述目标互动信息生成的;
基于所述第二多媒体数据在所述直播界面中播放所述虚拟对象在所述第二直播场景针对所述目标互动信息进行回复的视频内容。
根据本公开的一个或多个实施例,本公开提供了一种直播互动装置,包括:
信息接收模块,用于接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;
数据发送模块,用于若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述直播场景包括所述虚拟对象表演多媒体资源的直播场景和所述虚拟对象回复互动信息的直播场景。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述第一直播场景为所述虚拟对象表演多媒体资源的直播场景,所述装置还包括数据确定模块,用于:
在音频数据库中查找与目标多媒体资源匹配的第一音频数据,在虚拟对象动作数据库中查找与所述目标多媒体资源对应的第一动作数据,第一动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作;
基于所述第一直播场景的场景标识确定第一场景数据,所述第一场景数据用于表征所述第一直播场景下的直播间背景画面;
将所述第一动作数据、所述第一音频数据以及所述第一场景数据组合成所述第一直播场景对应的第一视频数据;
将所述第一视频数据发送至所述多个观众终端。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述装置还包括资源确定模块,用于:
接收来自所述多个观众终端的针对第一直播场景中展示的多个多媒体资源的触发信息;
基于所述触发信息,从所述多个多媒体资源中确定所述目标多媒体资源。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述装置还包括第二数据模块,用于:
基于目标互动信息在预设的文本库中确定回复所述目标互动信息的文本信息;
将所述文本信息转化成第二音频数据;
在虚拟对象动作数据库中查找与所述目标互动信息对应的第二动作数据,第二动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作;
基于所述第二直播场景的场景标识确定第二场景数据,所述第二场景数据用于表征所述第二直播场景下的直播间背景画面;
将所述第二动作数据、所述第二音频数据以及所述第二场景数据组合成所述第二直播 场景对应的第二视频数据;
将所述第二视频数据发送至所述多个观众终端。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述第二数据模块,用于:
根据所述目标互动信息识别所述虚拟对象反馈的情绪信息;
在虚拟对象动作数据库中查找与所述情绪信息对应的第二动作数据。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述装置还包括回复信息发送模块,用于:
将所述目标互动信息和回复所述目标互动信息的文本信息发送至所述多个观众终端。
根据本公开的一个或多个实施例,本公开提供的直播互动装置中,所述装置还包括触发条件模块,用于:
如果所述互动信息中的相似互动信息的数量达到预设阈值,则满足触发条件,其中,所述相似互动信息为相似度大于相似度阈值的互动信息;
提取所述互动信息中的关键词,将所述关键词与关键词数据库中的第一关键词和/或第二关键词进行匹配,如果所述互动信息中包括所述第一关键词和/或所述互动信息中的所述第二关键词的数量达到关键词阈值,则满足触发条件;
如果所述第一直播场景的时长达到预设时长,则满足触发条件;
如果所述第一直播场景到达预设标记点,则满足触发条件。
根据本公开的一个或多个实施例,本公开提供了一种电子设备,包括:
处理器;
用于存储所述处理器可执行指令的存储器;
所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现如本公开提供的任一所述的直播互动方法。
根据本公开的一个或多个实施例,本公开提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行如本公开提供的任一所述的直播互动方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (23)

  1. 一种直播互动方法,其特征在于,应用于进入虚拟对象的直播间的多个观众终端,包括:
    在直播界面播放第一直播场景下所述虚拟对象的视频内容,并展示来自所述多个观众终端的互动信息;
    响应于所述互动信息满足触发条件,在所述直播界面播放第二直播场景下所述虚拟对象的视频内容;其中,所述触发条件用于基于所述互动信息确定是否进行直播场景切换的条件,所述直播场景用于表征所述虚拟对象的直播内容类型。
  2. 根据权利要求1所述的方法,其特征在于,所述直播场景包括所述虚拟对象表演多媒体资源的直播场景和所述虚拟对象回复互动信息的直播场景。
  3. 根据权利要求2所述的方法,其特征在于,所述第一直播场景为所述虚拟对象表演多媒体资源的直播场景,所述在直播界面播放第一直播场景下所述虚拟对象的视频内容,包括:
    在所述直播界面的第一区域展示待表演的多个多媒体资源的多媒体资源信息;
    播放所述虚拟对象表演目标多媒体资源的视频内容,其中,所述目标多媒体资源是基于多个所述观众终端对所述多个多媒体资源的触发信息确定的。
  4. 根据权利要求1所述的方法,其特征在于,所述在所述直播界面播放第二直播场景下所述虚拟对象的视频内容,包括:
    在所述直播界面播放所述虚拟对象针对所述互动信息进行回复的视频内容。
  5. 根据权利要求1所述的方法,其特征在于,所述触发条件包括所述互动信息的数量达到预设阈值、所述互动信息中包括第一关键词、所述互动信息中第二关键词的数量达到关键词阈值中至少一个。
  6. 根据权利要求4所述的方法,其特征在于,所述虚拟对象针对所述互动信息中的目标互动信息进行回复;所述方法还包括:
    在所述直播界面的第二区域展示所述目标互动信息以及回复所述目标互动信息的文本信息。
  7. 根据权利要求6所述的方法,其特征在于,所述目标互动信息为基于发送互动信息的直播观众的积分确定的,或所述目标互动信息为基于预设关键词匹配得到的。
  8. 根据权利要求3所述的方法,其特征在于,所述在直播界面播放第一直播场景下所述虚拟对象的视频内容,包括:
    接收所述第一直播场景对应的第一视频数据,其中,所述第一视频数据包括第一场景数据、第一动作数据和第一音频数据,所述第一场景数据用于表征所述第一直播场景下的直播间背景画面,所述第一动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作,所述音频数据与所述目标多媒体资源相匹配;
    基于所述第一视频数据在所述直播界面中播放所述虚拟对象在所述第一直播场景下表演所述目标多媒体资源的视频内容。
  9. 根据权利要求4所述的方法,其特征在于,所述在所述直播界面播放第二直播场景 下所述虚拟对象的视频内容,包括:
    接收所述第二直播场景对应的第二多媒体数据,其中,所述第二多媒体数据包括第二场景数据、第二动作数据和第二音频数据,所述第二场景数据用于表征所述第二直播场景下的直播间背景画面,所述第二动作数据用于表征所述虚拟对象在所述第二直播场景中的表情动作和肢体动作,所述第二音频数据是基于所述目标互动信息生成的;
    基于所述第二多媒体数据在所述直播界面中播放所述虚拟对象在所述第二直播场景针对所述目标互动信息进行回复的视频内容。
  10. 根据权利要求8或9所述的方法,其特征在于,所述直播间背景画面所对应的场景包括所述虚拟对象在直播场景下的背景场景以及画面视角场景,所述画面视角场景为不同镜头拍摄所述虚拟对象时的视角。
  11. 一种直播互动方法,其特征在于,应用于服务端,包括:
    接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;
    若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
  12. 根据权利要求11所述的方法,其特征在于,所述直播场景包括所述虚拟对象表演多媒体资源的直播场景和所述虚拟对象回复互动信息的直播场景。
  13. 根据权利要求12所述的方法,其特征在于,所述第一直播场景为所述虚拟对象表演多媒体资源的直播场景,还包括:
    在音频数据库中查找与目标多媒体资源匹配的第一音频数据,在虚拟对象动作数据库中查找与所述目标多媒体资源对应的第一动作数据,第一动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作;
    基于所述第一直播场景的场景标识确定第一场景数据,所述第一场景数据用于表征所述第一直播场景下的直播间背景画面;
    将所述第一动作数据、所述第一音频数据以及所述第一场景数据组合成所述第一直播场景对应的第一视频数据;
    将所述第一视频数据发送至所述多个观众终端。
  14. 根据权利要求13所述的方法,其特征在于,还包括:
    接收来自所述多个观众终端的针对第一直播场景中展示的多个多媒体资源的触发信息;
    基于所述触发信息,从所述多个多媒体资源中确定所述目标多媒体资源。
  15. 根据权利要求11所述的方法,其特征在于,所述第二视频数据通过如下方法生成:
    基于目标互动信息在预设的文本库中确定回复所述目标互动信息的文本信息;
    将所述文本信息转化成第二音频数据;
    在虚拟对象动作数据库中查找与所述目标互动信息对应的第二动作数据,第二动作数据用于表征所述虚拟对象在所述第一直播场景中的表情动作和肢体动作;
    基于所述第二直播场景的场景标识确定第二场景数据,所述第二场景数据用于表征所 述第二直播场景下的直播间背景画面;
    将所述第二动作数据、所述第二音频数据以及所述第二场景数据组合成所述第二直播场景对应的第二视频数据;
    将所述第二视频数据发送至所述多个观众终端。
  16. 根据权利要求15所述的方法,其特征在于,在虚拟对象动作数据库中查找与所述目标互动信息对应的第二动作数据,包括:
    根据所述目标互动信息识别所述虚拟对象反馈的情绪信息;
    在虚拟对象动作数据库中查找与所述情绪信息对应的第二动作数据。
  17. 根据权利要求15所述的方法,其特征在于,所述方法还包括:
    将所述目标互动信息和回复所述目标互动信息的文本信息发送至所述多个观众终端。
  18. 根据权利要求15-17任一项所述的方法,其特征在于,所述目标互动信息为基于发送互动信息的直播观众的积分确定的,或所述目标互动信息为基于预设关键词匹配得到的。
  19. 根据权利要求11所述的方法,其特征在于,所述触发条件通过如下至少一种方式确定:
    所述互动信息中的相似互动信息的数量达到预设阈值,则满足触发条件,其中,所述相似互动信息为相似度大于相似度阈值的互动信息;
    提取所述互动信息中的关键词,将所述关键词与关键词数据库中的第一关键词和/或第二关键词进行匹配,如果所述互动信息中包括所述第一关键词和/或所述互动信息中的所述第二关键词的数量达到关键词阈值,则满足触发条件。
  20. 一种直播互动装置,其特征在于,设置于进入虚拟对象的直播间的多个观众终端,包括:
    第一直播模块,用于在直播界面播放第一直播场景下所述虚拟对象的视频内容,并展示来自所述多个观众终端的互动信息;
    第二直播模块,用于响应于所述互动信息满足触发条件,在所述直播界面播放第二直播场景下所述虚拟对象的视频内容;其中,直播场景用于表征所述虚拟对象的直播内容类型。
  21. 一种直播互动装置,其特征在于,设置于服务端,包括:
    信息接收模块,用于接收第一直播场景下多个观众终端的互动信息,基于所述互动信息确定是否满足直播场景切换的触发条件;
    数据发送模块,用于若满足所述触发条件,向所述多个观众终端发送第二直播场景对应的第二视频数据;其中,直播场景用于表征直播间中虚拟对象的直播内容类型。
  22. 一种电子设备,其特征在于,所述电子设备包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现上述权利要求1-19中任一所述的直播互动方法。
  23. 一种计算机可读存储介质,其特征在于,所述存储介质存储有计算机程序,所述计算机程序用于执行上述权利要求1-19中任一所述的直播互动方法。
PCT/CN2021/129508 2020-12-11 2021-11-09 一种直播互动方法、装置、设备及介质 WO2022121601A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2023534896A JP2023553101A (ja) 2020-12-11 2021-11-09 ライブストリーミングインタラクション方法、装置、デバイス及び媒体

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011463601.8 2020-12-11
CN202011463601.8A CN112616063B (zh) 2020-12-11 2020-12-11 一种直播互动方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2022121601A1 true WO2022121601A1 (zh) 2022-06-16

Family

ID=75233674

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/129508 WO2022121601A1 (zh) 2020-12-11 2021-11-09 一种直播互动方法、装置、设备及介质

Country Status (3)

Country Link
JP (1) JP2023553101A (zh)
CN (1) CN112616063B (zh)
WO (1) WO2022121601A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022664A (zh) * 2022-06-17 2022-09-06 云知声智能科技股份有限公司 基于人工智能的直播带货辅助方法及装置
CN115866284A (zh) * 2022-11-28 2023-03-28 珠海南方数字娱乐公共服务中心 一种基于虚拟现实技术的产品信息直播管理系统及方法
CN116737936A (zh) * 2023-06-21 2023-09-12 圣风多媒体科技(上海)有限公司 一种基于人工智能的ai虚拟人物语言库分类管理系统

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112616063B (zh) * 2020-12-11 2022-10-28 北京字跳网络技术有限公司 一种直播互动方法、装置、设备及介质
CN113115061B (zh) * 2021-04-07 2023-03-10 北京字跳网络技术有限公司 直播交互方法、装置、电子设备和存储介质
CN115379265B (zh) * 2021-05-18 2023-12-01 阿里巴巴新加坡控股有限公司 虚拟主播的直播行为控制方法及装置
CN113286162B (zh) * 2021-05-20 2022-05-31 成都威爱新经济技术研究院有限公司 一种基于混合现实的多机位画面直播方法及系统
CN115580753A (zh) * 2021-06-21 2023-01-06 北京字跳网络技术有限公司 一种基于多媒体作品的交互方法、装置、设备及存储介质
CN113810729B (zh) * 2021-09-16 2024-02-02 中国平安人寿保险股份有限公司 直播氛围特效匹配方法、装置、设备及介质
CN113965771A (zh) * 2021-10-22 2022-01-21 成都天翼空间科技有限公司 一种vr直播用户互动体验系统
CN116233382A (zh) * 2022-01-07 2023-06-06 深圳看到科技有限公司 基于场景要素的三维场景互动视频生成方法及生成装置
CN114125569B (zh) * 2022-01-27 2022-07-15 阿里巴巴(中国)有限公司 直播处理方法以及装置
CN114615514B (zh) * 2022-03-14 2023-09-22 深圳幻影未来信息科技有限公司 一种虚拟人直播互动系统
CN115225948A (zh) * 2022-06-28 2022-10-21 北京字跳网络技术有限公司 直播间互动方法、装置、设备及介质
CN115243096A (zh) * 2022-07-27 2022-10-25 北京字跳网络技术有限公司 直播间展示方法、装置、电子设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120224024A1 (en) * 2009-03-04 2012-09-06 Lueth Jacquelynn R System and Method for Providing a Real-Time Three-Dimensional Digital Impact Virtual Audience
US20150088622A1 (en) * 2012-04-06 2015-03-26 LiveOne, Inc. Social media application for a media content providing platform
CN106878820A (zh) * 2016-12-09 2017-06-20 北京小米移动软件有限公司 直播互动方法及装置
CN107750005A (zh) * 2017-09-18 2018-03-02 迈吉客科技(北京)有限公司 虚拟互动方法和终端
CN107911724A (zh) * 2017-11-21 2018-04-13 广州华多网络科技有限公司 直播互动方法、装置及系统
CN110519611A (zh) * 2019-08-23 2019-11-29 腾讯科技(深圳)有限公司 直播互动方法、装置、电子设备及存储介质
CN112616063A (zh) * 2020-12-11 2021-04-06 北京字跳网络技术有限公司 一种直播互动方法、装置、设备及介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018103516A1 (zh) * 2016-12-06 2018-06-14 腾讯科技(深圳)有限公司 一种虚拟对象的虚拟资源获取的方法及客户端
CN107423809B (zh) * 2017-07-07 2021-02-26 北京光年无限科技有限公司 应用于视频直播平台的虚拟机器人多模态交互方法和系统
CN111010589B (zh) * 2019-12-19 2022-02-25 腾讯科技(深圳)有限公司 基于人工智能的直播方法、装置、设备及存储介质
CN111010586B (zh) * 2019-12-19 2021-03-19 腾讯科技(深圳)有限公司 基于人工智能的直播方法、装置、设备及存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120224024A1 (en) * 2009-03-04 2012-09-06 Lueth Jacquelynn R System and Method for Providing a Real-Time Three-Dimensional Digital Impact Virtual Audience
US20150088622A1 (en) * 2012-04-06 2015-03-26 LiveOne, Inc. Social media application for a media content providing platform
CN106878820A (zh) * 2016-12-09 2017-06-20 北京小米移动软件有限公司 直播互动方法及装置
CN107750005A (zh) * 2017-09-18 2018-03-02 迈吉客科技(北京)有限公司 虚拟互动方法和终端
CN107911724A (zh) * 2017-11-21 2018-04-13 广州华多网络科技有限公司 直播互动方法、装置及系统
CN110519611A (zh) * 2019-08-23 2019-11-29 腾讯科技(深圳)有限公司 直播互动方法、装置、电子设备及存储介质
CN112616063A (zh) * 2020-12-11 2021-04-06 北京字跳网络技术有限公司 一种直播互动方法、装置、设备及介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022664A (zh) * 2022-06-17 2022-09-06 云知声智能科技股份有限公司 基于人工智能的直播带货辅助方法及装置
CN115866284A (zh) * 2022-11-28 2023-03-28 珠海南方数字娱乐公共服务中心 一种基于虚拟现实技术的产品信息直播管理系统及方法
CN115866284B (zh) * 2022-11-28 2023-09-01 珠海南方数字娱乐公共服务中心 一种基于虚拟现实技术的产品信息直播管理系统及方法
CN116737936A (zh) * 2023-06-21 2023-09-12 圣风多媒体科技(上海)有限公司 一种基于人工智能的ai虚拟人物语言库分类管理系统
CN116737936B (zh) * 2023-06-21 2024-01-02 圣风多媒体科技(上海)有限公司 一种基于人工智能的ai虚拟人物语言库分类管理系统

Also Published As

Publication number Publication date
CN112616063A (zh) 2021-04-06
CN112616063B (zh) 2022-10-28
JP2023553101A (ja) 2023-12-20

Similar Documents

Publication Publication Date Title
WO2022121601A1 (zh) 一种直播互动方法、装置、设备及介质
US10210002B2 (en) Method and apparatus of processing expression information in instant communication
WO2022121557A1 (zh) 一种直播互动方法、装置、设备及介质
WO2021114881A1 (zh) 智能解说生成、播放方法、装置、设备及计算机存储介质
US10659499B2 (en) Providing selectable content items in communications
WO2022121558A1 (zh) 一种直播演唱方法、装置、设备和介质
US11917344B2 (en) Interactive information processing method, device and medium
CN108847214B (zh) 语音处理方法、客户端、装置、终端、服务器和存储介质
JP2019003604A (ja) ビデオベースの通信におけるコンテンツキュレーションのための方法、システム及びプログラム
US20220318306A1 (en) Video-based interaction implementation method and apparatus, device and medium
JP6337183B1 (ja) テキスト抽出装置、コメント投稿装置、コメント投稿支援装置、再生端末および文脈ベクトル計算装置
CN110602516A (zh) 基于视频直播的信息交互方法、装置及电子设备
CN109600559B (zh) 一种视频特效添加方法、装置、终端设备及存储介质
CN113010698B (zh) 多媒体的交互方法、信息交互方法、装置、设备及介质
CN112653902A (zh) 说话人识别方法、装置及电子设备
US20240121451A1 (en) Video processing method and apparatus, storage medium, and device
CN112738557A (zh) 视频处理方法及装置
CN114501064B (zh) 一种视频生成方法、装置、设备、介质及产品
CN111158924A (zh) 内容分享方法、装置、电子设备及可读存储介质
JP2019008779A (ja) テキスト抽出装置、コメント投稿装置、コメント投稿支援装置、再生端末および文脈ベクトル計算装置
CN110990632B (zh) 一种视频处理方法及装置
WO2023061229A1 (zh) 视频生成方法及设备
US20230027035A1 (en) Automated narrative production system and script production method with real-time interactive characters
CN113301352A (zh) 在视频播放期间进行自动聊天
CN114793289B (zh) 直播间的视频信息的显示处理方法、终端、服务器及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21902308

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023534896

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21902308

Country of ref document: EP

Kind code of ref document: A1