WO2022121592A1 - 一种直播互动方法及装置 - Google Patents

一种直播互动方法及装置 Download PDF

Info

Publication number
WO2022121592A1
WO2022121592A1 PCT/CN2021/129237 CN2021129237W WO2022121592A1 WO 2022121592 A1 WO2022121592 A1 WO 2022121592A1 CN 2021129237 W CN2021129237 W CN 2021129237W WO 2022121592 A1 WO2022121592 A1 WO 2022121592A1
Authority
WO
WIPO (PCT)
Prior art keywords
reply text
virtual object
live
data
video stream
Prior art date
Application number
PCT/CN2021/129237
Other languages
English (en)
French (fr)
Inventor
南天骄
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2022121592A1 publication Critical patent/WO2022121592A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2355Processing of additional data, e.g. scrambling of additional data or processing content descriptors involving reformatting operations of additional data, e.g. HTML pages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting

Definitions

  • the present invention relates to the technical field of human-computer interaction, and in particular, to a method and device for live broadcast interaction.
  • both the anchor side and the audience side are real-time online users.
  • the audience side user can input interactive information while watching the video generated by the anchor side user in real time, and the anchor side user can view the interaction input by the audience side user. information and provide feedback on interactive information.
  • live broadcasts are no longer limited to live broadcasts, but can generate virtual anchors based on artificial intelligence (AI) technology, and conduct virtual live broadcasts based on virtual anchors.
  • AI artificial intelligence
  • the virtual host broadcasts the live broadcast according to the preset process and scene, and cannot interact with the audience in real time.
  • the present invention provides a live broadcast interaction method and device, which are used to solve the problem that real-time interaction with the audience cannot be performed during virtual live broadcast.
  • an embodiment of the present invention provides a live interactive method, including:
  • the obtaining feedback data according to the interaction information includes:
  • the first correspondence includes the correspondence between each emotion label and the facial expression data of the virtual object
  • the second correspondence includes the correspondence between each emotion label and the body motion data of the virtual object
  • the obtaining feedback data according to the interaction information includes:
  • the first feedback model is a model obtained by training a first algorithm model based on the sample interaction information and the facial expression data corresponding to the sample interaction information
  • the second feedback model is based on the sample interaction information and the facial expression data corresponding to the sample interaction information.
  • the method further includes:
  • Audio data of the live video stream is generated based on the reply text and the timbre of the virtual object.
  • the obtaining, according to the interactive information, the reply text for replying to the interactive information includes:
  • the first reply text library includes at least one semantically corresponding reply text
  • the second reply text library includes each semantically corresponding reply text
  • the matching degree of each reply text in the first reply text database and the virtual object is greater than or equal to that of each reply text in the second reply text database and the virtual object
  • the matching degree of the role; the obtaining of the reply text based on the first reply text base, the second reply text base and the semantics of the interaction information includes:
  • the reply text is acquired based on the second reply text library and the semantics of the interaction information.
  • the method before acquiring the reply text based on the first reply text database, the second reply text database and the semantics of the interaction information, the method further includes:
  • the second reply text library is generated according to the reply text obtained based on the big data.
  • the method further includes:
  • the reply text is sent to the terminal device.
  • the method further includes:
  • the generating a live video stream based on the feedback data includes:
  • the live video stream is generated according to the feedback data and the lip data.
  • an embodiment of the present invention provides a live interactive method, including:
  • the live video stream is a video stream generated by the live server based on feedback data corresponding to the interaction information, and the feedback data includes facial expression data of the virtual object and/or or body motion data of the virtual object;
  • the live video stream is displayed in the live room of the virtual object.
  • the method further includes:
  • the audio data is audio data generated by the live broadcast server based on the reply text used to reply to the interactive information and the timbre of the virtual object;
  • the audio data is played in the live room of the virtual object.
  • the method further includes:
  • the reply text is displayed in the live room of the virtual object.
  • an embodiment of the present invention provides a live server, including:
  • a receiving unit configured to receive interactive information sent by the terminal device, where the interactive information is interactive information input by the user in the live broadcast room of the virtual object;
  • a processing unit configured to obtain feedback data according to the interaction information, where the feedback data includes facial expression data of the virtual object and/or body motion data of the virtual object;
  • a generating unit for generating a live video stream based on the feedback data
  • a sending unit configured to send the live video stream to the terminal device.
  • the processing unit is specifically configured to parse the interaction information to obtain an emotion label, obtain the facial expression data based on the first correspondence and the emotion label, and/or obtain the facial expression data based on the first correspondence and the emotion label.
  • the second correspondence and the emotional label obtain the body motion data;
  • the emotion label is used to represent the emotion of the virtual object when the interaction information is fed back.
  • the first correspondence includes the correspondence between each emotion label and the facial expression data of the virtual object
  • the second correspondence includes the corresponding relationship between each emotional label and the body motion data of the virtual object.
  • the processing unit is specifically configured to acquire the facial expression data according to the interaction information and the first feedback model and/or obtain the facial expression data according to the interaction information and the second feedback model obtain the body movement data;
  • the first feedback model is a model obtained by training a first algorithm model based on the sample interaction information and the facial expression data corresponding to the sample interaction information
  • the second feedback model is based on the sample interaction information and the facial expression data corresponding to the sample interaction information.
  • the processing unit is further configured to obtain a reply text for replying to the interactive information according to the interactive information;
  • the generating unit is further configured to generate audio data based on the reply text and the timbre of the virtual object;
  • the sending unit is further configured to send the audio data to the terminal device.
  • the processing unit is specifically configured to acquire the semantics of the interactive information, and obtains the semantics of the interactive information based on the first reply text library, the second reply text library and the interactive information. the text of the reply;
  • the first reply text library includes at least one semantically corresponding reply text
  • the second reply text library includes each semantically corresponding reply text
  • the matching degree of each reply text in the first reply text database and the virtual object is greater than or equal to that of each reply text in the second reply text database and the virtual object role fit;
  • the processing unit is specifically configured to judge whether the first reply text library contains the reply text corresponding to the semantics of the interaction information; the reply text; if not, the reply text is acquired based on the second reply text library and the semantics of the interaction information.
  • the processing unit is further configured to, before acquiring the reply text based on the first reply text base, the second reply text base and the semantics of the interactive information, receive the user Set the reply text corresponding to at least one semantic input based on the character of the virtual object; according to the reply text corresponding to the at least one semantic input by the user; obtain the reply text corresponding to each semantic category based on big data; The reply text generates the second reply text library.
  • the sending unit is further configured to send the reply text to the terminal device.
  • the processing unit is further configured to acquire mouth shape data of the virtual object according to the voice corresponding to the reply text;
  • the generating unit is specifically configured to generate the live video stream according to the feedback data and the mouth shape data.
  • an embodiment of the present invention provides a terminal device, including:
  • a user input unit configured to receive interactive information input by the user in the live broadcast room of the virtual object
  • a sending unit configured to send the interactive information to the live server
  • a receiving unit configured to receive a live video stream sent by the live server, where the live video stream is a video stream generated by the live server based on feedback data corresponding to the interaction information, and the feedback data includes Facial expression data and/or body movement data of the virtual object;
  • An output unit configured to display the live video stream in the live room of the virtual object.
  • the live video stream further includes audio data, and the audio data of the live video stream is audio data generated by the live server based on the reply text used for replying to the interaction information and the timbre of the virtual object.
  • the receiving unit is further configured to receive the reply text sent by the live server;
  • the output unit is further configured to display the reply text in the live broadcast room of the virtual object.
  • an embodiment of the present invention provides an electronic device, including: a memory and a processor, where the memory is used to store a computer program; the processor is used to execute the first aspect or any optional option of the first aspect when the computer program is invoked
  • an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the first aspect or any optional implementation or second aspect of the first aspect.
  • the live interaction method provided by the embodiment of the present invention acquires facial expression data including the virtual object and/or the virtual object according to the interaction information when receiving the interaction information sent by the terminal device and input by the user in the live broadcast room of the virtual object. Then, based on the feedback data, a live video stream is generated, and the live video stream is sent to the terminal device. Because the live interaction method provided by the embodiment of the present invention can obtain the facial expression data of the virtual object and/or the body motion data of the virtual object when receiving the interactive information input by the user in the live broadcast room, and obtain the facial expression data of the virtual object through the facial expression of the virtual object. The data and/or the body movement data of the virtual object generate a live video stream, so that the interactive information input by the user is fed back through the facial expression data and/or body movements of the virtual object. The question of interacting with your audience in real time.
  • FIG. 1 is a flowchart of steps of a live broadcast interaction method provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an image frame of a live video stream provided by an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a reply text provided by an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a live interactive device according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a terminal device provided by an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention.
  • first and second in the description and claims of the present invention are used to distinguish synchronized objects, rather than to describe a specific order of the objects.
  • first correspondence and the second correspondence are used for distinguishing different correspondences, rather than for describing a specific order of the correspondences.
  • words such as “exemplary” or “for example” are used to mean serving as an example, illustration or illustration. Any embodiments or designs described as “exemplary” or “such as” in the embodiments of the present invention should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplary” or “such as” is intended to present the related concepts in a specific manner.
  • the meaning of "plurality” refers to two or more.
  • An embodiment of the present invention provides a live broadcast interaction method.
  • the live broadcast interaction method includes the following steps S101 to S106:
  • a terminal device receives interactive information input by a user in a live broadcast room of a virtual object.
  • the terminal device in the embodiment of the present invention may be a mobile phone, a tablet computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a personal digital assistant (PDA), a smart watch, a smart A terminal device such as a bracelet, or the terminal device may also be another type of terminal device, and the embodiment of the present invention does not limit the type of the terminal device.
  • UMPC ultra-mobile personal computer
  • PDA personal digital assistant
  • smart watch such as a bracelet
  • the terminal device may also be another type of terminal device, and the embodiment of the present invention does not limit the type of the terminal device.
  • the virtual object in the embodiment of the present invention refers to a model of a virtual host generated based on artificial or artificial intelligence technology.
  • the image of the virtual object is not limited in the embodiment of the present invention.
  • the virtual object may be a two-dimensional cartoon character, Real-life simulation image, etc.
  • the interactive information in this embodiment of the present invention may be text comment content, expressions in an expression package, voice, likes, virtual gifts, and the like.
  • the terminal device sends the interactive information to the live broadcast server.
  • the live server receives the interactive information sent by the terminal device.
  • the interactive information is interactive information input by the user in the live broadcast room of the virtual object.
  • the live server in the embodiment of the present invention may be any server interconnected with the terminal device.
  • cloud server desktop server
  • rack server rack server
  • blade server etc.
  • the live server acquires feedback data according to the interaction information.
  • the feedback data includes facial expression data of the virtual object and/or body motion data of the virtual object.
  • the facial expression data in this embodiment of the present invention may include: data used to represent expressions such as smile, laugh, cry, excitement, shyness, disgust, etc. of the virtual object
  • the body motion data may include: data used to represent the virtual object Data of actions such as nodding, shaking his head, applauding, dancing, etc.
  • the above step S103 (the live broadcast server obtains feedback data according to the interaction information) may include the following steps a and b:
  • Step a parsing the interactive information to obtain emotional tags.
  • the emotion tag is used to represent the emotion of the virtual object when the interaction information is fed back.
  • the content of the text comments can be understood based on natural language processing (NPL) technology to obtain emotional tags.
  • NPL natural language processing
  • interactive information can be converted into text content first, and then emotional tags can be obtained by understanding the content of text comments based on NPL technology.
  • the interactive information is "you are harmless", the emotional label "anger” obtained by parsing the interactive information directly based on NPL technology
  • the interactive information is like, the interactive information can be converted into text content first "You are great”, and then based on the NPL technology to analyze the text converted from interactive information to obtain the emotional label "joy”.
  • the emotional tags in this embodiment of the present invention may include: joy, anger, sadness, fear, grievance, and the like.
  • Step b Acquire the facial expression data based on the first correspondence and the emotion label and/or acquire the body motion data based on the second correspondence and the emotion label.
  • the first correspondence includes the correspondence between each emotion label and facial expression data
  • the second correspondence includes the correspondence between each emotion label and body motion data
  • the corresponding relationship between each emotional tag and the facial expression data of the virtual object and/or the corresponding relationship between each emotional tag and the body motion data of the virtual object is established in advance, after acquiring the interactive information, first analyze the interactive information to obtain the emotional tag, and then The facial expression data of the virtual object and/or the body motion data of the virtual object are acquired according to the emotion tag and the pre-established correspondence.
  • the foregoing step S103 (the live broadcast server obtains feedback data according to the interaction information) includes:
  • the facial expression data is acquired according to the interaction information and the first feedback model and/or the body motion data is acquired according to the interaction information and the second feedback model.
  • the first feedback model is a model obtained by training a first algorithm model based on the sample interaction information and the facial expression data corresponding to the sample interaction information
  • the second feedback model is based on the sample interaction information and the facial expression data corresponding to the sample interaction information.
  • the first algorithm model and the second algorithm model may be machine learning algorithm models such as a deep learning neural network model and a convolutional neural network model, and the first algorithm model and the second algorithm model may be the same or different.
  • the embodiments of the present invention do not limit the specific types of the first algorithm model and the second algorithm model.
  • first feedback model and the second feedback model in the above embodiment may be two independent models, or may be two sub-models in the model.
  • the live server generates a live video stream based on the feedback data.
  • the interactive information is “you are beautiful”, and the feedback data includes: the facial expression data of the virtual object and the body motion data of the virtual object as an example.
  • the virtual object in the live video stream generated based on the feedback data expresses the interactive information "Hello” through facial expressions "smile” and body movements "applaud” beautiful” for feedback.
  • the live server sends the live video stream to the terminal device.
  • the terminal device receives the live video stream sent by the live server.
  • the live video stream of the terminal device is that the live video stream is a video stream generated by the live server based on feedback data corresponding to the interaction information.
  • the feedback data includes facial expression data of the virtual object and/or body motion data of the virtual object.
  • the terminal device displays the live video stream in the live room of the virtual object.
  • the live interaction method provided by the embodiment of the present invention acquires facial expression data including the virtual object and/or the virtual object according to the interaction information when receiving the interaction information sent by the terminal device and input by the user in the live broadcast room of the virtual object. Then, based on the feedback data, a live video stream is generated, and the live video stream is sent to the terminal device. Because the live interaction method provided by the embodiment of the present invention can obtain the facial expression data of the virtual object and/or the body motion data of the virtual object when receiving the interactive information input by the user in the live broadcast room, and obtain the facial expression data of the virtual object through the facial expression of the virtual object.
  • the data and/or the body movement data of the virtual object generate a live video stream, so that the interactive information input by the user is fed back through the facial expression data and/or body movements of the virtual object, so the embodiment of the present invention can solve the problem of inability to perform virtual live broadcast.
  • the live interaction method provided by the embodiment of the present invention further includes:
  • the live server obtains, according to the interactive information, a reply text for replying to the interactive information
  • the live server generates audio data of the live video stream based on the reply text and the timbre of the virtual object.
  • the live video stream received by the terminal device also includes audio data
  • the audio data of the live video stream is based on the reply text and audio data generated by the timbre of the virtual object.
  • the timbre of the virtual object can be preset by the developer.
  • the audio data of the live video stream is generated based on the reply text and the timbre of the virtual object, which may be: converting the reply text into a voice format through a speech synthesis (Text-To-Speech, TTS) technology, and then according to The converted voice generates audio data of the audio data of the live video stream.
  • TTS speech synthesis
  • the method provided by the embodiment of the present invention further includes:
  • the live server acquires the mouth shape data of the virtual object according to the voice corresponding to the reply text.
  • the above step S105 (the live server generating the live video stream based on the feedback data) includes: the live server generating the live video stream according to the feedback data and the mouth shape data.
  • the live server also obtains the mouth shape data of the virtual object according to the voice corresponding to the reply text, and generates the live video stream according to the feedback data and the mouth shape data, so the above embodiment You can control the mouth shape change of the virtual object with the voice corresponding to the playback text. Thereby making the virtual object more three-dimensional and vivid.
  • the implementation manner of acquiring the reply text for replying to the interactive information according to the interactive information may include the following steps I and II.
  • Step I the live server obtains the semantics of the interactive information.
  • the interactive information may be parsed based on the NPL technology, so as to obtain the semantics of the interactive information.
  • Step II the live server obtains the reply text based on the first reply text database, the second reply text database and the semantics of the interactive information.
  • the first reply text library includes at least one semantically corresponding reply text
  • the second reply text library includes each semantically corresponding reply text
  • the matching degree of each reply text in the first reply text database and the corresponding semantics is greater than or equal to the matching degree of each reply text in the second reply text database and the corresponding semantics.
  • the method of constructing the first reply text library may be:
  • the user can set one or more semantically corresponding reply texts according to the character setting of the virtual object, thereby constructing the first reply text library. For example, if the character of the virtual object is set to be "gentle”, the user can set the semantically corresponding reply text of the interactive information "you are harmless” to "I will be very sad if you say that". For another example: if the character of the virtual object is set as "irritable”, the user can set the semantically corresponding reply text of the interactive information "you are harmless" as "you are harmless, I'm mad”.
  • the method of constructing the second reply text library may be:
  • the second reply text library is generated according to the reply text obtained based on the big data.
  • text replies corresponding to each semantics can be collected through big data, and each text replies corresponding to semantics can be added to the second reply text library.
  • the reply text in the first reply text database is specially designed by the user based on the character setting of the virtual object, the reply text in the second reply text database is collected from big data, so the first reply text database
  • the characteristics are: the data may not be comprehensive, some semantically corresponding reply texts have not been included or set, but the reply texts have a high degree of matching with virtual objects.
  • the characteristics of the second reply text library are: the data is more comprehensive and can almost cover All semantically corresponding reply texts, but some reply texts have a low degree of matching with virtual objects. That is, it exists that the matching degree of each reply text in the first reply text database and the virtual object is greater than or equal to the matching degree of each reply text in the second reply text database and the virtual character.
  • the above step II (the live server obtains the reply text based on the first reply text database, the second reply text database and the semantics of the interactive information) includes:
  • the live server judges whether the first mapping relationship includes the corresponding relationship between the semantics of the interactive information and the reply text;
  • the live server obtains the reply text based on the first mapping relationship and the semantics of the interaction information
  • the live server obtains the reply text based on the second mapping relationship and the semantics of the interaction information.
  • the matching degree of each reply text in the first reply text database and the virtual object is greater than or equal to the matching degree of each reply text in the second reply text database and the virtual character, and when the interactive information is obtained
  • the semantically corresponding reply text first determine whether the first reply text database contains the semantically corresponding reply text of the interactive information. If the first reply text database contains the corresponding relationship between the semantics of the interactive information and the reply text, then Acquire the reply text corresponding to the semantics of the interactive information based on the first reply text library. If the first reply text library does not contain the reply text corresponding to the semantics of the interactive information, obtain the reply text based on the second reply text library The response text corresponding to the semantics of the interactive information.
  • the reply text used for replying to the interactive information with the semantics of "you are harmless” is obtained, and the first reply text library includes the corresponding text input by the user based on the character setting of "gentleness" of the virtual object.
  • the reply content “I will be very sad if you say that” the second reply text library includes the corresponding reply content obtained from the big data "You are harmless, I am mad", “I am not harmless at all”, “You I'd be sad to say that” as an example to illustrate the above embodiment.
  • the reply text is obtained based on the first reply text base and the semantics of the interactive information, and the final obtained reply text For "I'll be sad if you say that".
  • the following is to obtain the reply text for replying to the interactive information with the semantic "you are beautiful”, the reply text corresponding to the semantic "you are beautiful” whose semantics are not included in the first reply text database, and the second database
  • the corresponding reply content "thank you for your compliment", “you really have vision”, and "really?” obtained from big data is taken as an example to illustrate the above embodiment.
  • the reply text base does not contain the reply text corresponding to the semantic "you are so beautiful” of the interactive information
  • the reply text is obtained based on the second reply text base and the semantics of the interactive information.
  • the response text corresponding to the semantic "you are so beautiful” in the text library includes “thank you for your compliment", “you are really insightful", and "really?”, so one can be selected as the semantic response text based on the preset rules , for example: randomly selected.
  • the matching degree of each reply text in the first reply text database and the virtual object is greater than or equal to the matching degree of each reply text in the second reply text database and the virtual character
  • First it is judged whether the first reply text library contains the reply text corresponding to the semantics of the interactive information. Obtain the reply text, so as to ensure the matching degree between the reply text and the virtual object. If the reply text corresponding to the semantics of the interactive information does not exist in the first reply text database, the reply text is obtained based on the second reply text database, so as to avoid the failure to interact with the interaction information. information to reply.
  • the live interaction method provided by the embodiment of the present invention further includes:
  • the live server sends the reply text to the terminal device.
  • the terminal device receives the reply text sent by the live broadcast server, and displays the reply text in the live broadcast room of the virtual object.
  • the interactive information is "You are so beautiful” and the reply text is "You are really insightful” as an example.
  • the manner in which the terminal device displays the reply text in the live broadcast room of the virtual object may be: superimposing and displaying the reply text "You really have vision" 31 on the video playing interface of the interactive video.
  • an embodiment of the present invention further provides a live broadcast server and a terminal device.
  • the embodiments of the live broadcast server and the terminal device correspond to the foregoing method embodiments.
  • this device This embodiment will not repeat the details of the foregoing method embodiments one by one, but it should be clarified that the live broadcast server in this embodiment can correspondingly implement all the steps performed by the live broadcast server in the foregoing method embodiments, and the terminal device can correspondingly implement the foregoing method embodiments. All steps performed by the terminal device in the method embodiment.
  • FIG. 4 is a schematic structural diagram of a live broadcast server provided by an embodiment of the present invention.
  • the live broadcast server 400 provided by this embodiment includes:
  • the receiving unit 41 is configured to receive interactive information sent by the terminal device, where the interactive information is the interactive information input by the user in the live broadcast room of the virtual object;
  • a processing unit 42 configured to obtain feedback data according to the interaction information, where the feedback data includes facial expression data of the virtual object and/or body motion data of the virtual object;
  • a generating unit 43 configured to generate a live video stream based on the feedback data
  • the sending unit 44 is configured to send the live video stream to the terminal device.
  • the processing unit 42 is specifically configured to parse the interaction information to obtain an emotion label, and obtain the facial expression data and/or the facial expression data based on the first correspondence and the emotion label. Obtaining the body motion data based on the second correspondence and the emotion label;
  • the emotion label is used to represent the emotion of the virtual object when the interaction information is fed back.
  • the first correspondence includes the correspondence between each emotion label and the facial expression data of the virtual object
  • the second correspondence includes the corresponding relationship between each emotional label and the body motion data of the virtual object.
  • the processing unit 42 is specifically configured to acquire the facial expression data according to the interaction information and the first feedback model and/or according to the interaction information and the second feedback the model obtains the body motion data;
  • the first feedback model is a model obtained by training a first algorithm model based on the sample interaction information and the facial expression data corresponding to the sample interaction information
  • the second feedback model is based on the sample interaction information and the facial expression data corresponding to the sample interaction information.
  • the processing unit 42 is further configured to obtain a reply text for replying to the interactive information according to the interactive information;
  • the generating unit 43 is further configured to generate audio data based on the reply text and the timbre of the virtual object;
  • the sending unit 44 is further configured to send the audio data to the terminal device.
  • the processing unit 42 is specifically configured to acquire the semantics of the interactive information, based on the first reply text library, the second reply text library and the semantic acquisition of the interactive information the reply text;
  • the first reply text library includes at least one semantically corresponding reply text
  • the second reply text library includes each semantically corresponding reply text
  • the matching degree of each reply text in the first reply text database and the virtual object is greater than or equal to that of each reply text in the second reply text database and the virtual object role fit;
  • the processing unit 42 is specifically configured to determine whether the first reply text library contains the semantic corresponding reply text of the interactive information; if so, obtain the semantics based on the first reply text library and the interactive information the reply text; if not, obtain the reply text based on the second reply text library and the semantics of the interaction information.
  • the processing unit 42 is further configured to, before acquiring the reply text based on the first reply text database, the second reply text database and the semantics of the interaction information, receive The user sets the reply text corresponding to at least one semantic input based on the character of the virtual object; according to the reply text corresponding to the at least one semantic input by the user; obtains the reply text corresponding to each semantic category based on the big data; Obtain the reply text to generate the second reply text library.
  • the sending unit 44 is further configured to send the reply text to the terminal device.
  • the processing unit 42 is further configured to acquire the mouth shape data of the virtual object according to the voice corresponding to the reply text;
  • the generating unit 43 is specifically configured to generate the live video stream according to the feedback data and the mouth shape data.
  • the live broadcast server provided in this embodiment can perform all the steps performed by the live broadcast server in the live broadcast interaction method provided by the above method embodiments, and the implementation principle and technical effect thereof are similar, and are not repeated here.
  • FIG. 5 is a schematic structural diagram of a terminal device provided by an embodiment of the present invention. As shown in FIG. 5 , the terminal device 500 provided by this embodiment includes:
  • the user input unit 51 is used for receiving the interactive information input by the user in the live broadcast room of the virtual object;
  • a sending unit 52 configured to send the interactive information to the live server
  • a receiving unit 53 configured to receive a live video stream sent by the live server, where the live video stream is a video stream generated by the live server based on feedback data corresponding to the interaction information, and the feedback data includes the virtual object facial expression data and/or body movement data of the virtual object;
  • the output unit 54 is configured to display the live video stream in the live room of the virtual object.
  • the live video stream further includes audio data, and the audio data of the live video stream is audio data generated by the live server based on the reply text used for replying to the interaction information and the timbre of the virtual object.
  • the receiving unit 53 is further configured to receive the reply text sent by the live server;
  • the output unit 54 is further configured to display the reply text in the live broadcast room of the virtual object.
  • the terminal device provided in this embodiment can execute each step performed by the terminal device in the live interactive method provided by the above method embodiments, and the implementation principle and technical effect thereof are similar, and are not repeated here.
  • FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.
  • the electronic device provided by this embodiment includes: a memory 61 and a processor 62.
  • the memory 61 is used for storing computer programs; the processor 62 is used for The steps performed by the live broadcast server or the terminal device in the live broadcast interactive method provided by the above method embodiments are executed when the computer program is invoked.
  • the memory 61 can be used to store software programs and various data.
  • the memory 61 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program (such as a sound playback function, an image playback function, etc.) required for at least one function, and the like; Data created by the use of electronic equipment (such as audio data, phone book, etc.), etc.
  • memory 61 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
  • the processor 62 is the control center of the electronic device, using various interfaces and lines to connect various parts of the entire electronic device, by running or executing the software programs and/or modules stored in the memory 61, and calling the data stored in the memory 61. , perform various functions of electronic equipment and process data, so as to monitor electronic equipment as a whole.
  • Processor 62 may include one or more processing units.
  • the electronic device may further include: a radio frequency unit, a network module, an audio output unit, a receiving unit, a sensor, a display unit, a user receiving unit, an interface unit, and a power supply and other components.
  • a radio frequency unit e.g., a radio frequency unit
  • a network module e.g., a Wi-Fi Protected Access (WPA) module
  • an audio output unit e.g., a wireless local area network
  • a receiving unit e.g., a sensor
  • a display unit e.g., a user receiving unit
  • an interface unit e.g., a user receiving unit
  • a power supply and other components e.g., a power supply and other components.
  • the structure of the electronic device described above does not constitute a limitation on the electronic device, and the electronic device may include more or less components, or combine some components, or arrange different components.
  • the electronic device includes but is not limited to a mobile phone, a tablet computer, a
  • the radio frequency unit can be used for receiving and sending signals during sending and receiving of information or during a call. Specifically, after receiving downlink data from the base station, it is processed by the processor 62; in addition, the uplink data is sent to the base station.
  • a radio frequency unit includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.
  • the radio frequency unit can also communicate with the network and other devices through the wireless communication system.
  • Electronic devices provide users with wireless broadband Internet access through network modules, such as helping users send and receive e-mails, browse web pages, and access streaming media.
  • the audio output unit may convert audio data received by the radio frequency unit or the network module or stored in the memory 61 into audio signals and output as sound. Also, the audio output unit may also provide audio output related to a specific function performed by the electronic device (eg, call signal reception sound, message reception sound, etc.).
  • the audio output unit includes speakers, buzzers, and receivers.
  • the receiving unit is used to receive audio or video signals.
  • the receiving unit may include a graphics processor (Graphics Processing Unit, GPU) and a microphone, and the graphics processor processes image data of still pictures or videos obtained by an image capture device (such as a camera) in a video capture mode or an image capture mode.
  • the processed image frames can be displayed on the display unit.
  • the image frames processed by the graphics processor may be stored in memory (or other storage medium) or transmitted via a radio frequency unit or a network module.
  • the microphone can receive sound and can process such sound into audio data.
  • the processed audio data can be converted into a format that can be transmitted to a mobile communication base station via a radio frequency unit for output in the case of a telephone call mode.
  • the electronic device also includes at least one sensor, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel according to the brightness of the ambient light, and the proximity sensor can turn off the display panel and/or the backlight when the electronic device is moved to the ear .
  • the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes), and can detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of electronic devices (such as horizontal and vertical screen switching, related games , magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; sensors can also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared Sensors, etc., will not be repeated here.
  • the display unit is used to display information input by the user or information provided to the user.
  • the display unit may include a display panel, and the display panel may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an organic light-emitting diode (Organic Light-Emitting Diode, OLED), and the like.
  • LCD Liquid Crystal Display
  • OLED Organic Light-Emitting Diode
  • the user receiving unit can be used for receiving inputted numerical or character information, and generating key signal input related to user setting and function control of the electronic device.
  • the user receiving unit includes a touch panel and other input devices.
  • a touch panel also known as a touch screen, collects user touch operations on or near it (such as a user's operations on or near the touch panel using a finger, stylus, or any suitable object or accessory).
  • the touch panel may include two parts, a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch orientation, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it to the touch controller.
  • the touch panel can be realized by various types of resistive, capacitive, infrared, and surface acoustic waves.
  • the user receiving unit may also include other input devices.
  • other input devices may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be described herein again.
  • the touch panel can be overlaid on the display panel, and when the touch panel detects a touch operation on or near it, it is transmitted to the processor 62 to determine the type of the touch event, and then the processor 62 determines the type of the touch event according to the type of the touch event. Provide the corresponding visual output on the display panel.
  • the touch panel and the display panel are used as two independent components to realize the input and output functions of the electronic device, but in some embodiments, the touch panel and the display panel can be integrated to realize the input and output functions of the electronic device and output functions, which are not specifically limited here.
  • the interface unit is an interface for connecting an external device with an electronic device.
  • external devices may include wired or wireless headset ports, external power (or battery charger) ports, wired or wireless data ports, memory card ports, ports for connecting devices with identification modules, audio input/output (I/O) ports, video I/O ports, headphone ports, and more.
  • the interface unit may be used to receive input (eg, data information, power, etc.) from an external device and transmit the received input to one or more elements in the electronic device or may be used to communicate between the electronic device and the external device transfer data.
  • the electronic device may also include a power supply (such as a battery) for supplying power to various components.
  • a power supply such as a battery
  • the power supply may be logically connected to the processor 62 through a power management system, so as to manage charging, discharging, and power consumption management functions through the power management system. .
  • Embodiments of the present invention further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for displaying comments provided by the foregoing method embodiments is implemented.
  • embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein.
  • Computer readable media includes both persistent and non-permanent, removable and non-removable storage media.
  • a storage medium can be implemented by any method or technology for storing information, and the information can be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.

Abstract

本发明实施例提供了一种直播互动方法及装置,涉及人机交互技术领域。该方法包括:接收终端设备发送的互动信息,所述互动信息为用户在虚拟对象的直播间输入的互动信息;根据所述互动信息获取反馈数据,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;基于所述反馈数据生成直播视频流;向终端设备发送所述直播视频流。本发明实施例用于解决虚拟直播时无法与观众进行实时互动的问题。

Description

一种直播互动方法及装置
本申请要求于2020年12月11日提交中国国家知识产权局、申请号为202011458910.6、申请名称为“一种直播互动方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及人机交互技术领域,尤其涉及一种直播互动方法及装置。
背景技术
近年来,随着流媒体技术的进步以及网络带宽的飞速增长,视频业务日趋火热,各种直播节目已逐渐应用于人们生活的方方面面。
观看直播时进行实时互动已成为一种潮流趋势。在传统直播方式中,主播侧和观众侧均为实时在线的用户,观众侧用户可以在观看主播侧用户实时生成的视频的过程中输入互动信息,主播侧用户则能够查看观众侧用户输入的互动信息,并对互动信息进行反馈。然而,随着直播类别的不断丰富,直播已不再局限于真人直播,而是可以基于人工智能(Artificial Intelligence,AI)技术生成虚拟主播,并基于虚拟主播进行虚拟直播。目前,虚拟直播时虚拟主播均是按照预设定的流程和场景进行直播,无法与观众进行实时互动。
发明内容
有鉴于此,本发明提供了一种直播互动方法及装置,用于解决虚拟直播时无法与观众进行实时互动的问题。
为了实现上述目的,本发明实施例提供技术方案如下:
第一方面,本发明的实施例提供一种直播互动方法,包括:
接收终端设备发送的互动信息,所述互动信息为用户在虚拟对象的直播间输入的互动信息;
根据所述互动信息获取反馈数据,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
基于所述反馈数据生成直播视频流;
向终端设备发送所述直播视频流。
作为本发明实施例一种可选的实施方式,所述根据所述互动信息获取反馈数据,包括:
解析所述互动信息获取情绪标签,所述情绪标签用于表征对所述互动信息进行反馈时所述虚拟对象的情绪;
基于第一对应关系和所述情绪标签获取所述面部表情数据和/或基于第二对应关系和所述情绪标签获取所述肢体动作数据;
其中,所述第一对应关系包括各情绪标签与所述虚拟对象的面部表情数据的对应关系,所述第二对应关系包括各情绪标签与所述虚拟对象的肢体动作数据的对应关系。
作为本发明实施例一种可选的实施方式,所述根据所述互动信息获取反馈数据,包括:
根据所述互动信息和第一反馈模型获取所述面部表情数据和/或根据所述互动信息和第二反馈模型获取所述肢体动作数据;
其中,所述第一反馈模型为基于样本互动信息和与所述样本互动信息对应的面部表情数据对第一算法模型进行训练获取的模型,所述第二反馈模型为基于样本互动信息和与所述样本互动信息对应的肢体动作数据对第二算法模型进行训练获取的模型。
作为本发明实施例一种可选的实施方式,所述方法还包括:
根据所述互动信息获取用于对所述互动信息进行回复的回复文本;
基于所述回复文本和所述虚拟对象的音色生成所述直播视频流的音频数据。
作为本发明实施例一种可选的实施方式,所述根据所述互动信息获取用于对所述互动信息进行回复的回复文本,包括:
获取所述互动信息的语义;
基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本;
其中,所述第一回复文本库包括至少一个语义对应的回复文本,所述第二回复文本库包括各语义对应的回复文本。
作为本发明实施例一种可选的实施方式,所述第一回复文本库中各回复文 本与所述虚拟对象的匹配度大于或等于所述第二回复文本库中各回复文本与所述虚拟角色的匹配度;所述基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本,包括:
判断所述第一回复文本库中是否包含所述互动信息的语义对应的回复文本;
若是,则基于所述第一回复文本库和所述互动信息的语义获取所述回复文本;
若否,则基于所述第二回复文本库和所述互动信息的语义获取所述回复文本。
作为本发明实施例一种可选的实施方式,在基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本之前,所述方法还包括:
接收用户基于所述虚拟对象的性格设定输入的至少一个语义对应的回复文本;
根据用户输入的所述至少一个语义对应的回复文本生成所述第一回复文本库;
基于大数据获取各个语义类别对应的回复文本;
根据基于大数据获取回复文本生成所述第二回复文本库。
作为本发明实施例一种可选的实施方式,所述方法还包括:
向所述终端设备发送所述回复文本。
作为本发明实施例一种可选的实施方式,所述方法还包括:
根据所述回复文本对应的语音获取所述虚拟对象的口型数据;
所述基于所述反馈数据生成直播视频流,包括:
根据所述反馈数据和所述口型数据生成所述直播视频流。
第二方面,本发明实施例提供一种直播互动方法,包括:
接收用户在虚拟对象的直播间输入的互动信息;
向直播服务器发送所述互动信息;
接收所述直播服务器发送的直播视频流,所述直播视频流为所述直播服务 器基于所述互动信息对应的反馈数据生成的视频流,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
在所述虚拟对象的直播间显示所述直播视频流。
作为本发明实施例一种可选的实施方式,所述方法还包括:
接收所述直播服务器发送的音频数据,所述音频数据为所述直播服务器基于用于对所述互动信息进行回复的回复文本和所述虚拟对象的音色生成的音频数据;
在所述虚拟对象的直播间播放所述音频数据。
作为本发明实施例一种可选的实施方式,所述方法还包括:
接收所述直播服务器发送的所述回复文本;
在所述虚拟对象的直播间显示所述回复文本。
第三方面,本发明实施例提供一种直播服务器,包括:
接收单元,用于接收终端设备发送的互动信息,所述互动信息为用户在虚拟对象的直播间输入的互动信息;
处理单元,用于根据所述互动信息获取反馈数据,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
生成单元,用于基于所述反馈数据生成直播视频流;
发送单元,用于向终端设备发送所述直播视频流。
作为本发明实施例一种可选的实施方式,所述处理单元,具体用于解析所述互动信息获取情绪标签,基于第一对应关系和所述情绪标签获取所述面部表情数据和/或基于第二对应关系和所述情绪标签获取所述肢体动作数据;
其中,所述情绪标签用于表征对所述互动信息进行反馈时所述虚拟对象的情绪所述第一对应关系包括各情绪标签与所述虚拟对象的面部表情数据的对应关系,所述第二对应关系包括各情绪标签与所述虚拟对象的肢体动作数据的对应关系。
作为本发明实施例一种可选的实施方式,所述处理单元,具体用于根据所述互动信息和第一反馈模型获取所述面部表情数据和/或根据所述互动信息和 第二反馈模型获取所述肢体动作数据;
其中,所述第一反馈模型为基于样本互动信息和与所述样本互动信息对应的面部表情数据对第一算法模型进行训练获取的模型,所述第二反馈模型为基于样本互动信息和与所述样本互动信息对应的肢体动作数据对第二算法模型进行训练获取的模型。
作为本发明实施例一种可选的实施方式,
所述处理单元,还用于根据所述互动信息获取用于对所述互动信息进行回复的回复文本;
所述生成单元,还用于基于所述回复文本和所述虚拟对象的音色生成音频数据;
所述发送单元,还用于向所述终端设备发送所述所述音频数据。
作为本发明实施例一种可选的实施方式,所述处理单元,具体用于获取所述互动信息的语义,基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本;
其中,所述第一回复文本库包括至少一个语义对应的回复文本,所述第二回复文本库包括各语义对应的回复文本。
作为本发明实施例一种可选的实施方式,所述第一回复文本库中各回复文本与所述虚拟对象的匹配度大于或等于所述第二回复文本库中各回复文本与所述虚拟角色的匹配度;
所述处理单元,具体用于判断所述第一回复文本库中是否包含所述互动信息的语义对应的回复文本;若是,则基于所述第一回复文本库和所述互动信息的语义获取所述回复文本;若否,则基于所述第二回复文本库和所述互动信息的语义获取所述回复文本。
作为本发明实施例一种可选的实施方式,所述处理单元,还用于在基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本之前,接收用户基于所述虚拟对象的性格设定输入的至少一个语义对应的回复文本;根据用户输入的所述至少一个语义对应的回复文本;基于大数据获取各个 语义类别对应的回复文本;根据基于大数据获取回复文本生成所述第二回复文本库。
作为本发明实施例一种可选的实施方式,所述发送单元,还用于向所述终端设备发送所述回复文本。
作为本发明实施例一种可选的实施方式,
所述处理单元,还用于根据所述回复文本对应的语音获取所述虚拟对象的口型数据;
所述生成单元,具体用于根据所述反馈数据和所述口型数据生成所述直播视频流。
第四方面,本发明实施例提供一种终端设备,包括:
用户输入单元,用于接收用户在虚拟对象的直播间输入的互动信息;
发送单元,用于向直播服务器发送所述互动信息;
接收单元,用于接收所述直播服务器发送的直播视频流,所述直播视频流为所述直播服务器基于所述互动信息对应的反馈数据生成的视频流,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
输出单元,用于在所述虚拟对象的直播间显示所述直播视频流。
作为本发明实施例一种可选的实施方式,
所述直播视频流还包括音频数据,所述直播视频流的音频数据为所述直播服务器基于用于对所述互动信息进行回复的回复文本和所述虚拟对象的音色生成的音频数据。
作为本发明实施例一种可选的实施方式,
所述接收单元,还用于接收所述直播服务器发送的所述回复文本;
所述输出单元,还用于在所述虚拟对象的直播间显示所述回复文本。
第五方面,本发明实施例提供一种电子设备,包括:存储器和处理器,存储器用于存储计算机程序;处理器用于在调用计算机程序时执行第一方面或第一方面任一种可选的实施方式或第二方面或第二方面任一种可选的实施方式所述的直播互动方法。
第六方面,本发明实施例提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现第一方面或第一方面任一种可选的实施方式或第二方面或第二方面任一种可选的实施方式所述的直播互动方法。
本发明实施例提供的直播互动方法在接收终端设备发送的用户在虚拟对象的直播间输入的互动信息时,根据所述互动信息获取包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据的反馈数据,然后基于所述反馈数据生成直播视频流,并向终端设备发送所述直播视频流。由于本发明实施例提供的直播互动方法可以在接收到用户在直播间中输入互动信息时,获取虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据,并通过虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据生成直播视频流,从而通过虚拟对象的面部表情数据和/或肢体动作对用户输入的互动信息进行反馈,因此本发明实施例可以解决虚拟直播时无法与观众进行实时互动的问题。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明书一起用于解释本发明的原理。
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的直播互动方法的步骤流程图;
图2为本发明实施例提供的直播视频流的图像帧的示意图;
图3为本发明实施例提供的回复文本的示意图;
图4为本发明实施例提供的直播互动装置的结构示意图;
图5为本发明实施例提供的终端设备的结构示意图;
图6为本发明实施例提供的电子设备的硬件结构示意图。
具体实施方式
为了能够更清楚地理解本发明的上述目的、特征和优点,下面将对本发明 的方案进行进一步描述。需要说明的是,在不冲突的情况下,本发明的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本发明,但本发明还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本发明的一部分实施例,而不是全部的实施例。
本发明的说明书和权利要求书中的术语“第一”和“第二”等是用于区别同步的对象,而不是用于描述对象的特定顺序。例如,第一对应关系和第二对应关系是用于区别不同的对应关系,而不是用于描述对应关系的特定顺序。
在本发明实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本发明实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。此外,在本发明实施例的描述中,除非另有说明,“多个”的含义是指两个或两个以上。
本文中术语“和/或”,用于描述关联对象的关联关系,具体表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。
本发明实施例提供了一种直播互动方法,参照图1所示,该直播互动方法包括如下步骤S101至S106:
S101、终端设备接收用户在虚拟对象的直播间输入的互动信息。
本发明实施例中的终端设备,可以为手机、平板电脑、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、智能手表、智能手环等终端设备,或者该终端设备还可以为其他类型的终端设备,本发明实施例对终端设备的类型不作限定。
本发明实施例中的虚拟对象是指基于人工或人工智能技术生成的虚拟主播的模型,本发明实施例中对虚拟对象的形象不做限定,示例性的,虚拟对象 可以为二次元卡通人物、真人模拟形象等。
本发明实施例中的互动信息可以为文本评论内容、表情包中的表情、语音、点赞、赠送虚拟礼物等。
S102、终端设备向直播服务器发送所述互动信息。
对应的,直播服务器接收所述终端设备发送的互动信息。其中,所述互动信息为用户在虚拟对象的直播间输入的互动信息。
本发明实施例中的直播服务器可以为任意与终端设备互联的服务器。例如:云服务器、台式服务器、机架式服务器、机柜式服务器、刀片式服务器等。
S103、直播服务器根据所述互动信息获取反馈数据。
其中,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据。
示例性的,本发明实施例中的面部表情数据可以包括:用于表征虚拟对象微笑、大笑、大哭、兴奋、害羞、厌恶等表情的数据,肢体动作数据可以包括:用于表征虚拟对象点头、摇头、鼓掌、手舞足蹈等动作的数据。
作为本发明实施例一种可选的实施方式,上述步骤S103(直播服务器根据所述互动信息获取反馈数据)可以包括如下步骤a和步骤b:
步骤a、解析所述互动信息获取情绪标签。
其中,所述情绪标签用于表征对所述互动信息进行反馈时所述虚拟对象的情绪。
具体的,当互动信息为文本内容时,可以基于自然语言处理(Natural Language Processing,NPL)技术理解文字评论的内容获取情绪标签,当互动信息为表情包中的表情、语音、点赞、赠送虚拟礼物等非文本内容时,可以先将互动信息转换为文本内容,然后再基于NPL技术理解文字评论的内容获取情绪标签。例如:当互动信息为“你好笨”时,直接基于NPL技术解析所述互动信息获取的情绪标签“愤怒”,再例如:当互动信息为点赞时,可先将互动信息转换为文本内容“你很棒”,然后再基于NPL技术解析互动信息转换得到的文本获取的情绪标签“喜悦”。
示例性的,本发明实施例中的情绪标签可以包括:喜悦、愤怒、悲伤、恐惧、委屈等。
步骤b、基于第一对应关系和所述情绪标签获取所述面部表情数据和/或基于第二对应关系和所述情绪标签获取所述肢体动作数据。
其中,所述第一对应关系包括各情绪标签与面部表情数据的对应关系,所述第二对应关系包括各情绪标签与肢体动作数据的对应关系。
即,预先建立各情绪标签与虚拟对象的面部表情数据的对应关系和/或各情绪标签与虚拟对象的肢体动作数据的对应关系,在获取互动信息后,先解析互动信息获取情绪标签,然后在根据情绪标签和预先建立的对应关系获取虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据。
作为本发明实施例一种可选的实施例,上述步骤S103(直播服务器根据所述互动信息获取反馈数据)包括:
根据所述互动信息和第一反馈模型获取所述面部表情数据和/或根据所述互动信息和第二反馈模型获取所述肢体动作数据。
其中,所述第一反馈模型为基于样本互动信息和与所述样本互动信息对应的面部表情数据对第一算法模型进行训练获取的模型,所述第二反馈模型为基于样本互动信息和与所述样本互动信息对应的肢体动作数据对第二算法模型进行训练获取的模型。
示例性的,第一算法模型和第二算法模型可以为深度学习神经网络模型、卷积神经网络模型等机器学习算法模型,且第一算法模型和第二算法模型可以相同,也可以不同。本发明实施例对第一算法模型和第二算法模型的具体类型不做限定。
还需要说明的是,上述实施例中的第一反馈模型和第二反馈模型可为两个独立的模型,也可以为模型中的两个子模型。
S104、直播服务器基于所述反馈数据生成直播视频流。
示例性的,参照图2所示,图2中以互动信息为“你好漂亮”、反馈数据包括:所述虚拟对象的面部表情数据和所述虚拟对象的肢体动作数据为例示出。 如图2所示直播视频流的视图帧,当接收的互动信息时,基于所述反馈数据生成的直播视频流中虚拟对象通过面部表情“微笑”以及肢体动作“鼓掌”对互动信息“你好漂亮”进行反馈。
S105、直播服务器向终端设备发送所述直播视频流。
对应的,终端设备接收所述直播服务器发送的直播视频流。其中,终端设备的直播视频流为所述直播视频流为所述直播服务器基于所述互动信息对应的反馈数据生成的视频流。所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据。
S106、终端设备在所述虚拟对象的直播间显示所述直播视频流。
本发明实施例提供的直播互动方法在接收终端设备发送的用户在虚拟对象的直播间输入的互动信息时,根据所述互动信息获取包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据的反馈数据,然后基于所述反馈数据生成直播视频流,并向终端设备发送所述直播视频流。由于本发明实施例提供的直播互动方法可以在接收到用户在直播间中输入互动信息时,获取虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据,并通过虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据生成直播视频流,从而通过虚拟对象的面部表情数据和/或肢体动作对用户输入的互动信息进行反馈,因此本发明实施例可以解决虚拟直播时无法与观众进行实时互动的问题。
作为本发明实施例一种可选的实施方式,在上述实施例的基础上,本发明实施例提供的直播互动方法还包括:
直播服务器根据所述互动信息获取用于对所述互动信息进行回复的回复文本;
直播服务器基于所述回复文本和所述虚拟对象的音色生成所述直播视频流的音频数据。
即,终端设备接收到的直播视频流还包括音频数据,且直播视频流的音频数据所述直播视频流的音频数据为所述直播服务器基于用于对所述互动信息进行回复的回复文本和所述虚拟对象的音色生成的音频数据。
具体的,虚拟对象的音色可以由开发人员预先设定。基于所述回复文本和所述虚拟对象的音色生成所述直播视频流的音频数据,可以为:通过语音合成(Text-To-Speech,TTS)技术,将回复文本转换为语音格式,然后再根据转换得到的语音生成所述直播视频流的音频数据的音频数据。
可选的,在上实施例的基础上,本发明实施例提供的方法还包括:
直播服务器根据所述回复文本对应的语音获取所述虚拟对象的口型数据。
所述上述步骤S105(直播服务器基于所述反馈数据生成直播视频流),包括:直播服务器根据所述反馈数据和所述口型数据生成所述直播视频流。
由于上述实施例中直播服务器还会根据所述回复文本对应的语音获取所述虚拟对象的口型数据,并根据所述反馈数据和所述口型数据生成所述直播视频流,因此上述实施例可以在播放回复文本对应的语音控制虚拟对象口型变化。从而使虚拟对象更加立体生动。
可选的,根据所述互动信息获取用于对所述互动信息进行回复的回复文本的实现方式可以包括如下步骤Ⅰ和步骤Ⅱ。
步骤Ⅰ、直播服务器获取所述互动信息的语义。
具体的,可以基于NPL技术对互动信息进行解析,从而获取所述互动信息的语义。
步骤Ⅱ、直播服务器基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本。
其中,所述第一回复文本库包括至少一个语义对应的回复文本,所述第二回复文本库包括各语义对应的回复文本。
可选的,第一回复文本库中各回复文本与对应的语义的匹配度大于或等于所述第二回复文本库中各回复文本与对应的语义的匹配度。
作为本发明实施例一种可选的实施方式,构建第一回复文本库的方式可以为:
接收用户基于所述虚拟对象的性格设定输入的至少一个语义对应的回复文本,并根据用户输入的所述至少一个语义对应的回复文本生成第一回复文本 库。
即,用户可以根据对虚拟对象的性格设定设置一个或多个语义对应的回复文本,从而构建第一回复文本库。例如:虚拟对象的性格设定为“温柔”,则用户可以将互动信息“你好笨”的语义对应的回复文本设置为“你这样说我会很伤心的”。再例如:虚拟对象的性格设定为“暴躁”,则用户可以将互动信息“你好笨”的语义对应的回复文本设置为“你才笨,气死我了”。
作为本发明实施例一种可选的实施方式,构建第二回复文本库的方式可以为:
基于大数据获取各个语义类别对应的回复文本;
根据基于大数据获取回复文本生成所述第二回复文本库。
即,可以通过大数据收集各个语义对应的文本回复,并将每一个语义对应的文本回复均添加到第二回复文本库中。
由于第一回复文本库中的回复文本为用户基于虚拟对象的性格设定而专门设计的回复文本,第二回复文本库中的回复文本是从大数据中收集获取的,因此第一回复文本库的特点为:数据可能不全面,一些语义对应的回复文本没有被收录或设定,但回复文本与虚拟对象的匹配度较高,第二回复文本库的特点为:数据更加全面,几乎可以涵盖全部语义对应的回复文本,但一些回复文本与虚拟对象的匹配度较低。即,存在所述第一回复文本库中各回复文本与所述虚拟对象的匹配度大于或等于所述第二回复文本库中各回复文本与所述虚拟角色的匹配度。
在上述实施例的基础上,上述步骤Ⅱ(直播服务器基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本),包括:
直播服务器判断所述第一映射关系中是否包含所述互动信息的语义与回复文本的对应关系;
若是,则直播服务器基于所述第一映射关系和所述互动信息的语义获取所述回复文本;
若否,则直播服务器基于所述第二映射关系和所述互动信息的语义获取所 述回复文本。
即,所述第一回复文本库中各回复文本与所述虚拟对象的匹配度大于或等于所述第二回复文本库中各回复文本与所述虚拟角色的匹配度,且在获取互动信息的语义对应的回复文本时,先判断第一回复文本库中是否包含所述互动信息的语义对应的回复文本,若第一回复文本库中包含所述互动信息的语义与回复文本的对应关系,则基于所述第一回复文本库获取所述互动信息的语义对应的回复文本,若第一回复文本库中不包含所述互动信息的语义对应的回复文本,则基于所述第二回复文本库获取所述互动信息的语义对应的回复文本。
示例性的,以下以获取用于对语义为“你好笨”的互动信息进行回复的回复文本、第一回复文本库包括用户基于所述虚拟对象的性格设定“温柔”而输入的对应的回复内容“你这样说我会很伤心的”、第二回复文本库包括从大数据中获取的对应的回复内容“你才笨,气死我了”、“我一点也不笨”、“你这样说我会很伤心的”为例对上述实施例进行说明。由于第一回复文本库包含所述互动信息的语义“你好笨”对应的回复文本,因此基于所述第一回复文本库和所述互动信息的语义获取所述回复文本,最终获取的回复文本为“你这样说我会很伤心的”。
示例性的,以下以获取用于对语义为“你真漂亮”的互动信息进行回复的回复文本、第一回复文本库不包括互动信息的语义“你真漂亮”对应的回复文本、第二数据库中第二回复文本库包括从大数据中获取的对应的回复内容“谢谢你的夸奖”、“你真有眼光”、“真的吗”为例对上述实施例进行说明。由于第一回复文本库不包含所述互动信息的语义“你真漂亮”对应的回复文本,因此基于所述第二回复文本库和所述互动信息的语义获取所述回复文本,由于第二回复文本库中语义“你真漂亮”对应的回复文本包括“谢谢你的夸奖”、“你真有眼光”、“真的吗”3个,因此可以基于预设规则从中选取一个作为该语义的回复文本,例如:随机选取。
由于所述第一回复文本库中各回复文本与所述虚拟对象的匹配度大于或等于所述第二回复文本库中各回复文本与所述虚拟角色的匹配度,因此在获取 回复文本时,首先判断第一回复文本库中是否包含所述互动信息的语义对应的回复文本,若第一回复文本库中包含与所述虚拟对象的匹配度较高的回复文本,则基于第一回复文本库获取回复文本,从而保证回复文本与虚拟对象的匹配度,若第一回复文本库中不具互动信息的语义对应的回复文本,则基于第二回复文本库获取回复文本,从而避免无法对所述互动信息进行回复。
可选的,在上述实施例的从基础上,本发明实施例提供的直播互动方法还包括:
直播服务器向所述终端设备发送所述回复文本。
对应的,终端设备接收所述直播服务器发送的所述回复文本,并在所述虚拟对象的直播间显示所述回复文本。
示例性的,参照图3所示,图3中以互动信息为“你好漂亮”、回复文本为“你真有眼光”为例示出。如图3所示,终端设备在所述虚拟对象的直播间显示所述回复文本的方式可以为:将回复文本“你真有眼光”31叠加显示于所述互动视频的视频播放界面中。
基于同一发明构思,作为对上述方法的实现,本发明实施例还提供了一种直播服务器和一种终端设备,该直播服务器和终端设备实施例与前述方法实施例对应,为便于阅读,本装置实施例不再对前述方法实施例中的细节内容进行逐一赘述,但应当明确,本实施例中的直播服务器能够对应实现前述方法实施例中直播服务器所执行的全部步骤,终端设备能够对应实现前述方法实施例中终端设备所执行的全部步骤。
图4为本发明实施例提供的直播服务器的结构示意图,如图4所示,本实施例提供的直播服务器400包括:
接收单元41,用于接收终端设备发送的互动信息,所述互动信息为用户在虚拟对象的直播间输入的互动信息;
处理单元42,用于根据所述互动信息获取反馈数据,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
生成单元43,用于基于所述反馈数据生成直播视频流;
发送单元44,用于向终端设备发送所述直播视频流。
作为本发明实施例一种可选的实施方式,所述处理单元42,具体用于解析所述互动信息获取情绪标签,基于第一对应关系和所述情绪标签获取所述面部表情数据和/或基于第二对应关系和所述情绪标签获取所述肢体动作数据;
其中,所述情绪标签用于表征对所述互动信息进行反馈时所述虚拟对象的情绪所述第一对应关系包括各情绪标签与所述虚拟对象的面部表情数据的对应关系,所述第二对应关系包括各情绪标签与所述虚拟对象的肢体动作数据的对应关系。
作为本发明实施例一种可选的实施方式,所述处理单元42,具体用于根据所述互动信息和第一反馈模型获取所述面部表情数据和/或根据所述互动信息和第二反馈模型获取所述肢体动作数据;
其中,所述第一反馈模型为基于样本互动信息和与所述样本互动信息对应的面部表情数据对第一算法模型进行训练获取的模型,所述第二反馈模型为基于样本互动信息和与所述样本互动信息对应的肢体动作数据对第二算法模型进行训练获取的模型。
作为本发明实施例一种可选的实施方式,
所述处理单元42,还用于根据所述互动信息获取用于对所述互动信息进行回复的回复文本;
所述生成单元43,还用于基于所述回复文本和所述虚拟对象的音色生成音频数据;
所述发送单元44,还用于向所述终端设备发送所述音频数据。
作为本发明实施例一种可选的实施方式,所述处理单元42,具体用于获取所述互动信息的语义,基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本;
其中,所述第一回复文本库包括至少一个语义对应的回复文本,所述第二回复文本库包括各语义对应的回复文本。
作为本发明实施例一种可选的实施方式,所述第一回复文本库中各回复文 本与所述虚拟对象的匹配度大于或等于所述第二回复文本库中各回复文本与所述虚拟角色的匹配度;
所述处理单元42,具体用于判断所述第一回复文本库中是否包含所述互动信息的语义对应的回复文本;若是,则基于所述第一回复文本库和所述互动信息的语义获取所述回复文本;若否,则基于所述第二回复文本库和所述互动信息的语义获取所述回复文本。
作为本发明实施例一种可选的实施方式,所述处理单元42,还用于在基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本之前,接收用户基于所述虚拟对象的性格设定输入的至少一个语义对应的回复文本;根据用户输入的所述至少一个语义对应的回复文本;基于大数据获取各个语义类别对应的回复文本;根据基于大数据获取回复文本生成所述第二回复文本库。
作为本发明实施例一种可选的实施方式,所述发送单元44,还用于向所述终端设备发送所述回复文本。
作为本发明实施例一种可选的实施方式,
所述处理单元42,还用于根据所述回复文本对应的语音获取所述虚拟对象的口型数据;
所述生成单元43,具体用于根据所述反馈数据和所述口型数据生成所述直播视频流。
本实施例提供的直播服务器可以执行上述方法实施例提供的直播互动方法中直播服务器所执行的全部步骤,其实现原理与技术效果类似,此处不再赘述。
图5为本发明实施例提供的终端设备的结构示意图,如图5所示,本实施例提供的终端设备500包括:
用户输入单元51,用于接收用户在虚拟对象的直播间输入的互动信息;
发送单元52,用于向直播服务器发送所述互动信息;
接收单元53,用于接收所述直播服务器发送的直播视频流,所述直播视 频流为所述直播服务器基于所述互动信息对应的反馈数据生成的视频流,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
输出单元54,用于在所述虚拟对象的直播间显示所述直播视频流。
作为本发明实施例一种可选的实施方式,
所述直播视频流还包括音频数据,所述直播视频流的音频数据为所述直播服务器基于用于对所述互动信息进行回复的回复文本和所述虚拟对象的音色生成的音频数据。
作为本发明实施例一种可选的实施方式,
所述接收单元53,还用于接收所述直播服务器发送的所述回复文本;
所述输出单元54,还用于在所述虚拟对象的直播间显示所述回复文本。
本实施例提供的终端设备可以执行上述方法实施例提供的直播互动方法中终端设备所执行的各个步骤,其实现原理与技术效果类似,此处不再赘述。
基于同一发明构思,本发明实施例还提供了一种电子设备。图6为本发明实施例提供的电子设备的结构示意图,如图6所示,本实施例提供的电子设备包括:存储器61和处理器62,存储器61用于存储计算机程序;处理器62用于在调用计算机程序时执行上述方法实施例提供的直播互动方法中直播服务器或终端设备所执行的各步骤。
具体的,存储器61可用于存储软件程序以及各种数据。存储器61可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器61可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
处理器62是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器61中的软件程序和/或模块,以及调用存储在存储器61中的数据,执行电子设备的各种功能和处理数据,从 而对电子设备进行整体监控。处理器62可包括一个或多个处理单元。
此外,应当理解的是,本发明实施例提供的电子设备还可以包括:射频单元、网络模块、音频输出单元、接收单元、传感器、显示单元、用户接收单元、接口单元、以及电源等部件。本领域技术人员可以理解,上述描述出的电子设备的结构并不构成对电子设备的限定,电子设备可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。在本发明实施例中,电子设备包括但不限于手机、平板电脑、笔记本电脑、掌上电脑、车载终端、可穿戴设备、以及计步器等。
其中,射频单元可用于收发信息或通话过程中,信号的接收和发送,具体的,将来自基站的下行数据接收后,给处理器62处理;另外,将上行的数据发送给基站。通常,射频单元包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器、双工器等。此外,射频单元还可以通过无线通信系统与网络和其他设备通信。
电子设备通过网络模块为用户提供了无线的宽带互联网访问,如帮助用户收发电子邮件、浏览网页和访问流式媒体等。
音频输出单元可以将射频单元或网络模块接收的或者在存储器61中存储的音频数据转换成音频信号并且输出为声音。而且,音频输出单元还可以提供与电子设备执行的特定功能相关的音频输出(例如,呼叫信号接收声音、消息接收声音等等)。音频输出单元包括扬声器、蜂鸣器以及受话器等。
接收单元用于接收音频或视频信号。接收单元可以包括图形处理器(Graphics Processing Unit,GPU)和麦克风,图形处理器对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。处理后的图像帧可以显示在显示单元上。经图形处理器处理后的图像帧可以存储在存储器(或其它存储介质)中或者经由射频单元或网络模块进行发送。麦克风可以接收声音,并且能够将这样的声音处理为音频数据。处理后的音频数据可以在电话通话模式的情况下转换为可经由射频单元发送到移动通信基站的格式输出。
电子设备还包括至少一种传感器,比如光传感器、运动传感器以及其他传感器。具体地,光传感器包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板的亮度,接近传感器可在电子设备移动到耳边时,关闭显示面板和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别电子设备姿态(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;传感器还可以包括指纹传感器、压力传感器、虹膜传感器、分子传感器、陀螺仪、气压计、湿度计、温度计、红外线传感器等,在此不再赘述。
显示单元用于显示由用户输入的信息或提供给用户的信息。显示单元可包括显示面板,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板。
用户接收单元可用于接收输入的数字或字符信息,以及产生与电子设备的用户设置以及功能控制有关的键信号输入。具体地,用户接收单元包括触控面板以及其他输入设备。触控面板,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板上或在触控面板附近的操作)。触控面板可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器62,接收处理器62发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板。除了触控面板,用户接收单元还可以包括其他输入设备。具体地,其他输入设备可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。
进一步的,触控面板可覆盖在显示面板上,当触控面板检测到在其上或附近的触摸操作后,传送给处理器62以确定触摸事件的类型,随后处理器62根据触摸事件的类型在显示面板上提供相应的视觉输出。一般情况下,触控面板 与显示面板是作为两个独立的部件来实现电子设备的输入和输出功能,但是在某些实施例中,可以将触控面板与显示面板集成而实现电子设备的输入和输出功能,具体此处不做限定。
接口单元为外部装置与电子设备连接的接口。例如,外部装置可以包括有线或无线头戴式耳机端口、外部电源(或电池充电器)端口、有线或无线数据端口、存储卡端口、用于连接具有识别模块的装置的端口、音频输入/输出(I/O)端口、视频I/O端口、耳机端口等等。接口单元可以用于接收来自外部装置的输入(例如,数据信息、电力等等)并且将接收到的输入传输到电子设备中的一个或多个元件或者可以用于在电子设备和外部装置之间传输数据。
电子设备还可以包括给各个部件供电的电源(比如电池),可选的,电源可以通过电源管理系统与处理器62逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
本发明实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述方法实施例提供的评论显示方法。
本领域技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质上实施的计算机程序产品的形式。
计算机可读介质包括永久性和非永久性、可移动和非可移动存储介质。存储介质可以由任何方法或技术来实现信息存储,信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备 访问的信息。根据本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitorymedia),如调制的数据信号和载波。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上所述仅是本发明的具体实施方式,使本领域技术人员能够理解或实现本发明。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所述的这些实施例,而是要符合与本文所发明的原理和新颖特点相一致的最宽的范围。

Claims (16)

  1. 一种直播互动方法,其特征在于,包括:
    接收终端设备发送的互动信息,所述互动信息为用户在虚拟对象的直播间输入的互动信息;
    根据所述互动信息获取反馈数据,所述反馈数据包括:所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
    基于所述反馈数据生成直播视频流;
    向终端设备发送所述直播视频流。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述互动信息获取反馈数据,包括:
    解析所述互动信息获取情绪标签,所述情绪标签用于表征对所述互动信息进行反馈时所述虚拟对象的情绪;
    基于第一对应关系和所述情绪标签获取所述面部表情数据,和/或,基于第二对应关系和所述情绪标签获取所述肢体动作数据;
    其中,所述第一对应关系包括各情绪标签与面部表情数据的对应关系,所述第二对应关系包括各情绪标签与肢体动作数据的对应关系。
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述互动信息获取反馈数据,包括:
    根据所述互动信息和第一反馈模型获取所述面部表情数据,和/或,根据所述互动信息和第二反馈模型获取所述肢体动作数据;
    其中,所述第一反馈模型为基于样本互动信息和与所述样本互动信息对应的面部表情数据对第一算法模型进行训练获取的模型,所述第二反馈模型为基于样本互动信息和与所述样本互动信息对应的肢体动作数据对第二算法模型进行训练获取的模型。
  4. 根据权利要求1-3所述的方法,其特征在于,所述方法还包括:
    根据所述互动信息获取用于对所述互动信息进行回复的回复文本;
    基于所述回复文本和所述虚拟对象的音色生成所述直播视频流的音频数据。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述互动信息获取用于对所述互动信息进行回复的回复文本,包括:
    获取所述互动信息的语义;
    基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本;
    其中,所述第一回复文本库包括至少一个语义对应的回复文本,所述第二回复文本库包括各语义对应的回复文本。
  6. 根据权利要求5所述的方法,其特征在于,所述第一回复文本库中各回复文本与所述虚拟对象的匹配度大于或等于所述第二回复文本库中各回复文本与所述虚拟角色的匹配度;所述基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本,包括:
    判断所述第一回复文本库中是否包含所述互动信息的语义对应的回复文本;
    若是,则基于所述第一回复文本库和所述互动信息的语义获取所述回复文本;
    若否,则基于所述第二回复文本库和所述互动信息的语义获取所述回复文本。
  7. 根据权利要求6所述的方法,其特征在于,在基于第一回复文本库、第二回复文本库以及所述互动信息的语义获取所述回复文本之前,所述方法还包括:
    接收用户基于所述虚拟对象的性格设定输入的至少一个语义对应的回复文本;
    根据用户输入的所述至少一个语义对应的回复文本生成所述第一回复文本库;
    基于大数据获取各个语义类别对应的回复文本;
    根据基于大数据获取回复文本生成所述第二回复文本库。
  8. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    向所述终端设备发送所述回复文本。
  9. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    根据所述回复文本对应的语音获取所述虚拟对象的口型数据;
    所述基于所述反馈数据生成直播视频流,包括:
    根据所述反馈数据和所述口型数据生成所述直播视频流。
  10. 一种直播方法,其特征在于,包括:
    接收用户在虚拟对象的直播间输入的互动信息;
    向直播服务器发送所述互动信息;
    接收所述直播服务器发送的直播视频流,所述直播视频流为所述直播服务器基于所述互动信息对应的反馈数据生成的视频流,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
    在所述虚拟对象的直播间显示所述直播视频流。
  11. 根据权利要求10所述的方法,其特征在于,所述直播视频流还包括音频数据,所述直播视频流的音频数据为所述直播服务器基于用于对所述互动信息进行回复的回复文本和所述虚拟对象的音色生成的音频数据。
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    接收所述直播服务器发送的所述回复文本;
    在所述虚拟对象的直播间显示所述回复文本。
  13. 一种直播服务器,其特征在于,包括:
    接收单元,用于接收终端设备发送的互动信息,所述互动信息为用户在虚拟对象的直播间输入的互动信息;
    处理单元,用于根据所述互动信息获取反馈数据,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
    生成单元,用于基于所述反馈数据生成直播视频流;
    发送单元,用于向终端设备发送所述直播视频流。
  14. 一种终端设备,其特征在于,包括:
    用户输入单元,用于接收用户在虚拟对象的直播间输入的互动信息;
    发送单元,用于向直播服务器发送所述互动信息;
    接收单元,用于接收所述直播服务器发送的直播视频流,所述直播视频流为所述直播服务器基于所述互动信息对应的反馈数据生成的视频流,所述反馈数据包括所述虚拟对象的面部表情数据和/或所述虚拟对象的肢体动作数据;
    输出单元,用于在所述虚拟对象的直播间显示所述直播视频流。
  15. 一种电子设备,其特征在于,包括:存储器和处理器,存储器用于存储计算机程序;处理器用于在调用计算机程序时执行权利要求1-12任一项所述的直播互动方法。
  16. 一种计算机可读存储介质,其特征在于,其上存储有计算机程序,计算机程序被处理器执行时实现权利要求1-12任一项所述的直播互动方法。
PCT/CN2021/129237 2020-12-11 2021-11-08 一种直播互动方法及装置 WO2022121592A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011458910.6A CN114630135A (zh) 2020-12-11 2020-12-11 一种直播互动方法及装置
CN202011458910.6 2020-12-11

Publications (1)

Publication Number Publication Date
WO2022121592A1 true WO2022121592A1 (zh) 2022-06-16

Family

ID=81896316

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/129237 WO2022121592A1 (zh) 2020-12-11 2021-11-08 一种直播互动方法及装置

Country Status (2)

Country Link
CN (1) CN114630135A (zh)
WO (1) WO2022121592A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116828246A (zh) * 2023-06-29 2023-09-29 中科智宏(北京)科技有限公司 一种数字人直播交互方法、系统、设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115515016B (zh) * 2022-11-04 2023-03-31 广东玄润数字信息科技股份有限公司 一种可实现自交互回复的虚拟直播方法、系统及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109491564A (zh) * 2018-10-18 2019-03-19 深圳前海达闼云端智能科技有限公司 虚拟机器人的互动方法,装置,存储介质及电子设备
WO2019092590A1 (ru) * 2017-11-09 2019-05-16 ГИОРГАДЗЕ, Анико Тенгизовна Взаимодействие пользователей в коммуникационной системе при помощи множественного потокового вещания данных дополненной реальности
CN110413841A (zh) * 2019-06-13 2019-11-05 深圳追一科技有限公司 多态交互方法、装置、系统、电子设备及存储介质
CN111010586A (zh) * 2019-12-19 2020-04-14 腾讯科技(深圳)有限公司 基于人工智能的直播方法、装置、设备及存储介质
CN111010589A (zh) * 2019-12-19 2020-04-14 腾讯科技(深圳)有限公司 基于人工智能的直播方法、装置、设备及存储介质
CN111027425A (zh) * 2019-11-28 2020-04-17 深圳市木愚科技有限公司 一种智能化表情合成反馈交互系统及方法
CN111954063A (zh) * 2020-08-24 2020-11-17 北京达佳互联信息技术有限公司 视频直播间的内容显示控制方法及装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130022434A (ko) * 2011-08-22 2013-03-07 (주)아이디피쉬 통신단말장치의 감정 컨텐츠 서비스 장치 및 방법, 이를 위한 감정 인지 장치 및 방법, 이를 이용한 감정 컨텐츠를 생성하고 정합하는 장치 및 방법
CN110085229A (zh) * 2019-04-29 2019-08-02 珠海景秀光电科技有限公司 智能虚拟外教信息交互方法及装置
CN110688008A (zh) * 2019-09-27 2020-01-14 贵州小爱机器人科技有限公司 虚拟形象交互方法和装置
CN111026856A (zh) * 2019-12-09 2020-04-17 出门问问信息科技有限公司 一种智能交互方法、装置以及计算机可读储存介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019092590A1 (ru) * 2017-11-09 2019-05-16 ГИОРГАДЗЕ, Анико Тенгизовна Взаимодействие пользователей в коммуникационной системе при помощи множественного потокового вещания данных дополненной реальности
CN109491564A (zh) * 2018-10-18 2019-03-19 深圳前海达闼云端智能科技有限公司 虚拟机器人的互动方法,装置,存储介质及电子设备
CN110413841A (zh) * 2019-06-13 2019-11-05 深圳追一科技有限公司 多态交互方法、装置、系统、电子设备及存储介质
CN111027425A (zh) * 2019-11-28 2020-04-17 深圳市木愚科技有限公司 一种智能化表情合成反馈交互系统及方法
CN111010586A (zh) * 2019-12-19 2020-04-14 腾讯科技(深圳)有限公司 基于人工智能的直播方法、装置、设备及存储介质
CN111010589A (zh) * 2019-12-19 2020-04-14 腾讯科技(深圳)有限公司 基于人工智能的直播方法、装置、设备及存储介质
CN111954063A (zh) * 2020-08-24 2020-11-17 北京达佳互联信息技术有限公司 视频直播间的内容显示控制方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116828246A (zh) * 2023-06-29 2023-09-29 中科智宏(北京)科技有限公司 一种数字人直播交互方法、系统、设备及存储介质
CN116828246B (zh) * 2023-06-29 2024-03-19 中科智宏(北京)科技有限公司 一种数字人直播交互方法、系统、设备及存储介质

Also Published As

Publication number Publication date
CN114630135A (zh) 2022-06-14

Similar Documents

Publication Publication Date Title
US10726836B2 (en) Providing audio and video feedback with character based on voice command
CN106874265B (zh) 一种与用户情绪匹配的内容输出方法、电子设备及服务器
WO2022121592A1 (zh) 一种直播互动方法及装置
WO2021083168A1 (zh) 视频分享方法及电子设备
CN107633098A (zh) 一种内容推荐方法及移动终端
CN110083319B (zh) 笔记显示方法、装置、终端和存储介质
CN110830362B (zh) 一种生成内容的方法、移动终端
CN111739517B (zh) 语音识别方法、装置、计算机设备及介质
CN109993821B (zh) 一种表情播放方法及移动终端
CN113365085B (zh) 一种直播视频生成方法及装置
CN108600079B (zh) 一种聊天记录展示方法及移动终端
CN112749956A (zh) 信息处理方法、装置及设备
CN107862059A (zh) 一种歌曲推荐方法及移动终端
CN113420177A (zh) 音频数据处理方法、装置、计算机设备及存储介质
CN110012172A (zh) 一种来电处理方法及终端设备
CN109947988B (zh) 一种信息处理方法、装置、终端设备及服务器
CN110062281B (zh) 一种播放进度调节方法及其终端设备
CN112165627A (zh) 信息处理方法、装置、存储介质、终端及系统
CN110880330A (zh) 音频转换方法及终端设备
CN212588503U (zh) 一种嵌入式音频播放装置
CN111416955B (zh) 一种视频通话方法及电子设备
CN108763514B (zh) 一种信息显示方法及移动终端
CN108958505B (zh) 一种显示候选信息的方法及终端
CN110764618A (zh) 一种仿生交互系统、方法及相应的生成系统和方法
CN112489619A (zh) 语音处理方法、终端设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21902299

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21902299

Country of ref document: EP

Kind code of ref document: A1