WO2021114708A1 - 多人视频直播业务实现方法、装置、计算机设备 - Google Patents

多人视频直播业务实现方法、装置、计算机设备 Download PDF

Info

Publication number
WO2021114708A1
WO2021114708A1 PCT/CN2020/109869 CN2020109869W WO2021114708A1 WO 2021114708 A1 WO2021114708 A1 WO 2021114708A1 CN 2020109869 W CN2020109869 W CN 2020109869W WO 2021114708 A1 WO2021114708 A1 WO 2021114708A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
live
frame
live broadcast
person
Prior art date
Application number
PCT/CN2020/109869
Other languages
English (en)
French (fr)
Inventor
唐自信
薛德威
Original Assignee
上海幻电信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海幻电信息科技有限公司 filed Critical 上海幻电信息科技有限公司
Priority to US17/783,630 priority Critical patent/US11889132B2/en
Publication of WO2021114708A1 publication Critical patent/WO2021114708A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2365Multiplexing of several video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2668Creating a channel for a dedicated end-user group, e.g. insertion of targeted commercials based on end-user profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip

Definitions

  • This application relates to the technical field of video processing, and in particular to a method, device, and computer equipment for realizing a multi-person video live broadcast service.
  • the communication mode of each user watching the live broadcast in the live broadcast group is limited to sending text messages (i.e. barrage) in the live broadcast room for communication, and cannot watch the live video of other users, which makes the interaction between the live broadcast users. Relatively single, resulting in a poor user experience.
  • This application provides a method for realizing a multi-person video live broadcast service, including:
  • the first video and the second video of the live broadcast host Acquire the first video and the second video of the live broadcast host, and obtain the third video of other live broadcast members in the live broadcast group except the live broadcast host, where the first video includes the first video taken by the live broadcast host A real-time picture collected by a device, the second video includes a video watched by the live broadcast anchor, and the third video includes a real-time picture collected by a second camera device of the other live broadcast member;
  • the first video, the second video, and the third video of the other live members are spliced to obtain a multi-person live video stream, wherein each frame of the multi-person live video stream contains all Frame pictures in the first video, frame pictures in the second video, and frame pictures in the third video of the other live broadcast members;
  • the multi-person live video stream is sent to the live client corresponding to each live member for viewing by each live member.
  • face detection is performed on the first video and the third video of the other live broadcast members through the detection thread, and after the face is detected, the detection thread is used to perform face detection on the detected face. Area for rendering.
  • the video type of the second video is a cross-dressing video
  • synthesizing each frame of the multi-person live video stream includes:
  • the second to-be-combined video frame is spliced with the to-be-combined video frame that includes the sticker special effect and the to-be-combined video frame that does not include the sticker special effect to obtain a frame picture in the multi-person live video stream.
  • the video type of the second video is a challenge video
  • synthesizing each frame of the multi-person live video stream includes:
  • the preset sticker effect is added to the video frame to be synthesized where the preset expression is recognized, and the sticker effect is not added to the video frame to be synthesized for which the preset expression is not recognized;
  • the video type of the second video is a scenario-guessing video
  • synthesizing each frame of the multi-person live video stream includes:
  • the method for realizing the multi-person video live broadcast service further includes:
  • the method for realizing the multi-person video live broadcast service further includes:
  • the present invention also provides a device for realizing a multi-person video live broadcast service, including:
  • the acquiring module is used to acquire the first video and the second video of the live broadcast host, and the third video of other live broadcast members in the live broadcast group except the live broadcast host, wherein the first video includes the live broadcast A real-time picture collected by a first camera device of the host, the second video includes a video watched by the live broadcaster, and the third video includes a real-time picture collected by a second camera device of the other live broadcast member;
  • the splicing module is used to splice the first video, the second video, and the third video of the other live broadcast members to obtain a multi-person live video stream, wherein each of the multi-person live video streams
  • the frame pictures all include the frame pictures in the first video, the frame pictures in the second video, and the frame pictures in the third video of the other live broadcast members;
  • the sending module is used to send the multi-person live video stream to the live client corresponding to each live member for viewing by each live member.
  • the present application also provides a computer device.
  • the computer device includes a memory, a processor, and computer-readable instructions stored on the memory and running on the processor, and the processor executes the computer The steps of the above method are implemented when the instructions are readable.
  • the present application also provides a computer-readable storage medium on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, the steps of the foregoing method are implemented.
  • the first video and the second video of the live broadcaster are obtained, and the third videos of other live broadcast members in the live broadcast group except the live broadcaster are obtained; the first video and the second video And the third videos of the other live members are spliced to obtain a multi-person live video stream; the multi-person live video stream is sent to the live client corresponding to each live member for viewing by each live member .
  • each live broadcast member can watch the live video screens of all live broadcast members on the display interface of their live broadcast client, and can also watch the video together with the live broadcast anchor, thereby increasing The interactive mode of live broadcast members enhances the user experience.
  • FIG. 1 is a framework diagram of an embodiment of a system framework diagram for implementing the multi-person video live broadcast service described in this application;
  • FIG. 2 is a flowchart of an embodiment of a method for implementing a multi-person video live broadcast service described in this application;
  • FIG. 3 is a detailed flowchart of the steps of synthesizing each frame of the multi-person live video stream in an embodiment of the application;
  • FIG. 4 is a schematic diagram of frame pictures in a multi-person live stream obtained after splicing in an embodiment of the application;
  • FIG. 5 is a schematic diagram of frame pictures in a multi-person live stream obtained after splicing in another embodiment of the application;
  • FIG. 7 is a schematic diagram of frame pictures in a multi-person live stream obtained after splicing in another embodiment of the application.
  • FIG. 8 is a detailed flowchart of the steps of synthesizing each frame of the multi-person live video stream in another embodiment of the application.
  • FIG. 9 is a schematic diagram of frame pictures in a multi-person live stream obtained after splicing in another embodiment of the application.
  • FIG. 10 is a flowchart of another embodiment of a method for implementing a multi-person video live broadcast service according to this application.
  • FIG. 11 is a flowchart of another embodiment of a method for implementing a multi-person video live broadcast service according to this application.
  • FIG. 12 is a block diagram of an embodiment of an apparatus for implementing a multi-person video live broadcast service according to this application.
  • FIG. 13 is a schematic diagram of the hardware structure of a computer device for implementing a method for implementing a multi-person video live broadcast service provided by an embodiment of the application.
  • first, second, third, etc. may be used in this disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other.
  • first information may also be referred to as second information, and similarly, the second information may also be referred to as first information.
  • word “if” as used herein can be interpreted as "when” or “when” or "in response to determination”.
  • Fig. 1 schematically shows a schematic diagram of an application environment of a method for implementing a multi-person video live broadcast service according to an embodiment of the present application.
  • the system of the application environment may include a live broadcast host terminal 10, a live broadcast participant (that is, a live broadcast member other than the live broadcast group in the live broadcast group) terminal 20, and a server 30.
  • the live broadcast host terminal 10, the live broadcast participant terminal 20 and the server 30 form a wireless or wired connection, and the live broadcast host terminal 10 and the live broadcast participant terminal 20 have corresponding application clients or web clients, the live broadcast host terminal 10, the live broadcast The participant terminal 2 uploads the live video stream to the server 30 through the application client or the web client, so that the server 30 can be triggered to execute the multi-person video live broadcast service implementation method.
  • the live broadcast host terminal 10 and the live broadcast participant terminal 20 may be a PC, a mobile phone, an iPAD, a tablet computer, a notebook computer, a personal digital assistant, and the like.
  • FIG. 2 is a schematic flowchart of a method for implementing a multi-person video live broadcast service according to an embodiment of the application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps. The following is an exemplary description with the server as the executive body. It can be seen from the figure that the method for implementing the multi-person video live broadcast service provided in this embodiment includes:
  • Step S20 Obtain the first video and the second video of the live broadcast host, and obtain the third video of other live broadcast members in the live broadcast group except the live broadcast host, where the first video includes the live broadcast through the live broadcast host
  • the live broadcast host and each live broadcast member in the live broadcast group except the live broadcast host upload the live video stream to the server through the live broadcast client.
  • the first camera device may be a camera device (such as a camera) built into the live broadcast host terminal, or a camera device external to the live broadcast host terminal.
  • the first video can be collected by the first camera device, and the first video is a live broadcast.
  • the current live broadcast picture of the host is the real-time picture collected by the first camera device.
  • the first video includes the face of the live broadcaster.
  • the second video watched by the live broadcast host is a video that the live broadcast host wants to watch with other live broadcast members and played by the live broadcast host through the live broadcast host terminal.
  • the video may be a local video or a network video.
  • the type of the second video can be multiple, such as a costume video, a challenge video, a story-guessing video, and so on.
  • the second camera device may be a camera device (such as a camera) built into the terminal of the live participant (the other live member), or a camera device external to the terminal of the live participant, through which the third video can be collected
  • the third video is the current live broadcast picture of the live broadcast participant, that is, the real-time picture collected by the second camera device.
  • the third video includes the faces of live broadcast participants.
  • a live broadcast group needs to be created by the live broadcast host, so that multiple live broadcast participants can join the live broadcast group.
  • the live broadcast anchor can perform a multi-person live broadcast service.
  • Step S21 splicing the first video, the second video, and the third video of the other live members to obtain a multi-person live video stream, wherein each frame of the multi-person live video stream Both include the frame picture in the first video, the frame picture in the second video, and the frame picture in the third video of the other live broadcast members.
  • the first frame of the synthesized multi-person live video stream includes the live video frames of each live member, so that each live member can see the live broadcast of each live member on the display interface of the live broadcast client Picture.
  • the frames in the synthesized multi-person live video stream may be divided into display blocks corresponding to the number of live members according to the number of live members.
  • the live group has For 6 live members
  • the frames in the multi-person live video stream can be divided into 7 blocks, of which 6 frames are used to display the video frames of the live members, and the other frame is used to display the frames of the video watched by the live broadcaster
  • the effect of each live broadcast member watching the multi-person live video stream is to see the live screens of the 6 live broadcast members on the display interface of the live broadcast client, and to see the frames of the video watched by the live broadcast host.
  • the first video and the other live broadcasts may be checked through the detection thread during splicing.
  • the third video of the member performs face detection, and after the face is detected, the detected face area is rendered by the rendering thread.
  • one thread is used for video synthesis processing, and multithreading is used for video synthesis processing, which can save video synthesis time, and furthermore, can cause video freezes when watching a multi-person live video stream.
  • more threads may also be used to process the to-be-composited video frames, so as to obtain a multi-person live video stream.
  • the video type of the second video is a cross-dressing video
  • the synthesis of each frame of the multi-person live video stream includes:
  • Step S30 Determine whether the time stamp corresponding to the second video frame to be synthesized in the second video is a preset time stamp.
  • the video type of the second video is a cross-dressing video
  • the timestamp corresponding to the synthesized video frame is a preset timestamp.
  • the preset time stamp may be one or more time points preset by the live broadcaster, or one or more time points preset by the system by default. In the implementation of the present application, Not limited.
  • the cross-dressing video refers to that when the live broadcast users watch this type of video, when the video is watched to a specific time point, the server will add corresponding stickers and special effects for each live broadcast user, so as to realize the change.
  • the second to-be-combined video frame is the current to-be-composited video frame in the second video.
  • the second to-be-combined video frame is The composite video frame is the first frame of the second video; for example, if the third frame of the multi-person video live stream is currently being synthesized, the second to-be-composited video frame is the second The third frame in the video.
  • Step S31 If yes, perform face detection on the first video frame to be synthesized in the first video and the third video frame to be synthesized in the third videos of the other live broadcast members.
  • a face recognition model can be used to analyze the first video frame to be synthesized in the first video and the other live broadcast members. Face detection is performed on the third to-be-combined video frame in the third video, where the face recognition model is an existing model for face detection, which will not be repeated in this implementation.
  • the first video frame to be synthesized and the third video frame to be synthesized are the current video frame to be synthesized in the first video, and the current video frame to be synthesized in the third video in sequence
  • the first frame of the multiplayer video live stream is currently being synthesized
  • the first video frame to be synthesized is the first frame of the first video
  • the third video frame to be synthesized is The first frame of the third video; for example, the third frame of the multi-person video live stream is currently being synthesized, then the first to-be-composited video frame is the third frame of the first video Frame picture, the third to-be-combined video frame is the third picture in the third video.
  • the synthesis of the video frame can be directly performed without face detection.
  • Step S32 After the face is detected, a preset sticker effect is added to the detected face to obtain a video frame to be synthesized containing the sticker effect, wherein the video frame to be synthesized without a face is not added Sticker special effects.
  • the sticker special effect may be a preset beauty sticker, eye shadow, gesture special effect, fun dressing special effect, and the like.
  • sticker effects when adding sticker effects, the same sticker effects can be added to the faces of each live broadcast member, or different sticker effects can be added to the faces of each live broadcast member, or also It is possible to only add a sticker special effect on the face of one or more live broadcast members, which is not limited in this embodiment.
  • Step S33 splicing the second video frame to be synthesized with the video frame to be synthesized including the sticker special effect and the video frame to be synthesized without the sticker special effect to obtain the frame picture in the multi-person live video stream.
  • the second video frame to be synthesized, the video frame to be synthesized containing the sticker special effect, and the video frame to be synthesized without the sticker special effect can be spliced and synthesized to obtain the result.
  • Fig. 4 or 5 the frame pictures in the multi-person live stream obtained after splicing are shown in Fig. 4 or 5, where Fig. 4 is a schematic diagram of a composite frame picture after adding a sticker effect to only one live user, and Fig. 5 is A schematic diagram of the synthesized frame picture after adding sticker effects to all live broadcast users.
  • a preset sticker special effect is added to the face of the live broadcast user, so that it can be This enables live broadcast users to have greater watching interest when watching videos together, and improves the user experience.
  • the video type of the second video is a challenge video
  • the synthesis of each frame of the multi-person live video stream includes:
  • Step S40 Perform expression recognition on the first video frame to be synthesized in the first video and the third video frame to be synthesized in the third video of the other live broadcast members.
  • the video type of the second video is a challenge video
  • an expression recognition model may be used to perform expression recognition on the first video frame to be synthesized and the third video frame to be synthesized by other live broadcast members, wherein the expression recognition model may use an existing application.
  • the model for recognizing facial expressions will not be described in detail in this embodiment.
  • the challenge video refers to that when live broadcast users watch this type of video, each live broadcast user can challenge PK.
  • the challenge video is a funny challenge video
  • the live broadcast users can challenge who to watch first.
  • Laugh when the user laughs, the user is eliminated; for example, when the challenge video is a sad challenge video, live broadcast users can challenge to see who sheds tears first. When the user sheds tears, the user will be eliminated. Eliminated.
  • the first video frame to be synthesized and the third video frame to be synthesized are the current video frame to be synthesized in the first video, and the current video frame in the third video.
  • the video frame to be synthesized for example, the first frame of the multi-person video live stream is currently being synthesized, then the first video frame to be synthesized is the first frame of the first video, and the third The synthesized video frame is the first frame in the third video; for another example, if the third frame in the multi-person video live stream is currently being synthesized, the first to-be-composited video frame is the first frame.
  • the third picture in the video, and the third video frame to be synthesized is the third picture in the third video.
  • step S41 when the preset expression is recognized, a preset sticker special effect is added to the to-be-composited video frame where the preset expression is recognized, wherein the sticker special effect is not added to the to-be-composited video frame for which the preset expression is not recognized.
  • the preset expression is a preset expression, such as a smiling face expression, or a crying face expression.
  • the sticker special effect may be a preset sticker special effect used to indicate that the live broadcast user has failed a challenge, such as "out”.
  • Step S42 splicing the second video frame to be synthesized in the second video with the video frame to be synthesized that contains the sticker special effect and the video frame to be synthesized that does not include the sticker special effect to obtain the multi-person live video stream Medium frame picture.
  • the second video frame to be synthesized, the video frame to be synthesized containing the sticker special effect, and the video frame to be synthesized without the sticker special effect can be spliced and synthesized to obtain the result.
  • the frames in the multi-person live video stream Exemplarily, the frame pictures in the multi-person live stream obtained after splicing are shown in FIG. 7.
  • the second to-be-combined video frame is the current to-be-composited video frame in the second video.
  • the second to-be-combined video frame is The composite video frame is the first frame of the second video; for example, if the third frame of the multi-person video live stream is currently being synthesized, the second to-be-composited video frame is the second The third frame in the video.
  • each live broadcast user can challenge each other, so that the enjoyment of the live broadcast user watching the video can be improved, and the user experience can be improved.
  • the video type of the second video is a story-guessing video
  • the synthesis of each frame of the multi-person live video stream includes:
  • Step S50 It is detected whether to receive a scenario guessing message sent by any live broadcast member in the live broadcast group.
  • the video type of the second video is a story-guessing video
  • the video type of the second video is a story-guessing video
  • the plot guessing message sent by the live broadcast members that is, when each live broadcast member is watching the video together, any one of the live broadcast members can send a plot guessing message for each live broadcast member to guess the plot trend of the jointly watched video.
  • the live broadcaster can initiate a story guessing message during the watching of the live broadcast member, so that the users who guess the fried eggs in the video share a few eggs.
  • the story guessing message may be a message containing a candidate answer, or it may be a message not containing a candidate answer, and only includes the guessing question.
  • Step S51 if yes, add the scenario guessing message to the second to-be-composited video frame in the second video.
  • the scenario guessing message may be added to the second to-be-combined video frame in the second video.
  • the video frames in the multi-person live video stream can be directly synthesized.
  • the second to-be-combined video frame is the current to-be-composited video frame in the second video.
  • the second to-be-combined video frame is The composite video frame is the first frame of the second video; for example, if the third frame of the multi-person video live stream is currently being synthesized, the second to-be-composited video frame is the second The third frame in the video.
  • Step S52 Perform the second to-be-combined video frame containing the story guessing message, the first to-be-composited video frame in the first video, and the third to-be-combined video frame in the third videos of the other live broadcast members. Splicing to obtain frames in the multi-person live video stream.
  • the second to-be-combined video frame containing the story-guessing message can be combined with the first to-be-composited video frame in the first video, and the second video frame of the other live broadcast members.
  • the third video frame to be synthesized in the three videos is spliced and synthesized to obtain the frame picture in the multi-person live video stream.
  • the frame pictures in the multi-person live stream obtained after splicing are shown in FIG. 9.
  • the first video frame to be synthesized and the third video frame to be synthesized are the current video frame to be synthesized in the first video, and the current video frame to be synthesized in the third video in sequence
  • the first frame of the multiplayer video live stream is currently being synthesized
  • the first video frame to be synthesized is the first frame of the first video
  • the third video frame to be synthesized is The first frame of the third video; for example, the third frame of the multi-person video live stream is currently being synthesized, then the first to-be-composited video frame is the third frame of the first video Frame picture, the third to-be-combined video frame is the third picture in the third video.
  • any live-streaming member can send a plot-guessing message, so that the live-streaming member can guess the plot direction, thereby increasing the interaction between the live-streaming members. And then improve the user experience.
  • Step S22 Send the multi-person live video stream to the live client corresponding to each live member for viewing by each live member.
  • each live broadcast member includes the live broadcast host and other live broadcast members in the live broadcast group except the live broadcast host.
  • the multi-person live video stream is immediately sent to the live client corresponding to each live member, so that each live member can watch the video screen containing each live member at the same time And the video screen watched by the live broadcaster.
  • the first video and the second video of the live broadcaster are obtained, and the third videos of other live broadcast members in the live broadcast group except the live broadcaster are obtained; the first video and the second video And the third videos of the other live members are spliced to obtain a multi-person live video stream; the multi-person live video stream is sent to the live client corresponding to each live member for viewing by each live member .
  • each live broadcast member can watch the live video screens of all live broadcast members on the display interface of their live broadcast client, and can also watch the video together with the live broadcast anchor, thereby increasing The interactive mode of live broadcast members enhances the user experience.
  • FIG. 10 is a schematic flowchart of a method for implementing a multi-person video live broadcast service according to another embodiment of this application, it can be seen from the figure that the method for implementing a multi-person video live broadcast service provided in this embodiment includes:
  • Step S60 Obtain the first video and the second video of the live broadcast host, and obtain the third video of other live broadcast members in the live broadcast group except the live broadcast host, wherein the first video includes the live broadcast through the live broadcast host
  • Step S61 splicing the first video, the second video, and the third video of the other live members to obtain a multi-person live video stream, wherein each frame of the multi-person live video stream Both include the frame picture in the first video, the frame picture in the second video, and the frame picture in the third video of the other live broadcast members.
  • Step S62 Send the multi-person live video stream to the live client corresponding to each live member for viewing by each live member.
  • steps S60-S62 are the same as the steps S20-S22 in the foregoing embodiment, and will not be repeated in this embodiment.
  • Step S63 Distribute the multi-person live video stream to the CDN network.
  • the multi-person live video stream can be distributed to a CDN network (Content Delivery Network, content delivery network).
  • CDN network Content Delivery Network, content delivery network
  • the embodiment of the application distributes multi-person live video streams to the CDN network, so that other users can download the live video streams from the CDN network to play and watch as required, thereby expanding the types of videos that users can watch.
  • FIG. 11 is a schematic flowchart of a method for implementing a multi-person video live broadcast service according to another embodiment of this application.
  • the method for implementing a multi-person video live broadcast service provided in this embodiment includes:
  • Step S70 Obtain the first video and the second video of the live broadcast host, and obtain the third video of other live broadcast members in the live broadcast group except the live broadcast host, where the first video includes the live broadcast through the live broadcast host.
  • Step S71 splicing the first video, the second video, and the third video of the other live members to obtain a multi-person live video stream, wherein each frame of the multi-person live video stream Both include the frame picture in the first video, the frame picture in the second video, and the frame picture in the third video of the other live broadcast members.
  • Step S72 Perform bit rate reduction processing on the multi-person live video stream.
  • the multi-person live video stream can be broadcasted to the multiple people before the multi-person live video stream is sent to the live client corresponding to each live member.
  • the video stream is processed by reducing the bit rate, thereby reducing the occupation of network resources by the multi-person live video stream.
  • the code rate here is the bit rate, which refers to the number of bits transmitted per second. The higher the bit rate, the faster the data transmission speed.
  • Step S73 Send the multi-person live video stream after the bit rate reduction process to the live broadcast client corresponding to each live broadcast member for viewing by each live broadcast member.
  • the time consumption of sending to the live client corresponding to each live broadcast member can be reduced, and a better live broadcast effect can be obtained.
  • FIG. 12 is a program module diagram of an embodiment of an apparatus 800 for implementing a multi-person video live broadcast service according to this application.
  • the device 800 for realizing a multi-person video live broadcast service includes a series of computer-readable instructions stored in a memory.
  • the various embodiments of the present application can be implemented. Realize the function of multi-person video live broadcast service.
  • the multi-person video live broadcast service implementation apparatus 800 may be divided into one or more modules. For example, in FIG. 7, the device 800 for realizing a multi-person video live broadcast service can be divided into an acquiring module 801, a splicing module 802, and a sending module 803. among them:
  • the obtaining module 801 is configured to obtain the first video and the second video of the live broadcast anchor, and obtain the third video of other live broadcast members in the live broadcast group except the live broadcast anchor, wherein the first video includes the first video and the second video.
  • a video collected by a camera device the second video includes a video watched by the live broadcaster, and the third video includes a video collected by a second camera device.
  • the live broadcast host and each live broadcast member in the live broadcast group except the live broadcast host upload the live video stream to the server through the live broadcast client.
  • the first camera device may be a camera device (such as a camera) built into the live broadcast host terminal, or a camera device external to the live broadcast host terminal.
  • the first video can be collected by the first camera device, and the first video is a live broadcast.
  • the current live broadcast picture of the host is the real-time picture collected by the first camera device.
  • the first video includes the face of the live broadcaster.
  • the second video watched by the live broadcast host is a video that the live broadcast host wants to watch with other live broadcast members and played by the live broadcast host through the live broadcast host terminal.
  • the video may be a local video or a network video.
  • the type of the second video can be multiple, such as a costume video, a challenge video, a story-guessing video, and so on.
  • the second camera device may be a camera device (such as a camera) built into the terminal of the live participant (the other live member), or a camera device external to the terminal of the live participant, through which the third video can be collected
  • the third video is the current live broadcast picture of the live broadcast participant, that is, the real-time picture collected by the second camera device.
  • the third video includes the faces of live broadcast participants.
  • a live broadcast group needs to be created by the live broadcast host, so that multiple live broadcast participants can join the live broadcast group.
  • the live broadcast anchor can perform a multi-person live broadcast service.
  • the splicing module 802 is configured to splice the first video, the second video, and the third video of the other live broadcast members to obtain a multi-person live video stream, wherein the multi-person live video stream is Each frame of the video includes a frame of the first video, a frame of the second video, and a frame of the third video of the other live members.
  • the first frame of the synthesized multi-person live video stream includes the live video frames of each live member, so that each live member can see the live broadcast of each live member on the display interface of the live broadcast client Picture.
  • the frames in the synthesized multi-person live video stream may be divided into display blocks corresponding to the number of live members according to the number of live members.
  • the live group has For 6 live members
  • the frames in the multi-person live video stream can be divided into 7 blocks, of which 6 frames are used to display the video frames of the live members, and the other frame is used to display the frames of the video watched by the live broadcaster
  • the effect of each live broadcast member watching the multi-person live video stream is to see the live screens of the 6 live broadcast members on the display interface of the live broadcast client, and to see the frames of the video watched by the live broadcast host.
  • the first video and the other live broadcasts may be checked through the detection thread during splicing.
  • the third video of the member performs face detection, and after the face is detected, the detected face area is rendered by the rendering thread.
  • one thread is used for video synthesis processing, and multithreading is used for video synthesis processing, which can save video synthesis time, and furthermore, can cause video freezes when watching a multi-person live video stream.
  • more threads may also be used to process the to-be-composited video frames, so as to obtain a multi-person live video stream.
  • the video type of the second video is a cross-dressing video
  • the splicing module 802 is further configured to determine the time corresponding to the second video frame to be synthesized in the second video Whether the stamp is a preset time stamp.
  • the video type of the second video is a cross-dressing video
  • the timestamp corresponding to the synthesized video frame is a preset timestamp.
  • the preset time stamp may be one or more time points preset by the live broadcaster, or one or more time points preset by the system by default. In the implementation of the present application, Not limited.
  • the cross-dressing video refers to that when the live broadcast users watch this type of video, when the video is watched to a specific time point, the server will add corresponding stickers and special effects for each live broadcast user, so as to realize the change.
  • the second to-be-combined video frame is the current to-be-composited video frame in the second video.
  • the second to-be-combined video frame is The composite video frame is the first frame of the second video; for example, if the third frame of the multi-person video live stream is currently being synthesized, the second to-be-composited video frame is the second The third frame in the video.
  • the splicing module 802 is further configured to: if the timestamp corresponding to the second video frame to be synthesized in the second video is the preset timestamp, perform a comparison of the first video frame to be synthesized in the first video , Perform face detection on the third video frame to be synthesized in the third video of the other live broadcast member.
  • a face recognition model can be used to analyze the first video frame to be synthesized in the first video and the other live broadcast members. Face detection is performed on the third to-be-combined video frame in the third video, where the face recognition model is an existing model for face detection, which will not be repeated in this implementation.
  • the first video frame to be synthesized and the third video frame to be synthesized are the current video frame to be synthesized in the first video, and the current video frame to be synthesized in the third video in sequence
  • the first frame of the multiplayer video live stream is currently being synthesized
  • the first video frame to be synthesized is the first frame of the first video
  • the third video frame to be synthesized is The first frame of the third video; for example, the third frame of the multi-person video live stream is currently being synthesized, then the first to-be-composited video frame is the third frame of the first video Frame picture, the third to-be-combined video frame is the third picture in the third video.
  • the synthesis of the video frame can be directly performed without face detection.
  • the splicing module 802 is also used to add preset sticker effects to the detected faces after detecting the faces, so as to obtain the to-be-composited video frames containing the sticker effects. No sticker effects are added to the video frames to be synthesized.
  • the sticker special effect may be a preset beauty sticker, eye shadow, gesture special effect, fun dressing special effect, and the like.
  • sticker effects when adding sticker effects, the same sticker effects can be added to the faces of each live broadcast member, or different sticker effects can be added to the faces of each live broadcast member, or also It is possible to only add a sticker special effect on the face of one or more live broadcast members, which is not limited in this embodiment.
  • the splicing module 802 is further configured to splice the second video frame to be synthesized with the video frame to be synthesized containing the sticker special effect and the video frame to be synthesized without the sticker special effect to obtain the multi-person live video Frames in the stream.
  • the second video frame to be synthesized, the video frame to be synthesized containing the sticker special effect, and the video frame to be synthesized without the sticker special effect can be spliced and synthesized to obtain the result.
  • Fig. 4 or 5 the frame pictures in the multi-person live stream obtained after splicing are shown in Fig. 4 or 5, where Fig. 4 is a schematic diagram of a composite frame picture after adding a sticker effect to only one live user, and Fig. 5 is A schematic diagram of the synthesized frame picture after adding sticker effects to all live broadcast users.
  • a preset sticker special effect is added to the face of the live broadcast user, so that it can be This enables live broadcast users to have greater watching interest when watching videos together, and improves the user experience.
  • the video type of the second video is a challenge video
  • the splicing module 802 is further configured to check the first video frame to be synthesized in the first video and the video of the other live broadcast members. Expression recognition is performed on the third video frame to be synthesized in the third video.
  • the video type of the second video is a challenge video
  • an expression recognition model may be used to perform expression recognition on the first video frame to be synthesized and the third video frame to be synthesized by other live broadcast members, wherein the expression recognition model may use an existing application.
  • the model for recognizing facial expressions will not be described in detail in this embodiment.
  • the challenge video refers to that when live broadcast users watch this type of video, each live broadcast user can challenge PK.
  • the challenge video is a funny challenge video
  • the live broadcast users can challenge who to watch first.
  • Laugh when the user laughs, the user is eliminated; for example, when the challenge video is a sad challenge video, live broadcast users can challenge to see who sheds tears first. When the user sheds tears, the user will be eliminated. Eliminated.
  • the stitching module 802 is also used to add a preset sticker effect to the video frame to be synthesized for which the preset expression is recognized when the preset expression is recognized, wherein, for the video frame to be synthesized for which the preset expression is not recognized No sticker effects are added.
  • the preset expression is a preset expression, such as a smiling face expression, or a crying face expression.
  • the sticker special effect may be a preset sticker special effect used to indicate that the live broadcast user has failed a challenge, such as "out”.
  • the splicing module 802 is also used to splice the second video frame to be synthesized in the second video with the video frame to be synthesized containing the sticker special effect and the video frame to be synthesized without the sticker special effect to obtain the result. Describes the frames in the multi-person live video stream.
  • the second video frame to be synthesized, the video frame to be synthesized containing the sticker special effect, and the video frame to be synthesized without the sticker special effect can be spliced and synthesized to obtain the result.
  • the frames in the multi-person live video stream Exemplarily, the frame pictures in the multi-person live stream obtained after splicing are shown in FIG. 7.
  • each live broadcast user can challenge each other, so that the enjoyment of the live broadcast user watching the video can be improved, and the user experience can be improved.
  • the video type of the second video is a story-guessing video
  • the splicing module 802 is also used to detect whether to receive a story-guessing message sent by any live broadcast member in the live broadcast group.
  • the video type of the second video is a story-guessing video
  • the video type of the second video is a story-guessing video
  • the plot guessing message sent by the live broadcast members that is, when each live broadcast member is watching the video together, any one of the live broadcast members can send a plot guessing message for each live broadcast member to guess the plot trend of the jointly watched video.
  • the live broadcaster can initiate a story guessing message during the watching of the live broadcast member, so that the users who guess the fried eggs in the video share a few eggs.
  • the story guessing message may be a message containing a candidate answer, or it may be a message not containing a candidate answer, and only includes the guessing question.
  • the splicing module 802 is further configured to, if a scenario guessing message sent by any live broadcast member in the live broadcast group is received, add the scenario guessing message to the second to-be-combined video frame in the second video.
  • the scenario guessing message may be added to the second to-be-combined video frame in the second video.
  • the video frames in the multi-person live video stream can be directly synthesized.
  • the splicing module 802 is further configured to combine the second to-be-combined video frame containing the plot guessing message with the first to-be-composited video frame in the first video, and the first video frame in the third video of the other live broadcast members.
  • the three-to-be-combined video frames are spliced to obtain the frames in the multi-person live video stream.
  • the second to-be-combined video frame containing the story-guessing message can be combined with the first to-be-composited video frame in the first video, and the second video frame of the other live broadcast members.
  • the third video frame to be synthesized in the three videos is spliced and synthesized to obtain the frame picture in the multi-person live video stream.
  • the frame pictures in the multi-person live stream obtained after splicing are shown in FIG. 9.
  • any live-streaming member can send a plot-guessing message, so that the live-streaming member can guess the plot direction, thereby increasing the interaction between the live-streaming members. And then improve the user experience.
  • the sending module 803 is configured to send the multi-person live video stream to the live broadcast client corresponding to each live broadcast member for viewing by each live broadcast member.
  • each live broadcast member includes the live broadcast host and other live broadcast members in the live broadcast group except the live broadcast host.
  • the multi-person live video stream is immediately sent to the live client corresponding to each live member, so that each live member can watch the video screen containing each live member at the same time And the video screen watched by the live broadcaster.
  • the first video and the second video of the live broadcaster are obtained, and the third videos of other live broadcast members in the live broadcast group except the live broadcaster are obtained; the first video and the second video And the third videos of the other live members are spliced to obtain a multi-person live video stream; the multi-person live video stream is sent to the live client corresponding to each live member for viewing by each live member .
  • each live broadcast member can watch the live video screens of all live broadcast members on the display interface of their live broadcast client, and can also watch the video together with the live broadcast anchor, thereby increasing The interactive mode of live broadcast members enhances the user experience.
  • the device 800 for realizing a multi-person video live broadcast service further includes a distribution module.
  • the distribution module is used to distribute the multi-person live video stream to the CDN network.
  • the multi-person live video stream can be distributed to a CDN network (Content Delivery Network, content delivery network).
  • CDN network Content Delivery Network, content delivery network
  • the embodiment of the application distributes multi-person live video streams to the CDN network, so that other users can download the live video streams from the CDN network to play and watch as required, thereby expanding the types of videos that users can watch.
  • the device 800 for realizing a multi-person video live broadcast service further includes a processing module.
  • the processing module is configured to perform bit rate reduction processing on the multi-person live video stream.
  • the multi-person live video stream can be broadcasted to the multiple people before the multi-person live video stream is sent to the live client corresponding to each live member.
  • the video stream is processed by reducing the bit rate, thereby reducing the occupation of network resources by the multi-person live video stream.
  • the code rate here is the bit rate, which refers to the number of bits transmitted per second. The higher the bit rate, the faster the data transmission speed.
  • the time consumption of sending to the live client corresponding to each live broadcast member can be reduced, and a better live broadcast effect can be obtained.
  • FIG. 13 schematically shows a schematic diagram of the hardware architecture of a computer device 2 suitable for implementing a method for implementing a multi-person video live broadcast service according to an embodiment of the present application.
  • the computer device 2 is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • it can be a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers).
  • the computer device 2 at least includes but is not limited to: a memory 901, a processor 902, and a network interface 903 that can communicate with each other via a system bus. among them:
  • the memory 901 includes at least one type of computer-readable storage medium.
  • the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), and static random access memory.
  • SRAM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • PROM programmable read-only memory
  • magnetic memory magnetic disk, optical disk, etc.
  • the memory 901 may be an internal storage module of the computer device 2, for example, the hard disk or memory of the computer device 2.
  • the memory 901 may also be an external storage device of the computer device 2, such as a plug-in hard disk equipped on the computer device 2, a smart memory card (Smart Media Card, referred to as SMC), and a secure digital (Secure Digital). Digital, abbreviated as SD) card, flash card (Flash Card), etc.
  • the memory 901 may also include both the internal storage module of the computer device 2 and its external storage device.
  • the memory 901 is generally used to store an operating system and various application software installed in the computer device 2, for example, program code of a method for implementing a multi-person video live broadcast service.
  • the memory 901 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 902 may be a central processing unit (Central Processing Unit, CPU for short), a controller, a microcontroller, a microprocessor, or other data processing chips.
  • the processor 902 is generally used to control the overall operation of the computer device 2, for example, to perform data interaction or communication-related control and processing with the computer device 2.
  • the processor 902 is configured to run program codes stored in the memory 901 or process data.
  • the network interface 903 may include a wireless network interface or a wired network interface, and the network interface 903 is generally used to establish a communication link between the computer device 2 and other computer devices.
  • the network interface 903 is used to connect the computer device 2 with an external terminal through a network, and establish a data transmission channel and a communication link between the computer device 2 and the external terminal.
  • the network can be Intranet, Internet, Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network , 5G network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
  • FIG. 13 only shows a computer device with components 901 to 903, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the method for realizing a multi-person video live broadcast service stored in the memory 901 can also be divided into one or more program modules and executed by one or more processors (in this embodiment, the processor 902) To complete this application.
  • the embodiment of the present application provides a computer-readable storage medium on which computer-readable instructions are stored, and the computer-readable instructions implement the following steps when executed by a processor:
  • the first video and the second video of the live broadcast host Acquire the first video and the second video of the live broadcast host, and obtain the third video of other live broadcast members in the live broadcast group except the live broadcast host, where the first video includes the first video taken by the live broadcast host A real-time picture collected by a device, the second video includes a video watched by the live broadcast anchor, and the third video includes a real-time picture collected by a second camera device of the other live broadcast member;
  • the first video, the second video, and the third video of the other live members are spliced to obtain a multi-person live video stream, wherein each frame of the multi-person live video stream contains all Frame pictures in the first video, frame pictures in the second video, and frame pictures in the third video of the other live broadcast members;
  • the multi-person live video stream is sent to the live client corresponding to each live member for viewing by each live member.
  • the computer-readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the computer-readable storage medium may be an internal storage unit of a computer device, such as a hard disk or memory of the computer device.
  • the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk equipped on the computer device, a smart memory card (Smart Media Card, referred to as SMC), and a secure digital ( Secure Digital, referred to as SD card, Flash Card, etc.
  • the computer-readable storage medium may also include both the internal storage unit and the external storage device of the computer device.
  • the computer-readable storage medium is generally used to store the operating system and various application software installed in the computer device, such as the program code of the method for implementing the multi-person video live broadcast service in the embodiment.
  • the computer-readable storage medium can also be used to temporarily store various types of data that have been output or will be output.
  • the device embodiments described above are merely illustrative, where the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place. , Or it can be distributed to at least two network units. Some or all of the modules may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application. Those of ordinary skill in the art can understand and implement it without creative work.
  • each implementation manner can be implemented by means of software plus a general hardware platform, and of course, it can also be implemented by hardware.
  • Those of ordinary skill in the art can understand that all or part of the processes in the methods of the foregoing embodiments can be implemented by computer-readable instructions to instruct relevant hardware.
  • the programs can be stored in a computer-readable storage medium. When the program is executed, it may include the processes of the above-mentioned method embodiments.
  • the storage medium can be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请公开了一种多人视频直播业务实现方法、装置、计算机设备及可读存储介质,属于视频处理技术领域。本申请的多人视频直播业务实现方法包括:获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频;将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流;将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。本申请可以增加直播成员的交互方式,提升用户体验。

Description

多人视频直播业务实现方法、装置、计算机设备
本申请要求于2019年12月9日提交中国专利局、申请号为201911251118.0、发明名称为“多人视频直播业务实现方法、装置、计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频处理技术领域,尤其涉及一种多人视频直播业务实现方法、装置、计算机设备。
背景技术
随着直播行业的快速发展,越来越多的用户喜爱观看直播。目前,直播群中的各个观看直播的用户的沟通方式仅限于在直播间中发送文字信息(即弹幕)进行交流,不能观看到其他用户的直播视频,从而使得各个直播用户之间的交互方式比较单一,造成用户的使用体验较差。
发明内容
有鉴于此,现提供一种多人视频直播业务实现方法、装置、计算机设备及计算机可读存储介质,以解决现有方法在进行视频直播时,不能观看到其他用户的直播视频的问题。
本申请提供了一种多人视频直播业务实现方法,包括:
获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面;
将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面;
将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
可选地,在进行拼接时,通过检测线程对所述第一视频以及所述其他直播成员的第三视频进行人脸检测,并在检测到人脸后,通过渲染线程对检测到的人脸区域进行渲染。
可选地,所述第二视频的视频类型为变装类视频,对所述多人直播视频流中的每一帧画面的合成包括:
判断所述第二视频中的第二待合成视频帧对应的时间戳是否为预设时间戳;
若是,则对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行人脸检测;
在检测到人脸后,在检测到的人脸上添加预设的贴纸特效,以得到包含贴纸特效的待合成视频帧,其中,对未检测到人脸的待合成视频帧不添加贴纸特效;
将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
可选地,所述第二视频的视频类型为挑战类视频,对所述多人直播视频流中的每一帧画面的合成包括:
对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行表情识别;
在识别出预设表情时,在识别出预设表情的待合成视频帧中添加预设的贴纸特效,其中,对未识别出预设表情的待合成视频帧不添加贴纸特效;
将所述第二视频中的第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
可选地,所述第二视频的视频类型为猜剧情类视频,对所述多人直播视频流中的每一帧画面的合成包括:
检测是否接收到所述直播群中的任一直播成员发送的剧情猜测消息;
若是,则将所述剧情猜测消息添加至第二视频中的第二待合成视频帧中;
将包含所述剧情猜测消息的第二待合成视频帧与所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
可选地,所述多人视频直播业务实现方法还包括:
将所述多人直播视频流分发至CDN网络中。
可选地,所述多人视频直播业务实现方法还包括:
对所述多人直播视频流进行降码率处理。
本发明本申请还提供了一种多人视频直播业务实现装置,包括:
获取模块,用于获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面;
拼接模块,用于将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面;
发送模块,用于将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
本申请还提供了一种计算机设备,所述计算机设备,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述方法的步骤。
本申请还提供了一种计算机可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述方法的步骤。
上述技术方案的有益效果:
本申请实施例中,通过获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频;将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流;将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。本申请实施例中的多人视 频直播业务实现方法,各个直播成员可以在其直播客户端的显示界面上观看到所有直播成员的直播视频画面,并且还可以共同与直播主播同时观看视频,从而可以增加直播成员的交互方式,提升用户体验。
附图说明
图1为本申请所述多人视频直播业务实现的系统框架图的一种实施例的框架图;
图2为本申请所述的多人视频直播业务实现方法的一种实施例的流程图;
图3为本申请一实施方式中对所述多人直播视频流中的每一帧画面的合成的步骤细化流程图;
图4为本申请一实施方式中进行拼接后得到的多人直播流中的帧画面的示意图;
图5为本申请另一实施方式中进行拼接后得到的多人直播流中的帧画面的示意图;
图6为本申请另一实施方式中对所述多人直播视频流中的每一帧画面的合成的步骤细化流程图;
图7为本申请另一实施方式中进行拼接后得到的多人直播流中的帧画面的示意图;
图8为本申请另一实施方式中对所述多人直播视频流中的每一帧画面的合成的步骤细化流程图;
图9为本申请另一实施方式中进行拼接后得到的多人直播流中的帧画面的示意图;
图10为本申请所述的多人视频直播业务实现方法的另一种实施例的流程图;
图11为本申请所述的多人视频直播业务实现方法的另一种实施例的流程图;
图12为本申请所述的多人视频直播业务实现装置的一种实施例的模块图;
图13为本申请实施例提供的执行多人视频直播业务实现方法的计算机设备的硬件结构示意图。
具体实施方式
以下结合附图与具体实施例进一步阐述本申请的优点。
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本公开可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
在本申请的描述中,需要理解的是,步骤前的数字标号并不标识执行步骤的前后顺序,仅用于方便描述本申请及区别每一步骤,因此不能理解为对本申请的限制。
图1示意性示出了根据本申请实施例的多人视频直播业务实现方法的应用环境示意图。在示例性的实施例中,该应用环境的系统可包括直播主播终端10、直播参与者(即直播群中除主播直播之外的直播成员)终端20、服务器30。其中,直播主播终端10、直播参与者终端20与服务器30形成无线或有线连接,且直播主播终端10、直播参与者终端20上具有相应的应用客户端或网页客户端,直播主播终端10、直播参与者终端2通过应用客户端或网页客户端将直播视频流上传至服务器30,从而可以触发服务器30执行多人视频直播业务实现方法。其中,直播主播终端10、直播参与者终端20可以为PC、手机、iPAD,平板电脑、笔记本电脑、个人数字助理等。
参阅图2,其为本申请一实施例的多人视频直播业务实现方法的流程示意图。可以理解,本方法实施例中的流程图不用于对执行步骤的顺序进行限定。下面以服务器为执行主体进行示例性描述,从图中可以看出,本实施例中所提供的多人视频直播业务实现方法包括:
步骤S20、获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面。
具体地,本申请实施例在进行多人直播业务时,直播主播以及直播群中除直播主播之外的各个直播成员分别通过其直播客户端将直播视频流上传至服务器。
其中,第一摄像装置可以为直播主播终端内置的摄像装置(比如摄像头),也可以为直播主播终端外接的摄像装置,通过该第一摄像装置可以采集到第一视频,该第一视频为直播主播当前的直播画面,即为通过该第一摄像装置采集的实时画面。优选地,该第一视频包括直播主播的人脸。所述直播主播观看的第二视频为直播主播通过直播主播终端播放的想同其他直播成员共同观看的视频,该视频可以为本地视频,也可以为网络视频。该第二视频的类型可以为多种,比如为变装类视频,挑战类视频、猜剧情类视频等。第二摄像装置可以为直播参与者(所述其他直播成员)终端内置的摄像装置(比如摄像头),也可以为直播参与者终端外接的摄像装置,通过该第二摄像装置可以采集到第三视频,该第三视频为直播参与者当前的直播画面,即通过该第二摄像装置采集到的实时画面。优选地,该第三视频包括直播参与者的人脸。
需要说明的是,本申请实施例在进行多人直播业务之前,需要先通过直播主播创建直播群,以供多个直播参与者加入到该直播群中。在多个直播参与者加入到该直播群之后,该直播主播可以进行多人直播业务。
步骤S21,将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面。
具体地,在进行拼接时,是一帧一帧进行的,首先,获取第一视频、第二视频以及其他直播成员的第三视频中的第一帧画面,然后将获取到的各个第一帧画面进行拼接以得到多人直播视频流中的第一帧画面。也就是说,合成后的多人直播视频流中的第一帧画面包括各个直播成员的直播视频帧画面,这样,每个直播成员即可在直播客户端的显示界面中看到各个直播成员的直播画面。
在一实施方式中,在进行拼接时,可以根据直播成员的数量将合成得到的多人直播视 频流中的帧画面分成与所述直播成员的数量相对应的显示块,比如,直播群中具有6个直播成员,则可以将多人直播视频流中的帧画面分成7块,其中6块帧画面用于显示直播成员的视频帧,另一块帧画面则用于显示直播主播观看的视频的帧画面,每个直播成员观看多人直播视频流的效果即是在直播客户端的显示界面上看到6个直播成员的直播画面,以及看到直播主播观看的视频的帧画面。
需要说明的是,在完成多人直播视频流中的第一帧画面的合成后,继续获取第一视频、第二视频以及其他直播成员的第三视频中的第二帧画面,然后将获取到的各个第二帧画面进行拼接以得到多人直播视频流中的第二帧画面。以此类推,直到完成多人直播视频流中的所有帧画面的合成。
在本申请一实施方式中,为了使得各个直播成员在观看的多人直播视频流时,不会产生视频卡顿的效果,可以在进行拼接时,通过检测线程对第一视频以及所述其他直播成员的第三视频进行人脸检测,并在检测到人脸之后,通过渲染线程对检测到的人脸区域进行渲染。本申请实施例中采用一个线程进行视频的合成处理,采用多线程进行视频的合成处理,可以节省视频的合成时间,进而可以在观看多人直播视频流时产生视频卡顿。
需要说明的是,在本申请其他实施方式中,在进行拼接时,也可以采用更多的线程对待合成的视频帧进行处理,以得到多人直播视频流。
示例性的,在一实施方式中,参照图3,所述第二视频的视频类型为变装类视频,对所述多人直播视频流中的每一帧画面的合成包括:
步骤S30,判断所述第二视频中的第二待合成视频帧对应的时间戳是否为预设时间戳。
具体地,在第二视频的视频类型为变装类视频时,在对所述多人直播视频流中的每一帧画面的合成过程中,首先需要判断所述第二视频中的第二待合成视频帧对应的时间戳是否为预设的时间戳。在本实施例中,可以通过直接获取该第二待合成视频帧中的时间戳信息,然后将获取到的时间戳信息与预设时间戳进行比较的方式来判定第二待合成视频帧对应的时间戳与预设时间戳是否相同。在本申请实施例中,所述预设时间戳可以为直播主播预先设定的一个或多个时间点,也可以为系统预先默认设定的一个或者多个时间点,在本申请实施方式中不作限定。
其中,变装类视频指的是直播用户在观看该类视频时,在观看到视频播放到特定时间点时,服务器会为各个直播用户添加对应的贴纸特效,从而实现变装。
需要说明的是,所述第二待合成视频帧为所述第二视频中的当前待合成视频帧,比如,当前正在合成多人视频直播流中的第一帧画面,则所述第二待合成视频帧即为所述第二视频中的第一帧画面;又比如,当前正在合成多人视频直播流中的第三帧画面,则所述第二待合成视频帧即为所述第二视频中的第三帧画面。
步骤S31,若是,则对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行人脸检测。
具体地,在判定出所述第二待合成视频对应的时间戳为预设时间戳时,则可以采用人脸识别模型对第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行人脸检测,其中,所述人脸识别模型为现有的用于检测人脸的模型,在本实施了中不再赘述。
需要说明的是,所述第一待合成视频帧、所述第三待合成视频帧依次为所述第一视频中的当前待合成视频帧、所述第三视频中的当前待合成视频帧,比如,当前正在合成多人 视频直播流中的第一帧画面,则所述第一待合成视频帧即为所述第一视频中的第一帧画面,所述第三待合成视频帧即为所述第三视频中的第一帧画面;又比如,当前正在合成多人视频直播流中的第三帧画面,则所述第一待合成视频帧即为所述第一视频中的第三帧画面,所述第三待合成视频帧即为所述第三视频中的第三帧画面。
在本申请实施例中,在判定出所述第二待合成视频对应的时间戳不为预设时间戳时,则可以直接进行视频帧的合成,而不用进行人脸检测出来。
步骤S32,在检测到人脸后,在检测到的人脸上添加预设的贴纸特效,以得到包含贴纸特效的待合成视频帧,其中,对未检测到人脸的待合成视频帧不添加贴纸特效。
具体地,所述贴纸特效可以为预先设定的美妆贴纸、眼影、手势特效、趣味变装特效等。
需要说明的是,本申请实施例中,在添加贴纸特效时,可以在各个直播成员的人脸上添加相同的贴纸特效,也可以在各个直播成员的人脸上添加不同的贴纸特效,或者也可以仅仅在某一个或多个直播成员的人脸上添加贴纸特效,在本实施例中不作限定。
步骤S33,将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
具体地,在完成贴纸特效的添加之后,即可以将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接合成,以得到所述多人直播视频流中帧画面。示例性的,进行拼接后得到的多人直播流中的帧画面如图4或5所示,其中,图4为仅仅对一个直播用户进行添加贴纸特效后的合成帧画面的示意图,图5为对所有直播用户都进行添加贴纸特效后的合成帧画面的示意图。
本实施例中,通过在判定出主播直播观看的第二视频中的当前待合成视频帧对应的时间戳为预设时间戳时,在直播用户的人脸上添加预设的贴纸特效,从而可以使得直播用户在共同观看视频时,具有更大的观看兴趣,提高用户的使用体验。
在另一实施方式中,参照图6,所述第二视频的视频类型为挑战类视频,对所述多人直播视频流中的每一帧画面的合成包括:
步骤S40,对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行表情识别。
具体地,在第二视频的视频类型为挑战类视频时,在对所述多人直播视频流中的每一帧画面的合成过程中,首先需要对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行表情识别。在本申请实施例中,可以采用表情识别模型对所述第一待合成视频帧、其他直播成员的所述第三待合成视频帧进行表情识别,其中,该表情识别模型可以采用现有的用于识别表情的模型,在本实施例中不再赘述。
其中,所述挑战类视频指的是直播用户在观看该类视频时,各个直播用户可以进行挑战PK,比如,当该挑战类视频为搞笑类挑战视频时,直播用户之间可以挑战看谁先笑,当用户笑时,该用户即被淘汰;又比如,当该挑战类视频为悲伤类挑战视频时,直播用户之间可以挑战看谁先流眼泪,当用户流眼泪时,该用户即被淘汰。
需要说明的是,在本实例中,所述第一待合成视频帧、所述第三待合成视频帧依次为所述第一视频中的当前待合成视频帧、所述第三视频中的当前待合成视频帧,比如,当前正在合成多人视频直播流中的第一帧画面,则所述第一待合成视频帧即为所述第一视频中的第一帧画面,所述第三待合成视频帧即为所述第三视频中的第一帧画面;又比如,当前 正在合成多人视频直播流中的第三帧画面,则所述第一待合成视频帧即为所述第一视频中的第三帧画面,所述第三待合成视频帧即为所述第三视频中的第三帧画面。
步骤S41,在识别出预设表情时,在识别出预设表情的待合成视频帧中添加预设的贴纸特效,其中,对未识别出预设表情的待合成视频帧不添加贴纸特效。
具体地,所述预设表情为预先设定的表情,比如笑脸表情,或者哭脸表情等。所述贴纸特效可以为预先设定的用于表明直播用户挑战失败的贴纸特效,比如为“out”。
步骤S42,将所述第二视频中的第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
具体地,在完成贴纸特效的添加之后,即可以将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接合成,以得到所述多人直播视频流中帧画面。示例性的,进行拼接后得到的多人直播流中的帧画面如图7所示。
需要说明的是,所述第二待合成视频帧为所述第二视频中的当前待合成视频帧,比如,当前正在合成多人视频直播流中的第一帧画面,则所述第二待合成视频帧即为所述第二视频中的第一帧画面;又比如,当前正在合成多人视频直播流中的第三帧画面,则所述第二待合成视频帧即为所述第二视频中的第三帧画面。
本实施例中,通过在主播直播观看的第二视频为挑战类视频时,各个直播用户之间可以进行挑战,从而可以提升直播用户观看视频的乐趣,进而提升用户体验。
在另一实施方式中,参照图8,所述第二视频的视频类型为猜剧情类视频,对所述多人直播视频流中的每一帧画面的合成包括:
步骤S50,检测是否接收所述直播群中的任一直播成员发送的剧情猜测消息。
具体地,在第二视频的视频类型为猜剧情类视频时,在对所述多人直播视频流中的每一帧画面的合成过程中,首先需要检测是否接收到为直播群中的任一个直播成员发送的剧情猜测消息,即各个直播成员在共同观看视频时,其中,任意一个直播成员可以发出剧情猜测消息,以供各个直播成员对该共同观看视频的剧情走向进行猜测。比如,直播成员在观看一个煎鸡蛋的猜剧情视频时,在直播成员观看的过程中,直播人员可以发起一个剧情猜测消息,以让各个直播成员猜测视频中的煎鸡蛋的用户共用了几个鸡蛋。其中,该剧情猜测消息可以为一个包含候选答案的消息,也可以为不包含候选答案的消息,只包括猜测问题。
步骤S51,若是,则将所述剧情猜测消息添加至第二视频中的第二待合成视频帧中。
具体地,在接收到某一个直播成员发送的剧情猜测消息时,即可以将该剧情猜测消息添加至第二视频中的第二待合成视频帧中。
在本申请实施例中,在未接收到直播群中的任一直播成员发送的剧情猜测消息时,则可以直接进行多人直播视频流中的视频帧的合成。
需要说明的是,所述第二待合成视频帧为所述第二视频中的当前待合成视频帧,比如,当前正在合成多人视频直播流中的第一帧画面,则所述第二待合成视频帧即为所述第二视频中的第一帧画面;又比如,当前正在合成多人视频直播流中的第三帧画面,则所述第二待合成视频帧即为所述第二视频中的第三帧画面。
步骤S52,将包含所述剧情猜测消息的第二待合成视频帧与所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
具体地,在完成剧情猜测消息的添加之后,即可以将包含所述剧情猜测消息的第二待合成视频帧与所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行拼接合成,以得到所述多人直播视频流中帧画面。示例性的,进行拼接拼接后得到的多人直播流中的帧画面如图9所示。
需要说明的是,所述第一待合成视频帧、所述第三待合成视频帧依次为所述第一视频中的当前待合成视频帧、所述第三视频中的当前待合成视频帧,比如,当前正在合成多人视频直播流中的第一帧画面,则所述第一待合成视频帧即为所述第一视频中的第一帧画面,所述第三待合成视频帧即为所述第三视频中的第一帧画面;又比如,当前正在合成多人视频直播流中的第三帧画面,则所述第一待合成视频帧即为所述第一视频中的第三帧画面,所述第三待合成视频帧即为所述第三视频中的第三帧画面。
本实施例中,通过在主播直播观看的第二视频为猜剧情类视频时,任意一个直播成员可以发送剧情猜测消息,以让直播成员进行剧情走向的猜测,从而增加直播成员之间的互动,进而提升用户体验。
步骤S22,将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
具体地,所述各个直播成员包括所述直播主播以及直播群中除所述直播主播之外的其他直播成员。
本申请实施例中,在合成多人直播视频流后,立即将该多人直播视频流发送至各个直播成员对应的直播客户端中,以便各个直播成员可以观看到同时包含各个直播成员的视频画面以及直播主播观看的视频画面。
本申请实施例中,通过获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频;将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流;将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。本申请实施例中的多人视频直播业务实现方法,各个直播成员可以在其直播客户端的显示界面上观看到所有直播成员的直播视频画面,并且还可以共同与直播主播同时观看视频,从而可以增加直播成员的交互方式,提升用户体验。
进一步地,参阅图10,其为本申请另一实施例的多人视频直播业务实现方法的流程示意图,从图中可以看出,本实施例中所提供的多人视频直播业务实现方法包括:
步骤S60,获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面。
步骤S61,将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面。
步骤S62,将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
具体地,上述步骤S60-S62与上述实施例中的步骤S20-S22相同,在本实施例中不再赘述。
步骤S63,将所述多人直播视频流分发至CDN网络中。
具体地,在得到多人直播视频流之后,为了让其他用户也可以观看到该多人直播视频流,可以将该多人直播视频流分发至CDN网络(Content Delivery Network,内容分发网络)中。这样,其他用户即可以根据自己的需求从CDN网络中下载其喜欢观看的类型的多人直播视频流进行播放观看。
本申请实施例通过将多人直播视频流分发至CDN网络中,从而使得其他用户也可以根据需求从CDN网络中下载该直播视频流进行播放观看,从而可以扩充用户可以观看的视频的种类。
进一步地,参阅图11,其为本申请另一实施例的多人视频直播业务实现方法的流程示意图,从图中可以看出,本实施例中所提供的多人视频直播业务实现方法包括:
步骤S70,获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面。
步骤S71,将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面。
步骤S72,对所述多人直播视频流进行降码率处理。
具体地,为了使得各个直播成员在观看的多人直播视频流时,不会产生视频卡顿的效果,可以在多人直播视频流发送至各个直播成员对应的直播客户端前,对多人直播视频流进行降码率处理,从而减少多人直播视频流对网络资源的占用。这里的码率即是比特率,是指每秒传送的比特数,比特率越高,传送数据速度越快。
步骤S73,将经过降码率处理后的多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
本申请实施例中,通过对多人直播视频流进行降码率处理,从而可以减少发送至各个直播成员对应的直播客户端的耗时,进而可以获得更好的直播效果。
参阅图12所示,是本申请多人视频直播业务实现装置800一实施例的程序模块图。
本实施例中,所述多人视频直播业务实现装置800包括一系列的存储于存储器上的计算机可读指令指令,当该计算机可读指令指令被处理器执行时,可以实现本申请各实施例的多人视频直播业务实现功能。在一些实施例中,基于该计算机可读指令指令各部分所实现的特定的操作,多人视频直播业务实现装置800可以被划分为一个或多个模块。例如,在图7中,所述多人视频直播业务实现装置800可以被分割成获取模块801、拼接模块802及发送模块803。其中:
获取模块801,用于获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过第一摄像装置采集到的视频,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过第二摄像装置采集到的视频。
具体地,本申请实施例在进行多人直播业务时,直播主播以及直播群中除直播主播之外的各个直播成员分别通过其直播客户端将直播视频流上传至服务器。
其中,第一摄像装置可以为直播主播终端内置的摄像装置(比如摄像头),也可以为直 播主播终端外接的摄像装置,通过该第一摄像装置可以采集到第一视频,该第一视频为直播主播当前的直播画面,即为通过该第一摄像装置采集的实时画面。优选地,该第一视频包括直播主播的人脸。所述直播主播观看的第二视频为直播主播通过直播主播终端播放的想同其他直播成员共同观看的视频,该视频可以为本地视频,也可以为网络视频。该第二视频的类型可以为多种,比如为变装类视频,挑战类视频、猜剧情类视频等。第二摄像装置可以为直播参与者(所述其他直播成员)终端内置的摄像装置(比如摄像头),也可以为直播参与者终端外接的摄像装置,通过该第二摄像装置可以采集到第三视频,该第三视频为直播参与者当前的直播画面,即通过该第二摄像装置采集到的实时画面。优选地,该第三视频包括直播参与者的人脸。
需要说明的是,本申请实施例在进行多人直播业务之前,需要先通过直播主播创建直播群,以供多个直播参与者加入到该直播群中。在多个直播参与者加入到该直播群之后,该直播主播可以进行多人直播业务。
所述拼接模块802,用于将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面。
具体地,在进行拼接时,是一帧一帧进行的,首先,获取第一视频、第二视频以及其他直播成员的第三视频中的第一帧画面,然后将获取到的各个第一帧画面进行拼接以得到多人直播视频流中的第一帧画面。也就是说,合成后的多人直播视频流中的第一帧画面包括各个直播成员的直播视频帧画面,这样,每个直播成员即可在直播客户端的显示界面中看到各个直播成员的直播画面。
在一实施方式中,在进行拼接时,可以根据直播成员的数量将合成得到的多人直播视频流中的帧画面分成与所述直播成员的数量相对应的显示块,比如,直播群中具有6个直播成员,则可以将多人直播视频流中的帧画面分成7块,其中6块帧画面用于显示直播成员的视频帧,另一块帧画面则用于显示直播主播观看的视频的帧画面,每个直播成员观看多人直播视频流的效果即是在直播客户端的显示界面上看到6个直播成员的直播画面,以及看到直播主播观看的视频的帧画面。
需要说明的是,在完成多人直播视频流中的第一帧画面的合成后,继续获取第一视频、第二视频以及其他直播成员的第三视频中的第二帧画面,然后将获取到的各个第二帧画面进行拼接以得到多人直播视频流中的第二帧画面。以此类推,直到完成多人直播视频流中的所有帧画面的合成。
在本申请一实施方式中,为了使得各个直播成员在观看的多人直播视频流时,不会产生视频卡顿的效果,可以在进行拼接时,通过检测线程对第一视频以及所述其他直播成员的第三视频进行人脸检测,并在检测到人脸之后,通过渲染线程对检测到的人脸区域进行渲染。本申请实施例中采用一个线程进行视频的合成处理,采用多线程进行视频的合成处理,可以节省视频的合成时间,进而可以在观看多人直播视频流时产生视频卡顿。
需要说明的是,在本申请其他实施方式中,在进行拼接时,也可以采用更多的线程对待合成的视频帧进行处理,以得到多人直播视频流。
示例性的,在一实施方式中,所述第二视频的视频类型为变装类视频,所述拼接模块802,还用于判断所述第二视频中的第二待合成视频帧对应的时间戳是否为预设时间戳。
具体地,在第二视频的视频类型为变装类视频时,在对所述多人直播视频流中的每一帧画面的合成过程中,首先需要判断所述第二视频中的第二待合成视频帧对应的时间戳是否为预设的时间戳。在本实施例中,可以通过直接获取该第二待合成视频帧中的时间戳信息,然后将获取到的时间戳信息与预设时间戳进行比较的方式来判定第二待合成视频帧对应的时间戳与预设时间戳是否相同。在本申请实施例中,所述预设时间戳可以为直播主播预先设定的一个或多个时间点,也可以为系统预先默认设定的一个或者多个时间点,在本申请实施方式中不作限定。
其中,变装类视频指的是直播用户在观看该类视频时,在观看到视频播放到特定时间点时,服务器会为各个直播用户添加对应的贴纸特效,从而实现变装。
需要说明的是,所述第二待合成视频帧为所述第二视频中的当前待合成视频帧,比如,当前正在合成多人视频直播流中的第一帧画面,则所述第二待合成视频帧即为所述第二视频中的第一帧画面;又比如,当前正在合成多人视频直播流中的第三帧画面,则所述第二待合成视频帧即为所述第二视频中的第三帧画面。
所述拼接模块802,还用于若所述第二视频中的第二待合成视频帧对应的时间戳为所述预设时间戳,则对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行人脸检测。
具体地,在判定出所述第二待合成视频对应的时间戳为预设时间戳时,则可以采用人脸识别模型对第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行人脸检测,其中,所述人脸识别模型为现有的用于检测人脸的模型,在本实施了中不再赘述。
需要说明的是,所述第一待合成视频帧、所述第三待合成视频帧依次为所述第一视频中的当前待合成视频帧、所述第三视频中的当前待合成视频帧,比如,当前正在合成多人视频直播流中的第一帧画面,则所述第一待合成视频帧即为所述第一视频中的第一帧画面,所述第三待合成视频帧即为所述第三视频中的第一帧画面;又比如,当前正在合成多人视频直播流中的第三帧画面,则所述第一待合成视频帧即为所述第一视频中的第三帧画面,所述第三待合成视频帧即为所述第三视频中的第三帧画面。
在本申请实施例中,在判定出所述第二待合成视频对应的时间戳不为预设时间戳时,则可以直接进行视频帧的合成,而不用进行人脸检测出来。
所述拼接模块802,还用于在检测到人脸后,在检测到的人脸上添加预设的贴纸特效,以得到包含贴纸特效的待合成视频帧,其中,对未检测到人脸的待合成视频帧不添加贴纸特效。
具体地,所述贴纸特效可以为预先设定的美妆贴纸、眼影、手势特效、趣味变装特效等。
需要说明的是,本申请实施例中,在添加贴纸特效时,可以在各个直播成员的人脸上添加相同的贴纸特效,也可以在各个直播成员的人脸上添加不同的贴纸特效,或者也可以仅仅在某一个或多个直播成员的人脸上添加贴纸特效,在本实施例中不作限定。
所述拼接模块802,还用于将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
具体地,在完成贴纸特效的添加之后,即可以将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接合成,以得到所述多 人直播视频流中帧画面。示例性的,进行拼接后得到的多人直播流中的帧画面如图4或5所示,其中,图4为仅仅对一个直播用户进行添加贴纸特效后的合成帧画面的示意图,图5为对所有直播用户都进行添加贴纸特效后的合成帧画面的示意图。
本实施例中,通过在判定出主播直播观看的第二视频中的当前待合成视频帧对应的时间戳为预设时间戳时,在直播用户的人脸上添加预设的贴纸特效,从而可以使得直播用户在共同观看视频时,具有更大的观看兴趣,提高用户的使用体验。
在另一实施方式中,所述第二视频的视频类型为挑战类视频,所述拼接模块802,还用于对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行表情识别。
具体地,在第二视频的视频类型为挑战类视频时,在对所述多人直播视频流中的每一帧画面的合成过程中,首先需要对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行表情识别。在本申请实施例中,可以采用表情识别模型对所述第一待合成视频帧、其他直播成员的所述第三待合成视频帧进行表情识别,其中,该表情识别模型可以采用现有的用于识别表情的模型,在本实施例中不再赘述。
其中,所述挑战类视频指的是直播用户在观看该类视频时,各个直播用户可以进行挑战PK,比如,当该挑战类视频为搞笑类挑战视频时,直播用户之间可以挑战看谁先笑,当用户笑时,该用户即被淘汰;又比如,当该挑战类视频为悲伤类挑战视频时,直播用户之间可以挑战看谁先流眼泪,当用户流眼泪时,该用户即被淘汰。
所述拼接模块802,还用于在识别出预设表情时,在识别出预设表情的待合成视频帧中添加预设的贴纸特效,其中,对未识别出预设表情的待合成视频帧不添加贴纸特效。
具体地,所述预设表情为预先设定的表情,比如笑脸表情,或者哭脸表情等。所述贴纸特效可以为预先设定的用于表明直播用户挑战失败的贴纸特效,比如为“out”。
所述拼接模块802,还用于将所述第二视频中的第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
具体地,在完成贴纸特效的添加之后,即可以将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接合成,以得到所述多人直播视频流中帧画面。示例性的,进行拼接后得到的多人直播流中的帧画面如图7所示。
本实施例中,通过在主播直播观看的第二视频为挑战类视频时,各个直播用户之间可以进行挑战,从而可以提升直播用户观看视频的乐趣,进而提升用户体验。
在另一实施方式中,所述第二视频的视频类型为猜剧情类视频,所述拼接模块802,还用于检测是否接收所述直播群中的任一直播成员发送的剧情猜测消息。
具体地,在第二视频的视频类型为猜剧情类视频时,在对所述多人直播视频流中的每一帧画面的合成过程中,首先需要检测是否接收到为直播群中的任一个直播成员发送的剧情猜测消息,即各个直播成员在共同观看视频时,其中,任意一个直播成员可以发出剧情猜测消息,以供各个直播成员对该共同观看视频的剧情走向进行猜测。比如,直播成员在观看一个煎鸡蛋的猜剧情视频时,在直播成员观看的过程中,直播人员可以发起一个剧情猜测消息,以让各个直播成员猜测视频中的煎鸡蛋的用户共用了几个鸡蛋。其中,该剧情猜测消息可以为一个包含候选答案的消息,也可以为不包含候选答案的消息,只包括猜测问题。
所述拼接模块802,还用于若接收到所述直播群中的任一直播成员发送的剧情猜测消息,则将所述剧情猜测消息添加至第二视频中的第二待合成视频帧中。
具体地,在接收到某一个直播成员发送的剧情猜测消息时,即可以将该剧情猜测消息添加至第二视频中的第二待合成视频帧中。
在本申请实施例中,在未接收到直播群中的任一直播成员发送的剧情猜测消息时,则可以直接进行多人直播视频流中的视频帧的合成。
所述拼接模块802,还用于将包含所述剧情猜测消息的第二待合成视频帧与所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
具体地,在完成剧情猜测消息的添加之后,即可以将包含所述剧情猜测消息的第二待合成视频帧与所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行拼接合成,以得到所述多人直播视频流中帧画面。示例性的,进行拼接后得到的多人直播流中的帧画面如图9所示。
本实施例中,通过在主播直播观看的第二视频为猜剧情类视频时,任意一个直播成员可以发送剧情猜测消息,以让直播成员进行剧情走向的猜测,从而增加直播成员之间的互动,进而提升用户体验。
发送模块803,用于将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
具体地,所述各个直播成员包括所述直播主播以及直播群中除所述直播主播之外的其他直播成员。
本申请实施例中,在合成多人直播视频流后,立即将该多人直播视频流发送至各个直播成员对应的直播客户端中,以便各个直播成员可以观看到同时包含各个直播成员的视频画面以及直播主播观看的视频画面。
本申请实施例中,通过获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频;将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流;将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。本申请实施例中的多人视频直播业务实现方法,各个直播成员可以在其直播客户端的显示界面上观看到所有直播成员的直播视频画面,并且还可以共同与直播主播同时观看视频,从而可以增加直播成员的交互方式,提升用户体验。
进一步地,在一实施方式中,所述多人视频直播业务实现装置800还包括分发模块。
所述分发模块,用于将所述多人直播视频流分发至CDN网络中。
具体地,在得到多人直播视频流之后,为了让其他用户也可以观看到该多人直播视频流,可以将该多人直播视频流分发至CDN网络(Content Delivery Network,内容分发网络)中。这样,其他用户即可以根据自己的需求从CDN网络中下载其喜欢观看的类型的多人直播视频流进行播放观看。
本申请实施例通过将多人直播视频流分发至CDN网络中,从而使得其他用户也可以根据需求从CDN网络中下载该直播视频流进行播放观看,从而可以扩充用户可以观看的视频的种类。
进一步地,在另一实施方式中,所述多人视频直播业务实现装置800还包括处理模块。
所述处理模块,用于对所述多人直播视频流进行降码率处理。
具体地,为了使得各个直播成员在观看的多人直播视频流时,不会产生视频卡顿的效果,可以在多人直播视频流发送至各个直播成员对应的直播客户端前,对多人直播视频流进行降码率处理,从而减少多人直播视频流对网络资源的占用。这里的码率即是比特率,是指每秒传送的比特数,比特率越高,传送数据速度越快。
本申请实施例中,通过对多人直播视频流进行降码率处理,从而可以减少发送至各个直播成员对应的直播客户端的耗时,进而可以获得更好的直播效果。
图13示意性示出了根据本申请实施例的适于实现多人视频直播业务实现方法的计算机设备2的硬件架构示意图。本实施例中,计算机设备2是一种能够按照事先设定或者存储的指令,自动进行数值计算和/或信息处理的设备。例如,可以是平板电脑、笔记本电脑、台式计算机、机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。如图13所示,计算机设备2至少包括但不限于:可通过系统总线相互通信链接存储器901、处理器902、网络接口903。其中:
存储器901至少包括一种类型的计算机可读存储介质,可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器901可以是计算机设备2的内部存储模块,例如该计算机设备2的硬盘或内存。在另一些实施例中,存储器901也可以是计算机设备2的外部存储设备,例如该计算机设备2上配备的插接式硬盘,智能存储卡(Smart Media Card,简称为SMC),安全数字(Secure Digital,简称为SD)卡,闪存卡(Flash Card)等。当然,存储器901还可以既包括计算机设备2的内部存储模块也包括其外部存储设备。本实施例中,存储器901通常用于存储安装于计算机设备2的操作系统和各类应用软件,例如多人视频直播业务实现方法的程序代码等。此外,存储器901还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器902在一些实施例中可以是中央处理器(Central Processing Unit,简称为CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器902通常用于控制计算机设备2的总体操作,例如执行与计算机设备2进行数据交互或者通信相关的控制和处理等。本实施例中,处理器902用于运行存储器901中存储的程序代码或者处理数据。
网络接口903可包括无线网络接口或有线网络接口,该网络接口903通常用于在计算机设备2与其他计算机设备之间建立通信链接。例如,网络接口903用于通过网络将计算机设备2与外部终端相连,在计算机设备2与外部终端之间的建立数据传输通道和通信链接等。网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,简称为GSM)、宽带码分多址(Wideband Code Division Multiple Access,简称为WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。
需要指出的是,图13仅示出了具有部件901~903的计算机设备,但是应理解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。
在本实施例中,存储于存储器901中的多人视频直播业务实现方法还可以被分割为一个或者多个程序模块,并由一个或多个处理器(本实施例为处理器902)所执行,以完成本申请。
本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质其上存储有计算机可读指令,计算机可读指令被处理器执行时实现以下步骤:
获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面;
将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面;
将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
本实施例中,计算机可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,计算机可读存储介质可以是计算机设备的内部存储单元,例如该计算机设备的硬盘或内存。在另一些实施例中,计算机可读存储介质也可以是计算机设备的外部存储设备,例如该计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card,简称为SMC),安全数字(Secure Digital,简称为SD)卡,闪存卡(Flash Card)等。当然,计算机可读存储介质还可以既包括计算机设备的内部存储单元也包括其外部存储设备。本实施例中,计算机可读存储介质通常用于存储安装于计算机设备的操作系统和各类应用软件,例如实施例中的多人视频直播业务实现方法的程序代码等。此外,计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的各类数据。
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到至少两个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本申请实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
通过以上的实施方式的描述,本领域普通技术人员可以清楚地了解到各实施方式可借助软件加通用硬件平台的方式来实现,当然也可以通过硬件。本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程是可以通过计算机可读指令来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-OnlyMemory,ROM)或随机存储记忆体(RandomAccessMemory,RAM)等。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (20)

  1. 一种多人视频直播业务实现方法,包括:
    获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面;
    将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面;
    将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
  2. 根据权利要求1所述的多人视频直播业务实现方法,在进行拼接时,通过检测线程对所述第一视频以及所述其他直播成员的第三视频进行人脸检测,并在检测到人脸后,通过渲染线程对检测到的人脸区域进行渲染。
  3. 根据权利要求1所述的多人视频直播业务实现方法,所述第二视频的视频类型为变装类视频,对所述多人直播视频流中的每一帧画面的合成包括:
    判断所述第二视频中的第二待合成视频帧对应的时间戳是否为预设时间戳;
    若是,则对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行人脸检测;
    在检测到人脸后,在检测到的人脸上添加预设的贴纸特效,以得到包含贴纸特效的待合成视频帧,其中,对未检测到人脸的待合成视频帧不添加贴纸特效;
    将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  4. 根据权利要求1所述的多人视频直播业务实现方法,所述第二视频的视频类型为挑战类视频,对所述多人直播视频流中的每一帧画面的合成包括:
    对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行表情识别;
    在识别出预设表情时,在识别出预设表情的待合成视频帧中添加预设的贴纸特效,其中,对未识别出预设表情的待合成视频帧不添加贴纸特效;
    将所述第二视频中的第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  5. 根据权利要求1所述的多人视频直播业务实现方法,所述第二视频的视频类型为猜剧情类视频,对所述多人直播视频流中的每一帧画面的合成包括:
    检测是否接收到所述直播群中的任一直播成员发送的剧情猜测消息;
    若是,则将所述剧情猜测消息添加至第二视频中的第二待合成视频帧中;
    将包含所述剧情猜测消息的第二待合成视频帧与所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  6. 根据权利要求1至6任一项所述的多人视频直播业务实现方法,所述多人视频直播 业务实现方法还包括:
    将所述多人直播视频流分发至CDN网络中。
  7. 根据权利要求1至6任一项所述的多人视频直播业务实现方法,所述多人视频直播业务实现方法还包括:
    对所述多人直播视频流进行降码率处理。
  8. 一种多人视频直播业务实现装置,包括:
    获取模块,用于获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面;
    拼接模块,用于将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面;
    发送模块,用于将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
  9. 一种计算机设备,所述计算机设备,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现以下步骤:
    获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面;
    将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面;
    将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
  10. 根据权利要求9所述的计算机设备,所述第二视频的视频类型为变装类视频,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    判断所述第二视频中的第二待合成视频帧对应的时间戳是否为预设时间戳;
    若是,则对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行人脸检测;
    在检测到人脸后,在检测到的人脸上添加预设的贴纸特效,以得到包含贴纸特效的待合成视频帧,其中,对未检测到人脸的待合成视频帧不添加贴纸特效;
    将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  11. 根据权利要求9所述的计算机设备,所述第二视频的视频类型为挑战类视频,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合 成视频帧进行表情识别;
    在识别出预设表情时,在识别出预设表情的待合成视频帧中添加预设的贴纸特效,其中,对未识别出预设表情的待合成视频帧不添加贴纸特效;
    将所述第二视频中的第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  12. 根据权利要求9所述的计算机设备,所述第二视频的视频类型为猜剧情类视频,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    检测是否接收到所述直播群中的任一直播成员发送的剧情猜测消息;
    若是,则将所述剧情猜测消息添加至第二视频中的第二待合成视频帧中;
    将包含所述剧情猜测消息的第二待合成视频帧与所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  13. 根据权利要求9所述的计算机设备,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    将所述多人直播视频流分发至CDN网络中。
  14. 根据权利要求9所述的计算机设备,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    对所述多人直播视频流进行降码率处理。
  15. 一种计算机可读存储介质,其上存储有计算机可读指令,其特征在于:所述计算机可读指令被处理器执行时实现以下步骤:
    获取直播主播的第一视频与第二视频,以及获取直播群中除所述直播主播之外的其他直播成员的第三视频,其中,所述第一视频包括通过所述直播主播的第一摄像装置采集到的实时画面,所述第二视频包括所述直播主播观看的视频,所述第三视频包括通过所述其他直播成员的第二摄像装置采集到的实时画面;
    将所述第一视频、第二视频、以及所述其他直播成员的第三视频进行拼接,以得到多人直播视频流,其中,所述多人直播视频流中的每一帧画面均包含所述第一视频中的帧画面、所述第二视频中的帧画面、以及所述其他直播成员的第三视频中的帧画面;
    将所述多人直播视频流发送至各个直播成员对应的直播客户端中,以供所述各个直播成员观看。
  16. 根据权利要求15所述的计算机可读存储介质,所述第二视频的视频类型为变装类视频,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    判断所述第二视频中的第二待合成视频帧对应的时间戳是否为预设时间戳;
    若是,则对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行人脸检测;
    在检测到人脸后,在检测到的人脸上添加预设的贴纸特效,以得到包含贴纸特效的待合成视频帧,其中,对未检测到人脸的待合成视频帧不添加贴纸特效;
    将所述第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  17. 根据权利要求15所述的计算机可读存储介质,所述第二视频的视频类型为挑战类视频,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    对所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行表情识别;
    在识别出预设表情时,在识别出预设表情的待合成视频帧中添加预设的贴纸特效,其中,对未识别出预设表情的待合成视频帧不添加贴纸特效;
    将所述第二视频中的第二待合成视频帧与包含所述贴纸特效的待合成视频帧以及不包含贴纸特效的待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  18. 根据权利要求15所述的计算机可读存储介质,所述第二视频的视频类型为猜剧情类视频,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    检测是否接收到所述直播群中的任一直播成员发送的剧情猜测消息;
    若是,则将所述剧情猜测消息添加至第二视频中的第二待合成视频帧中;
    将包含所述剧情猜测消息的第二待合成视频帧与所述第一视频中的第一待合成视频帧、所述其他直播成员的第三视频中的第三待合成视频帧进行拼接,以得到所述多人直播视频流中帧画面。
  19. 根据权利要求15所述的计算机可读存储介质,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    将所述多人直播视频流分发至CDN网络中。
  20. 根据权利要求15所述的计算机可读存储介质,所述处理器执行所述计算机可读指令时还用于实现以下步骤:
    对所述多人直播视频流进行降码率处理。
PCT/CN2020/109869 2019-12-09 2020-08-18 多人视频直播业务实现方法、装置、计算机设备 WO2021114708A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/783,630 US11889132B2 (en) 2019-12-09 2020-08-18 Method and apparatus for implementing multi-person video live-streaming service, and computer device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911251118.0 2019-12-09
CN201911251118.0A CN113038287B (zh) 2019-12-09 2019-12-09 多人视频直播业务实现方法、装置、计算机设备

Publications (1)

Publication Number Publication Date
WO2021114708A1 true WO2021114708A1 (zh) 2021-06-17

Family

ID=76329457

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/109869 WO2021114708A1 (zh) 2019-12-09 2020-08-18 多人视频直播业务实现方法、装置、计算机设备

Country Status (3)

Country Link
US (1) US11889132B2 (zh)
CN (1) CN113038287B (zh)
WO (1) WO2021114708A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113824987A (zh) * 2021-09-30 2021-12-21 杭州网易云音乐科技有限公司 直播间首帧耗时的确定方法、介质、装置和计算设备
CN114827664A (zh) * 2022-04-27 2022-07-29 咪咕文化科技有限公司 多路直播混流方法、服务器、终端设备、系统及存储介质
WO2023245846A1 (zh) * 2022-06-21 2023-12-28 喻荣先 移动终端真实内容解析、内置反诈和欺诈判断系统及方法
CN117729375A (zh) * 2023-10-17 2024-03-19 书行科技(北京)有限公司 直播画面的处理方法、装置、计算机设备及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705368A (zh) * 2021-08-09 2021-11-26 上海幻电信息科技有限公司 人脸表情迁移方法、装置以及计算机设备
CN115834923A (zh) * 2021-09-16 2023-03-21 艾锐势企业有限责任公司 用于视频内容处理的网络设备、系统和方法
CN113949891B (zh) * 2021-10-13 2023-12-08 咪咕文化科技有限公司 一种视频处理方法、装置、服务端及客户端
CN114125485B (zh) * 2021-11-30 2024-04-30 北京字跳网络技术有限公司 图像处理方法、装置、设备及介质
CN115396716B (zh) * 2022-08-23 2024-01-26 北京字跳网络技术有限公司 一种直播视频处理方法、装置、设备及介质
US20240170021A1 (en) * 2022-11-22 2024-05-23 Bobroud Holdings, Llc. Live mobile video capture from multiple sources
CN118118698A (zh) * 2022-11-30 2024-05-31 腾讯科技(深圳)有限公司 直播处理方法、装置、电子设备、存储介质及程序产品
CN116437157A (zh) * 2023-06-12 2023-07-14 北京达佳互联信息技术有限公司 直播数据显示方法、装置、电子设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110306325A1 (en) * 2010-06-10 2011-12-15 Rajesh Gutta Streaming video/audio from mobile phone to any device
CN105491393A (zh) * 2015-12-02 2016-04-13 北京暴风科技股份有限公司 多人视频直播业务的实现方法
US20160294890A1 (en) * 2015-03-31 2016-10-06 Facebook, Inc. Multi-user media presentation system
CN106162221A (zh) * 2015-03-23 2016-11-23 阿里巴巴集团控股有限公司 直播视频的合成方法、装置及系统
CN106341695A (zh) * 2016-08-31 2017-01-18 腾讯数码(天津)有限公司 直播间互动方法、装置及系统
CN106954100A (zh) * 2017-03-13 2017-07-14 网宿科技股份有限公司 直播方法及系统、连麦管理服务器
CN108235044A (zh) * 2017-12-29 2018-06-29 北京密境和风科技有限公司 一种实现多人直播的方法、装置和服务器

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2517469A4 (en) * 2009-12-22 2014-01-15 Vidyo Inc SYSTEM AND METHOD FOR INTERACTIVE SYNCHRONIZED VIDEO VISUALIZATION
CN105791958A (zh) * 2016-04-22 2016-07-20 北京小米移动软件有限公司 游戏直播方法及装置
CN107105315A (zh) * 2017-05-11 2017-08-29 广州华多网络科技有限公司 直播方法、主播客户端的直播方法、主播客户端及设备
WO2019118890A1 (en) * 2017-12-14 2019-06-20 Hivecast, Llc Method and system for cloud video stitching
CN108259989B (zh) * 2018-01-19 2021-09-17 广州方硅信息技术有限公司 视频直播的方法、计算机可读存储介质和终端设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110306325A1 (en) * 2010-06-10 2011-12-15 Rajesh Gutta Streaming video/audio from mobile phone to any device
CN106162221A (zh) * 2015-03-23 2016-11-23 阿里巴巴集团控股有限公司 直播视频的合成方法、装置及系统
US20160294890A1 (en) * 2015-03-31 2016-10-06 Facebook, Inc. Multi-user media presentation system
CN105491393A (zh) * 2015-12-02 2016-04-13 北京暴风科技股份有限公司 多人视频直播业务的实现方法
CN106341695A (zh) * 2016-08-31 2017-01-18 腾讯数码(天津)有限公司 直播间互动方法、装置及系统
CN106954100A (zh) * 2017-03-13 2017-07-14 网宿科技股份有限公司 直播方法及系统、连麦管理服务器
CN108235044A (zh) * 2017-12-29 2018-06-29 北京密境和风科技有限公司 一种实现多人直播的方法、装置和服务器

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113824987A (zh) * 2021-09-30 2021-12-21 杭州网易云音乐科技有限公司 直播间首帧耗时的确定方法、介质、装置和计算设备
CN113824987B (zh) * 2021-09-30 2024-04-30 杭州网易云音乐科技有限公司 直播间首帧耗时的确定方法、介质、装置和计算设备
CN114827664A (zh) * 2022-04-27 2022-07-29 咪咕文化科技有限公司 多路直播混流方法、服务器、终端设备、系统及存储介质
CN114827664B (zh) * 2022-04-27 2023-10-20 咪咕文化科技有限公司 多路直播混流方法、服务器、终端设备、系统及存储介质
WO2023245846A1 (zh) * 2022-06-21 2023-12-28 喻荣先 移动终端真实内容解析、内置反诈和欺诈判断系统及方法
CN117729375A (zh) * 2023-10-17 2024-03-19 书行科技(北京)有限公司 直播画面的处理方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
CN113038287B (zh) 2022-04-01
CN113038287A (zh) 2021-06-25
US20230011255A1 (en) 2023-01-12
US11889132B2 (en) 2024-01-30

Similar Documents

Publication Publication Date Title
WO2021114708A1 (zh) 多人视频直播业务实现方法、装置、计算机设备
US11381739B2 (en) Panoramic virtual reality framework providing a dynamic user experience
US20200302179A1 (en) Method for labeling performance segment, video playing method, apparaus and system
WO2020083021A1 (zh) 视频录制方法、视频播放方法、装置、设备及存储介质
US10938725B2 (en) Load balancing multimedia conferencing system, device, and methods
US11778142B2 (en) System and method for intelligent appearance monitoring management system for videoconferencing applications
US10897637B1 (en) Synchronize and present multiple live content streams
US11025967B2 (en) Method for inserting information push into live video streaming, server, and terminal
US20140192136A1 (en) Video chatting method and system
US11863801B2 (en) Method and device for generating live streaming video data and method and device for playing live streaming video
US11451858B2 (en) Method and system of processing information flow and method of displaying comment information
WO2019134235A1 (zh) 一种直播互动方法、装置、终端设备及存储介质
WO2019114330A1 (zh) 一种视频播放方法、装置和终端设备
CN112738418B (zh) 视频获取方法、装置以及电子设备
WO2023035882A1 (zh) 视频处理方法、设备、存储介质和程序产品
WO2024001661A1 (zh) 视频合成方法、装置、设备和存储介质
CN112528052A (zh) 多媒体内容输出方法、装置、电子设备和存储介质
CN112492324A (zh) 数据处理方法及系统
CN112954452B (zh) 视频生成方法、装置、终端及存储介质
WO2021088973A1 (zh) 直播流显示方法、装置、电子设备及可读存储介质
CN112261422A (zh) 适用于广电领域的仿真远程直播流数据处理方法
US20240137619A1 (en) Bullet-screen comment display
US20240236436A9 (en) Bullet-screen comment display
CN112584084B (zh) 一种视频播放方法、装置、计算机设备和存储介质
KR102615377B1 (ko) 방송 체험 서비스의 제공 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20900087

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20900087

Country of ref document: EP

Kind code of ref document: A1