WO2016150317A1 - Method, apparatus and system for synthesizing live video - Google Patents

Method, apparatus and system for synthesizing live video Download PDF

Info

Publication number
WO2016150317A1
WO2016150317A1 PCT/CN2016/076374 CN2016076374W WO2016150317A1 WO 2016150317 A1 WO2016150317 A1 WO 2016150317A1 CN 2016076374 W CN2016076374 W CN 2016076374W WO 2016150317 A1 WO2016150317 A1 WO 2016150317A1
Authority
WO
WIPO (PCT)
Prior art keywords
video stream
video
terminal
play
server
Prior art date
Application number
PCT/CN2016/076374
Other languages
French (fr)
Chinese (zh)
Inventor
晏营
袁英灿
吴易明
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2016150317A1 publication Critical patent/WO2016150317A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/475End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data

Definitions

  • the present application relates to the field of video processing technologies, and in particular, to a method, device, and system for synthesizing live video.
  • the existing live channel is a live broadcast directly formed by a single video stream, and the user can directly watch the live video through the network.
  • traditional live broadcasts have been unable to meet the diverse needs of users. People will want more interactive ways in viewing the video playback. For example, in the process of playing a game by a user, some game darlings hope to play a game explanation or add motion guidance to the ongoing game, synthesize a video with the game screen, and form a vivid game strategy, while other users also want to see These game masters are how to operate the game strategy.
  • the existing video synthesis method is usually a software with a media material editing function installed on the terminal, which combines the captured video, picture and recorded audio into a dynamic video with sound.
  • This kind of video synthesized on a single terminal has no way to directly share it with others, and thus cannot achieve the effect of live video.
  • the purpose of the present application is to provide a method, a device and a system for synthesizing a live video, which can add a user's interactive video to the currently played screen to form a live broadcast screen, and the effect of synthesizing the video stream in the server is better, and the user experience is good.
  • the present application provides a method for synthesizing a live video, the method comprising:
  • the second video stream is collected by the video capture device
  • the present application further provides a method for synthesizing a live video, the method comprising:
  • the terminal Receiving, by the terminal, the second video stream that is transmitted by the terminal, where the second video stream is a video stream that is collected by the terminal through the video collection device;
  • the present application further provides a synthesizing device for a live video, the device comprising:
  • An acquiring unit configured to collect a second video stream when playing the first video stream
  • a transmitting unit configured to transmit the second video stream collected by the collecting unit to a server, so that the server merges with the first video stream being played by using the second video stream to form a live broadcast Three video streams;
  • a receiving unit configured to receive the third video stream sent by the server
  • a processing unit configured to parse the third video stream received by the receiving unit, form a play screen of the third video stream, and play a play screen of the third video stream.
  • the present application further provides a synthesizing device for a live video, the device comprising:
  • a receiving unit when the terminal plays the first video stream, receiving a second video stream that is transmitted by the terminal, where the second video stream is a video stream that is collected by the terminal through the video collection device;
  • a processing unit configured to merge the second video stream received by the receiving unit with the first video stream to form a third video stream that is broadcasted;
  • a transmitting unit configured to transmit the third video stream formed by the processing unit to the terminal.
  • the present application further provides a system for synthesizing live video, the system comprising: a server and a terminal with a video capture device;
  • the terminal When the terminal plays the first video stream, the terminal collects the second video stream by using the video capture device;
  • the server merges with the first video stream being played by using the second video stream to form a third video stream of the live broadcast;
  • the terminal parses the third video stream to form a play screen of the third video stream, and plays a play screen of the third video stream.
  • the video capture device is used to collect the interaction behavior of the user for the currently played picture, and the collected video stream is transmitted to the server, and the user's interactive video can be added to the currently played picture. , forming a live video, good real-time, good user experience, and at the same time because it is in the server
  • the composite video stream can make the live video stream better and the picture clearer.
  • FIG. 1 is a schematic diagram of a system for synthesizing live video according to an embodiment of the present application
  • FIG. 2 is a flowchart of a method for synthesizing live video on a terminal side according to an embodiment of the present disclosure
  • FIG. 3 is a flowchart of a method for synthesizing live video on a server side according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of a synthesized live broadcast screen according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a device for synthesizing live video according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a device for synthesizing live video according to an embodiment of the present application.
  • the method and device for synthesizing live video are applicable to a terminal with a video capture device and capable of network connection, or to a terminal that can connect to a video capture device and can perform network connection, for example,
  • a terminal device such as a television, a computer, a pad, a mobile phone, etc., which can be connected to a cloud server through a network cable or a wireless network to communicate with the cloud server.
  • the system includes a terminal 1 and a server 2 with a video capture device 11.
  • the server 2 may be a cloud server, and the terminal 1 and the server 2 are connected through a network.
  • the terminal 1 plays the video stream transmitted by the server, and the user can interact with the playback screen of the terminal 1.
  • the terminal 1 collects the interactive video stream of the user through the video collection device 11 and transmits the video stream to the server 2, after performing video synthesis in the server 2,
  • the synthesized live video stream is played on the terminal 1.
  • a method for synthesizing a live video according to an embodiment of the present application includes:
  • the terminal When the terminal plays the first video stream, the terminal collects the second video stream by using the video collection device.
  • the second video stream is a video data stream that is collected by the terminal through the video capture device in real time.
  • the video capture device collects a user's interaction behavior with the play screen of the first video stream to form a second video stream.
  • it can also be other video content collected by real-time shooting by a device such as a camera.
  • the video capture device includes a camera or a camera.
  • the camera may be a camera on a mobile terminal such as a mobile phone or a tablet computer connected to the television, or may be a camera of a camera, a video recorder, or the like.
  • the interaction behavior includes a voice interaction behavior and an action interaction behavior.
  • the collecting, by the video capture device, the interaction behavior of the user on the play screen of the first video stream includes: collecting, by the camera, the action interaction behavior of the user on the play screen of the first video stream; and collecting the device through the microphone.
  • the interaction behavior includes an action interaction behavior.
  • the collecting, by the video capture device, the interaction behavior of the user on the play screen of the first video stream includes: collecting, by the camera, the action interaction behavior of the user on the played screen.
  • the second video stream is transmitted to a server, so that the server merges with the first video stream being played by using the second video stream to form a live third video stream.
  • the third video stream is a video data stream after the video being played in the server is combined with the video captured by the terminal through a camera or the like. After receiving the third video stream, the terminal may form a play screen.
  • the terminal After receiving the third video stream, the terminal processes the third video stream according to the existing video codec manner, obtains a play picture of the third video stream, and plays a play picture of the third video stream on the display of the terminal.
  • the playback screen at this time includes the original video screen and the video screen captured by the terminal.
  • the live broadcast screen can be seen immediately in the user's own terminal. When other users choose to watch the video on the network, they can also see the live broadcast screen.
  • the video synthesis is processed on the cloud server, the user can directly select the interactive live broadcast mode through the own video capture device, and the video can be broadcast live, without It is very simple and convenient to buy professional equipment. Moreover, the video picture synthesized on the cloud server side has higher pixels and better effect.
  • the method further includes: receiving, by the input control device, an input control operation of the user on the first video stream.
  • the input control device comprises a game handle, a keyboard, a mouse or a somatosensory camera.
  • the terminal processes the received input control operation accordingly. For example, when the user uses the gamepad to move left or right, the terminal can move the video screen being played to the left or right.
  • the method further includes: the terminal storing the third video stream; and parsing the third when receiving the operation of playing the third video stream a video stream, forming a play picture of the third video stream, playing a play picture of the third video stream.
  • the terminal storing the third video stream
  • parsing the third when receiving the operation of playing the third video stream a video stream, forming a play picture of the third video stream, playing a play picture of the third video stream.
  • the user may also choose to store the file formed by the third video stream on a website or in a cloud storage space so that other users can watch or order the video.
  • the file formed by the third video stream can also be stored on the server.
  • a method for synthesizing a live video according to an embodiment of the present application includes:
  • the second video stream is a video data stream collected by the terminal through the video capture device in real time.
  • the second video stream is a video stream formed by the user that the terminal collects through the video collection device interacts with the first video stream.
  • the server After receiving the second video stream of the terminal in S201, the server combines the second video stream with the first video stream being played on the terminal stored in the server by using a codec technology to form a third video stream.
  • the merging the second video stream with the first video stream that is being played to form a live third video stream may include: embedding a play window in a play screen of the first video stream; a time identifier of the second video stream, the play screen of the second video stream is added to the play window, and the play screen of the play window has the same time identifier as the play screen of the first video stream. Forming the third video stream.
  • the method before the transmitting, by the S203, the third video stream to the terminal, the method further includes: compressing a formed play picture of the third video stream, and transmitting the compressed third video stream to the terminal. This way, you can The amount of data transmitted in the network is reduced, and the response speed is fast.
  • the user is playing a cloud game on a local terminal and has selected to upload the game video stream (ie, the first video stream) in real time over the network.
  • the cloud game refers to the game in which the video stream of the game is stored on the cloud server. Then, the user can use the local camera to collect video, collect the action and voice of the game, form a second video stream, and upload the second video stream to the cloud server in real time through the network.
  • the cloud server combines the first video stream and the second video stream by using a codec technology to form a live stream of a video stream, that is, a third video stream.
  • the cloud server transmits the synthesized third video stream to the local terminal.
  • FIG. 4 is a schematic diagram of a synthesized live broadcast screen provided by an embodiment of the present application, where a user can view a synthesized live video program on a screen of a local terminal.
  • the live video synthesis method of the present application can be used in many application scenarios.
  • the scenario similar to the above-mentioned user playing the game may further include: the teacher may also use the live video synthesis system provided by the application to create a live classroom, etc., and the specific processing process is also Similarly, it will not be repeated here.
  • the method for synthesizing the live video provided by the embodiment of the present application, the method for forming a live channel by combining the client and the cloud, and transmitting the video stream of the user interaction collected by the terminal to the server, and the effect of synthesizing the video stream in the server is better.
  • the user's interactive video can be added to the currently played screen to form a live broadcast screen, and the user experience is good.
  • FIG. 5 is a schematic diagram of a device for synthesizing a live video according to an embodiment of the present disclosure.
  • the device for synthesizing a live video of the present application includes: an acquisition unit 301, a transmission unit 302, a receiving unit 303, and a processing unit 304.
  • the collecting unit 301 is configured to collect the second video stream when the first video stream is played.
  • the second video stream is a video stream formed by collecting interaction behaviors of the user on the play screen of the first video stream.
  • the transmitting unit 302 is configured to transmit the second video stream collected by the collecting unit 301 to the server 2, so that the server 2 merges with the first video stream being played by using the second video stream to form a third live broadcast. Video stream.
  • the receiving unit 303 is configured to receive the third video stream sent by the server 2.
  • the processing unit 304 is configured to parse the third video stream received by the receiving unit 303, form a play screen of the third video stream, and play a play screen of the third video stream.
  • the interaction behavior includes a voice interaction behavior and an action interaction behavior.
  • the acquisition unit 301 includes a camera a head and a microphone, the camera collects an action interaction behavior of the user on a play screen of the first video stream, and the microphone collects voice interaction data of the user on the first video stream.
  • the interaction behavior includes an action interaction behavior.
  • the collecting unit 301 includes a camera that collects the action interaction behavior of the user on the playing screen of the first video stream.
  • the synthesizing device of the live video further includes: an input control unit, configured to receive an input control operation of the first video stream by the user.
  • the processing unit 304 processes the input control operations received by the input control unit accordingly. For example, when the user performs an operation of moving left or right using the game pad, the processing unit 304 moves the video screen being played to the left or right.
  • the synthesizing device of the live video further includes: a storage unit, configured to store the third video stream after the receiving unit 303 receives the third video stream sent by the server.
  • the processing unit 304 parses the third video stream to form a play screen of the third video stream, and plays a play screen of the third video stream.
  • the functions of the foregoing units may correspond to the processing steps of the method for synthesizing the live video described in detail in FIG. 2, and details are not described herein again.
  • FIG. 6 is a schematic diagram of a device for synthesizing a live video according to an embodiment of the present disclosure.
  • the device for synthesizing a live video of the present application includes: a receiving unit 401, a processing unit 402, and a transmission unit 403.
  • the receiving unit 401 is configured to receive a second video stream that is transmitted by the terminal when the terminal plays the first video stream, where the second video stream is a video stream that is collected by the terminal through the video collection device.
  • the second video stream is a video stream formed by the user that the terminal collects through the video collection device interacts with the first video stream.
  • the processing unit 402 is configured to merge the second video stream received by the receiving unit 401 with the first video stream to form a live third video stream.
  • the transmitting unit 403 is configured to transmit the third video stream formed by the processing unit 402 to the terminal.
  • the processing unit 402 specifically includes: an embedded subunit and a merged subunit.
  • the embedded subunit is configured to embed a play window in a play screen of the first video stream.
  • the merging subunit is configured to add a play screen of the second video stream to the play window according to a time identifier of the second video stream, and play a play screen of the play window with the first video
  • the play pictures of the stream have the same time stamp, forming the third video stream.
  • the processing unit 402 further includes: a compression subunit.
  • the compression subunit is configured to compress the formed play picture of the third video stream before the transmission unit 403 transmits the third video stream to the terminal.
  • the transmitting unit 403 transmits the third video stream compressed by the compression subunit to the terminal.
  • the functions of the foregoing units may correspond to the processing steps of the method for synthesizing the live video described in detail in FIG. 3, and details are not described herein again.
  • the video capture device is used to collect the interaction behavior of the user for the currently played picture, and the collected video stream is transmitted to the server, and the user may be added to the currently played picture.
  • Interactive video forming a live video, good real-time performance, good user experience, and because the video stream is synthesized in the server, the live video stream can be better and the picture is clearer.
  • the steps of a method or algorithm described in connection with the embodiments disclosed herein can be implemented in hardware, a software module executed by a processor, or a combination of both.
  • the software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present application relates to a method, apparatus and system for synthesizing a live video. The method comprises: collecting a second video stream by means of a video collection device when a first video stream is played; transmitting the second video stream to a server, so that the server merges with the playing first video stream by using the second video stream and forms a live third video stream; receiving the third video stream sent by the server; parsing the third video stream, forming a playing picture of the third video stream, and playing the playing picture of the third video stream. In the present application, an interactive video of a user can be added to a currently playing picture to form a live picture, the effect of synthesizing the video stream in the server is even better, and user experience is good.

Description

直播视频的合成方法、装置及系统Method, device and system for synthesizing live video
本申请要求2015年03月23日递交的申请号为201510127721.3、发明名称为“直播视频的合成方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application Serial No. No. No. No. No. No. No. No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No
技术领域Technical field
本申请涉及视频处理技术领域,尤其涉及一种直播视频的合成方法、装置及系统。The present application relates to the field of video processing technologies, and in particular, to a method, device, and system for synthesizing live video.
背景技术Background technique
现有的直播频道是通过单一的视频流直接形成的直播,用户可以通过网络直接观看直播的视频。然而,随着计算机和网络的发展,传统的直播已经不能满足用户多元化的需求。人们在查看视频播放的过程中,会希望多一些互动的方式。例如在用户玩游戏的过程中,有些游戏达人希望可以对正在进行的游戏进行游戏解说或者添加动作指导,与游戏画面合成一段视频,形成生动形象的游戏攻略,而其他用户也想要看到这些游戏达人是如何操作的游戏攻略。The existing live channel is a live broadcast directly formed by a single video stream, and the user can directly watch the live video through the network. However, with the development of computers and networks, traditional live broadcasts have been unable to meet the diverse needs of users. People will want more interactive ways in viewing the video playback. For example, in the process of playing a game by a user, some game darlings hope to play a game explanation or add motion guidance to the ongoing game, synthesize a video with the game screen, and form a vivid game strategy, while other users also want to see These game masters are how to operate the game strategy.
现有的视频合成方法通常是在终端上安装有媒体素材编辑功能的软件,将拍摄的视频、图片以及录制的音频合成为有声的动态视频。这种在单个终端上合成的视频并没有办法直接分享给别人,因而无法达到视频直播的效果。The existing video synthesis method is usually a software with a media material editing function installed on the terminal, which combines the captured video, picture and recorded audio into a dynamic video with sound. This kind of video synthesized on a single terminal has no way to directly share it with others, and thus cannot achieve the effect of live video.
发明内容Summary of the invention
本申请的目的是,提供一种直播视频的合成方法、装置及系统,可以针对当前播放的画面添加用户的交互视频,形成直播画面,在服务器中合成视频流的效果更好,用户体验好。The purpose of the present application is to provide a method, a device and a system for synthesizing a live video, which can add a user's interactive video to the currently played screen to form a live broadcast screen, and the effect of synthesizing the video stream in the server is better, and the user experience is good.
本申请提供了一种直播视频的合成方法,所述方法包括:The present application provides a method for synthesizing a live video, the method comprising:
在播放第一视频流时,通过视频采集设备采集第二视频流;When the first video stream is played, the second video stream is collected by the video capture device;
将所述第二视频流传输至服务器,以便所述服务器利用所述第二视频流与正在播放的所述第一视频流合并,形成直播的第三视频流;Transmitting the second video stream to a server, so that the server merges with the first video stream being played by using the second video stream to form a live third video stream;
接收所述服务器发送的所述第三视频流;Receiving the third video stream sent by the server;
解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。 Parsing the third video stream to form a play screen of the third video stream, and playing a play screen of the third video stream.
又一方面,本申请还提供了一种直播视频的合成方法,所述方法包括:In another aspect, the present application further provides a method for synthesizing a live video, the method comprising:
在终端播放第一视频流时,接收所述终端传输的第二视频流,所述第二视频流为所述终端通过视频采集设备采集的视频流;Receiving, by the terminal, the second video stream that is transmitted by the terminal, where the second video stream is a video stream that is collected by the terminal through the video collection device;
利用所述第二视频流与所述第一视频流合并,形成直播的第三视频流;Merging the second video stream with the first video stream to form a live third video stream;
将所述第三视频流传输给所述终端。Transmitting the third video stream to the terminal.
又一方面,本申请还提供了一种直播视频的合成装置,所述装置包括:In another aspect, the present application further provides a synthesizing device for a live video, the device comprising:
采集单元,用于在播放第一视频流时,采集第二视频流;An acquiring unit, configured to collect a second video stream when playing the first video stream;
传输单元,用于将所述采集单元采集到的所述第二视频流传输至服务器,以便所述服务器利用所述第二视频流与正在播放的所述第一视频流合并,形成直播的第三视频流;a transmitting unit, configured to transmit the second video stream collected by the collecting unit to a server, so that the server merges with the first video stream being played by using the second video stream to form a live broadcast Three video streams;
接收单元,用于接收所述服务器发送的所述第三视频流;a receiving unit, configured to receive the third video stream sent by the server;
处理单元,用于解析所述接收单元接收的所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。And a processing unit, configured to parse the third video stream received by the receiving unit, form a play screen of the third video stream, and play a play screen of the third video stream.
又一方面,本申请还提供了一种直播视频的合成装置,所述装置包括:In another aspect, the present application further provides a synthesizing device for a live video, the device comprising:
接收单元,在终端播放第一视频流时,接收所述终端传输的第二视频流,所述第二视频流为所述终端通过视频采集设备采集的视频流;a receiving unit, when the terminal plays the first video stream, receiving a second video stream that is transmitted by the terminal, where the second video stream is a video stream that is collected by the terminal through the video collection device;
处理单元,用于利用所述接收单元接收的所述第二视频流与所述第一视频流合并,形成直播的第三视频流;a processing unit, configured to merge the second video stream received by the receiving unit with the first video stream to form a third video stream that is broadcasted;
传输单元,用于将所述处理单元形成的所述第三视频流传输给所述终端。And a transmitting unit, configured to transmit the third video stream formed by the processing unit to the terminal.
又一方面,本申请还提供了一种直播视频的合成系统,所述系统包括:服务器和带有视频采集设备的终端;In another aspect, the present application further provides a system for synthesizing live video, the system comprising: a server and a terminal with a video capture device;
所述终端在播放第一视频流时,通过视频采集设备采集第二视频流;When the terminal plays the first video stream, the terminal collects the second video stream by using the video capture device;
所述终端将所述第二视频流传输至所述服务器;Transmitting, by the terminal, the second video stream to the server;
所述服务器利用所述第二视频流与正在播放的所述第一视频流合并,形成直播的第三视频流;The server merges with the first video stream being played by using the second video stream to form a third video stream of the live broadcast;
所述终端接收所述服务器发送的所述第三视频流;Receiving, by the terminal, the third video stream sent by the server;
所述终端解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。The terminal parses the third video stream to form a play screen of the third video stream, and plays a play screen of the third video stream.
本申请实施例提供的直播视频的合成方法及装置,利用视频采集设备采集用户针对当前播放的画面的交互行为,将采集到的视频流传输到服务器,可以针对当前播放的画面添加用户的交互视频,形成直播画面,实时性好,用户体验好,同时由于是在服务器 中合成视频流,可以使得直播视频流的效果更好,画面更清晰。The method and device for synthesizing the live video provided by the embodiment of the present application, the video capture device is used to collect the interaction behavior of the user for the currently played picture, and the collected video stream is transmitted to the server, and the user's interactive video can be added to the currently played picture. , forming a live video, good real-time, good user experience, and at the same time because it is in the server The composite video stream can make the live video stream better and the picture clearer.
附图说明DRAWINGS
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present application. Other drawings may also be obtained from those of ordinary skill in the art in view of the drawings.
图1为本申请实施例提供的一种直播视频的合成系统的示意图;FIG. 1 is a schematic diagram of a system for synthesizing live video according to an embodiment of the present application;
图2为本申请实施例提供的一种终端侧的直播视频的合成方法流程图;FIG. 2 is a flowchart of a method for synthesizing live video on a terminal side according to an embodiment of the present disclosure;
图3为本申请实施例提供的一种服务端侧的直播视频的合成方法流程图;FIG. 3 is a flowchart of a method for synthesizing live video on a server side according to an embodiment of the present application;
图4为本申请实施例提供的一种合成后的直播画面示意图;FIG. 4 is a schematic diagram of a synthesized live broadcast screen according to an embodiment of the present application;
图5为本申请实施例提供的一种直播视频的合成装置示意图;FIG. 5 is a schematic diagram of a device for synthesizing live video according to an embodiment of the present application;
图6为本申请实施例提供的一种直播视频的合成装置示意图。FIG. 6 is a schematic diagram of a device for synthesizing live video according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本发明实施例中附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本发明实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围,而是仅仅表示本发明的选定实施例。基于本发明的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of the embodiments of the invention, which are generally described and illustrated in the figures herein, may be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of the invention in the claims All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
本申请实施例提供的直播视频的合成方法及装置,适用于带有视频采集设备且能够进行网络连接的终端,或者,也适用于可以外接视频采集设备且能够进行网络连接的终端,例如,带有摄像头的电视机、电脑、Pad、手机等终端设备,这些终端设备可以通过网线或无线网络连接到云端服务器,与云端服务器通信。The method and device for synthesizing live video provided by the embodiment of the present invention are applicable to a terminal with a video capture device and capable of network connection, or to a terminal that can connect to a video capture device and can perform network connection, for example, A terminal device such as a television, a computer, a pad, a mobile phone, etc., which can be connected to a cloud server through a network cable or a wireless network to communicate with the cloud server.
图1是本申请实施例提供的直播视频的合成系统的示意图,如图1所示,该系统包括带有视频采集设备11的终端1和服务器2。服务器2可以是云端服务器,终端1和服务器2之间通过网络连接。终端1播放服务器传输过来的视频流,用户可以对终端1的播放画面进行互动,终端1通过视频采集设备11采集用户的交互视频流,并传输给服务器2,在服务器2中进行视频合成后,将合成后的直播视频流在终端1上播放。下面通 过图2和图3对本申请提供的直播视频的合成方法进行详细介绍。1 is a schematic diagram of a system for synthesizing live video provided by an embodiment of the present application. As shown in FIG. 1, the system includes a terminal 1 and a server 2 with a video capture device 11. The server 2 may be a cloud server, and the terminal 1 and the server 2 are connected through a network. The terminal 1 plays the video stream transmitted by the server, and the user can interact with the playback screen of the terminal 1. The terminal 1 collects the interactive video stream of the user through the video collection device 11 and transmits the video stream to the server 2, after performing video synthesis in the server 2, The synthesized live video stream is played on the terminal 1. Below The method for synthesizing the live video provided by the present application is described in detail in FIG. 2 and FIG. 3 .
图2是本申请实施例提供的直播视频的合成方法流程图,如图2所示,本申请实施例的直播视频的合成方法包括:2 is a flowchart of a method for synthesizing a live video according to an embodiment of the present application. As shown in FIG. 2, a method for synthesizing a live video according to an embodiment of the present application includes:
S101、终端在播放第一视频流时,通过视频采集设备采集第二视频流。S101. When the terminal plays the first video stream, the terminal collects the second video stream by using the video collection device.
第二视频流是终端通过视频采集设备实时采集到的视频数据流,例如,包括通过视频采集设备采集用户对所述第一视频流的播放画面的交互行为形成第二视频流。当然,也可以是其他通过摄像头等设备实时拍摄而采集到的视频内容。The second video stream is a video data stream that is collected by the terminal through the video capture device in real time. For example, the video capture device collects a user's interaction behavior with the play screen of the first video stream to form a second video stream. Of course, it can also be other video content collected by real-time shooting by a device such as a camera.
视频采集设备包括摄像头或摄像机等,所述摄像头可以是与电视机相连接的手机、平板电脑等移动终端上的摄像头,也可以是照相机、录像机等设备的摄像头。The video capture device includes a camera or a camera. The camera may be a camera on a mobile terminal such as a mobile phone or a tablet computer connected to the television, or may be a camera of a camera, a video recorder, or the like.
其中,所述交互行为包括语音交互行为和动作交互行为。所述通过视频采集设备采集用户对所述第一视频流的播放画面的交互行为包括:通过摄像头采集所述用户对所述第一视频流的播放画面的动作交互行为;以及,通过麦克风采集所述用户对所述第一视频流的语音交互数据。The interaction behavior includes a voice interaction behavior and an action interaction behavior. The collecting, by the video capture device, the interaction behavior of the user on the play screen of the first video stream includes: collecting, by the camera, the action interaction behavior of the user on the play screen of the first video stream; and collecting the device through the microphone The voice interaction data of the user to the first video stream is described.
或者,所述交互行为包括动作交互行为。则所述通过视频采集设备采集用户对所述第一视频流的播放画面的交互行为包括:通过摄像头采集所述用户对播放的画面的动作交互行为。Alternatively, the interaction behavior includes an action interaction behavior. The collecting, by the video capture device, the interaction behavior of the user on the play screen of the first video stream includes: collecting, by the camera, the action interaction behavior of the user on the played screen.
S102、将所述第二视频流传输至服务器,以便所述服务器利用所述第二视频流与正在播放的所述第一视频流合并,形成直播的第三视频流。S102. The second video stream is transmitted to a server, so that the server merges with the first video stream being played by using the second video stream to form a live third video stream.
S103、接收所述服务器发送的所述第三视频流。S103. Receive the third video stream sent by the server.
第三视频流是服务器中保存的正在播放的视频与终端通过摄像头等采集到的视频合成之后的视频数据流。终端在接收到第三视频流之后,可以形成播放画面。The third video stream is a video data stream after the video being played in the server is combined with the video captured by the terminal through a camera or the like. After receiving the third video stream, the terminal may form a play screen.
S104、解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。S104. Parse the third video stream, form a play screen of the third video stream, and play a play screen of the third video stream.
终端在接收到第三视频流之后,按照现有的视频编解码方式对第三视频流进行处理,得到第三视频流的播放画面,并在终端的显示器上播放第三视频流的播放画面。此时的播放画面中包括原有的视频画面以及终端采集的视频画面。在用户自己所在的终端中可以立刻看到直播的画面,当其他用户在网络中选择收看该视频时,也可以看到直播的画面。After receiving the third video stream, the terminal processes the third video stream according to the existing video codec manner, obtains a play picture of the third video stream, and plays a play picture of the third video stream on the display of the terminal. The playback screen at this time includes the original video screen and the video screen captured by the terminal. The live broadcast screen can be seen immediately in the user's own terminal. When other users choose to watch the video on the network, they can also see the live broadcast screen.
在本申请实施例中,由于视频合成是在云端服务器上进行处理的,这样,用户可以通过自有的视频采集设备,简单地选择互动直播的方式,就可以进行视频直播了,不需 要购买专业的设备,非常简单方便。而且,在云服务器端合成的视频画面像素更高,效果更好。In the embodiment of the present application, since the video synthesis is processed on the cloud server, the user can directly select the interactive live broadcast mode through the own video capture device, and the video can be broadcast live, without It is very simple and convenient to buy professional equipment. Moreover, the video picture synthesized on the cloud server side has higher pixels and better effect.
可选地,在所述播放第一视频流之后,还包括:通过输入控制设备接收所述用户对所述第一视频流的输入控制操作。其中,所述输入控制设备包括游戏手柄、键盘、鼠标或体感摄像机。Optionally, after the playing the first video stream, the method further includes: receiving, by the input control device, an input control operation of the user on the first video stream. Wherein, the input control device comprises a game handle, a keyboard, a mouse or a somatosensory camera.
终端对接收的所述输入控制操作进行相应地处理。例如,用户利用游戏手柄进行向左或向右移动的操作时,终端则可以将正在播放的视频画面向左或向右移动。The terminal processes the received input control operation accordingly. For example, when the user uses the gamepad to move left or right, the terminal can move the video screen being played to the left or right.
可选地,在接收所述服务器发送的所述第三视频流之后,还包括:终端存储所述第三视频流;当接收到播放所述第三视频流的操作时,解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。这样,用户可以在终端上回看或点播这一视频,操作灵活方便。Optionally, after receiving the third video stream sent by the server, the method further includes: the terminal storing the third video stream; and parsing the third when receiving the operation of playing the third video stream a video stream, forming a play picture of the third video stream, playing a play picture of the third video stream. In this way, the user can view or order the video on the terminal, and the operation is flexible and convenient.
用户还可以选择将这个第三视频流所形成的文件存储到网站上或者云存储空间中,以便其他用户回看或点播这一视频。当然,在服务器上也可以存储该第三视频流所形成的文件。The user may also choose to store the file formed by the third video stream on a website or in a cloud storage space so that other users can watch or order the video. Of course, the file formed by the third video stream can also be stored on the server.
图3是本申请实施例提供的直播视频的合成方法流程图,如图3所示,本申请实施例的直播视频的合成方法包括:3 is a flowchart of a method for synthesizing a live video according to an embodiment of the present application. As shown in FIG. 3, a method for synthesizing a live video according to an embodiment of the present application includes:
S201、在终端播放第一视频流时,接收所述终端传输的第二视频流。S201. Receive a second video stream transmitted by the terminal when the terminal plays the first video stream.
所述第二视频流是终端通过视频采集设备实时采集到的视频数据流。例如,所述第二视频流为所述终端通过视频采集设备采集的用户对所述第一视频流进行交互而形成的视频流。The second video stream is a video data stream collected by the terminal through the video capture device in real time. For example, the second video stream is a video stream formed by the user that the terminal collects through the video collection device interacts with the first video stream.
S202、利用所述第二视频流与所述第一视频流合并,形成直播的第三视频流。S202. Combine the second video stream with the first video stream to form a live third video stream.
服务器在S201中接收到终端的第二视频流之后,则通过编解码技术,将该第二视频流与服务器中存储的正在终端上播放的第一视频流进行合并处理,形成第三视频流。After receiving the second video stream of the terminal in S201, the server combines the second video stream with the first video stream being played on the terminal stored in the server by using a codec technology to form a third video stream.
具体地,利用所述第二视频流与正在播放的第一视频流合并,形成直播的第三视频流,可以包括:在所述第一视频流的播放画面中嵌入一个播放窗口;按照所述第二视频流的时间标识,将所述第二视频流的播放画面添加到所述播放窗口中,且所述播放窗口的播放画面与所述第一视频流的播放画面具有相同的时间标识,形成所述第三视频流。Specifically, the merging the second video stream with the first video stream that is being played to form a live third video stream may include: embedding a play window in a play screen of the first video stream; a time identifier of the second video stream, the play screen of the second video stream is added to the play window, and the play screen of the play window has the same time identifier as the play screen of the first video stream. Forming the third video stream.
S203、将所述第三视频流传输给所述终端。S203. Transmit the third video stream to the terminal.
可选地,在S203将所述第三视频流传输给所述终端之前,还包括:对形成的所述第三视频流的播放画面进行压缩,将压缩后的第三视频流传输给所述终端。这样,还可以 使在网络中传输的数据量变小,响应速度快。Optionally, before the transmitting, by the S203, the third video stream to the terminal, the method further includes: compressing a formed play picture of the third video stream, and transmitting the compressed third video stream to the terminal. This way, you can The amount of data transmitted in the network is reduced, and the response speed is fast.
举个例子,用户正在本地的终端上玩云游戏,并选择了通过网络实时上传游戏视频流(即第一视频流)。其中,云游戏是指游戏的视频流存储在云端服务器上的游戏。那么,用户则可以利用本地的摄像头采集视频,将其如何进行游戏的动作和语音都采集到,形成第二视频流,并通过网络实时上传该第二视频流到云端服务器。云端服务器通过编解码技术,将第一视频流和第二视频流合并,形成一个视频流的直播节目,即第三视频流。云端服务器再将合成后的第三视频流传输给本地的终端。用户在本地的终端的屏幕上即可观看到合成的直播视频节目。同时,在网络中的其他用户也可以点播的方式选择收看到该用户的直播视频节目。图4是本申请实施例提供的一种合成后的直播画面示意图,用户在本地的终端的屏幕上可以观看到合成的直播视频节目。For example, the user is playing a cloud game on a local terminal and has selected to upload the game video stream (ie, the first video stream) in real time over the network. Among them, the cloud game refers to the game in which the video stream of the game is stored on the cloud server. Then, the user can use the local camera to collect video, collect the action and voice of the game, form a second video stream, and upload the second video stream to the cloud server in real time through the network. The cloud server combines the first video stream and the second video stream by using a codec technology to form a live stream of a video stream, that is, a third video stream. The cloud server then transmits the synthesized third video stream to the local terminal. The user can view the synthesized live video program on the screen of the local terminal. At the same time, other users in the network can also select to view the live video program of the user in an on-demand manner. FIG. 4 is a schematic diagram of a synthesized live broadcast screen provided by an embodiment of the present application, where a user can view a synthesized live video program on a screen of a local terminal.
本申请的直播视频合成方法可以用于许多应用场景,与上述用户玩游戏相类似的场景还可以包括:教师也可以利用本申请提供的直播视频合成系统制作直播课堂等等,具体地处理过程也相类似,于此不再赘述。The live video synthesis method of the present application can be used in many application scenarios. The scenario similar to the above-mentioned user playing the game may further include: the teacher may also use the live video synthesis system provided by the application to create a live classroom, etc., and the specific processing process is also Similarly, it will not be repeated here.
本申请实施例提供的直播视频的合成方法,这种通过客户端和云端结合形成直播频道的方式,将终端采集的用户交互的视频流传输到服务器,在服务器中合成视频流的效果更好,可以针对当前播放的画面添加用户的交互视频,形成直播画面,用户体验好。The method for synthesizing the live video provided by the embodiment of the present application, the method for forming a live channel by combining the client and the cloud, and transmitting the video stream of the user interaction collected by the terminal to the server, and the effect of synthesizing the video stream in the server is better. The user's interactive video can be added to the currently played screen to form a live broadcast screen, and the user experience is good.
以上是对本申请实施例所提供的直播视频的合成方法进行的详细描述,下面对本申请提供的直播视频的合成装置进行详细描述。The above is a detailed description of the method for synthesizing the live video provided by the embodiment of the present application. The following describes the synthesizing device for the live video provided by the present application.
图5是本申请实施例提供的直播视频的合成装置示意图,如图5所示,本申请的直播视频的合成装置包括:采集单元301、传输单元302、接收单元303和处理单元304。FIG. 5 is a schematic diagram of a device for synthesizing a live video according to an embodiment of the present disclosure. As shown in FIG. 5, the device for synthesizing a live video of the present application includes: an acquisition unit 301, a transmission unit 302, a receiving unit 303, and a processing unit 304.
采集单元301用于在播放第一视频流时,采集第二视频流。The collecting unit 301 is configured to collect the second video stream when the first video stream is played.
其中,第二视频流为采集用户对所述第一视频流的播放画面的交互行为而形成的视频流。The second video stream is a video stream formed by collecting interaction behaviors of the user on the play screen of the first video stream.
传输单元302用于将采集单元301采集到的所述第二视频流传输至服务器2,以便服务器2利用所述第二视频流与正在播放的所述第一视频流合并,形成直播的第三视频流。The transmitting unit 302 is configured to transmit the second video stream collected by the collecting unit 301 to the server 2, so that the server 2 merges with the first video stream being played by using the second video stream to form a third live broadcast. Video stream.
接收单元303用于接收服务器2发送的所述第三视频流。The receiving unit 303 is configured to receive the third video stream sent by the server 2.
处理单元304用于解析接收单元303接收的所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。The processing unit 304 is configured to parse the third video stream received by the receiving unit 303, form a play screen of the third video stream, and play a play screen of the third video stream.
可选地,所述交互行为包括语音交互行为和动作交互行为。采集单元301包括摄像 头和麦克风,所述摄像头采集所述用户对所述第一视频流的播放画面的动作交互行为,所述麦克风采集所述用户对所述第一视频流的语音交互数据。Optionally, the interaction behavior includes a voice interaction behavior and an action interaction behavior. The acquisition unit 301 includes a camera a head and a microphone, the camera collects an action interaction behavior of the user on a play screen of the first video stream, and the microphone collects voice interaction data of the user on the first video stream.
可选地,所述交互行为包括动作交互行为。采集单元301包括摄像头,所述摄像头采集所述用户对所述第一视频流的播放画面的动作交互行为。Optionally, the interaction behavior includes an action interaction behavior. The collecting unit 301 includes a camera that collects the action interaction behavior of the user on the playing screen of the first video stream.
可选地,直播视频的合成装置还包括:输入控制单元,所述输入控制单元用于接收所述用户对所述第一视频流的输入控制操作。处理单元304对所述输入控制单元接收的所述输入控制操作进行相应地处理。例如,用户利用游戏手柄进行向左或向右移动的操作时,处理单元304将正在播放的视频画面向左或向右移动。Optionally, the synthesizing device of the live video further includes: an input control unit, configured to receive an input control operation of the first video stream by the user. The processing unit 304 processes the input control operations received by the input control unit accordingly. For example, when the user performs an operation of moving left or right using the game pad, the processing unit 304 moves the video screen being played to the left or right.
可选地,直播视频的合成装置还包括:存储单元,所述存储单元用于在接收单元303接收到所述服务器发送的所述第三视频流之后,存储所述第三视频流。当接收单元3303接收到播放所述第三视频流的操作时,处理单元304解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。Optionally, the synthesizing device of the live video further includes: a storage unit, configured to store the third video stream after the receiving unit 303 receives the third video stream sent by the server. When the receiving unit 3303 receives the operation of playing the third video stream, the processing unit 304 parses the third video stream to form a play screen of the third video stream, and plays a play screen of the third video stream.
上述各单元的功能可对应于图2详细描述的上述直播视频的合成方法的处理步骤,于此不再赘述。The functions of the foregoing units may correspond to the processing steps of the method for synthesizing the live video described in detail in FIG. 2, and details are not described herein again.
图6是本申请实施例提供的直播视频的合成装置示意图,如图6所示,本申请的直播视频的合成装置包括:接收单元401、处理单元402和传输单元403。FIG. 6 is a schematic diagram of a device for synthesizing a live video according to an embodiment of the present disclosure. As shown in FIG. 6 , the device for synthesizing a live video of the present application includes: a receiving unit 401, a processing unit 402, and a transmission unit 403.
接收单元401用于在终端播放第一视频流时,接收所述终端传输的第二视频流,所述第二视频流为所述终端通过视频采集设备采集的视频流。The receiving unit 401 is configured to receive a second video stream that is transmitted by the terminal when the terminal plays the first video stream, where the second video stream is a video stream that is collected by the terminal through the video collection device.
其中,所述第二视频流为所述终端通过视频采集设备采集的用户对所述第一视频流进行交互而形成的视频流。The second video stream is a video stream formed by the user that the terminal collects through the video collection device interacts with the first video stream.
处理单元402用于利用接收单元401接收的所述第二视频流与所述第一视频流合并,形成直播的第三视频流。The processing unit 402 is configured to merge the second video stream received by the receiving unit 401 with the first video stream to form a live third video stream.
传输单元403用于将处理单元402形成的所述第三视频流传输给所述终端。The transmitting unit 403 is configured to transmit the third video stream formed by the processing unit 402 to the terminal.
可选地,处理单元402具体包括:嵌入子单元和合并子单元。Optionally, the processing unit 402 specifically includes: an embedded subunit and a merged subunit.
所述嵌入子单元用于在所述第一视频流的播放画面中嵌入一个播放窗口。The embedded subunit is configured to embed a play window in a play screen of the first video stream.
所述合并子单元用于按照所述第二视频流的时间标识,将所述第二视频流的播放画面添加到所述播放窗口中,且所述播放窗口的播放画面与所述第一视频流的播放画面具有相同的时间标识,形成所述第三视频流。The merging subunit is configured to add a play screen of the second video stream to the play window according to a time identifier of the second video stream, and play a play screen of the play window with the first video The play pictures of the stream have the same time stamp, forming the third video stream.
可选地,处理单元402还包括:压缩子单元。所述压缩子单元用于在传输单元403将所述第三视频流传输给所述终端之前,对形成的所述第三视频流的播放画面进行压缩。 传输单元403将所述压缩子单元压缩后的第三视频流传输给所述终端。Optionally, the processing unit 402 further includes: a compression subunit. The compression subunit is configured to compress the formed play picture of the third video stream before the transmission unit 403 transmits the third video stream to the terminal. The transmitting unit 403 transmits the third video stream compressed by the compression subunit to the terminal.
上述各单元的功能可对应于图3详细描述的上述直播视频的合成方法的处理步骤,于此不再赘述。The functions of the foregoing units may correspond to the processing steps of the method for synthesizing the live video described in detail in FIG. 3, and details are not described herein again.
本申请实施例提供的直播视频的合成方法、装置及系统,利用视频采集设备采集用户针对当前播放的画面的交互行为,将采集到的视频流传输到服务器,可以针对当前播放的画面添加用户的交互视频,形成直播画面,实时性好,用户体验好,同时由于是在服务器中合成视频流,可以使得直播视频流的效果更好,画面更清晰。The method, device, and system for synthesizing live video provided by the embodiment of the present application, the video capture device is used to collect the interaction behavior of the user for the currently played picture, and the collected video stream is transmitted to the server, and the user may be added to the currently played picture. Interactive video, forming a live video, good real-time performance, good user experience, and because the video stream is synthesized in the server, the live video stream can be better and the picture is clearer.
专业人员应该还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person skilled in the art should further appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both, in order to clearly illustrate hardware and software. Interchangeability, the composition and steps of the various examples have been generally described in terms of function in the above description. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of a method or algorithm described in connection with the embodiments disclosed herein can be implemented in hardware, a software module executed by a processor, or a combination of both. The software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.
以上所述的具体实施方式,对本申请的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本申请的具体实施方式而已,并不用于限定本申请的保护范围,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。 The specific embodiments of the present invention have been described in detail with reference to the specific embodiments of the present application. It is to be understood that the foregoing description is only The scope of protection, any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this application are intended to be included within the scope of the present application.

Claims (17)

  1. 一种直播视频的合成方法,其特征在于,所述方法包括:A method for synthesizing live video, characterized in that the method comprises:
    在播放第一视频流时,通过视频采集设备采集第二视频流;When the first video stream is played, the second video stream is collected by the video capture device;
    将所述第二视频流传输至服务器,以便所述服务器利用所述第二视频流与正在播放的所述第一视频流合并,形成直播的第三视频流;Transmitting the second video stream to a server, so that the server merges with the first video stream being played by using the second video stream to form a live third video stream;
    接收所述服务器发送的所述第三视频流;Receiving the third video stream sent by the server;
    解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。Parsing the third video stream to form a play screen of the third video stream, and playing a play screen of the third video stream.
  2. 根据权利要求1所述的方法,其特征在于,所述第二视频流包括通过视频采集设备采集用户对所述第一视频流的播放画面的交互行为形成第二视频流。The method according to claim 1, wherein the second video stream comprises: collecting, by the video capture device, a user's interaction behavior with a play screen of the first video stream to form a second video stream.
  3. 根据权利要求2所述的方法,其特征在于,所述交互行为包括语音交互行为和动作交互行为;The method according to claim 2, wherein the interaction behavior comprises a voice interaction behavior and an action interaction behavior;
    所述通过视频采集设备采集用户对所述第一视频流的播放画面的交互行为包括:The interaction behavior of the user to capture the play screen of the first video stream by using the video capture device includes:
    通过摄像头采集所述用户对所述第一视频流的播放画面的动作交互行为;以及,通过麦克风采集所述用户对所述第一视频流的语音交互数据。Collecting, by the camera, an action interaction behavior of the user on a play screen of the first video stream; and collecting, by using a microphone, voice interaction data of the user on the first video stream.
  4. 根据权利要求2所述的方法,其特征在于,所述交互行为包括动作交互行为;The method of claim 2, wherein the interaction behavior comprises an action interaction behavior;
    所述通过视频采集设备采集用户对所述第一视频流的播放画面的交互行为包括:The interaction behavior of the user to capture the play screen of the first video stream by using the video capture device includes:
    通过摄像头采集所述用户对播放的画面的动作交互行为。The action interaction behavior of the user on the played picture is collected by the camera.
  5. 根据权利要求1所述的方法,其特征在于,在接收所述服务器发送的所述第三视频流之后,还包括:The method according to claim 1, wherein after receiving the third video stream sent by the server, the method further comprises:
    存储所述第三视频流;Storing the third video stream;
    当接收到播放所述第三视频流的操作时,解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。When receiving the operation of playing the third video stream, the third video stream is parsed to form a play screen of the third video stream, and a play screen of the third video stream is played.
  6. 一种直播视频的合成方法,其特征在于,所述方法包括:A method for synthesizing live video, characterized in that the method comprises:
    在终端播放第一视频流时,接收所述终端传输的第二视频流,所述第二视频流为所述终端通过视频采集设备采集的视频流;Receiving, by the terminal, the second video stream that is transmitted by the terminal, where the second video stream is a video stream that is collected by the terminal through the video collection device;
    利用所述第二视频流与所述第一视频流合并,形成直播的第三视频流;Merging the second video stream with the first video stream to form a live third video stream;
    将所述第三视频流传输给所述终端。Transmitting the third video stream to the terminal.
  7. 根据权利要求6所述的方法,其特征在于,所述第二视频流为所述终端通过视频采集设备采集的用户对所述第一视频流进行交互而形成的视频流。 The method according to claim 6, wherein the second video stream is a video stream formed by the user that the terminal collects through the video collection device interacts with the first video stream.
  8. 根据权利要求6所述的方法,其特征在于,利用所述第二视频流与正在播放的第一视频流合并,形成直播的第三视频流,具体包括:The method according to claim 6, wherein the merging of the second video stream with the first video stream being played to form a third video stream of the live broadcast comprises:
    在所述第一视频流的播放画面中嵌入一个播放窗口;Embedding a play window in a play screen of the first video stream;
    按照所述第二视频流的时间标识,将所述第二视频流的播放画面添加到所述播放窗口中,且所述播放窗口的播放画面与所述第一视频流的播放画面具有相同的时间标识,形成所述第三视频流。Adding a play screen of the second video stream to the play window according to the time identifier of the second video stream, and the play screen of the play window has the same play screen as the play screen of the first video stream Time identification, forming the third video stream.
  9. 根据权利要求6所述的方法,其特征在于,在将所述第三视频流传输给所述终端之前,还包括:The method according to claim 6, wherein before the transmitting the third video stream to the terminal, the method further comprises:
    对形成的所述第三视频流的播放画面进行压缩,将压缩后的第三视频流传输给所述终端。And compressing the formed play picture of the third video stream, and transmitting the compressed third video stream to the terminal.
  10. 一种直播视频的合成装置,其特征在于,所述装置包括:A synthesizing device for a live video, characterized in that the device comprises:
    采集单元,用于在播放第一视频流时,采集第二视频流;An acquiring unit, configured to collect a second video stream when playing the first video stream;
    传输单元,用于将所述采集单元采集到的所述第二视频流传输至服务器,以便所述服务器利用所述第二视频流与正在播放的所述第一视频流合并,形成直播的第三视频流;a transmitting unit, configured to transmit the second video stream collected by the collecting unit to a server, so that the server merges with the first video stream being played by using the second video stream to form a live broadcast Three video streams;
    接收单元,用于接收所述服务器发送的所述第三视频流;a receiving unit, configured to receive the third video stream sent by the server;
    处理单元,用于解析所述接收单元接收的所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。And a processing unit, configured to parse the third video stream received by the receiving unit, form a play screen of the third video stream, and play a play screen of the third video stream.
  11. 根据权利要求10所述的装置,其特征在于,所述第二视频流包括采集的用户对所述第一视频流的播放画面的交互行为而形成的第二视频流;The device according to claim 10, wherein the second video stream comprises a second video stream formed by the collected user interaction behavior of the play picture of the first video stream;
    所述交互行为包括语音交互行为和动作交互行为;所述采集单元包括摄像头和麦克风,所述摄像头采集所述用户对所述第一视频流的播放画面的动作交互行为,所述麦克风采集所述用户对所述第一视频流的语音交互数据;The interaction behavior includes a voice interaction behavior and an action interaction behavior; the collection unit includes a camera and a microphone, and the camera collects an action interaction behavior of the user on a play screen of the first video stream, and the microphone collects the User interaction data of the first video stream;
    或者,所述交互行为包括动作交互行为;所述采集单元包括摄像头,所述摄像头采集所述用户对所述第一视频流的播放画面的动作交互行为。Or the interaction behavior includes an action interaction behavior; the collection unit includes a camera, and the camera collects an action interaction behavior of the user on a play screen of the first video stream.
  12. 根据权利要求10所述的装置,其特征在于,所述装置还包括:The device according to claim 10, wherein the device further comprises:
    存储单元,用于在所述接收单元接收到所述服务器发送的所述第三视频流之后,存储所述第三视频流;a storage unit, configured to store the third video stream after the receiving unit receives the third video stream sent by the server;
    当所述接收单元接收到播放所述第三视频流的操作时,所述处理单元解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。 When the receiving unit receives the operation of playing the third video stream, the processing unit parses the third video stream to form a play screen of the third video stream, and plays the play of the third video stream. Picture.
  13. 一种直播视频的合成装置,其特征在于,所述装置包括:A synthesizing device for a live video, characterized in that the device comprises:
    接收单元,用于在终端播放第一视频流时,接收所述终端传输的第二视频流;a receiving unit, configured to receive, when the terminal plays the first video stream, a second video stream that is transmitted by the terminal;
    处理单元,用于利用所述接收单元接收的所述第二视频流与所述第一视频流合并,形成直播的第三视频流;a processing unit, configured to merge the second video stream received by the receiving unit with the first video stream to form a third video stream that is broadcasted;
    传输单元,用于将所述处理单元形成的所述第三视频流传输给所述终端。And a transmitting unit, configured to transmit the third video stream formed by the processing unit to the terminal.
  14. 根据权利要求13所述的装置,其特征在于,所述第二视频流为所述终端通过视频采集设备采集的用户对所述第一视频流进行交互而形成的视频流。The device according to claim 13, wherein the second video stream is a video stream formed by the user that the terminal collects through the video collection device interacts with the first video stream.
  15. 根据权利要求13所述的装置,其特征在于,所述处理单元具体包括:The device according to claim 13, wherein the processing unit specifically comprises:
    嵌入子单元,用于在所述第一视频流的播放画面中嵌入一个播放窗口;Embedding a subunit for embedding a play window in a play screen of the first video stream;
    合并子单元,用于按照所述第二视频流的时间标识,将所述第二视频流的播放画面添加到所述播放窗口中,且所述播放窗口的播放画面与所述第一视频流的播放画面具有相同的时间标识,形成所述第三视频流。a merging unit, configured to add a play screen of the second video stream to the play window according to a time identifier of the second video stream, and play a play screen of the play window with the first video stream The play screens have the same time stamp to form the third video stream.
  16. 根据权利要求13所述的装置,其特征在于,所述处理单元还包括:The device according to claim 13, wherein the processing unit further comprises:
    压缩子单元,用于在所述传输单元将所述第三视频流传输给所述终端之前,对形成的所述第三视频流的播放画面进行压缩;a compression subunit, configured to compress the formed play picture of the third video stream before the transmitting unit transmits the third video stream to the terminal;
    所述传输单元将所述压缩子单元压缩后的第三视频流传输给所述终端。The transmitting unit transmits the compressed third video stream to the terminal.
  17. 一种直播视频的合成系统,其特征在于,所述系统包括:服务器和带有视频采集设备的终端;A system for synthesizing live video, characterized in that the system comprises: a server and a terminal with a video capture device;
    所述终端在播放第一视频流时,通过视频采集设备采集第二视频流;When the terminal plays the first video stream, the terminal collects the second video stream by using the video capture device;
    所述终端将所述第二视频流传输至所述服务器;Transmitting, by the terminal, the second video stream to the server;
    所述服务器利用所述第二视频流与正在播放的所述第一视频流合并,形成直播的第三视频流;The server merges with the first video stream being played by using the second video stream to form a third video stream of the live broadcast;
    所述终端接收所述服务器发送的所述第三视频流;Receiving, by the terminal, the third video stream sent by the server;
    所述终端解析所述第三视频流,形成所述第三视频流的播放画面,播放所述第三视频流的播放画面。 The terminal parses the third video stream to form a play screen of the third video stream, and plays a play screen of the third video stream.
PCT/CN2016/076374 2015-03-23 2016-03-15 Method, apparatus and system for synthesizing live video WO2016150317A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510127721.3 2015-03-23
CN201510127721.3A CN106162221A (en) 2015-03-23 2015-03-23 The synthetic method of live video, Apparatus and system

Publications (1)

Publication Number Publication Date
WO2016150317A1 true WO2016150317A1 (en) 2016-09-29

Family

ID=56977808

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/076374 WO2016150317A1 (en) 2015-03-23 2016-03-15 Method, apparatus and system for synthesizing live video

Country Status (2)

Country Link
CN (1) CN106162221A (en)
WO (1) WO2016150317A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106375347A (en) * 2016-11-18 2017-02-01 上海悦野健康科技有限公司 Tourism live broadcast platform based on virtual reality
CN106658205A (en) * 2016-11-22 2017-05-10 广州华多网络科技有限公司 Studio video streaming synthesis control method, device and terminal equipment
CN107005721A (en) * 2016-11-22 2017-08-01 广州市百果园信息技术有限公司 Direct broadcasting room pushing video streaming control method and corresponding server and mobile terminal
CN107968948A (en) * 2016-10-19 2018-04-27 北京新唐思创教育科技有限公司 Online Video playback method and system
CN112929681A (en) * 2021-01-19 2021-06-08 广州虎牙科技有限公司 Video stream image rendering method and device, computer equipment and storage medium
CN113949895A (en) * 2017-02-16 2022-01-18 脸谱公司 Method and system for transmitting video clips of viewer response

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106534618B (en) * 2016-11-24 2020-05-12 广州爱九游信息技术有限公司 Method, device and system for realizing pseudo field explanation
CN107278374B (en) * 2016-12-01 2020-01-03 深圳前海达闼云端智能科技有限公司 Interactive advertisement display method, terminal and smart city interactive system
CN106604047A (en) * 2016-12-13 2017-04-26 天脉聚源(北京)传媒科技有限公司 Multi-video-stream video direct broadcasting method and device
CN106658037A (en) * 2016-12-13 2017-05-10 天脉聚源(北京)传媒科技有限公司 Live video method and apparatus of multiple video streams
CN107317815A (en) * 2017-07-04 2017-11-03 上海鋆创信息技术有限公司 A kind of method and device, storage medium and the terminal of video superposition
CN109660853B (en) * 2017-10-10 2022-12-30 腾讯科技(北京)有限公司 Interaction method, device and system in live video
CN107920274B (en) * 2017-10-27 2020-08-04 优酷网络技术(北京)有限公司 Video processing method, client and server
CN108055577A (en) * 2017-12-18 2018-05-18 北京奇艺世纪科技有限公司 A kind of live streaming exchange method, system, device and electronic equipment
CN108449632B (en) * 2018-05-09 2021-04-02 福建星网视易信息系统有限公司 Method and terminal for real-time synthesis of singing video
CN108965746A (en) * 2018-07-26 2018-12-07 北京竞业达数码科技股份有限公司 Image synthesizing method and system
CN109327731B (en) * 2018-11-20 2021-05-11 福建海媚数码科技有限公司 Method and system for synthesizing DIY video in real time based on karaoke
CN113115108A (en) * 2018-12-20 2021-07-13 聚好看科技股份有限公司 Video processing method and computing device
CN109618178A (en) * 2019-01-21 2019-04-12 北京奇艺世纪科技有限公司 A kind of live broadcasting method, apparatus and system
CN110636321A (en) * 2019-09-30 2019-12-31 北京达佳互联信息技术有限公司 Data processing method, device, system, mobile terminal and storage medium
CN110662082A (en) * 2019-09-30 2020-01-07 北京达佳互联信息技术有限公司 Data processing method, device, system, mobile terminal and storage medium
CN113038287B (en) * 2019-12-09 2022-04-01 上海幻电信息科技有限公司 Method and device for realizing multi-user video live broadcast service and computer equipment
CN114449179B (en) * 2020-10-19 2024-05-28 海信视像科技股份有限公司 Display device and image mixing method
CN112291579A (en) * 2020-10-26 2021-01-29 北京字节跳动网络技术有限公司 Data processing method, device, equipment and storage medium
CN112788358B (en) * 2020-12-31 2022-02-18 腾讯科技(深圳)有限公司 Video live broadcast method, video sending method, device and equipment for game match
CN113115073B (en) * 2021-04-12 2022-09-02 宜宾市天珑通讯有限公司 Image coding method, system, device and medium capable of reducing interference
CN116033189B (en) * 2023-03-31 2023-06-30 卓望数码技术(深圳)有限公司 Live broadcast interactive video partition intelligent control method and system based on cloud edge cooperation
CN117082461A (en) * 2023-08-09 2023-11-17 中移互联网有限公司 Method, device and storage medium for transmitting 5G message in audio/video call

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1696923A (en) * 2004-05-10 2005-11-16 北京大学 Networked, multimedia synchronous composed storage and issuance system, and method for implementing the system
US8301596B2 (en) * 2010-01-15 2012-10-30 Hulu Llc Method and apparatus for providing supplemental video content for third party websites
US8667054B2 (en) * 2010-07-12 2014-03-04 Opus Medicus, Inc. Systems and methods for networked, in-context, composed, high resolution image viewing
CN103856787A (en) * 2012-12-04 2014-06-11 上海文广科技(集团)有限公司 Commentary video passing-back live system based on public network and live method of commentary video passing-back live system based on public network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1696923A (en) * 2004-05-10 2005-11-16 北京大学 Networked, multimedia synchronous composed storage and issuance system, and method for implementing the system
US8301596B2 (en) * 2010-01-15 2012-10-30 Hulu Llc Method and apparatus for providing supplemental video content for third party websites
US8667054B2 (en) * 2010-07-12 2014-03-04 Opus Medicus, Inc. Systems and methods for networked, in-context, composed, high resolution image viewing
CN103856787A (en) * 2012-12-04 2014-06-11 上海文广科技(集团)有限公司 Commentary video passing-back live system based on public network and live method of commentary video passing-back live system based on public network

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107968948A (en) * 2016-10-19 2018-04-27 北京新唐思创教育科技有限公司 Online Video playback method and system
CN106375347A (en) * 2016-11-18 2017-02-01 上海悦野健康科技有限公司 Tourism live broadcast platform based on virtual reality
CN106658205A (en) * 2016-11-22 2017-05-10 广州华多网络科技有限公司 Studio video streaming synthesis control method, device and terminal equipment
CN107005721A (en) * 2016-11-22 2017-08-01 广州市百果园信息技术有限公司 Direct broadcasting room pushing video streaming control method and corresponding server and mobile terminal
WO2018094556A1 (en) * 2016-11-22 2018-05-31 广州市百果园信息技术有限公司 Live-broadcasting room video stream pushing control method, and corresponding server and mobile terminal
CN107005721B (en) * 2016-11-22 2020-07-24 广州市百果园信息技术有限公司 Live broadcast room video stream push control method, corresponding server and mobile terminal
CN106658205B (en) * 2016-11-22 2020-09-04 广州华多网络科技有限公司 Live broadcast room video stream synthesis control method and device and terminal equipment
US11616990B2 (en) 2016-11-22 2023-03-28 Guangzhou Baiguoyuan Information Technology Co., Ltd. Method for controlling delivery of a video stream of a live-stream room, and corresponding server and mobile terminal
CN113949895A (en) * 2017-02-16 2022-01-18 脸谱公司 Method and system for transmitting video clips of viewer response
CN112929681A (en) * 2021-01-19 2021-06-08 广州虎牙科技有限公司 Video stream image rendering method and device, computer equipment and storage medium
CN112929681B (en) * 2021-01-19 2023-09-05 广州虎牙科技有限公司 Video stream image rendering method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN106162221A (en) 2016-11-23

Similar Documents

Publication Publication Date Title
WO2016150317A1 (en) Method, apparatus and system for synthesizing live video
WO2019205872A1 (en) Video stream processing method and apparatus, computer device and storage medium
CN105025327B (en) A kind of method and system of mobile terminal live broadcast
CN109327741B (en) Game live broadcast method, device and system
US20190124371A1 (en) Systems, methods and computer software for live video/audio broadcasting
WO2019114330A1 (en) Video playback method and apparatus, and terminal device
CN106358050A (en) Android based audio and video streaming push method and device as well as Android based audio and video streaming playing method and device
CN112019905A (en) Live broadcast playback method, computer equipment and readable storage medium
JP7290260B1 (en) Servers, terminals and computer programs
CN110113621A (en) Playing method and device, storage medium, the electronic device of media information
CN111147911A (en) Video clipping method and device, electronic equipment and storage medium
CN105704399A (en) Playing method and system for multi-picture television program
EP3099069B1 (en) Method for processing video, terminal and server
CN109040818B (en) Audio and video synchronization method, storage medium, electronic equipment and system during live broadcasting
WO2017092433A1 (en) Method and device for video real-time playback
US10721500B2 (en) Systems and methods for live multimedia information collection, presentation, and standardization
CN106060609B (en) Obtain the method and device of picture
CN109862385B (en) Live broadcast method and device, computer readable storage medium and terminal equipment
CN108124183A (en) With it is synchronous obtain it is audio-visual to carry out the method for one-to-many video stream
CN107426487A (en) A kind of panoramic picture recorded broadcast method and system
JP5205900B2 (en) Video conference system, server terminal, and client terminal
WO2017092435A1 (en) Method and device for audio/video real-time transmission, transmission stream packing method, and multiplexer
CN116708867B (en) Live broadcast data processing method, device, equipment and storage medium
KR20170083422A (en) Method for broadcast contents directly using mobile device and the system thereof
US11317035B1 (en) Method and system for synchronized playback of multiple video streams over a computer network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16767694

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16767694

Country of ref document: EP

Kind code of ref document: A1