WO2018095146A1 - 直播间视频流合成方法、装置及终端设备 - Google Patents

直播间视频流合成方法、装置及终端设备 Download PDF

Info

Publication number
WO2018095146A1
WO2018095146A1 PCT/CN2017/105017 CN2017105017W WO2018095146A1 WO 2018095146 A1 WO2018095146 A1 WO 2018095146A1 CN 2017105017 W CN2017105017 W CN 2017105017W WO 2018095146 A1 WO2018095146 A1 WO 2018095146A1
Authority
WO
WIPO (PCT)
Prior art keywords
video stream
terminal
server
stream
anchor
Prior art date
Application number
PCT/CN2017/105017
Other languages
English (en)
French (fr)
Inventor
余蒙
于川
徐光兴
吴昊
苏庆辉
陆锦铃
郭业翔
Original Assignee
广州华多网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州华多网络科技有限公司 filed Critical 广州华多网络科技有限公司
Publication of WO2018095146A1 publication Critical patent/WO2018095146A1/zh
Priority to US16/419,022 priority Critical patent/US20190273955A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/254Management at additional data server, e.g. shopping server, rights management server
    • H04N21/2541Rights Management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2665Gathering content from different sources, e.g. Internet and satellite
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/27Server based end-user applications
    • H04N21/274Storing end-user multimedia data in response to end-user request, e.g. network recorder
    • H04N21/2743Video hosting of uploaded data from client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4622Retrieving content or additional data from different sources, e.g. from a broadcast channel and the Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/632Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing using a connection between clients on a wide area network, e.g. setting up a peer-to-peer communication via Internet for retrieving video segments from the hard-disk of other client devices

Definitions

  • the present invention relates to the field of network live broadcast technology, and in particular, to a method, device and terminal device for synthesizing a video stream between live broadcasts.
  • the current live broadcast platform includes a live broadcast platform for the mobile terminal and a live broadcast platform for the PC end.
  • the user needs to simultaneously broadcast content of multiple mobile terminals or simultaneously broadcast content of multiple PCs or simultaneously broadcast content of the mobile terminal and the PC end, for example, the user starts to broadcast live on the PC, and then wants to temporarily transfer the live broadcast scene.
  • the anchor user is often required to re-establish a live broadcast room on the mobile terminal to open the outdoor live broadcast, and then the viewing user in the original live broadcast room needs to re-enter the new live broadcast room. After watching the outdoor broadcast of the anchor, the outdoor broadcast is temporary.
  • the anchor After the outdoor broadcast stops, the anchor returns to the PC to broadcast live, and the viewing user has to re-enter the live broadcast of the host's PC.
  • the operation process of the next anchor user and the viewing user is cumbersome, and it is impossible to bring a good experience to the anchor user and the viewing user, and reduce the reputation of the live broadcast platform, resulting in user loss.
  • the primary object of the present invention is to provide a method and apparatus for synthesizing a video stream between live broadcasts.
  • Another object of the present invention is to provide a terminal device that performs the above-described live stream video stream synthesis method.
  • the present invention provides a method for synthesizing a video stream between live broadcasts, including the following steps:
  • the first video stream collected locally and the second video stream are combined into a third video stream and uploaded to the server, so that the server pushes the third video stream to each user in the live broadcast room.
  • the pre-protocol information is characterized in the form of a two-dimensional code, a feature password or a link.
  • the process of receiving the second video stream and the process of collecting the local first video stream and the process of synthesizing the third video stream work in parallel with the process of uploading the third video stream.
  • the first, second, and third video streams each include an image stream and an audio stream
  • the third video stream includes an image stream of at least one of the first video stream and the second video stream, and further includes the two At least one of the audio streams.
  • the second terminal initiates a connection request to the anchor user terminal according to the pre-protocol information, and maintains a long connection with the anchor user terminal.
  • the present invention provides a live stream video stream synthesizing apparatus, including:
  • the output module is configured to output pre-protocol information including the feature information of the live broadcast and the identity information of the anchor in response to the operation instruction of the multicast user to enable the multi-source live broadcast;
  • a configuration module configured to configure an anchor user terminal as a server terminal, and receive a connection request initiated by the second terminal according to the pre-protocol information;
  • a receiving module configured to receive, after the anchor user confirms the connection with the at least one second terminal, the second video stream collected by the second terminal, in response to the connection request;
  • a synthesizing module configured to synthesize the locally collected first video stream and the second video stream into a third video stream, and upload the data to the server, so that the server pushes the third video stream to each user in the live broadcast room .
  • the pre-protocol information is characterized by a two-dimensional code, a feature password, a link, and the like.
  • the process of receiving the second video stream and the process of collecting the local first video stream and the process of synthesizing the third video stream work in parallel with the process of uploading the third video stream.
  • the first, second, and third video streams each include an image stream and an audio stream
  • the third video stream includes an image stream of at least one of the first video stream and the second video stream, and further includes the two At least one of the audio streams.
  • the present invention further provides a terminal device, comprising: a processor, a memory, configured to invoke any one of the steps of performing the inter-live stream video stream synthesis method stored in the memory in a program form.
  • the present invention has the following advantages:
  • the present invention configures the anchor terminal as a server terminal for receiving the video stream directly sent by the second terminal to the anchor terminal, and then the allcast video stream is synthesized and pushed by the anchor terminal to the video stream of the live broadcast room.
  • the video stream seen by the viewer user is consistent with the video stream on the anchor terminal, ensuring the simultaneity of the video stream.
  • the anchor terminal is configured as a server terminal
  • the video stream collected by the second terminal does not need to be directly sent to the anchor terminal through the server of the live broadcast platform, and the live broadcast platform server only needs to receive the final push to the live broadcast room uploaded by the anchor terminal.
  • the video stream is pushed to each viewing user in the live broadcast room, avoiding a large number of second terminals uploading the video stream to the server, reducing the occupation of network bandwidth and reducing the pressure on the live platform server.
  • the video stream pushed to the live broadcast room is adapted to various live video playback front ends (PC end, mobile end, WEB end), and there is no need to adapt different video stream parsing protocols for various front ends, completely Free the front end.
  • FIG. 1 is a schematic flow chart of an embodiment of a method for synthesizing a video stream between live broadcasts according to the present invention
  • FIG. 2 is a schematic diagram of two embodiments of multi-source live broadcast of the present invention.
  • FIG. 3 is a schematic diagram of characterizing pre-protocol information in the form of a two-dimensional code according to the present invention.
  • FIG. 4 is a schematic diagram of characterizing pre-protocol information in the form of a feature password according to the present invention.
  • FIG. 5 is a schematic diagram of the present invention for characterizing pre-protocol information in the form of a link
  • FIG. 6 is a schematic diagram of a third video stream after selecting two second video streams according to the present invention.
  • FIG. 7 is a schematic diagram of an embodiment of a live video stream synthesizing apparatus according to the present invention.
  • terminal and terminal device used herein include both a wireless signal receiver device, a device having only a wireless signal receiver without a transmitting capability, and a receiving and transmitting hardware.
  • Such devices may include cellular or other communication devices having a single line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Personal Communications Service), which may combine voice, data Processing, fax and/or data communication capabilities; PDA (Personal Digital Assistant, Personal digital assistants, which may include radio frequency receivers, pagers, Internet/Intranet access, web browsers, notepads, calendars and/or GPS (Global Positioning System) receivers; conventional laptops and/or Or a palmtop computer or other device having a conventional laptop and/or palmtop computer or other device that includes and/or includes a radio frequency receiver.
  • PCS Personal Communications Service
  • PDA Personal Digital Assistant, Personal digital assistants, which may include radio frequency receivers, pagers, Internet/Intranet access, web browsers, notepads, calendars and/or GPS (Global Positioning System) receivers
  • conventional laptops and/or Or a palmtop computer or other device having a conventional laptop and/or palmtop computer or other
  • terminal may be portable, transportable, installed in a vehicle (aviation, sea and/or land), or adapted and/or configured to operate locally, and/or Run in any other location on the Earth and/or space in a distributed form.
  • the "terminal” and “terminal device” used herein may also be a communication terminal, an internet terminal, a music/video playing terminal, and may be, for example, a PDA, a MID (Mobile Internet Device), and/or have a music/video playback.
  • Functional mobile phones can also be smart TVs, set-top boxes and other devices.
  • the remote network device used herein includes, but is not limited to, a computer, a network host, a single network server, a plurality of network server sets, or a cloud composed of multiple servers.
  • the cloud is composed of a large number of computers or network servers based on Cloud Computing, which is a kind of distributed computing, a super virtual computer composed of a group of loosely coupled computers.
  • the communication between the remote network device, the terminal device and the WNS server can be implemented by any communication method, including but not limited to, mobile communication based on 3GPP, LTE, WIMAX, TCP/IP, UDP protocol. Computer network communication and short-range wireless transmission based on Bluetooth and infrared transmission standards.
  • the user interface/operation interface of the present invention generally refers to a display interface that can be used to send control instructions to the smart terminal, for example, an option in the setting page of the Andro i d system (or a button, added by the application, the same below), may also be an option in the notification bar or interactive page that is called from the desktop, or may be an option in a page constructed by an active component of the application. . While some exemplary embodiments of the invention have been shown in the foregoing, the embodiments of the invention may The scope is defined by the claims and their equivalents.
  • the live broadcast room includes the following meanings: 1 a virtual space (or virtual room) created based on the webcast platform.
  • the live broadcast room is generally created by the anchor client and connected to multiple viewing clients, that is, live broadcast.
  • the anchor and the plurality of viewers are included in the room, and the viewing client located in the virtual space can watch the live content of the anchor client, and the user of the anchor client and the user who views the client, the user who views the client, and the user who views the client.
  • 2 an instant messaging platform that aggregates users together in groups, such as a video conferencing system, where users log in to the client to enter the group, and the user is in the group.
  • the group exists as a member of the group.
  • the same group contains multiple group members.
  • the user can join or leave the group arbitrarily. Within the group, various interactions such as text, voice, and video can be performed.
  • FIG. 1 is a schematic flowchart of an embodiment of a method for synthesizing a video stream in a live broadcast according to the present invention, which includes the following steps:
  • the implementation of the method depends on a certain function module or plug-in of the live video client.
  • the live video client includes a live video software on the PC and a live video application on the mobile terminal.
  • the function module or plug-in is specifically software/ The executable code within the application, the specific implementation form of the functional module or the plug-in is not specifically limited to the present invention.
  • Step S100 In response to the operation instruction of the multi-source live broadcast by the anchor user, output pre-protocol information including the feature information of the live broadcast and the identity information of the anchor.
  • FIG. 2 there are two implementation manners of multi-source live broadcast: (1) multiple video streams are simultaneously acquired by using multiple shooting devices, and then multiple video streams are synthesized and pushed to the anchor terminal by the anchor identity to
  • the video stream in the live broadcast for example, the anchor is initially broadcasted by the PC, and then the anchor needs to temporarily transfer the live broadcast scene.
  • the mobile terminal can be used to collect the video stream of another scene and then merge with the video stream of the PC to realize multi-source live broadcast.
  • the first video stream is collected by the anchor terminal, and then multiple users are connected and the second video stream collected by the second terminal of the users is received, and then the multiple video streams are synthesized and pushed to the live broadcast by the anchor identity on the anchor terminal.
  • the video stream between them enables multi-source live broadcast.
  • the pre-protocol information includes live broadcast feature information and anchor identity information, and further includes an authorization token or authentication information for establishing a dedicated link between the anchor terminal and the second terminal, which is processed and stored in an encryption manner.
  • the protocol information the corresponding authorized application must parse and obtain the live broadcast feature information and the anchor identity information from the pre-protocol information.
  • the pre-protocol information is defined by each live broadcast platform, and the pre-protocol information can be obtained after the application authorized by the live broadcast platform identifies the form of the pre-protocol information.
  • the feature information of the live broadcast is the channel ID of the live broadcast to determine the unique live broadcast
  • the anchor identity information is the UID of the anchor user to determine the anchor identity of the user.
  • the pre-protocol information is characterized in the form of a two-dimensional code, a feature password or a link.
  • FIG. 3 is a schematic diagram of characterizing pre-protocol information in the form of a two-dimensional code.
  • the two-dimensional code is a black and white graphic recording data symbol information distributed on a two-dimensional plane by a predetermined geometric pattern in a two-dimensional plane.
  • the code subtly utilizes the concept of "0" and "1" bit streams constituting the internal logic of the computer in code coding, and uses a plurality of geometric shapes corresponding to binary to represent text, numerical values and the like, which can be passed through the image input device. Or the photoelectric scanning device reads the two-dimensional code to obtain the information contained in the two-dimensional code.
  • the two-dimensional code has multiple commonalities of bar code technology: each code has its own specific character set; each character occupies a predetermined width; and has a certain check function and the like.
  • the pre-protocol information may also be characterized in the form of a feature password as shown in FIG. 4; at the same time, the pre-protocol information may also be characterized in the form of a link as shown in FIG. 5. Regardless of the form in which the pre-protocol information is characterized, it is necessary for the application authorized by each live broadcast platform to obtain the pre-protocol information according to the preset protocol, and then perform subsequent operations.
  • Step S200 Configure the anchor user terminal as a server terminal and receive a connection request initiated by the second terminal according to the pre-protocol information.
  • the video broadcast client configures the anchor user terminal as a server to receive the video stream of the second terminal.
  • the embodiment also calls the anchor user terminal as the server terminal.
  • the anchor user sends the pre-protocol information output by the live client to other users/terminal devices, so that the second terminals initiate a connection request to the server terminal according to the pre-protocol information.
  • the live broadcast platform server verifies the pre-protocol information, and if the verification succeeds, the connection request is regarded as a legal connection request, and then the host user terminal is followed by the first
  • the two terminals establish a dedicated communication link connection to enable the two parties to directly perform data transmission.
  • the second terminal and the anchor user terminal maintain a communication link connection between the two parties in a long connection.
  • the long connection is that the client is connected to the server for a long time.
  • the anchor user terminal is configured as a server terminal, the long connection refers to that the video live client of the second terminal is connected to the server for a long time.
  • the communication parties that maintain the long connection can continuously send multiple data packets on one connection.
  • the server has a timeout limit, that is, the connection is inactive (without any data transmission) in a certain period of time.
  • the connection will be automatically disconnected, so the client needs to send a heartbeat detection packet to the server at intervals to maintain a long link. Therefore, in this embodiment, during the long connection hold between the anchor user terminal and the second terminal, if there is no data packet transmission, the video live client of the second terminal needs to send a heartbeat data packet to the anchor user terminal at intervals to maintain
  • the long link and the long connection are suitable for point-to-point communication.
  • the invention can greatly reduce the bandwidth of the live platform server and reduce the pressure of the live platform server.
  • Step S300 In response to the connection request, after the anchor user confirms the connection with the at least one second terminal, receiving the second video stream collected by the second terminal.
  • the video broadcast client of the anchor user responds to the connection request, and displays one or more second terminals that maintain a long connection state on the live video interface, which is limited by the characteristics of the long connection and the size of the live video interface, and the anchor user terminal.
  • the number of the second terminals that maintain the long connection is preferably less than or equal to five, and at the same time, after the anchor user selects to confirm the connection with one or more of the second terminals, the second terminal transmits the second video stream to the server terminal at this time.
  • the two parties no longer need to send a heartbeat detection packet, and the server terminal (ie, the anchor user terminal) receives the second video stream collected by the second terminal.
  • the anchor user can cancel the receiving of the second terminal at any time on the operation interface of the anchor user terminal.
  • Video stream at this time, the second terminal no longer sends the second video stream to the server terminal, after reaching the specified time, the second terminal sends the heartbeat detection packet to the server terminal again to maintain a long connection with the server terminal;
  • the anchor user can also disconnect the long connection with the second terminal, and after the disconnection reaches the specified time, the second terminal again serves Heartbeat packet transmission terminal, a server terminal is not fed back, at this time connected to a second terminal disconnected from the long terminal.
  • Step S400 synthesizing the locally collected first video stream and the second video stream into a third video
  • the stream is uploaded to the server so that the server pushes the third video stream to each user in the live room.
  • the video stream collected by the second terminal is received, and the locally collected first video stream and the second video stream are combined into a third video stream and uploaded to the server, so that The server pushes the third video stream to each user in the live broadcast room.
  • the first, second, and third video streams each include an image stream and an audio stream
  • the third video stream includes an image stream of at least one of the first video stream and the second video stream, and further includes at least An audio stream; for example, the anchor terminal receives a second video stream, and the video stream on the anchor user terminal includes an image stream and an audio stream of the first video stream, and an image stream and an audio stream of the second video stream.
  • FIG. 6 is a schematic diagram of a third video stream after the connection of two second video streams is selected according to the present invention.
  • the process of receiving the second video stream and the process of collecting the local first video stream and the process of synthesizing the third video stream work in parallel with the process of uploading the third video stream.
  • Parallel work refers to two or more jobs of the same or different nature at the same time or within the same time interval.
  • Parallel work has 1 time overlap: adjacent processes are staggered in time, and the same set of hardware is used in turn. Each part; 2 resource sharing: Let multiple users use the same set of resources in turn in a certain time sequence to improve resource utilization; 3 resource duplication: repeatedly set hardware resources to improve hardware reliability and performance.
  • the parallel operation means that the anchor terminal can synchronously/synchronize the first video stream when receiving the second direct stream, and simultaneously/synchronize the collected first video stream and the The received second video stream is merged into a third video stream, and the third video stream is simultaneously/synchronized and uploaded to the server, ensuring the simultaneous/synchronization of the live video stream.
  • the present invention further provides a live stream video stream synthesizing apparatus, including:
  • the output module 100 is configured to output pre-protocol information including the feature information of the live broadcast and the identity information of the anchor in response to the operation instruction of the anchor user to enable the multi-source live broadcast.
  • the multi-source live operation command, the output module 100 will include the pre-protocol information of the live broadcast feature information and the anchor identity information in the form of a two-dimensional code as shown in FIG. 3, and the features of FIG.
  • the password form and the link form of Figure 5 are output.
  • the configuration module 200 is configured to configure the anchor user terminal as a server terminal and receive a connection request initiated by the second terminal according to the pre-protocol information.
  • the configuration module 200 configures the anchor user terminal as a server to receive the video stream of the second terminal.
  • the embodiment also calls the anchor user terminal as the server terminal.
  • the anchor user sends the pre-protocol information output by the live client to other users/terminal devices, so that the second terminals initiate a connection request to the server terminal according to the pre-protocol information.
  • the receiving module 300 is configured to receive, after the anchor user confirms the connection with the at least one second terminal, the second video stream collected by the second terminal, in response to the connection request.
  • the receiving module 300 displays one or more second terminals that maintain a long connection state on the live video interface in response to the connection request, and at the same time, after the anchor user selects to confirm the connection with one or more of the second terminals,
  • the second terminal transmits the second video stream to the server terminal, the two sides no longer need to send the heartbeat detection packet, and the receiving module 300 receives the second video stream collected by the second terminal.
  • the synthesizing module 400 is configured to synthesize the locally collected first video stream and the second video stream into a third video stream, and upload the same to the server, so that the server pushes the third video to each user in the live broadcast room. flow.
  • the video stream collected by the second terminal is received, and the synthesizing module 400 simultaneously synthesizes the locally collected first video stream and the second video stream into a third video stream and uploads the same to the server. So that the server pushes the third video stream to each user in the live room.
  • the video stream includes an image stream and an audio stream
  • the third live video stream includes at least one of the first live video stream, the second live video stream, or a combined image stream, and the at least two An audio stream.
  • the process of receiving the second video stream and the process of collecting the local first video stream and the process of synthesizing the third video stream work in parallel with the process of uploading the third video stream.
  • the parallel operation means that the receiving module 300 can synchronously/synchronize the first video stream when receiving the second direct frequency stream, and the first/synchronous synthesis module 400 will collect the first video stream.
  • the video stream and the received second video stream are merged into a third video stream, and the third video stream is simultaneously/synchronized to be uploaded to the server, ensuring the simultaneous/synchronization of the live video stream.
  • the present invention further provides a terminal device, including a memory, a processor, configured to store candidate intermediate data and result data generated in the implementation of the foregoing method, where the processor is used to execute execution in a program form. Any one of the steps of the live stream video stream synthesis method of the memory.
  • the present invention configures the anchor terminal as a server terminal for receiving the video stream directly sent by the second terminal to the anchor terminal, and then the allcast video stream is synthesized and pushed by the anchor terminal to the video stream of the live broadcast room, and the viewer user of the live broadcast room sees The video stream is consistent with the video stream on the anchor terminal, ensuring the simultaneity of the video stream.
  • the anchor terminal since the anchor terminal is configured as a server terminal, the video stream collected by the second terminal is directly sent to the server of the live broadcast platform.
  • the anchor terminal server only needs to receive the video stream finally uploaded to the live broadcast room uploaded by the anchor terminal, and pushes it to each viewing user in the live broadcast room, thereby avoiding a large number of second terminals uploading the video stream to the server, thereby reducing
  • the occupation of the network bandwidth reduces the pressure on the live platform server; in addition, the video stream pushed to the live broadcast is adapted to various live video playback front ends (PC end, mobile end, WEB end), and no need for multiple front ends. Adapt to different video stream resolution protocols and completely liberate the front end.

Abstract

本发明涉及网络直播领域,具体公开一种直播间视频流合成方法、装置及终端设备,所述方法包括步骤:响应于主播用户开启多源直播的操作指令,输出包含直播间特征信息及主播身份信息的预协议信息;将主播用户终端配置为服务器终端并接收第二终端根据所述预协议信息发起的连接请求;响应于所述连接请求,在主播用户与至少一个第二终端确认连接后,接收所述第二终端采集的第二视频流;将本地采集的第一视频流与所述第二视频流合成为第三视频流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第三视频流。通过将主播终端配置为服务器终端并合成推向直播间的视频流,减少带宽占用,降低直播平台服务器压力,解放直播视频播放前端。

Description

直播间视频流合成方法、装置及终端设备 【技术领域】
本发明涉及网络直播技术领域,具体涉及一种直播间视频流合成方法、装置及终端设备。
【背景技术】
随着互联网技术及智能移动终端设备的发展,各种互联网产品给人们的工作、生活带来了很多便利与娱乐,近年来,各类用于视频直播的直播平台层出不穷,视频直播给人们带来更实时的社交体验。目前的直播平台包括了用于移动端的直播平台和用于PC端的直播平台。
由于视频直播需求的多样性,用户需要同时直播多个移动端的内容或同时直播多个PC端的内容或同时直播移动端跟PC端的内容,例如用户开始在PC端上直播,然后想暂时转移直播场景至户外,利用移动端直播户外此时发生的情况,在现有技术中,往往需要主播用户在移动端重新建立一个直播间开启户外直播,然后原来直播间的观看用户需要重新进入新的直播间后才能看到主播进行的户外直播,由于户外直播是暂时性的,在户外直播停止后,主播重新回到PC端直播,而观看用户又得重新进入该主播的PC端的直播间,这种方式下主播用户跟观看用户的操作过程繁琐,无法给主播用户跟观看用户带来良好体验,降低直播平台口碑,造成用户流失。
所以,如何解决多个终端同时直播的多源直播模式是目前网络视频直播领域中亟需解决的问题,并且在多源直播模式下如何最大限度地提高网络带宽的利用率,降低直播平台服务器的压力同样是需要考虑并解决的问题。
【发明内容】
本发明的首要目的在于提供一种直播间视频流合成方法、装置。
本发明的另一目的在于提供执行上述直播间视频流合成方法的终端设备。
为实现该目的,本发明采用如下技术方案:
第一方面,本发明提供一种直播间视频流合成方法,包括如下步骤:
响应于主播用户开启多源直播的操作指令,输出包含直播间特征信息及主播身份信息的预协议信息;
将主播用户终端配置为服务器终端并接收第二终端根据所述预协议信息发起的连接请求;
响应于所述连接请求,在主播用户与至少一个第二终端确认连接后,接收所述第二终端采集的第二视频流;
将本地采集的第一视频流与所述第二视频流合成为第三视频流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第三视频流。
进一步的,所述预协议信息以二维码、特征口令或链接的形式表征。
进一步的,所述接收第二视频流的过程与采集本地的第一视频流的过程与合成第三视频流的过程与上传第三视频流的过程并行工作。
具体的,所述第一、第二、第三视频流均包括图像流和音频流,所述第三视频流包括第一视频流、第二视频流至少之一的图像流,还包括该两者至少之一的音频流。
进一步的,所述第二终端根据所述预协议信息向主播用户终端发起连接请求,并与主播用户终端保持长连接。
第二方面,本发明提供一种直播间视频流合成装置,包括:
输出模块:用于响应于主播用户开启多源直播的操作指令,输出包含直播间特征信息及主播身份信息的预协议信息;
配置模块:用于将主播用户终端配置为服务器终端并接收第二终端根据所述预协议信息发起的连接请求;
接收模块:用于响应于所述连接请求,在主播用户与至少一个第二终端确认连接后,接收所述第二终端采集的第二视频流;
合成模块:用于将本地采集的第一视频流与所述第二视频流合成为第三视频流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第三视频流。
进一步的,所述预协议信息以二维码、特征口令、链接等形式表征。
进一步的,所述接收第二视频流的过程与采集本地的第一视频流的过程与合成第三视频流的过程与上传第三视频流的过程并行工作。
具体的,所述第一、第二、第三视频流均包括图像流和音频流,所述第三视频流包括第一视频流、第二视频流至少之一的图像流,还包括该两者至少之一的音频流。
第三方面,本发明还提供一种终端设备,包括:处理器、存储器,所述处理器用于调用执行以程序形式存储于所述存储器的所述直播间视频流合成方法的任意一项步骤。
与现有技术相比,本发明具备如下优点:
(1)本发明将主播终端配置为服务器终端,用以接收第二终端采集后直接发送至主播终端的视频流,再由主播终端将所有视频流合成推送至直播间的视频流,直播间的观众用户看到的视频流与主播终端上的视频流一致,保证了视频流的同时性。
(2)同时,由于主播终端被配置为服务器终端,第二终端采集的视频流不需经过直播平台的服务器而直接发送至主播终端,直播平台服务器只需接收主播终端上传的最终推向直播间的视频流,并且将其推送至直播间的每个观看用户,避免大量的第二终端将视频流上传至服务器,减少了网络带宽的占用,降低了直播平台服务器的压力。
(3)再者,上述推向直播间的视频流适配于各种直播视频播放前端(PC端、移动端、WEB端),无需再为多种前端适配不同的视频流解析协议,彻底解放前端。
显然,上述有关本发明优点的描述是概括性的,更多的优点描述将体现在后续的实施例揭示中,以及,本领域技术人员也可以本发明所揭示的内容合理地发现本发明的其他诸多优点。
本发明附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本发明的实践了解到。
【附图说明】
图1为本发明直播间视频流合成方法的一实施例流程示意图;
图2为本发明多源直播的两种实施方式示意图;
图3为本发明以二维码的形式表征预协议信息的示意图;
图4为本发明以特征口令的形式表征预协议信息的示意图;
图5为本发明以链接的形式表征预协议信息的示意图;
图6为本发明选定连接2个第二视频流后的第三视频流的示意图;
图7为本发明直播间视频流合成装置的一实施例示意图。
【具体实施方式】
下面结合附图和示例性实施例对本发明作进一步地描述,其中附图中相同的标号全部指的是相同的部件。此外,如果已知技术的详细描述对于示出本发明的特征是不必要的,则将其省略。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本发明的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本发明所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
本技术领域技术人员可以理解,这里所使用的“终端”、“终端设备”既包括无线信号接收器的设备,其仅具备无发射能力的无线信号接收器的设备,又包括接收和发射硬件的设备,其具有能够在双向通信链路上,执行双向通信的接收和发射硬件的设备。这种设备可以包括:蜂窝或其他通信设备,其具有单线路显示器或多线路显示器或没有多线路显示器的蜂窝或其他通信设备;PCS(Personal Communications Service,个人通信系统),其可以组合语音、数据处理、传真和/或数据通信能力;PDA(Personal Digital Assistant, 个人数字助理),其可以包括射频接收器、寻呼机、互联网/内联网访问、网络浏览器、记事本、日历和/或GPS(Global Positioning System,全球定位系统)接收器;常规膝上型和/或掌上型计算机或其他设备,其具有和/或包括射频接收器的常规膝上型和/或掌上型计算机或其他设备。这里所使用的“终端”、“终端设备”可以是便携式、可运输、安装在交通工具(航空、海运和/或陆地)中的,或者适合于和/或配置为在本地运行,和/或以分布形式,运行在地球和/或空间的任何其他位置运行。这里所使用的“终端”、“终端设备”还可以是通信终端、上网终端、音乐/视频播放终端,例如可以是PDA、MID(Mobile Internet Device,移动互联网设备)和/或具有音乐/视频播放功能的移动电话,也可以是智能电视、机顶盒等设备。
本技术领域技术人员可以理解,这里所使用的远端网络设备,其包括但不限于计算机、网络主机、单个网络服务器、多个网络服务器集或多个服务器构成的云。在此,云由基于云计算(Cloud Computing)的大量计算机或网络服务器构成,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个超级虚拟计算机。本发明的实施例中,远端网络设备、终端设备与WNS服务器之间可通过任何通信方式实现通信,包括但不限于,基于3GPP、LTE、WIMAX的移动通信、基于TCP/IP、UDP协议的计算机网络通信以及基于蓝牙、红外传输标准的近距无线传输方式。
本技术领域技术人员可以理解,本发明所述的用户界面/操作界面泛指能够用于向智能终端发送控制指令的显示界面,例如,可以为Andro i d系统的设置页面中的一个选项(或按键,由所述应用程序添加其中,下同),也可以是从桌面呼出的通知栏或者交互页面中的一个选项,还可以是所述应用程序的一个活动组件所构造的页面中的一个选项。虽然上面已经示出了本发明的一些示例性实施例,但是本领域的技术人员将理解,在不脱离本发明的原理或精神的情况下,可以对这些示例性实施例做出改变,本发明的范围由权利要求及其等同物限定。
本领域技术人员应当理解,本发明所称的“应用”、“应用程序”、“应用软件”以及类似表述的概念,是业内技术人员所公知的相同概念,是指由一系列计算机指令及相关数据资源有机构造的适于电子运行的计算机软件。 除非特别指定,这种命名本身不受编程语言种类、级别,也不受其赖以运行的操作系统或平台所限制。理所当然地,此类概念也不受任何形式的终端所限制。
直播间:本发明所述直播间包括以下含义,①一种基于网络直播平台创建的一个虚拟空间(或虚拟房间),直播间一般由主播客户端创建并连接有多个观看客户端,即直播间中包括了主播及多个观众,位于该虚拟空间内的观看客户端可以观看主播客户端的直播内容,同时主播客户端的用户与观看客户端的用户、观看客户端的用户与观看客户端的用户之间还可以进行语音、图片、文字或赠送电子赠品的互动;②一种以群组方式将用户聚合在一起的即时通讯平台,例如视频会议系统,用户通过登录客户端的方式进入群组,用户在群组内以群组成员的身份存在,同一个群组内包含有多个群组成员,用户可任意加入或退出群组,在群组内,可以进行文字、语音、视频等多种互动。
第一方面,如图1所示是本发明直播间视频流合成方法的一实施例流程示意图,包括如下步骤:
本方法的实现依赖于视频直播客户端的某一功能模块或插件,视频直播客户端包括PC端上的视频直播软件和移动端上的视频直播应用程序,所述功能模块或插件具体为相应软件/应用程序内的可执行代码,所述功能模块或插件的具体实现形式不作为对本发明的具体限定。
步骤S100:响应于主播用户开启多源直播的操作指令,输出包含直播间特征信息及主播身份信息的预协议信息。
具体的,如图2所示是多源直播的两种实施方式:①利用多种、多台拍摄设备同时采集多个视频流,再在主播终端以主播身份将该多个视频流合成推送至直播间的视频流,例如主播刚开始是利用PC端进行直播,然后主播要暂时转移直播场景,此时可以利用手机端采集另一场景的视频流然后与PC端的视频流合并从而实现多源直播;②由主播终端采集第一视频流,再连接多个用户并接收该些用户的第二终端采集的第二视频流,再在主播终端上以主播身份将该多个视频流合成推送至直播间的视频流进而实现多源直播。用户点击如图2中的“我是主播”按钮,然后登陆直播平台的用户身份信息, 再通过配置直播间的标题、封面图片等信息后,即可以主播用身份开启多源直播。
具体的,所述预协议信息包含直播间特征信息、主播身份信息,另外还包含建立主播终端与第二终端的专属链路的授权令牌或鉴权信息,其通过加密方式处理、存储在预协议信息中,必须经相应授权的应用程序才能从预协议信息中解析、得到所述直播间特征信息、主播身份信息。一般地,预协议信息由各直播平台自行定义,经该直播平台授权的应用程序识别表征所述预协议信息的形式后方能得到所述预协议信息。直播间特征信息为直播间的频道ID,用以确定唯一的直播间,主播身份信息为主播用户的UID,用以确定用户的主播身份。
具体的,所述预协议信息以二维码、特征口令或链接的形式表征。
如图3所示是以二维码的形式表征预协议信息的示意图,二维码是通过特定的几何图形按预定的规律在二维平面上分布的黑白相间的图形记录数据符号信息,二维码在代码编制上巧妙地利用构成计算机内部逻辑基础的“0”、“1”比特流的概念,使用多个与二进制相对应的几何形体来表示文字、数值等信息,可以通过图象输入设备或光电扫描设备识读二维码以获取二维码中包含的信息。二维码具有条码技术的多个共性:每种码制有其特定的字符集;每个字符占有预定的宽度;并且具有一定的校验功能等。
另外,预协议信息亦可以如图4所示的特征口令的形式表征;同时,预协议信息亦可以如图5所示的链接的形式表征。不管预协议信息以何种形式表征,其均需由各直播平台授权的应用程序根据预先设定的协议获取其中的预协议信息,再进行后续操作。
步骤S200:将主播用户终端配置为服务器终端并接收第二终端根据所述预协议信息发起的连接请求。
主播用户开启多源直播后,视频直播客户端将主播用户终端配置为一台服务器以接收第二终端的视频流,为了区别于直播平台的服务器,本实施例亦称主播用户终端为服务器终端;同时,主播用户将直播客户端输出的预协议信息发送给其它用户/终端设备,以使该些第二终端根据所述预协议信息向服务器终端发起连接请求。
具体的,所述第二终端根据所述预协议信息向主播用户终端发起连接请求后,直播平台服务器验证预协议信息,验证成功则视该连接请求为合法连接请求,然后为主播用户终端跟第二终端建立专属的通信链路连接以使双方直接进行数据传输,此时,第二终端与主播用户终端以长连接的形式保持双方的通信链路连接。长连接即是客户端长时间的连接在服务器上,在本实施例中,由于主播用户终端被配置为服务器终端,所述长连接即指第二终端的视频直播客户端长时间的连接在服务器终端(即主播用户终端)上,保持长连接的通信双方在一个连接上可以连续发送多个数据包,一般服务器都设有超时限制即一定时间内连接处于非活动状态(没有任何数据传输)服务器就会把连接自动断开,所以需要客户端每隔一段时间给服务端发送一个心跳检测数据包以保持长链接。所以在本实施例中,主播用户终端与第二终端在长连接保持期间,如果没有数据包发送,需要第二终端的视频直播客户端每隔一段时间给主播用户终端发送一个心跳数据包以保持长链接,长连接适用于点对点通讯,对于视频直播领域而言,本发明可以大大减少直播平台服务器的带宽,降低直播平台服务器的压力。
步骤S300:响应于所述连接请求,在主播用户与至少一个第二终端确认连接后,接收所述第二终端采集的第二视频流。
主播用户的视频直播客户端响应所述连接请求,在视频直播界面上显示保持长连接状态的一个或多个第二终端,受限于长连接的特性及视频直播界面的大小,与主播用户终端保持长连接的第二终端的数量优选小于等于5个,同时,在主播用户选定与其中一个或多个第二终端确认连接后,此时第二终端将第二视频流传输至服务器终端,双方不再需要发送心跳检测包,服务器终端(即主播用户终端)接收所述第二终端采集的第二视频流;另外,主播用户可随时在主播用户终端的操作界面上取消接收第二终端的视频流,此时第二终端不再发送第二视频流至服务器终端,在达到规定的时间后,第二终端再次向服务器终端发送心跳检测包,以保持与服务器终端的长连接;再者,主播用户亦可断开与第二终端的长连接,在断开后达到规定的时间,第二终端再次向服务器终端发送心跳检测包,服务器终端没有反馈,则此时第二终端断开与服务器终端的长连接。
步骤S400:将本地采集的第一视频流与所述第二视频流合成为第三视频 流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第三视频流。
在主播用户与至少一个第二终端确认连接后接收第二终端采集的视频流,同时将本地采集的第一视频流与所述第二视频流合成为第三视频流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第三视频流。所述第一、第二、第三视频流均包括图像流和音频流,所述第三视频流包括第一视频流、第二视频流至少之一的图像流,还包括该两者至少之一的音频流;例如主播终端接收一个第二视频流,此时在主播用户终端上的视频流包括一个第一视频流的图像流和音频流、及一个第二视频流的图像流和音频流,在合成为第三视频流时,根据主播用户的选择对于第一视频流和第二视频流的音频流择一选用或调整其中一个音频流的音量大小以不影响另一音频流的播放。如图6所示是本发明选定连接2个第二视频流后的第三视频流的示意图。
在本实施例中,所述接收第二视频流的过程与采集本地的第一视频流的过程与合成第三视频流的过程与上传第三视频流的过程并行工作。并行工作是指同一时刻或同一时间间隔内完成两种或两种以上性质相同或不相同的工作,并行工作具有①时间重叠性:相邻处理过程在时间上错开,轮流重叠使用同一套硬件的各部分;②资源共享性:让多个用户按照一定的时间顺序轮流使用同一套资源,提高资源利用率;③资源重复:重复设置硬件资源,提高硬件可靠性和性能。具体的,对于本发明实施例而言,并行工作是指主播终端在接收第二直频流时,可以同步/同步采集第一视频流,并同时/同步将所采集的第一视频流和所接收到的第二视频流合并成第三视频流,且同时/同步将第三视频流上传到服务器,保证了直播视频流的同时/同步性。
相应地,如图7所示,本发明还提供一种直播间视频流合成装置,包括:
输出模块100:用于响应于主播用户开启多源直播的操作指令,输出包含直播间特征信息及主播身份信息的预协议信息。
用户点击如图2中的“我是主播”按钮,然后登陆直播平台的用户身份信息,再通过配置直播间的标题、封面图片等信息后,即可以主播用身份开启多源直播,即触发开启多源直播的操作指令,输出模块100将包含直播间特征信息及主播身份信息的预协议信息以如图3的二维码形式、图4的特征 口令形式、图5的链接形式输出。
配置模块200:用于将主播用户终端配置为服务器终端并接收第二终端根据所述预协议信息发起的连接请求。
主播用户开启多源直播后,配置模块200将主播用户终端配置为一台服务器以接收第二终端的视频流,为了区别于直播平台的服务器,本实施例亦称主播用户终端为服务器终端;同时,主播用户将直播客户端输出的预协议信息发送给其它用户/终端设备,以使该些第二终端根据所述预协议信息向服务器终端发起连接请求。
接收模块300:用于响应于所述连接请求,在主播用户与至少一个第二终端确认连接后,接收所述第二终端采集的第二视频流。
接收模块300响应所述连接请求,在视频直播界面上显示保持长连接状态的一个或多个第二终端,同时,在主播用户选定与其中一个或多个第二终端确认连接后,此时第二终端将第二视频流传输至服务器终端,双方不再需要发送心跳检测包,接收模块300接收所述第二终端采集的第二视频流。
合成模块400:用于将本地采集的第一视频流与所述第二视频流合成为第三视频流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第三视频流。
在主播用户与至少一个第二终端确认连接后接收第二终端采集的视频流,合成模块400同时将本地采集的第一视频流与所述第二视频流合成为第三视频流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第三视频流。所述视频流包括图像流和音频流,所述第三直播视频流包括第一直播视频流、第二直播视频流至少之一的或者两者合成后的图像流,还包括该两者至少之一的音频流。
在本实施例中,所述接收第二视频流的过程与采集本地的第一视频流的过程与合成第三视频流的过程与上传第三视频流的过程并行工作。具体的,对于本发明实施例而言,并行工作是指接收模块300在接收第二直频流时,可以同步/同步采集第一视频流,并同时/同步合成模块400将所采集的第一视频流和所接收到的第二视频流合并成第三视频流,且同时/同步将第三视频流上传到服务器,保证了直播视频流的同时/同步性。
另外,本发明还提供一种终端设备,包括存储器、处理器,所述存储器用于存储上述方法实现过程中产生的候选中间数据以及结果数据,所述处理器用于调用执行以程序形式存储于所述存储器的所述直播间视频流合成方法的任意一项步骤。
本发明将主播终端配置为服务器终端,用以接收第二终端采集后直接发送至主播终端的视频流,再由主播终端将所有视频流合成推送至直播间的视频流,直播间的观众用户看到的视频流与主播终端上的视频流一致,保证了视频流的同时性;同时,由于主播终端被配置为服务器终端,第二终端采集的视频流不需经过直播平台的服务器而直接发送至主播终端,直播平台服务器只需接收主播终端上传的最终推向直播间的视频流,并且将其推送至直播间的每个观看用户,避免大量的第二终端将视频流上传至服务器,减少了网络带宽的占用,降低了直播平台服务器的压力;再者,上述推向直播间的视频流适配于各种直播视频播放前端(PC端、移动端、WEB端),无需再为多种前端适配不同的视频流解析协议,彻底解放前端。
虽然上面已经示出了本发明的一些示例性实施例,但是本领域的技术人员将理解,在不脱离本发明的原理或精神的情况下,可以对这些示例性实施例做出改变,本发明的范围由权利要求及其等同物限定。

Claims (10)

  1. 一种直播间视频流合成方法,其特征在于,包括如下步骤:
    响应于主播用户开启多源直播的操作指令,输出包含直播间特征信息及主播身份信息的预协议信息;
    将主播用户终端配置为服务器终端并接收第二终端根据所述预协议信息发起的连接请求;
    响应于所述连接请求,在主播用户与至少一个第二终端确认连接后,接收所述第二终端采集的第二视频流;
    将本地采集的第一视频流与所述第二视频流合成为第三视频流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第三视频流。
  2. 根据权利要求1所述的方法,其特征在于,所述预协议信息以二维码、特征口令或链接的形式表征。
  3. 根据权利要求1所述的方法,其特征在于,所述接收第二视频流的过程与采集本地的第一视频流的过程与合成第三视频流的过程与上传第三视频流的过程并行工作。
  4. 根据权利要求1所述的方法,其特征在于,所述第一、第二、第三视频流均包括图像流和音频流,所述第三视频流包括第一视频流、第二视频流至少之一的图像流,还包括该两者至少之一的音频流。
  5. 根据权利要求1所述的方法,其特征在于,所述第二终端根据所述预协议信息向主播用户终端发起连接请求,并与主播用户终端保持长连接。
  6. 一种直播间视频流合成装置,其特征在于,包括:
    输出模块:用于响应于主播用户开启多源直播的操作指令,输出包含直播间特征信息及主播身份信息的预协议信息;
    配置模块:用于将主播用户终端配置为服务器终端并接收第二终端根据所述预协议信息发起的连接请求;
    接收模块:用于响应于所述连接请求,在主播用户与至少一个第二终端确认连接后,接收所述第二终端采集的第二视频流;
    合成模块:用于将本地采集的第一视频流与所述第二视频流合成为第三视频流后上传到服务器,以使得所述服务器向直播间的每个用户推送所述第 三视频流。
  7. 根据权利要求6所述的装置,其特征在于,所述预协议信息以二维码、特征口令、链接等形式表征。
  8. 根据权利要求6所述的装置,其特征在于,所述接收第二视频流的过程与采集本地的第一视频流的过程与合成第三视频流的过程与上传第三视频流的过程并行工作。
  9. 根据权利要求6所述的装置,其特征在于,所述第一、第二、第三视频流均包括图像流和音频流,所述第三视频流包括第一视频流、第二视频流至少之一的图像流,还包括该两者至少之一的音频流。
  10. 一种终端设备,其特征在于,包括:处理器、存储器,其特征在于,所述处理器用于调用执行以程序形式存储于所述存储器的如权利要求1~5任意一项所述方法的步骤。
PCT/CN2017/105017 2016-11-22 2017-09-30 直播间视频流合成方法、装置及终端设备 WO2018095146A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/419,022 US20190273955A1 (en) 2016-11-22 2019-05-22 Method, device and terminal apparatus for synthesizing video stream of live streaming room

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611049302.3A CN106792245B (zh) 2016-11-22 2016-11-22 直播间视频流合成方法、装置及终端设备
CN201611049302.3 2016-11-22

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/419,022 Continuation US20190273955A1 (en) 2016-11-22 2019-05-22 Method, device and terminal apparatus for synthesizing video stream of live streaming room

Publications (1)

Publication Number Publication Date
WO2018095146A1 true WO2018095146A1 (zh) 2018-05-31

Family

ID=58910512

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/105017 WO2018095146A1 (zh) 2016-11-22 2017-09-30 直播间视频流合成方法、装置及终端设备

Country Status (3)

Country Link
US (1) US20190273955A1 (zh)
CN (1) CN106792245B (zh)
WO (1) WO2018095146A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814732A (zh) * 2020-07-23 2020-10-23 上海优扬新媒信息技术有限公司 一种身份验证方法及装置
EP4017013A4 (en) * 2019-08-14 2022-11-16 Beijing Dajia Internet Information Technology Co., Ltd. METHOD AND DEVICE FOR OPENING A VIDEO IMAGE IN A CHAT ROOM AND ELECTRONIC DEVICE AND STORAGE MEDIA

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106792245B (zh) * 2016-11-22 2018-04-20 广州华多网络科技有限公司 直播间视频流合成方法、装置及终端设备
CN107547910A (zh) * 2017-08-24 2018-01-05 深圳依偎控股有限公司 一种基于多设备直播的方法及装置
CN108055577A (zh) * 2017-12-18 2018-05-18 北京奇艺世纪科技有限公司 一种直播交互方法、系统、装置及电子设备
CN108156501A (zh) * 2017-12-29 2018-06-12 北京安云世纪科技有限公司 用于对视频数据进行动态合成的方法、系统以及移动终端
CN108449623B (zh) * 2018-03-27 2021-07-27 卓米私人有限公司 抓取物体的控制方法、服务器和目标客户端
CN108600239A (zh) * 2018-05-01 2018-09-28 北京学易科技有限公司 数据合成方法及装置、客户端、服务器
CN108833931A (zh) * 2018-06-28 2018-11-16 南京曼殊室信息科技有限公司 一种异地用3d全息影像交互直播平台
CN108924641A (zh) * 2018-07-16 2018-11-30 北京达佳互联信息技术有限公司 直播方法、装置及计算机设备及存储介质
CN108900920B (zh) * 2018-07-20 2020-11-10 广州虎牙信息科技有限公司 一种直播处理方法、装置、设备及存储介质
CN108965904B (zh) * 2018-09-05 2021-08-06 阿里巴巴(中国)有限公司 一种直播间的音量调节方法及客户端
CN109525460B (zh) * 2018-11-26 2020-10-13 视联动力信息技术股份有限公司 一种视联网号码资源监测的方法和装置
CN109640171A (zh) * 2018-12-07 2019-04-16 北京微播视界科技有限公司 多媒体信息合成方法、电子设备及计算机可读存储介质
CN112788349B (zh) * 2019-11-01 2022-10-04 上海哔哩哔哩科技有限公司 数据流推送方法、系统、计算机设备及可读存储介质
CN111131908B (zh) * 2019-12-19 2021-12-28 广州方硅信息技术有限公司 语音礼物的接收方法、装置、设备及存储介质
CN111263220B (zh) * 2020-01-15 2022-03-25 北京字节跳动网络技术有限公司 视频的处理方法、装置、电子设备及计算机可读存储介质
CN111355973B (zh) * 2020-03-09 2021-10-15 北京达佳互联信息技术有限公司 数据播放方法、装置、电子设备及存储介质
CN111355974A (zh) * 2020-03-12 2020-06-30 广州酷狗计算机科技有限公司 虚拟礼物赠送处理的方法、装置、系统、设备及存储介质
CN111541906B (zh) * 2020-04-22 2022-07-05 广州酷狗计算机科技有限公司 数据发送方法、装置、计算机设备及存储介质
CN112235589B (zh) * 2020-10-13 2022-07-12 中国联合网络通信集团有限公司 网络直播识别方法、边缘服务器、计算机设备及存储介质
CN112333460B (zh) * 2020-11-02 2021-10-26 腾讯科技(深圳)有限公司 一种直播管理方法、计算机设备及可读存储介质
CN112423012B (zh) * 2020-11-18 2023-05-09 青岛华升联信智慧科技有限公司 一种多级负载的直播方法
CN112738540B (zh) * 2020-12-25 2023-09-05 广州虎牙科技有限公司 多设备直播切换方法、装置、系统、电子设备和可读存储介质
CN115086689B (zh) * 2021-03-15 2024-04-05 阿里巴巴创新公司 虚拟直播管理方法、电子设备及计算机存储介质
US11917214B1 (en) 2022-10-20 2024-02-27 Lemon Inc. Methods and systems for live streaming recommended content

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8910208B2 (en) * 2009-12-07 2014-12-09 Anthony Hartman Interactive video system
CN104581221A (zh) * 2014-12-25 2015-04-29 广州酷狗计算机科技有限公司 视频直播的方法和装置
CN105306468A (zh) * 2015-10-30 2016-02-03 广州华多网络科技有限公司 一种合成视频数据实时共享的方法及其主播客户端
CN105407384A (zh) * 2014-09-15 2016-03-16 上海天脉聚源文化传媒有限公司 一种利用二维码标识媒体播放内容的方法、装置及系统
CN105959719A (zh) * 2016-06-27 2016-09-21 徐文波 一种视频直播方法、装置和系统
CN106792245A (zh) * 2016-11-22 2017-05-31 广州华多网络科技有限公司 直播间视频流合成方法、装置及终端设备

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201623797U (zh) * 2010-04-30 2010-11-03 第一视频通信传媒有限公司 一种多画面网络直播系统
GB2526245A (en) * 2014-03-04 2015-11-25 Microsoft Technology Licensing Llc Sharing content
CN105530535A (zh) * 2014-09-29 2016-04-27 中兴通讯股份有限公司 一种多人观看视频实时互动的方法及系统
CN104639905B (zh) * 2015-02-03 2016-07-06 广西智询信息科技有限公司 一种适用于iOS系统移动终端直播的无线视频监控方法
US9774571B2 (en) * 2015-03-10 2017-09-26 Microsoft Technology Licensing, Llc Automatic provisioning of meeting room device
KR102013054B1 (ko) * 2015-04-10 2019-08-21 천종윤 퍼포먼스의 출력 및 퍼포먼스 컨텐츠 생성을 수행하는 방법 및 그 시스템
CN105472368A (zh) * 2015-11-25 2016-04-06 深圳凯澳斯科技有限公司 一种面向集群终端的立体视频直播系统
CN105704502B (zh) * 2016-01-19 2018-11-20 丁一 视频直播交互方法及装置
CN105763883B (zh) * 2016-02-19 2018-09-28 锐达互动科技股份有限公司 一种基于录播设备和云平台的直播方法
CN105847263A (zh) * 2016-03-31 2016-08-10 乐视控股(北京)有限公司 视频直播的方法、装置及系统
CN105847913B (zh) * 2016-05-20 2019-05-31 腾讯科技(深圳)有限公司 一种控制视频直播的方法、移动终端及系统
WO2017219347A1 (zh) * 2016-06-24 2017-12-28 北京小米移动软件有限公司 直播显示方法、装置及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8910208B2 (en) * 2009-12-07 2014-12-09 Anthony Hartman Interactive video system
CN105407384A (zh) * 2014-09-15 2016-03-16 上海天脉聚源文化传媒有限公司 一种利用二维码标识媒体播放内容的方法、装置及系统
CN104581221A (zh) * 2014-12-25 2015-04-29 广州酷狗计算机科技有限公司 视频直播的方法和装置
CN105306468A (zh) * 2015-10-30 2016-02-03 广州华多网络科技有限公司 一种合成视频数据实时共享的方法及其主播客户端
CN105959719A (zh) * 2016-06-27 2016-09-21 徐文波 一种视频直播方法、装置和系统
CN106792245A (zh) * 2016-11-22 2017-05-31 广州华多网络科技有限公司 直播间视频流合成方法、装置及终端设备

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4017013A4 (en) * 2019-08-14 2022-11-16 Beijing Dajia Internet Information Technology Co., Ltd. METHOD AND DEVICE FOR OPENING A VIDEO IMAGE IN A CHAT ROOM AND ELECTRONIC DEVICE AND STORAGE MEDIA
CN111814732A (zh) * 2020-07-23 2020-10-23 上海优扬新媒信息技术有限公司 一种身份验证方法及装置
CN111814732B (zh) * 2020-07-23 2024-02-09 度小满科技(北京)有限公司 一种身份验证方法及装置

Also Published As

Publication number Publication date
CN106792245B (zh) 2018-04-20
CN106792245A (zh) 2017-05-31
US20190273955A1 (en) 2019-09-05

Similar Documents

Publication Publication Date Title
WO2018095146A1 (zh) 直播间视频流合成方法、装置及终端设备
WO2018095174A1 (zh) 直播间视频流合成控制方法、装置及终端设备
US11616990B2 (en) Method for controlling delivery of a video stream of a live-stream room, and corresponding server and mobile terminal
CN110597774B (zh) 一种文件分享方法、系统、装置、计算设备及终端设备
WO2018121014A1 (zh) 视频播放控制方法、装置及终端设备
CN108259813B (zh) 多功能传屏装置、系统及方法
US9118729B2 (en) Method for sharing resource of a videoconference using a video conferencing system
WO2015117513A1 (zh) 视频会议控制方法和系统
EP3996355B1 (en) Method for transferring media stream and user equipment
US9756096B1 (en) Methods for dynamically transmitting screen images to a remote device
US20180014063A1 (en) Method and Apparatus for Accessing a Terminal Device Camera to a Target Device
WO2019080222A1 (zh) 移动终端的数据传输方法、装置和移动终端
CN105763905A (zh) 一种手机和电视共享摄像头的方法及系统
US20110096699A1 (en) Media pipeline for a conferencing session
US20220116746A1 (en) Special effect synchronization method, device, and storage medium
KR100628322B1 (ko) 비통신기기를 통하여 방송통신 융합서비스를 중개하는액세스 미디에이터 시스템
CN111092898A (zh) 报文传输方法及相关设备
CN101022500A (zh) 一种手机通过蓝牙向电脑传输视频图像的方法
US11290685B2 (en) Call processing method and gateway
US20040193675A1 (en) Method for supporting a personal wireless network
CN103581607A (zh) 使用远程摄像机设备将视频流传递到本地端点主机的方法
CN113726534A (zh) 会议控制方法、装置、电子设备及存储介质
WO2012103670A1 (zh) 一种基于云服务提供和处理多媒体数据的方法、装置及系统
JP2014072600A (ja) 会議サーバ、通信方法、コンピュータプログラム及び遠隔会議システム
CN103248860A (zh) 适应性调整视频传输频宽的方法及其相应装置与系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17874845

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17874845

Country of ref document: EP

Kind code of ref document: A1