CN108900859B

CN108900859B - Live broadcasting method and system

Info

Publication number: CN108900859B
Application number: CN201810943503.0A
Authority: CN
Inventors: 吕现广
Original assignee: Guangzhou Kugou Computer Technology Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2018-08-17
Filing date: 2018-08-17
Publication date: 2020-07-10
Anticipated expiration: 2038-08-17
Also published as: CN108900859A

Abstract

The application discloses a live broadcast method and a system, which belong to the technical field of information processing, and the method comprises the following steps: when the mobile terminal is in a double-stream live broadcast mode and is in a microphone connecting state, the first terminal can synthesize a first video frame acquired by the first terminal and a second video frame acquired by a third terminal connected with the first terminal to obtain a synthesized video, mix a first audio frame acquired by the first terminal and a second audio frame acquired by the third terminal to obtain a mixed audio, and send a first data packet containing the synthesized video and the mixed audio to the second terminal and the streaming media server.

Description

Live broadcasting method and system

Technical Field

The present application relates to the field of information processing technologies, and in particular, to a live broadcast method and system.

Background

In the current internet live broadcast, a main broadcast can perform double-stream live broadcast, the double-stream live broadcast means that the main broadcast corresponds to two terminals, the two terminals respectively acquire two paths of videos, and for convenience of subsequent description, the two terminals are respectively called as a first terminal and a second terminal. The first terminal is a horizontal screen terminal, namely the aspect ratio of the display screen is greater than 1, therefore, the acquired video is the video with the horizontal screen resolution, the second terminal is a vertical screen terminal, namely the aspect ratio of the display screen is less than 1, and therefore, the acquired video is the video with the vertical screen resolution. And then, the first terminal and the second terminal can respectively send the videos acquired by the first terminal and the second terminal to the streaming media server, and after receiving the videos respectively sent by the first terminal and the second terminal, the streaming media server can push the video acquired by the first terminal to audience users holding the horizontal screen terminal and push the video acquired by the second terminal to audience users holding the vertical screen terminal.

In the live broadcasting method, when a certain audience user is used as a wheat connecting person to connect with a first terminal of a main broadcast through a third terminal used by the user, the first terminal can receive a video acquired by the third terminal, and the video acquired by the first terminal and the video acquired by the third terminal are synthesized and then sent to a streaming media server. However, since the video captured by the first terminal is a video with a horizontal screen resolution, the composite video is also with a horizontal screen resolution, so that the streaming media server can only push the composite video to other audience users holding the horizontal screen terminal after receiving the composite video. Similarly, if the audience user connects to the second terminal of the anchor through the third terminal used by the audience user, the second terminal synthesizes the video acquired by the second terminal and the video acquired by the third terminal to obtain a synthesized video with the vertical screen resolution, so that the streaming media server can only push the synthesized video to other audience users having the vertical screen terminal after receiving the synthesized video. That is, in the related art, the third terminal of the audience user can only connect to the other terminal of the two terminals of the main broadcast, so that either the audience user having the vertical screen terminal cannot watch the composite video or the audience user having the horizontal screen terminal cannot watch the composite video. Based on this, it is desirable to provide a live broadcasting method to ensure that other audience users having both the vertical screen terminal and the horizontal screen terminal can watch the composite video of the main broadcaster and the continuous microphone during continuous microphone.

Disclosure of Invention

The embodiment of the application provides a live broadcasting method and system, which can be used for simultaneously providing a continuous-broadcasting video which accords with the aspect ratio of display screens of respective terminals for audience users holding a horizontal screen terminal and a vertical screen terminal during continuous broadcasting in a double-stream live broadcasting process. The technical scheme is as follows:

in a first aspect, a live broadcast method is provided, where the method includes:

the method comprises the steps that a first terminal detects whether the first terminal is in a double-current live broadcast mode currently or not and detects whether the first terminal is in a microphone connecting state currently or not;

if the current terminal is in a double-stream live broadcast mode and is in a microphone connecting state, the first terminal acquires a first video frame and a first audio frame currently acquired by the first terminal, and acquires a second video frame and a second audio frame currently acquired by a third terminal connected with the first terminal;

synthesizing the first video frame and the second video frame to obtain a synthesized video, mixing the first audio frame and the second audio frame to obtain a mixed audio, and sending a first data packet containing the synthesized video and the mixed audio to a second terminal and a streaming media server;

when the second terminal receives the first data packet, processing the first data packet to obtain a processed video which accords with the aspect ratio of a display screen of the second terminal;

and the second terminal displays the processed video and sends a second data packet containing the processed video and the mixed audio to a streaming media server.

Optionally, the detecting whether the current live broadcast mode is a dual-stream live broadcast mode, and detecting whether the current live broadcast mode is a connected-to-wheat state includes:

detecting whether a current double-current live broadcast variable is a first numerical value or not, and detecting whether a current wheat connecting variable is a second numerical value or not;

and if the current double-stream live broadcast variable is the first numerical value and the current wheat connecting variable is the second numerical value, determining that the current double-stream live broadcast variable is in a double-stream live broadcast mode and the current wheat connecting state.

Optionally, the first terminal is a landscape screen terminal, and the synthesizing the first video frame and the second video frame includes:

determining a first boundary and a second boundary which are parallel to the height direction of the first video frame, wherein the distance from the first boundary to a first edge of the first video frame is equal to the distance from the second boundary to a second edge of the first video frame, and the first edge and the second edge are both parallel to the height direction of the first video frame;

intercepting a first video picture between the first boundary line and the second boundary line from the first video frame, wherein the width of the first video picture is smaller than the width of the first video frame, and the height of the first video picture is equal to the height of the first video frame;

and if the height of the second video frame is the same as that of the first video frame, splicing the second video frame at one side of the first edge or the second edge of the first video picture.

Optionally, the processing the first data packet includes:

acquiring a composite video in the first data packet, and reducing the composite video so that the width of the composite video is equal to the width of a display screen of the second terminal;

splicing a first blank picture at one side where the first edge of the zoomed video is located according to the height of a display screen of the second terminal, and splicing a second blank picture at one side where the second edge of the zoomed video is located;

the first edge and the second edge are parallel to the width direction of the zoomed video, the height of the first blank picture is the same as that of the second blank picture, and the sum of the height of the first blank picture, the height of the first blank picture and the height of the zoomed video is equal to the height of a display screen of the second terminal;

and filling background colors in the first blank picture and the second blank picture in the spliced video, and taking the filled video as the processed video.

Optionally, the composite video and the mixed audio carry the same time stamp, and the processed video carries the time stamp of the composite video.

Optionally, the displaying the processed video includes:

extracting a timestamp from the processed video;

recording the current system time, and determining the display time of the processed video based on the frame interval time of the video frame and the current system time;

if the display time is later than the time indicated by the timestamp, displaying the processed video at the current moment;

and if the display time is earlier than the time indicated by the timestamp, delaying the display of the processed video.

Optionally, the method further comprises:

if the current double-stream live broadcast mode is detected and the current double-stream live broadcast mode is not in a microphone connecting state, the first terminal acquires a first video frame and a first audio frame which are currently acquired by the first terminal;

the first terminal sends an audio data packet containing the first audio frame to the second terminal, and sends a third data packet containing the first video frame and the first audio frame to the streaming media server, wherein the first audio frame carries a first time stamp;

when the second terminal receives the audio data packet, the second terminal acquires the first audio frame in the audio data packet, acquires a third video frame currently acquired by the second terminal, and records the acquisition time of the third video frame;

determining a second timestamp of the third video frame based on the acquisition time of the third video frame and a time offset between the system time of the first terminal and the system time of the second terminal;

displaying the third video frame based on the second timestamp;

and sending a fourth data packet containing the first audio frame and the third video frame to the streaming media server, wherein the third video frame carries the second timestamp.

Optionally, the method further comprises:

the second terminal sends at least one timing request packet to the first terminal, and correspondingly stores a request packet sequence number and sending time of each timing request packet in the at least one timing request packet, wherein the sending time refers to system time of the second terminal when the corresponding timing request packet is sent, and the request packet sequence number is used for identifying the corresponding timing request packet;

when the first terminal receives a target timing request packet, sending a target timing response packet for the received target timing request packet to the second terminal, wherein the target timing request packet refers to any one of the at least one timing request packet, and the target timing response packet carries a request packet sequence number of the target timing request packet and a current first system time of the first terminal;

when the second terminal receives the target timing response packet, the second terminal records a second system time of the second terminal when receiving the target timing response packet;

the second terminal acquires the sending time corresponding to the request packet sequence number carried by the target timing response packet from the stored corresponding relation based on the request packet sequence number carried by the target timing response packet to obtain third system time;

determining a time offset between the system time of the first terminal and the system time of the second terminal based on the first system time, the second system time, and the third system time.

Optionally, the method further comprises:

if the current live broadcast mode is detected not to be in the double-stream live broadcast mode and the current live broadcast mode is not in the connected-to-microphone state, acquiring a first video frame and a first audio frame which are currently acquired by the first terminal;

and playing the first video frame, and sending a single-stream live broadcast data packet containing the first video frame and the first audio frame to the streaming media server.

Optionally, the header of the first data packet carries a microphone connection identifier, where the microphone connection identifier is used to notify the second terminal that the first terminal is currently in a microphone connection state.

In a second aspect, a live system is provided, which includes: the system comprises a first terminal, a second terminal, a third terminal, a streaming media server and a microphone connecting server;

the first terminal is used for detecting whether the terminal is in a double-current live broadcast mode currently or not and detecting whether the terminal is in a wheat connecting state currently or not; if the current terminal is in a double-stream live broadcast mode and is in a microphone connecting state, the first terminal acquires a first video frame and a first audio frame currently acquired by the first terminal, and acquires a second video frame and a second audio frame currently acquired by a third terminal connected with the first terminal; synthesizing the first video frame and the second video frame, mixing the first audio frame and the second audio frame, and sending a first data packet containing synthesized video and mixed audio to a second terminal and a streaming media server;

the second terminal is used for receiving the first data packet, acquiring the synthesized video and the audio mixing audio contained in the first data packet, and processing the synthesized video to obtain a processed video according with the aspect ratio of a display screen of the second terminal; the second terminal displays the processed video and sends a second data packet containing the processed video and the mixed audio to a streaming media server;

the third terminal is used for sending a second video frame and a second audio frame which are collected currently to the microphone connecting server;

the microphone connecting server is used for receiving the second video frame and the second audio frame and sending the second video frame and the second audio frame to the first terminal;

the streaming media server is configured to receive the first data packet and the second data packet, and send the first data packet or the second data packet to other terminals except the first terminal and the second terminal.

Optionally, the first terminal is further configured to obtain, by the first terminal, a first video frame and a first audio frame currently acquired by the first terminal if it is detected that the first terminal is currently in a dual-stream live broadcast mode and is not currently in a connected-to-microphone state; sending an audio data packet containing the first audio frame to the second terminal, and sending a third data packet containing the first video frame and the first audio frame to the streaming media server, wherein the first audio frame carries a first timestamp;

the second terminal is further configured to receive the audio data packet, acquire the first audio frame in the audio data packet, acquire a third video frame currently acquired by the second terminal, and record acquisition time of the third video frame; determining a second timestamp of the third video frame based on the acquisition time of the third video frame and a time offset between the system time of the first terminal and the system time of the second terminal; displaying the third video frame based on the second timestamp; sending a fourth data packet containing the first audio frame and the third video frame to the streaming media server, wherein the third video frame carries the second timestamp;

the streaming media server is further configured to receive the third data packet and the fourth data packet, and send the third data packet or the fourth data packet to other terminals except the first terminal and the second terminal.

In a third aspect, a terminal is provided, where the terminal includes:

a processor;

a memory for storing processor-executable instructions;

wherein, when the terminal is a first terminal, the processor is configured to perform the correlation steps performed by the first terminal in the first aspect, and when the terminal is a second terminal, the processor is configured to perform the correlation steps performed by the second terminal in the first aspect.

In a fourth aspect, a computer-readable storage medium is provided, having instructions stored thereon, which when executed by a processor, implement the steps of any of the methods of the first aspect described above.

The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise: when the dual-stream live broadcast mode is in a continuous-broadcast state, the first terminal can synthesize a first video frame acquired by the first terminal and a second video frame acquired by a third terminal connected with the first terminal, mix a first audio frame acquired by the first terminal and a second audio frame acquired by the third terminal, send a continuous-broadcast data packet containing synthesized video and mixed-broadcast audio obtained by synthesis to the second terminal and the streaming media server, so that the second terminal can generate processed video which contains a video picture of the third terminal and accords with the aspect ratio of a display screen of the second terminal based on the continuous-broadcast data packet and further push the processed video and the mixed-broadcast audio to the streaming media server, and the streaming media server can push the video containing a main broadcast and a continuous-broadcast person to audience users holding a vertical screen terminal and a transverse screen terminal simultaneously, thereby enabling the dual-stream live broadcast mode, the combined video of the main broadcaster and the microphone can be watched by audience users holding the horizontal screen terminal and audience users holding the vertical screen terminal.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a system architecture diagram related to a live broadcast method provided in an embodiment of the present application;

fig. 2 is a flowchart of a live broadcasting method provided in an embodiment of the present application;

fig. 3 is a flowchart of a live broadcasting method provided in an embodiment of the present application;

fig. 4 is a schematic structural diagram of a terminal for live broadcasting provided in an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Before explaining the embodiments of the present application in detail, an application scenario of the embodiments of the present application will be described.

Currently, in internet live broadcasting, audience users can watch videos of a main broadcast through a horizontal screen terminal with an aspect ratio larger than 1, such as a notebook computer and a desktop computer, and can also watch videos of the main broadcast through a vertical screen terminal with an aspect ratio smaller than 1, such as a smart phone and a tablet computer. In order to meet the watching requirements of audience users using different terminals, the anchor can carry out double-stream live broadcasting through the two terminals. That is, the anchor can use a horizontal screen terminal and a vertical screen terminal when live, wherein, the horizontal screen terminal gathers the video of horizontal screen resolution ratio, and send the video of the horizontal screen resolution ratio of gathering to the streaming media server, the vertical screen terminal gathers the video of vertical screen resolution ratio, and send the video of the vertical screen resolution ratio of gathering to the streaming media server, and like this, the streaming media server just can be to the video of the audience user propelling movement vertical screen resolution ratio at the vertical screen terminal, to the video of the audience user propelling movement horizontal screen resolution ratio at the horizontal screen terminal. In the process of carrying out double-stream live broadcast, the vertical screen terminal is considered to be a mobile terminal generally, and the equipment performance of the mobile terminal is poorer than that of the horizontal screen terminal generally, so that the horizontal screen terminal carries out audio acquisition, and the vertical screen terminal does not carry out audio acquisition. The live broadcasting method provided by the embodiment of the application can be applied to a scene that a cross screen terminal is adopted to connect with the terminals of audience users in a double-stream live broadcasting process, so that the audience users with terminals with different aspect ratios can watch videos including main broadcasting and connecting with the audience.

Next, a live broadcast system provided in an embodiment of the present application is described. As shown in fig. 1, the system may include a first terminal 101, a second terminal 102, a third terminal 103, a fourth terminal 104, a streaming server 105, and a connecting server 106.

The first terminal 101 may be a terminal with an aspect ratio of a display screen greater than 1, that is, a landscape terminal. The terminal with the aspect ratio of the display screen smaller than 1, namely a vertical screen terminal, can also be used. When the first terminal 101 is a landscape terminal, the first terminal 101 may be configured to capture video including landscape resolution of a main broadcast and audio including sound of the main broadcast. When the first terminal 101 is a vertical screen terminal, the first terminal 101 may be configured to capture a video including a vertical screen resolution of a main broadcast and an audio including a sound of the main broadcast. Without connecting to the microphone, the first terminal 101 may send the captured video and audio to the streaming server 106 and send the captured audio to the second terminal 102. Under the condition of connecting to the microphone, the first terminal 101 may send the collected video and audio to the microphone server 106, receive the video and audio collected by the third terminal 103 sent by the microphone server, synthesize the video collected by the third terminal 103 and the video of the third terminal 103 to obtain a synthesized video, mix the audio collected by the third terminal 103 and the audio of the third terminal to obtain mixed audio, and send the synthesized video and the mixed audio to the streaming media server 105 and the second terminal 102.

When the first terminal 101 is a landscape terminal, the second terminal 102 may be a portrait terminal. At this time, the second terminal 102 is configured to acquire a video including the vertical screen resolution of the anchor, and when the first terminal 101 is the vertical screen terminal, the second terminal 102 is the horizontal screen terminal, and at this time, the second terminal 102 is configured to acquire a video including the horizontal screen resolution of the anchor. In addition, the second terminal 102 may be configured to receive audio transmitted by the first terminal 101, or receive composite video and audio mixed from the first terminal 101. When the audio transmitted by the first terminal 101 is received, the second terminal 102 processes the received audio and transmits the processed audio and the captured video with the vertical screen resolution to the streaming media server 105. When the composite video and the mixed audio transmitted by the first terminal 101 are received, the composite video is processed, and the mixed audio and the processed composite video are transmitted to the streaming server 105.

The third terminal 103 may be a terminal which connects to the first terminal 101 among the terminals of the plurality of viewer users. The third terminal 103 may collect a video including a microphone and an audio including a voice of the microphone, transmit the collected video and audio to the microphone server 106, and receive a video and audio of a main broadcast transmitted by the microphone server 106.

There may be a plurality of the fourth terminals 104, and the fourth terminals 104 are terminals of other audience users except for the microphone, and are used for receiving the video and audio pushed by the streaming media server.

The streaming media server 105 is configured to receive the audio and video transmitted by the first terminal 101 and the audio and video transmitted by the second terminal 102, and push the appropriate video and audio to the display screen of each of the plurality of fourth terminals 104 according to the aspect ratio of the display screen of the fourth terminal 104.

The microphone connecting server 106 is used for receiving self-collected audio and video with horizontal screen resolution sent by the first terminal 101, and self-collected audio and video sent by the third terminal 103; and sending the received audio and video collected by the third terminal to the first terminal 101, and sending the audio and video collected by the first terminal 101 to the third terminal 103.

The live broadcast method provided in the embodiments of the present application is explained in detail below.

Fig. 2 is a flowchart of a live broadcasting method according to an embodiment of the present application. The method can be applied to the first terminal in the foregoing system architecture, and referring to fig. 2, the method includes the following steps:

step 201: the first terminal detects whether the first terminal is in a double-current live broadcast mode currently or not and detects whether the first terminal is in a microphone connecting state currently or not.

The dual-stream live broadcast mode refers to that a first terminal and a second terminal respectively push one path of video to a streaming media server, wherein the aspect ratio of one path of video pushed by the first terminal is greater than 1, and the aspect ratio of the other path of video pushed by the second terminal is less than 1.

The second terminal is a terminal performing double-stream live broadcast with the first terminal.

Step 202: if the first terminal is detected to be in a double-stream live broadcast mode currently and the first terminal is detected to be in a microphone connecting state currently, the first terminal acquires a first video frame and a first audio frame which are acquired by the first terminal currently, and acquires a second video frame and a second audio frame which are acquired by a third terminal connected with the first terminal currently.

Step 203: and synthesizing the first video frame and the second video frame to obtain a synthesized video, mixing the first audio frame and the second audio frame to obtain a mixed audio, and sending a first data packet containing the synthesized video and the mixed audio to the second terminal and the streaming media server.

Step 204: and when the second terminal receives the first data packet sent by the first terminal, processing the first data packet to obtain audio mixing audio and a processed video which accords with the aspect ratio of a display screen of the second terminal.

Step 205: the second terminal displays the processed video and sends a second data packet containing the processed video and the mixed audio to the streaming media server.

In the embodiment of the application, when the dual-stream live broadcasting mode is in the continuous-broadcasting state, the first terminal can synthesize a first video frame acquired by the first terminal and a second video frame acquired by a third terminal connected with the first terminal, mix a first audio frame acquired by the first terminal and a second audio frame acquired by the third terminal, send a continuous-broadcasting data packet containing synthesized video and mixed-broadcasting audio obtained by synthesis to the second terminal, so that the second terminal can generate processed video which contains a video picture of the third terminal and accords with the aspect ratio of a display screen of the second terminal through the continuous-broadcasting data packet, and push the processed video and the mixed-broadcasting audio to the streaming media server, so that the streaming media server can push continuous-broadcasting video which accords with the aspect ratio of the display screen of each terminal and contains a main broadcaster and a continuous-broadcasting person to a viewer who holds a vertical screen terminal and a horizontal screen terminal simultaneously, therefore, under the double-stream live broadcast mode, both audience users holding the horizontal screen terminal and audience users holding the vertical screen terminal can watch the live broadcast and the live broadcast of the live broadcast.

As can be seen from the introduction of the foregoing system architecture, the first terminal may be a horizontal screen terminal or a vertical screen terminal, and considering that the device performance of the normal horizontal screen terminal is higher than that of the vertical screen terminal and the traffic cost of the vertical screen terminal is higher, in this embodiment of the present application, the explanation is mainly given by taking the first terminal for connecting the microphone as the horizontal screen terminal and the second terminal for performing the dual-stream live broadcast with the first terminal as the vertical screen terminal. This does not constitute a limitation of the first terminal and the second terminal.

Fig. 3 is a flowchart of a live broadcasting method provided in an embodiment of the present application, and as shown in fig. 3, the method includes the following steps:

step 301: the first terminal detects whether the current terminal is in a double-stream live broadcast mode.

In the embodiment of the application, a double-stream live broadcast variable used for indicating whether the double-stream live broadcast variable is currently in a double-stream live broadcast mode may be stored in the first terminal, when a value of the double-stream live broadcast variable is a first numerical value, the first terminal may determine that the double-stream live broadcast variable is currently in the double-stream live broadcast mode, and otherwise, the first terminal may determine that the double-stream live broadcast variable is not currently in the double-stream live broadcast mode.

It should be noted that the initial value of the dual-stream live broadcast variable may be a third value, that is, when the live broadcast starts, the first terminal is not in the dual-stream live broadcast mode. The method comprises the steps that a first terminal can detect whether a double-stream pairing request sent by a second terminal and used for requesting double-stream live broadcast is received or not in real time after live broadcast starts, when the first terminal receives the double-stream pairing request sent by the second terminal, the first terminal is paired with the second terminal, and after pairing is successful, the first terminal can set a value of a stored double-stream live broadcast variable to be a first numerical value.

Optionally, in this embodiment of the application, after the first terminal and the second terminal are successfully paired, the second terminal may further determine a time offset between a system time of the second terminal and a system time of the first terminal by sending a timing request packet to the first terminal, so as to implement time synchronization in performing dual-stream live broadcast subsequently.

For example, after the pairing is successful, the second terminal may send a timing request packet to the first terminal, where the timing request packet may carry the system time of the second terminal when the timing request packet is sent. After receiving the timing request packet, the first terminal may send a timing response packet for the timing request packet to the second terminal, where the timing response packet may carry the system time of the second terminal carried in the timing request packet and the current system time of the first terminal, that is, the first system time. The second terminal may record a second system time when receiving the timing response packet, and determine a time offset between the system time of the first terminal and the system time of the second terminal based on the first system time, the second system time, and the system time of the second terminal when the second terminal transmits the timing request packet.

The second terminal may calculate a first time deviation between the second system time and the system time of the second terminal carried in the timing request packet, where the first time deviation is actually a round-trip time from the second terminal to the first terminal, a half of the round-trip time is used as a one-way time from the second terminal to the first terminal, the one-way time is added to the second system time to obtain a third system time, and a time deviation between the second system time and the third system time is calculated, where the time deviation is a time deviation between the system time of the first terminal and the system time of the second terminal.

Optionally, in a possible implementation manner, due to a situation that a packet may be lost in a timing request packet sent by the second terminal, or due to other reasons, a deviation between a system time of the timing request packet sent by the second terminal and a second system time of the timing response packet received by the first terminal is too large, and half of the deviation between the system time of the timing request packet and the second system time may not accurately represent a one-way time duration of data transmission between the first terminal and the second terminal, in which case, the deviation between the system times of the first terminal and the second terminal may not be determined by the timing request packet.

Based on this, in order to ensure that the deviation between the system times of the first terminal and the second terminal can be determined more accurately, the second terminal may send a timing request packet to the first terminal every a certain timing length, in order to distinguish each timing request packet sent by the second terminal, each timing request packet may carry a request packet sequence number for identifying each timing request packet, and the second terminal may store a correspondence between the request packet sequence number of each timing request packet and the system time of the second terminal when the request packet sequence number is sent. When the first terminal receives a timing request packet, the first terminal may regard the received timing request packet as a target timing request packet, and send a target timing response packet for the target timing request packet to the second terminal. The target timing response packet may carry a sequence number of the target timing request packet and a current first system time of the first terminal.

And when the second terminal receives a target timing response packet, the second terminal can record the second system time when receiving the target timing response packet, and acquire the first system time and the request packet sequence number carried in the target timing response packet. And according to the request packet sequence number, obtaining the system time when the target timing request packet corresponding to the target timing response packet is sent out, namely the third system time, from the corresponding relation between the stored request packet sequence number and the system time. Thereafter, the second terminal may determine a first time offset between the third system time and the second system time. If the first time deviation is less than or equal to the fourth value, it indicates that the first time deviation can be used to determine the one-way duration, at this time, the second terminal may use half of the first time deviation as the one-way duration, and add the one-way duration to the first system time to obtain a fourth system time, and use the time deviation between the second system time and the fourth system time as the time deviation between the system time of the first terminal and the system time of the second terminal.

If the first time deviation is greater than the fourth value, it indicates that the first time deviation cannot be used to determine the one-way time length of data transmission between the first terminal and the second terminal, in this case, the target timing response packet cannot be used to determine the time deviation between the system times of the first terminal and the second terminal, and at this time, the second terminal may wait for the arrival of the next target timing response packet, and when receiving the next target timing response packet, process the target timing response packet by using the foregoing method.

Optionally, if the number of the timing response packets received by the second terminal is the same as the number of the transmitted timing request packets, and the time deviation between the system times of the first terminal and the second terminal still cannot be determined according to the last received timing response packet, it is determined that time synchronization between the first terminal and the second terminal fails, and at this time, a prompt message for prompting that the anchor broadcast cannot perform dual-stream live broadcast may be displayed on the second terminal.

When it is determined that the current mode is the dual-stream live broadcast mode through this step, the first terminal may execute step 302, and if the current mode is not the dual-stream live broadcast mode, the first terminal may obtain a first video frame and a first audio frame that are currently acquired, display the first video frame, encode the first video frame and the first audio frame, encapsulate the first video frame and the first audio frame according to a streaming media protocol, obtain a single-stream live broadcast data packet, and push the single-stream live broadcast data packet to the streaming media server.

Step 302: and if the current mode is in the double-current live broadcast mode, detecting whether the current mode is in a microphone connecting state.

If the current mode is the double-stream live broadcast mode, the first terminal can further detect whether the current mode is in a microphone connecting state. The first terminal may store a number of connected wheat variables for indicating whether the first terminal is currently in a connected wheat state. If the value of the continuous wheat variable is the second value, the first terminal can determine that the continuous wheat state is currently existed, otherwise, the first terminal can determine that the continuous wheat state is not currently existed.

It should be noted that, when the first terminal starts the live broadcast, the initial value of the wheat-connected variable is the fifth value. When the anchor wants to connect with one of the audience users, the first terminal can be triggered to send a wheat connecting request to a third terminal of the wheat connector through the wheat connecting server, and if the first terminal receives a prompt message which is returned by the wheat connecting server and used for indicating successful wheat connecting, the first terminal can modify the value of the wheat connecting variable into a second value. Or, the first terminal may receive a third terminal of the wheat connecting person, send a wheat connecting request through the wheat connecting server, and send a wheat connecting response to the wheat connecting server, and then, when receiving a prompt message returned by the wheat connecting server to indicate that the wheat connecting is successful, the first terminal may modify the value of the wheat connecting variable to a second value. In the continuous wheat process, the values of the continuous wheat variables are all the second numerical values, and after the continuous wheat is finished, the first terminal can modify the values of the continuous wheat variables into the fifth numerical values.

In the embodiments of the present application, the first numerical value and the second numerical value may be the same or different. The third and fifth values may be the same or different. The first value is different from the third value, and the second value is different from the fifth value.

When it is determined that the first terminal is currently in the connected state, the first terminal may perform

steps

303 and 304, and if the first terminal is not currently in the connected state, the first terminal may perform steps 305 and 306.

Step 303: if the first terminal is in a microphone connecting state, the first terminal acquires a first video frame and a first audio frame which are acquired currently, and acquires a second video frame and a second audio frame which are acquired currently by a third terminal connected with the first terminal.

If the first terminal is currently in a microphone connecting state, the first terminal can acquire a first video frame and a first audio frame which are currently acquired by the first terminal, and acquire a second video frame and a second audio frame which are currently acquired by a third terminal and are sent by the third terminal from a microphone connecting server.

The first terminal may send a data acquisition request to the microphone server, and after receiving the data request, the microphone server may acquire a video frame and an audio frame that are sent by a third terminal that has microphone with the first terminal most recently, and send the video frame and the audio frame to the first terminal as a second video frame and a second audio frame.

Or, in a possible implementation manner, when the microphone connecting server receives a video frame and an audio frame sent by the third terminal, the video frame and the audio frame may be forwarded to the first terminal, and the first terminal may obtain, from the audio frame and the video frame sent by the microphone connecting server, the video frame and the audio frame that carry the time indicated by the timestamp and are closest to the current system time as the second video frame and the second audio frame.

Optionally, in this embodiment of the application, after the first terminal acquires the first video frame and the first audio frame, the first terminal may further encode the first video frame and the first audio frame, and send the encoded first video frame and the first audio frame to the microphone connecting server after being encapsulated by the streaming media protocol, so that when the microphone connecting server receives the encapsulated first audio frame and the first video frame, the microphone connecting server forwards the encapsulated first video frame and the encapsulated first audio frame to a third terminal that is microphone-connected with the first terminal. Therefore, the third terminal can generate a video frame for self display according to the received first video frame and the second video frame acquired by the third terminal, and play the received first audio frame.

Step 304: the first terminal synthesizes the first video frame and the second video frame to obtain a synthesized video, mixes the first audio frame and the second audio frame to obtain mixed audio, and sends a first data packet containing the synthesized video and the mixed audio to the second terminal and the streaming media server.

After the first video frame and the second video frame are acquired, the first terminal can synthesize the first video frame and the second video frame, so that a synthesized video including both the anchor video and the linkman video is acquired.

For example, the first terminal may determine a first boundary and a second boundary parallel to the height direction of the first video frame, the first boundary being a distance from a first edge of the first video frame that is equal to a distance from the second boundary to a second edge of the first video frame, the first edge and the second edge both being parallel to the height direction of the first video frame; intercepting a first video picture between the first boundary line and the second boundary line from the first video frame, wherein the width of the first video picture is smaller than that of the first video frame, and the height of the first video picture is equal to that of the first video frame; and if the height of the second video frame is the same as that of the first video frame, splicing the second video frame at one side of the first edge or the second edge of the first video picture.

The first video frame is a video with horizontal screen resolution, that is, the pixel width of the first video frame is greater than the pixel height. Considering that the anchor is usually located in the central area of the video frames when live broadcasting is performed, the first terminal may intercept a portion of the video frames located in the central area from the first video frame, and splice the intercepted video frames with the second video frame to obtain the composite video. For example, the first terminal may first determine a first boundary in the first video frame parallel to the first edge, and a distance between the first boundary and the first edge is a quarter of a pixel width of the first video frame. The first terminal may then determine a second boundary in the first video frame parallel to the second edge, and a distance between the second boundary and the second edge is one quarter of a pixel width of the first video frame. Thus, the pixel width of the video frame between the first boundary line and the second boundary line is half of the pixel width of the first video frame. And taking the video picture between the first boundary line and the second boundary line as a captured video picture.

The first terminal can also process the second video frame while intercepting the first video picture from the first video frame. If the pixel height of the second video frame is the same as the pixel height of the first video frame, the first terminal can directly splice the second video frame to the side where the first edge of the first video frame is located, or splice the second video frame to the side where the second edge of the first video frame is located, so that the composite video is obtained.

Optionally, under the condition that the pixel height of the second video frame is the same as the pixel height of the first video frame, if the pixel width of the second video frame is greater than the pixel width of the first video frame, the first terminal may further intercept the second video frame, and splice the intercepted video frame with the first video frame, so as to obtain the composite video.

Optionally, if the pixel height of the second video frame is smaller than the pixel height of the first video frame, the first terminal may amplify the second video frame so that the pixel height of the second video frame is the same as the pixel height of the first video frame, and then, splice the amplified second video frame with the first video frame, so as to obtain the composite video.

Optionally, if the pixel height of the second video frame is greater than the pixel height of the first video frame, the first terminal may reduce the second video frame so that the pixel height of the second video frame is the same as the pixel height of the first video frame, and then, splice the reduced second video frame with the first video frame, thereby obtaining the composite video.

Optionally, in a possible implementation manner, the first terminal may further not intercept the first video frame, but after receiving the second video frame, zoom out the second video frame, and add the second video frame to a preset area of the first video frame, so as to obtain the composite video.

The first terminal can also perform audio mixing on the first audio frame and the second audio frame to obtain mixed audio while generating a composite video based on the first video frame and the second video frame.

After the composite video and the audio-mixed audio are obtained, the first terminal can encode and package the composite video and the audio-mixed audio to obtain a first data packet, and send the first data packet containing the composite video and the audio-mixed audio to the second terminal, so that the second terminal can process the composite video to obtain a processed video suitable for being displayed by the vertical screen terminal, and further push the processed video and the audio-mixed audio to the streaming media server. Meanwhile, the first terminal can also send the first data packet to the streaming media server, so that the streaming media server can push the composite video and mixed audio to the landscape terminal of the audience user.

Optionally, in this embodiment of the application, the first data packet may further include a second video frame, that is, after the composite video and the mixed audio are obtained, the first terminal may perform encoding and packaging on the composite video, the mixed audio, and the second video frame to obtain a first data packet, and send the first data packet to the second terminal and the streaming media server.

Alternatively, in one possible scenario, the first terminal may not send the same data packet to the second terminal and the streaming server. Specifically, the first terminal may encode and encapsulate the composite video and the mixed audio, and then send the encoded and encapsulated composite video and mixed audio to the streaming server. And for the second terminal, the first terminal can encode and package the mixed audio and the second video frame and then send the mixed audio and the second video frame to the second terminal.

It should be noted that, in order to notify the second terminal that the first terminal is currently in the connected state, the first terminal may further carry, in the packet header of the first data packet, a connected identifier for indicating that the first terminal is currently in the connected state.

Step 305: and if the first terminal is not in the connected state, the first terminal acquires a first video frame and a first audio frame which are currently acquired by the first terminal.

If the first terminal is not currently in a connected-to-microphone state, that is, the first terminal and the second terminal are currently only in a dual-stream live broadcast mode, under the circumstance, the first terminal can acquire a currently acquired first video frame and a currently acquired first audio frame, so that the first video frame and the first audio frame are pushed to audience users with the horizontal-screen terminals.

The first video frame and the first audio frame carry the same timestamp, and the timestamp may be used to indicate the acquisition time for the first terminal to acquire the first video frame and the first audio frame.

Step 306: the first terminal sends an audio data packet containing a first audio frame to the second terminal, and sends a third data packet containing a first video frame and a first audio frame to the streaming media server.

After a first video frame and a first audio frame currently acquired by a first terminal are acquired, the first terminal can encode the first video frame and the first audio frame and encapsulate the first video frame and the first audio frame through a streaming media protocol to obtain a third data packet containing the first video frame and the first audio frame, and the third data packet is sent to a streaming media server, so that when the streaming media server receives the third data packet containing the first video frame and the first audio frame, the first video frame and the first audio frame are pushed to a viewer user with a transverse screen terminal.

Meanwhile, considering that the device performance of the first terminal is generally better than that of the second terminal, and in addition, considering that the network traffic cost of the second terminal is higher, in this embodiment of the application, the first terminal may collect audio, and the second terminal does not collect audio, based on this, after the first terminal acquires the first audio frame, the first terminal may encode and encapsulate the first audio frame to obtain an audio data packet including the first audio frame, and send the audio data packet to the second terminal, so as to share the first audio frame with the second terminal.

Optionally, since the first terminal is not in the connected state at this time, in order to notify the second terminal that the first terminal is not in the connected state currently, the first terminal may further carry, in a header of the audio data packet, a non-connected identifier for indicating that the first terminal is not in the connected state currently.

Step 307: the second terminal judges whether the received data packet sent by the first terminal is a first data packet or an audio data packet.

When the first terminal and the second terminal are in a double-current live broadcast mode, no matter the first terminal is currently in a microphone connecting state or is not in a microphone connecting state, the first terminal can send a data packet to the second terminal, and the data packet may be a first data packet which is sent by the first terminal in the microphone connecting state and contains synthesized video and mixed audio, or may be an audio data packet which is sent by the first terminal in the microphone non-connecting state and only contains a first audio frame acquired by the first terminal.

Based on this, when the second terminal receives a data packet, the second terminal may directly parse the received data packet, and if the received data packet is parsed to obtain the composite video and the audio-mixed audio, the received data packet may be determined to be the first data packet, and if the received data packet is parsed to obtain the first audio frame and does not include the composite video and the audio-mixed audio, the received data packet may be determined to be the audio data packet.

Optionally, as can be seen from the foregoing description, the first terminal may carry a wheat connection identifier in a header of the first data packet to indicate that the first terminal is currently in a wheat connection state, and carry a non-wheat connection identifier in a header of the audio data packet to indicate that the first terminal is not currently in a wheat connection state. If the non-connected microphone identifier is carried, it may be determined that the received data packet is not the first data packet but an audio data packet including the first audio frame.

Optionally, in this embodiment of the application, the second terminal may also store a microphone connecting variable. In this way, when the second terminal detects that the packet header of the received data packet carries the miking identifier, the stored miking variable may be set to the second value, otherwise, the second terminal may set the stored miking variable to the fifth value.

If it is determined that the second terminal receives the first data packet through this step, step 308-.

Step 308: and if the second terminal receives the first data packet, processing the first data packet to obtain audio mixing audio and a processed video which accords with the aspect ratio of a display screen of the second terminal.

If the second terminal receives the first data packet, it can be determined that the first terminal is currently in a microphone connecting state, and at this time, because the video frame acquired by the second terminal does not include an image of a microphone connecting person, the second terminal can acquire a composite video from the first data packet and process the composite video to obtain a processed video which meets the aspect ratio of a display screen of the second terminal and includes the microphone connecting person.

Illustratively, the second terminal may zoom out the composite video such that the width of the composite video is equal to the width of the display screen of the second terminal; splicing a first blank picture at one side where a first edge of the zoomed video is located according to the height of a display screen of a second terminal, and splicing a second blank picture at one side where a second edge of the zoomed video is located; the first edge and the second edge are parallel to the width direction of the zoomed video, the height of the first blank picture is the same as that of the second blank picture, and the sum of the height of the first blank picture, the height of the first blank picture and the height of the zoomed video is equal to the height of a display screen of the second terminal; and filling background colors in a first blank picture and a second blank picture in the spliced video, and taking the filled video as a processed video.

It should be noted that, since the pixel width of the composite video is greater than the pixel width of the display screen of the second terminal, after the composite video is acquired, the second terminal may first reduce the composite video in an equal proportion so that the pixel width of the composite video is equal to the pixel width of the display screen. Then, the second terminal may calculate a height difference between the pixel height of the reduced composite video and the pixel height of the display screen of the second terminal, and stitch a first blank picture having a pixel height of one half of the height difference and a width equal to the pixel width of the reduced composite video on the upper edge side of the reduced composite video, and stitch a second blank picture having a pixel height and a pixel width equal to those of the first blank picture on the lower edge side of the reduced composite video, so that the sum of the pixel heights of the video pictures resulting from stitching the first blank picture, the reduced composite video, and the second blank picture is equal to the pixel height of the display screen of the second terminal, and the pixel width of the stitched video picture is equal to the pixel width of the display screen of the second terminal. And filling background colors in the first blank picture and the second blank picture, wherein the filled video pictures can be used as processed videos.

In addition, as can be seen from the foregoing description, the second terminal may not capture audio, but share audio captured by the first terminal with the first terminal, and therefore, the second terminal may capture mixed audio in the first data packet while capturing composite video from the first data packet.

Optionally, as can be seen from the related description in step 304, the first data packet may further include a second video frame, or the first terminal may package the second video frame and the mixed audio and then send them to the second terminal, in which case the second terminal may obtain the second video frame in the first data packet, and the second terminal may obtain a third video frame currently captured by itself. And then, the second terminal can synthesize the second video frame and a third video frame currently acquired by the second terminal, so as to obtain a processed video according with the aspect ratio of a display screen of the second terminal. In addition, in this case, the second terminal may also acquire the mixed audio in the first packet. The implementation process of the second terminal synthesizing the second video frame and the currently acquired third video frame may refer to the implementation process of synthesizing the first video frame and the second video frame in the foregoing embodiment, which is not described herein again in this embodiment of the application.

Step 309: the second terminal displays the processed video and sends a second data packet containing the processed video and the mixed audio to the streaming media server.

After the processed video and the mixed audio are acquired, the second terminal may display the processed video. The first video frame carries a timestamp, so that the composite video processed according to the first video frame also carries the same timestamp, and then the processed video processed by the composite video also carries the timestamp, or if the processed video is obtained by synthesizing the second video frame and the third video frame by the second terminal, the processed video also carries the timestamp because the second video frame carries the timestamp. Based on this, the second terminal may extract a timestamp from the processed video, record a current system time of the second terminal, and determine a display time of the processed video based on a frame interval time of a video frame and the current system time. The second terminal may display the processed video at the current time if the display time is later than the time indicated by the time stamp, and delay displaying the processed video if the display time is earlier than the time indicated by the time stamp of the processed video.

It should be noted that, after extracting the timestamp in the processed video, the second terminal may add the frame interval time of the video frame to the current system time to obtain the display time of the processed video, which is actually the display time of the processed video theoretically calculated. And the time indicated by the timestamp of the processed video actually refers to the real display time of the processed video. And then comparing the display time with the time indicated by the time stamp of the processed video, and if the display time is later than the time indicated by the time stamp of the processed video, indicating that the real display time of the processed video is passed, wherein the second terminal cannot delay displaying the processed video any more but needs to immediately display the processed video. If the display time is equal to the time indicated by the timestamp of the processed video, it indicates that the calculated display time matches the actual display time, in which case the second terminal may display the processed video when the display time is reached. If the display time is earlier than the time indicated by the timestamp of the processed video, it indicates that the processed video is too early to be displayed when the display time is reached, and at this time, the second terminal may sleep for a certain time and display the processed video after sleeping for a certain time.

After the second terminal displays the processed video, the second terminal can also encode the processed video and the audio, and package the encoded processed video and the audio through a streaming media protocol to obtain a second data packet, and send the second data packet to the streaming media server, so that the streaming media server can push the processed video and the audio in the second data packet to an audience user holding a vertical screen terminal. The audio mixing audio is obtained by mixing a first audio frame acquired by a first terminal and a second audio frame acquired by a third terminal, and therefore the audio mixing audio can carry a timestamp of the first audio frame, and as can be seen from the foregoing description, the timestamp of the first audio frame and the timestamp of the first video frame can be the same, so that the timestamp of the audio mixing audio and the timestamp of the processed video are the same, in other words, because the timestamp of the audio mixing audio and the timestamp of the processed video are the same, the second terminal can not perform time synchronization on the audio mixing audio and the processed video, and the processing complexity of the second terminal is reduced.

Step 310: the streaming media server receives a first data packet sent by the first terminal and a second data packet sent by the second terminal, sends the first data packet to the horizontal screen terminal and sends the second data packet to the vertical screen terminal.

The horizontal screen terminal refers to a horizontal screen terminal held by the audience user, and the vertical screen terminal refers to a vertical screen terminal held by the audience user.

Since the first data packet contains the composite video conforming to the high width ratio of the horizontal screen terminal and the second data packet contains the processed video conforming to the high width ratio of the vertical screen terminal, after receiving the first data packet and the second data packet, the streaming media server can push different videos to different audience users according to the terminals held by the audience users.

Step 311: and if the second terminal receives the audio data packet, acquiring a first audio frame in the audio data packet and a third video frame currently acquired by the second terminal, and sending a fourth data packet containing the first audio frame and the third video frame to the streaming media server.

If the data packet received by the second terminal is not the first data packet but an audio data packet, it may be determined that the first terminal is not currently in the connected state. In this case, the second terminal may acquire the first audio frame in the audio data packet and acquire the third video frame currently captured by the second terminal.

After the first audio frame and the third video frame are acquired, the second terminal can directly display the third video frame, encode and encapsulate the first audio frame and the third video frame to obtain a fourth data packet, and send the fourth data packet containing the first audio frame and the third video frame to the streaming media server, so that the streaming media server can push the third video frame and the first audio frame to a viewer user with a vertical screen terminal.

Alternatively, since the first audio frame is captured by the first terminal and the third video frame is captured by the second terminal, the second terminal may synchronize the first audio frame and the third video frame. Illustratively, the second terminal may record an acquisition time at which the third video frame was acquired. Thereafter, the second terminal may determine a second timestamp of the third video frame based on the acquisition time of the third video frame and a time offset between the system time of the first terminal and the system time of the second terminal. And taking the second timestamp as a timestamp of a third video frame, and displaying the third video frame based on the second timestamp, and then the second terminal may send a fourth data packet containing the first audio frame and the third video frame to the streaming server, where the third video frame carries the second timestamp.

In step 301, after the first terminal and the second terminal are successfully paired, the second terminal may determine a time offset between the system time of the first terminal and the system time of the second terminal. In this way, in this step, the second terminal may obtain the time offset, and subtract the time offset from the current system time to obtain a second time stamp, where the time indicated by the second time stamp is synchronized after the time offset between the system times of the first terminal and the second terminal is removed, and the second time stamp is used as the time stamp of the third video frame, and the third video frame is displayed based on the second time stamp by referring to the method described in step 309. And then, the second terminal may encode and encapsulate a third video frame carrying the second timestamp and a first audio frame carrying the first timestamp, so as to obtain a fourth data packet, and send the dual-stream live data packet to the streaming media server, so that the streaming media server may push the third video frame and the first audio frame to an audience user holding a vertical screen terminal.

Step 312: and the streaming media server receives a third data packet sent by the first terminal and a fourth data packet sent by the second terminal, pushes the third data packet to the horizontal screen terminal and pushes the fourth data packet to the vertical screen terminal.

Since the third data packet contains the first video frame conforming to the high width ratio of the horizontal screen terminal and the fourth data packet contains the third video frame conforming to the high width ratio of the vertical screen terminal, after receiving the third data packet and the fourth data packet, the streaming media server can push different videos to different audience users according to the terminals held by the audience users.

In the embodiment of the application, when the dual-stream live broadcasting mode is in the connected-to-microphone state, the first terminal may synthesize a first video frame acquired by itself and a second video frame acquired by a third terminal connected to microphone with itself, mix a first audio frame acquired by itself and a second audio frame acquired by the third terminal, and send the synthesized video and mixed audio obtained by synthesis to the second terminal, so that the second terminal may process the synthesized video to obtain a processed video including a video frame of the third terminal and conforming to the aspect ratio of the display screen of the second terminal, and push the processed video and mixed audio to the streaming media server, so that the streaming media server may push the processed video to a viewer user holding a vertical screen terminal, thereby enabling the viewer user holding a horizontal screen terminal to be the viewer user holding the vertical screen terminal in the dual-stream live broadcasting mode, the composite video of the main broadcast and the person who connects to the wheat can be observed.

In addition, in this embodiment of the application, the first terminal may perform audio acquisition, and the second terminal may share the acquired audio with the first terminal, in this case, the second terminal may determine a time offset between the system time of the first terminal by sending a timing request packet when starting the dual-stream live broadcast, so that, when the second terminal is in the dual-stream live broadcast mode and is not in the connected-to-microphone state, after acquiring the audio acquired by the first terminal, the second terminal may align timestamps of the audio acquired by the first terminal and the video acquired by the second terminal by determining the time offset between the system time of the first terminal and the system time of the second terminal, thereby achieving synchronization between the audio and the video.

Fig. 4 shows a block diagram of a terminal 400 for live broadcasting according to an exemplary embodiment of the present application. When the terminal is a landscape screen terminal, the terminal 400 may be a laptop computer, a desktop computer, or the like, and when the terminal is a portrait screen terminal, the terminal may be a smart phone, a tablet computer, or the like.

Generally, the terminal 400 includes: a processor 401 and a memory 402.

Processor 401 may include one or more Processing cores, such as a 4-core processor, an 8-core processor, etc. processor 401 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), a P L a (Programmable logic Array), processor 401 may also include a main processor and a coprocessor, the main processor being a processor for Processing data in a wake-up state, also known as a CPU (Central Processing Unit), the coprocessor being a low-power processor for Processing data in a standby state, in some embodiments, processor 401 may be integrated with a GPU (Graphics Processing Unit) for rendering and rendering content for display, in some embodiments, processor 401 may also include an AI (intelligent processor) for learning operations related to an AI (Artificial Intelligence processor) for computing operations related to display screens.

Memory 402 may include one or more computer-readable storage media, which may be non-transitory. Memory 402 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, the non-transitory computer readable storage medium in the memory 402 is configured to store at least one instruction, where the at least one instruction is configured to be executed by the processor 401 to implement the steps performed by the first terminal in the live broadcast method provided in the method embodiment of the present application when the terminal is the first terminal in the above embodiments, and the at least one instruction is configured to be executed by the processor 401 to implement the steps performed by the second terminal in the live broadcast method provided in the method embodiment of the present application if the terminal is the second terminal in the above embodiments.

In some embodiments, the terminal 400 may further optionally include: a peripheral interface 403 and at least one peripheral. The processor 401, memory 402 and peripheral interface 403 may be connected by bus or signal lines. Each peripheral may be connected to the peripheral interface 403 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 404, touch screen display 405, camera 406, audio circuitry 407, positioning components 408, and power supply 409.

The peripheral interface 403 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 401 and the memory 402. In some embodiments, processor 401, memory 402, and peripheral interface 403 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 401, the memory 402 and the peripheral interface 403 may be implemented on a separate chip or circuit board, which is not limited by this embodiment.

The Radio Frequency circuit 404 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 404 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 404 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 404 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 404 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 404 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The Display 405 may be configured to Display a UI (User Interface) that may include graphics, text, icons, video, and any combination thereof, when the Display 405 is a touch screen, the Display 405 may also have the ability to capture touch signals on or above a surface of the Display 405. the touch signals may be input to the processor 401 as control signals for processing, at this time, the Display 405 may also be configured to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. in some embodiments, the Display 405 may be one, providing a front panel of the terminal 400, in other embodiments, the Display 405 may be at least two, each disposed on a different surface or in a folded design of the terminal 400, in still other embodiments, the Display 405 may be a flexible Display disposed on a curved surface or on a folded surface of the terminal 400. even, the Display 405 may be configured as a non-rectangular irregular graphic, the Display 405 may be configured as a non-rectangular screen, the Display 405 may be configured as a L CD (L idCrycy, Display) or a liquid crystal Display (e.g. 3: 3) that may be configured to Display a width greater than a light Emitting Diode such as a Diode 400, e.g. a Diode, or the like, a Diode.

The camera assembly 406 is used to capture images or video. Optionally, camera assembly 406 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 406 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuit 407 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 401 for processing, or inputting the electric signals to the radio frequency circuit 404 for realizing voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 400. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 401 or the radio frequency circuit 404 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 407 may also include a headphone jack.

The positioning component 408 is used to locate the current geographic location of the terminal 400 to implement navigation or L BS (L o geographic based Service). the positioning component 408 may be a positioning component based on the GPS (global positioning System) in the united states, the beidou System in china, or the galileo System in the european union.

The power supply 409 is used to supply power to the various components in the terminal 400. The power source 409 may be alternating current, direct current, disposable or rechargeable. When the power source 409 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal 400 also includes one or more sensors 410. The one or more sensors 410 include, but are not limited to: acceleration sensor 411, gyro sensor 412, pressure sensor 413, fingerprint sensor 414, optical sensor 415, and proximity sensor 416.

The acceleration sensor 411 may detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 400. For example, the acceleration sensor 411 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 401 may control the touch display screen 405 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 411. The acceleration sensor 411 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 412 may detect a body direction and a rotation angle of the terminal 400, and the gyro sensor 412 may cooperate with the acceleration sensor 411 to acquire a 3D motion of the terminal 400 by the user. From the data collected by the gyro sensor 412, the processor 401 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

The pressure sensor 413 may be disposed on a side bezel of the terminal 400 and/or a lower layer of the touch display screen 405. When the pressure sensor 413 is disposed on the side frame of the terminal 400, a user's holding signal to the terminal 400 can be detected, and the processor 401 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 413. When the pressure sensor 413 is disposed at the lower layer of the touch display screen 405, the processor 401 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 405. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 414 is used for collecting a fingerprint of a user, and the processor 401 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 414, or the fingerprint sensor 414 identifies the identity of the user according to the collected fingerprint, when the identity of the user is identified as a trusted identity, the processor 401 authorizes the user to perform relevant sensitive operations, wherein the sensitive operations include screen unlocking, encrypted information viewing, software downloading, payment, setting change and the like, the fingerprint sensor 414 can be arranged on the front side, the back side or the side of the terminal 400, and when a physical key or a vendor L ogo is arranged on the terminal 400, the fingerprint sensor 414 can be integrated with the physical key or the vendor L ogo.

The optical sensor 415 is used to collect the ambient light intensity. In one embodiment, the processor 401 may control the display brightness of the touch display screen 405 based on the ambient light intensity collected by the optical sensor 415. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 405 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 405 is turned down. In another embodiment, the processor 401 may also dynamically adjust the shooting parameters of the camera assembly 406 according to the ambient light intensity collected by the optical sensor 415.

A proximity sensor 416, also known as a distance sensor, is typically disposed on the front panel of the terminal 400. The proximity sensor 416 is used to collect the distance between the user and the front surface of the terminal 400. In one embodiment, when the proximity sensor 416 detects that the distance between the user and the front surface of the terminal 400 gradually decreases, the processor 401 controls the touch display screen 405 to switch from the bright screen state to the dark screen state; when the proximity sensor 416 detects that the distance between the user and the front surface of the terminal 400 gradually becomes larger, the processor 401 controls the touch display screen 405 to switch from the breath screen state to the bright screen state.

That is, the present application embodiment not only provides a live broadcast terminal including a processor and a memory for storing executable instructions of the processor, where when the terminal 400 is a first terminal, the processor is configured to execute the steps related to the first terminal in the embodiments shown in fig. 2 and 3, and when the terminal 400 is a second terminal, the processor is configured to execute the steps related to the second terminal in the embodiments shown in fig. 2 and 3, but also provides a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by the processor, the live broadcast method in the embodiments shown in fig. 2 to 3 can be implemented.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A live broadcast method, the method comprising:

the method comprises the steps that a first terminal detects whether the first terminal is in a double-stream live broadcast mode currently and whether the first terminal is in a microphone connecting state currently, the double-stream live broadcast mode is that the first terminal and a second terminal respectively push one path of video to a streaming media server, the second terminal is a terminal which carries out double-stream live broadcast with the first terminal, and the first terminal and the second terminal are used for collecting the video during main broadcast live broadcast; the aspect ratio of one path of video pushed by the first terminal is greater than 1, and the aspect ratio of the other path of video pushed by the second terminal is less than 1, or the aspect ratio of one path of video pushed by the first terminal is less than 1, and the aspect ratio of the other path of video pushed by the second terminal is greater than 1;

the first terminal synthesizes the first video frame and the second video frame to obtain a synthesized video, mixes the first audio frame and the second audio frame to obtain mixed audio, and sends a first data packet containing the synthesized video and the mixed audio to the second terminal and a streaming media server;

when the second terminal receives the first data packet, processing the first data packet to obtain the audio mixing audio and a processed video which accords with the aspect ratio of a display screen of the second terminal;

2. The method of claim 1, wherein the detecting, by the first terminal, whether the first terminal is currently in a dual-stream live mode and whether the first terminal is currently in a connected-to-wheat state comprises:

the method comprises the steps that a first terminal detects whether a current double-current live broadcast variable is a first numerical value or not and detects whether a current wheat connecting variable is a second numerical value or not;

3. The method of claim 1, wherein the first terminal is a landscape terminal, and wherein the first terminal combines the first video frame and the second video frame, comprising:

the first terminal determines a first boundary and a second boundary which are parallel to the height direction of the first video frame, wherein the distance from the first boundary to a first edge of the first video frame is equal to the distance from the second boundary to a second edge of the first video frame, and the first edge and the second edge are both parallel to the height direction of the first video frame;

4. The method of claim 1, wherein the processing the first packet comprises:

acquiring mixed audio and synthesized video in the first data packet, and reducing the synthesized video so that the width of the synthesized video is equal to the width of a display screen of the second terminal;

5. The method of claim 1, wherein the composite video and the mixed audio carry the same time stamp, and wherein the processed video carries the time stamp of the composite video.

6. The method of claim 5, wherein the second terminal displays the processed video, comprising:

extracting a timestamp from the processed video;

7. The method of claim 1, further comprising:

displaying the third video frame based on the second timestamp;

8. The method of claim 7, further comprising:

the second terminal acquires the sending time corresponding to the request packet sequence number carried by the target timing response packet from the corresponding relation between the stored request packet sequence number and the sending time based on the request packet sequence number carried by the target timing response packet, so as to acquire third system time;

9. The method of any of claim 1, further comprising:

10. The method according to any one of claims 1 to 9, wherein a header of the first data packet carries a miking tag, and the miking tag is used to notify the second terminal that the first terminal is currently in a miking state.

11. A live system, characterized in that the live system comprises: the system comprises a first terminal, a second terminal, a third terminal, a streaming media server and a microphone connecting server;

the first terminal is used for detecting whether the first terminal is in a double-stream live broadcast mode currently and whether the first terminal is in a microphone connecting state currently, the double-stream live broadcast mode is that the first terminal and a second terminal respectively push a video to a streaming media server, the second terminal is a terminal performing double-stream live broadcast with the first terminal, and the first terminal and the second terminal are used for collecting videos during main broadcast live broadcast; the aspect ratio of one path of video pushed by the first terminal is greater than 1, and the aspect ratio of the other path of video pushed by the second terminal is less than 1, or the aspect ratio of one path of video pushed by the first terminal is less than 1, and the aspect ratio of the other path of video pushed by the second terminal is greater than 1; if the current terminal is in a double-stream live broadcast mode and is in a microphone connecting state, the first terminal acquires a first video frame and a first audio frame currently acquired by the first terminal, and acquires a second video frame and a second audio frame currently acquired by a third terminal connected with the first terminal; synthesizing the first video frame and the second video frame, mixing the first audio frame and the second audio frame, and sending a first data packet containing synthesized video and mixed audio to the second terminal and the streaming media server;

12. The system of claim 11,

the first terminal is further used for acquiring a first video frame and a first audio frame currently acquired by the first terminal if the first terminal detects that the first terminal is currently in a double-stream live broadcast mode and is not currently in a microphone connecting state; sending an audio data packet containing the first audio frame to the second terminal, and sending a third data packet containing the first video frame and the first audio frame to the streaming media server, wherein the first audio frame carries a first timestamp;