CN110740346B

CN110740346B - Video data processing method, device, server, terminal and storage medium

Info

Publication number: CN110740346B
Application number: CN201911012307.2A
Authority: CN
Inventors: 耿振健
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-10-23
Filing date: 2019-10-23
Publication date: 2022-04-22
Anticipated expiration: 2039-10-23
Also published as: CN110740346A

Abstract

The disclosure provides a video data processing method, a video data processing device, a server, a terminal and a storage medium, and relates to the technical field of network videos. The server replaces the picture data of the designated area of the corresponding video frame in the second video with the picture data of the video frame of the first video after receiving the video confluence playing request sent by the client to form a composite video frame, generates a composite video stream according to the composite video frame obtained by frame-by-frame synthesis and sends the composite video stream to the client, so that the client can respectively play the first video and the second video in different areas of one video playing picture according to the received composite video stream, and the client can simultaneously play the two videos by using one player without using two players which are stacked with each other to respectively play the first video and the second video, thereby saving the resource consumption of the client.

Description

Video data processing method, device, server, terminal and storage medium

Technical Field

The present disclosure relates to the field of network video technologies, and in particular, to a method, an apparatus, a server, a terminal, and a storage medium for processing video data.

Background

With the development of internet technology, people have become popular to watch short videos and live videos online through a network. Through the video playing client, the anchor can upload the short video recorded by the anchor to the video playing platform, share the short video with the user through the video playing platform, and the anchor can also carry out live broadcast through the video playing platform. The user can watch the short video shared by the anchor through a video playing client on the terminal, or enter a live broadcast room of the anchor to watch the live broadcast video of the anchor.

Generally, after a user selects a short video to be watched, a video playing client sends a video playing request to a server of a video playing platform, and the server sends the short video specified by the video playing request to the video playing client in a video streaming manner, so that the video playing client plays the short video. When a user enters a live broadcast room of a certain anchor to watch live broadcast video of the anchor, the server sends the live broadcast video of the anchor to the video playing client in a video streaming mode, so that the video playing client plays the live broadcast video.

At present, a video playing client cannot play short videos and live videos simultaneously.

Disclosure of Invention

The embodiment of the disclosure provides a video data processing method and device, a server, a terminal and a storage medium, which are used for solving the problem that short videos and live videos cannot be played simultaneously in the prior art.

In a first aspect, an embodiment of the present disclosure provides a video data processing method, which is applied to a server, and the method includes:

receiving a video confluence playing request sent by a client; the video confluence play request indicates that a first video and a second video are played simultaneously;

carrying out frame-by-frame synthesis on the video frame of the first video and the video frame of the second video to obtain a synthesized video frame; the composite video frame is obtained by replacing the picture data of the appointed area of the corresponding video frame in the second video by the picture data of the video frame of the first video;

generating a composite video stream from the composite video frame;

and sending the composite video stream to the client.

According to the video data processing method provided by the embodiment of the disclosure, after a video confluence play request sent by a client is received, picture data of a video frame of a first video is adopted to replace picture data of a designated area of a corresponding video frame in a second video to form a composite video frame, a composite video stream is generated according to the composite video frame obtained by frame-by-frame synthesis and sent to the client, and the client is enabled to play the first video and the second video in different areas of a video play picture respectively according to the received composite video stream, so that the client can play the two videos simultaneously by using one player without using two players stacked mutually to play the first video and the second video respectively, and resource consumption of the client is saved.

In one possible implementation manner, the synthesizing a video frame of a first video and a video frame of a second video frame by frame to obtain a synthesized video frame includes:

and carrying out frame-by-frame synthesis on the video frame of the first video and the video frame of the second video according to a preset position proportion relation.

In one possible implementation, after generating a composite video stream from the composite video frames, the method further includes:

adding audio frames of the first video and audio frames of the second video to the composite video stream, respectively.

In the method, the audio frame of the first video and the audio frame of the second video are respectively added to the composite video stream, so that the client can play the audio frame of the first video or the audio frame of the second video according to the received selection instruction.

In a possible implementation manner, the first video is a live video, and the second video is a short video; or the first video is a short video, and the second video is a live video.

The method provides a more efficient browsing experience for the user, and the user can watch the live video when watching the short video through the client; when a user watches live video through the client, the user can watch short video at the same time, and the watching efficiency is improved.

In one possible implementation, the first video is a live video; the second video is a short video; before replacing the picture data of the designated area of the corresponding video frame in the second video with the picture data of the video frame of the first video, the method further comprises:

acquiring bullet screen data of the first video;

and adding the bullet screen data into the picture data of the video frame of the first video.

According to the method, the barrage data is added to the picture data of the live video at the server side, so that the barrage data can be displayed at the correct position during confluence playing, meanwhile, the client side is not required to separately obtain the confluence of the barrage data and the picture data of the live video, and the program resources and the data transmission resources of the client side can be saved.

In one possible implementation, the first video is a live video; the second video is a short video; before the receiving of the streaming video playing request sent by the client, the method further includes:

and if the fact that the anchor of the second video is live is determined in the process that the client plays the second video, the client is informed to display video confluence prompt information.

In the method, when the client plays the short video of a certain anchor, the server automatically detects whether the anchor is live, if the anchor is live, the client is informed to display the video confluence prompt information, so that a user can select whether to play the short video and the live video of the same anchor in confluence according to own preference.

In one possible implementation, after sending the composite video stream to the client, the method further includes:

and if a single video playing request sent by the client is received, sending the video stream of the first video or the second video indicated to be played by the single video playing request to the client according to the video playing time carried in the single video playing request.

In the method, if a user selects to play only one of the videos at the client in the process of playing the two videos in a confluence manner, the server sends the video frames starting from the playing time to the client through the video stream according to the video playing time carried by the single video playing request, so that the video can be accurately connected with the videos played in the confluence manner, and the videos can be continuously played without gaps.

In a second aspect, an embodiment of the present disclosure provides a video data processing method, which is applied to a client, and the method includes:

responding to the received video confluence operation, and sending a video confluence playing request to a server; the video confluence play request indicates that a first video and a second video are played simultaneously;

receiving a composite video stream returned by the server; the composite video stream is generated by the server according to a composite video frame, and the composite video frame is obtained by replacing picture data of a designated area of a corresponding video frame in a second video by picture data of a video frame of a first video by the server;

and playing the composite video stream.

In the method, the client responds to the video confluence operation of the user and sends a video confluence playing request to the server so that the server sends the composite video stream to the client, and the client receives and plays the composite video stream.

In a possible implementation manner, the video frames of the first video and the video frames of the second video in the composite video frame are arranged according to a preset position proportion relationship.

In one possible implementation, the playing the composite video stream includes:

and in the process of playing the composite video stream, playing the audio frame of the first video or the audio frame of the second video in the composite video stream according to the audio selected by the user.

In the method, the user can flexibly select to play the audio frame of the first video or the audio frame of the second video at the client.

In one possible implementation, the second video is a short video;

before the responding to the received video confluence operation and sending a video confluence playing request to the server, the method further comprises:

and in the process of playing the second video, displaying the video confluence prompt information according to the notification of the server.

In the method, the client can display the video confluence prompt information according to the notification of the server, so that a user can select whether to play the short video and the live video of the same main broadcast in a confluence mode according to own preference.

In one possible implementation manner, the playing the composite video stream further includes:

in the process of playing the composite video stream, responding to the operation of clicking a designated area by a user, and sending a single video playing request to the server; the single video playing request indicates that a first video is played and carries the video playing time of the first video; or

In the process of playing the composite video stream, responding to the operation that a user clicks a video closing key, and sending a single video playing request to the server; and the single video playing request indicates that the second video is played and carries the video playing time of the second video.

In a third aspect, an embodiment of the present disclosure provides a video data processing apparatus, where the apparatus includes:

the request receiving unit is used for receiving a video confluence playing request sent by a client; the video confluence play request indicates that a first video and a second video are played simultaneously;

the video converging unit is used for synthesizing the video frames of the first video and the second video frame by frame to obtain a synthesized video frame and generating a synthesized video stream according to the synthesized video frame; the composite video frame is obtained by replacing the picture data of the appointed area of the corresponding video frame in the second video by the picture data of the video frame of the first video;

and the video sending unit is used for sending the composite video stream to the client.

In one possible implementation manner, the video merging unit is further configured to:

and respectively adding the audio frame of the first video and the audio frame of the second video into the composite video stream, so that the client plays the audio frame of the first video or the audio frame of the second video according to the selection of a user.

In one possible implementation, the first video is a live video; the second video is a short video; the video merging unit is further configured to:

acquiring bullet screen data of the first video;

In one possible implementation, the first video is a live video; the second video is a short video; the device further comprises:

and the notification sending unit is used for notifying the client to display the video confluence prompt information if the anchor of the second video is determined to be live in the process of playing the second video by the client.

In a possible implementation manner, the video sending unit is further configured to:

In a fourth aspect, an embodiment of the present disclosure provides a video data processing apparatus, including:

the request sending unit is used for responding to the received video confluence operation and sending a video confluence playing request to the server; the video confluence play request indicates that a first video and a second video are played simultaneously;

the video receiving unit is used for receiving the composite video stream returned by the server; the composite video stream is generated by the server according to a composite video frame, and the composite video frame is obtained by replacing picture data of a designated area of a corresponding video frame in a second video by picture data of a video frame of a first video by the server;

and the video playing unit is used for playing the composite video stream.

In a possible implementation manner, the video playing unit is configured to:

In one possible implementation, the second video is a short video; the device further comprises:

and the confluence prompt unit is used for displaying video confluence prompt information according to the notification of the server in the process of playing the second video.

In a possible implementation manner, the video playing unit is further configured to:

In a fifth aspect, embodiments of the present disclosure provide a server, including one or more processors, and a memory for storing instructions executable by the processors;

wherein the processor is configured to execute the instructions to perform the steps of:

generating a composite video stream from the composite video frame;

and sending the composite video stream to the client.

In one possible implementation, the processor further performs:

In one possible implementation, the first video is a live video; the second video is a short video; the processor further performs:

acquiring bullet screen data of the first video;

In one possible implementation, the processor further performs:

In a sixth aspect, embodiments of the present disclosure provide a terminal, including one or more processors, and a memory for storing instructions executable by the processors;

responding to the received video confluence operation, and sending a video confluence playing request to a server; the video confluence play request indicates that a first video and a second video are played simultaneously; the first video is a live video, and the second video is a short video; or the first video is a short video, and the second video is a live video;

and playing the composite video stream.

In one possible implementation, the processor specifically performs:

In one possible implementation, the second video is a short video; the processor further performs:

In one possible implementation, the processor specifically performs:

In a seventh aspect, the present disclosure provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method for processing video data according to the first aspect or the second aspect is implemented.

For technical effects brought by any one implementation manner of the third aspect to the seventh aspect, reference may be made to technical effects brought by a corresponding implementation manner of the first aspect or the second aspect, and details are not described here again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic view of an application scenario of a video data processing method according to an embodiment of the present disclosure;

fig. 2 is an interaction flowchart of a video data processing method according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of a process for generating a composite video frame according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of an audio selection interface provided by an embodiment of the present disclosure;

fig. 5 is an interaction flow diagram of another video data processing method provided by the embodiment of the present disclosure;

fig. 6 is an interaction flow diagram of another video data processing method provided by the embodiment of the present disclosure;

fig. 7 is a schematic flowchart of a video data processing method according to an embodiment of the disclosure;

fig. 8 is a schematic flowchart of another video data processing method according to an embodiment of the disclosure;

fig. 9 is a block diagram of a video data processing apparatus according to an embodiment of the disclosure;

fig. 10 is a block diagram of another video data processing apparatus according to an embodiment of the present disclosure;

fig. 11 is a block diagram of another video data processing apparatus according to an embodiment of the present disclosure;

fig. 12 is a block diagram of another video data processing apparatus according to an embodiment of the present disclosure;

fig. 13 is a block diagram of a server according to an embodiment of the present disclosure;

fig. 14 is a block diagram of a terminal according to an embodiment of the present disclosure.

Detailed Description

To make the objects, technical solutions and advantages of the present disclosure clearer, the present disclosure will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present disclosure, rather than all embodiments. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the scope of protection of the present disclosure.

Some terms in the embodiments of the present disclosure are explained below to facilitate understanding by those skilled in the art.

(1) A client: an application program can be operated on terminal electronic equipment such as a smart phone, a tablet computer or a computer. The client in the embodiment of the present disclosure mainly refers to a video playing client. For example, the terminal may download an installation package of the client via the network, install the client using the installation package, and after the installation is completed, the client may operate on the terminal.

(2) Live video: the video which utilizes internet and streaming media technology to carry out network interactive live broadcast is one of the mainstream expression modes of the current internet media. Live broadcast is a new social networking mode, and the anchor can adopt independent and controllable audio and video acquisition equipment to acquire audio and video, generate live broadcast video and upload the live broadcast video to a server through a network, and then the server sends the live broadcast video to a client of each user watching the live broadcast.

(3) Short video: the internet content transmission mode is one of internet content transmission modes, and refers to high-frequency push video content which can be played through a client and is suitable for being watched in a mobile state or a short-time leisure state. Generally, a new media platform is used for carrying out a main broadcast of live video, and short video recorded by the user can be sent to a server of the new media platform to be shared by the user for watching.

(4) The terms "first," "second," and the like in the embodiments of the present disclosure are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that these similar objects may be interchanged where appropriate.

The present disclosure is described in further detail below with reference to the accompanying drawings and specific embodiments.

Fig. 1 is a schematic view of an application scenario of a video data processing method according to an embodiment of the present disclosure. In this application scenario, terminals of one or more users (such as terminals 101-103 shown in FIG. 1) are connected to a server 300 over a network 200. The terminal can be a mobile phone, a palm computer, a PC, an all-in-one machine and other devices with communication functions, and can also be a virtual machine or simulator simulation device. The network 200 may be a wired network or a wireless network. The server 300 may be a server of a certain video playing platform. The terminal may be a terminal of a main broadcast that is currently broadcasting live, or may be a terminal of a user that is currently watching live. In the following description, it is assumed that the terminal 103 in fig. 1 is a terminal of a main broadcast that is live, and the

terminals

101 and 102 in fig. 1 are terminals of users who are watching the live broadcast.

The terminal 103 of the anchor uploads the live video recorded by the anchor to the server 300 in real time through the network 200, and the terminal 103 of the anchor can also upload the short video recorded by the anchor to the server 300 in real time through the network 200, so as to share the user through the server. The user's terminal 101 and terminal 102 have installed thereon clients that can play short videos and live videos. A user can enter a live broadcast room of a main broadcast through a client to watch a live broadcast video being played, and the server 300 sends the live broadcast video of the main broadcast to each terminal of the user watching the live broadcast, such as the terminal 101 and the terminal 102, respectively in a live broadcast video stream manner. The user can also watch the short video shared by the anchor through the client. After a user clicks a certain short video, the client sends a video playing request for playing the short video to the server 300, and the server 300 sends the short video specified by the video playing request to the client in a video streaming manner, so that the client plays the short video.

At present, if a user wants to watch live videos at the same time in the process of watching short videos or wants to watch short videos at the same time in the process of watching live videos, a client cannot meet the requirements of the user and plays the short videos and the live videos at the same time.

In order to solve the above problem, embodiments of the present disclosure provide a video data processing method, an apparatus, a server, a terminal, and a storage medium, where after receiving a video confluence play request sent by a client, the server replaces picture data of a video frame of a first video with picture data of a specified area of a corresponding video frame in a second video to form a composite video frame, generates a composite video stream according to the composite video frame obtained by frame-by-frame synthesis, and sends the composite video stream to the client, so that the client plays the first video and the second video in different areas of a video play picture according to the received composite video stream, thereby enabling the client to play the two videos simultaneously using one player without using two players stacked to play the first video and the second video, and saving resource consumption of the client.

The application scenario in fig. 1 is only an example of an application scenario for implementing the embodiment of the present application, and the embodiment of the present application is not limited to the application scenario in fig. 1.

Fig. 2 illustrates an interaction diagram of a video data processing method provided by an embodiment of the present disclosure, which may be performed by the server 300 shown in fig. 1 and a client installed on a terminal of a user.

In one embodiment, as shown in fig. 2, a video data processing method provided by an embodiment of the present disclosure includes the following steps:

in step S201, the client receives a video merging operation of the user.

The video merging operation is input operation when a user wants to watch a first video and a second video simultaneously, wherein the first video can be a live video, and the second video can be a short video; alternatively, the first video may be a short video and the second video may be a live video. For example, when a user watches short videos, if the user wants to watch live videos at the same time, the video confluence operation can be triggered.

In an alternative embodiment, the user may trigger the video merge operation during the viewing of the short video. For example, assuming that the second video is a short video, the user sends a video playing request for playing the second video to the server through the client, and the server sends a video stream of the second video to the client. And the client receives the video stream of the second video sent by the server and plays the second video. In the process of playing the second video by the client, if the server determines that the anchor of the second video is live, the client can be informed to display the video confluence prompt information. And the client displays the video confluence prompt information according to the notification of the server. For example, the client receives a notification that the anchor of the second video is live, may pop up a prompt box or present video join prompt information to the user in other prompt manners, and may display a confirm and cancel button in the prompt box. And if the user clicks the confirmation key, the user is considered to trigger the video confluence operation.

In another alternative embodiment, the user may trigger a video merge operation before watching the short video. For example, still assuming that the second video is a short video, the user may view the second video by clicking on the summary display area of the second video. When the user clicks the second video operation, the client can inquire whether the anchor of the second video of the server is live broadcast, if so, the client can pop up a prompt box to display video confluence prompt information to the user, and a confirmation button and a cancel button can be displayed in the prompt box. And if the user clicks the confirmation key, the user is considered to trigger the video confluence operation. Optionally, the client may also display, in the interface for displaying the summary of the second video, a prompt message indicating whether the anchor of the second video is live; alternatively, if the anchor of the second video is live, the client may regard an operation in which the user clicks the digest display area of the second video as a video merging operation.

Similarly, the user may also trigger the video merging operation in the process of watching the live video, which is not described herein again.

Step S202, the client sends a video confluence playing request to the server.

And responding to the received video confluence operation, and sending a video confluence playing request to the server by the client. The video confluence play request indicates that the first video and the second video are played simultaneously.

In step S203, the server performs frame-by-frame synthesis on the video frame of the first video and the video frame of the second video to obtain a synthesized video frame.

The server receives a video confluence play request sent by the client, and synthesizes the video frames of the first video and the second video frame by frame to obtain a synthesized video frame, namely confluence of color coding (YUV) data in the two videos. Wherein, the frame rates of the first video and the second video are the same. When the frame rates of the first video and the second video are different, the frame rate of the other video file can be dynamically adjusted according to the decoding frame rate of the first video or the second video, so that no obvious frame rate difference exists between the two videos.

The synthesized video frame is obtained by replacing the picture data of the designated area of the corresponding video frame in the second video with the picture data of the video frame of the first video, and the synthesized video frame can be obtained by synthesizing the video frame of the first video and the video frame of the second video according to the preset position proportion relation. For example, the position proportion relationship may be that the position ratio of the video frame of the first video to the video frame of the second video is 1: for example, when the display screen is divided into four equal parts on average, the video frame of the first video is located at the upper right corner of the display screen and occupies one quarter of the area of the display screen, and the video frame of the second video occupies the remaining three quarters of the area of the display screen.

Illustratively, for any one video frame in the first video and the corresponding video frame in the second video, as shown in fig. 3, the server cuts out the picture data of the designated area in the video frame of the second video, and then tiles the picture data of the corresponding video frame in the first video to the designated area, resulting in a composite video frame. In fig. 3, the designated area is located in the upper right corner of the picture, and is an area with an aspect ratio of 16: 9. In other embodiments, the designated area may also be located at other positions of the screen, such as the upper left corner, the lower right corner, and the like, and the aspect ratio of the designated area is not limited in the embodiments of the present disclosure.

In step S204, the server generates a composite video stream according to the composite video frame.

In step S205, the server sends the composite video stream to the client.

And the server sends the composite video stream to the client side which sends the video confluence play request.

Step S206, the client plays the composite video stream.

And the client receives the synthesized video stream sent by the server and starts the player to play the video picture of the synthesized video stream.

According to the video data processing method provided by the embodiment of the disclosure, after a video confluence play request sent by a client is received, a server replaces picture data of a designated area of a corresponding video frame in a second video with picture data of a video frame of a first video to form a composite video frame, a composite video stream is generated according to the composite video frame obtained by frame-by-frame synthesis and sent to the client, and the client is enabled to play the first video and the second video in different areas of a video play picture respectively according to the received composite video stream, so that the client can play the two videos simultaneously by using one player without using two players stacked mutually to play the first video and the second video respectively, and resource consumption of the client is saved.

In an alternative embodiment, in order to allow the user to freely select which video to listen to, the server may add audio frames of the first video and audio frames of the second video to the composite video stream, respectively, so that the client plays the audio frames of the first video or the audio frames of the second video according to the user's selection. In the process of playing the composite video stream, the client may play the audio frame (ACC data) of the first video or the audio frame of the second video in the composite video stream according to the audio selected by the user.

For example, the client may add options corresponding to different audios in the setting area or the setting menu during playing the composite video stream. As shown in fig. 4, after the setting menu is expanded, the user can see the options of the first video and the second video, the user selects the first video, the client plays the audio frame of the first video, the user selects the second video, and the client plays the audio frame of the second video. The user may also select the first video and the second video simultaneously while listening to the sound of the first video and the second video.

In another alternative embodiment, the server may also add the audio frames of the first video or the audio frames of the second video to the composite video stream according to the selection of the user of the client, and send the composite video stream to the client. For example, the user selects a first video through a sound setting menu, and the client notifies the server of the user's selection. The server adds the audio frames of the first video to the composite video stream and sends the composite video stream to the client.

Considering that in the live broadcasting process, there are bullet screen messages in general. In some embodiments, assuming that the first video is a live video and the second video is a short video, the server may obtain the bullet screen data of the first video from the bullet screen server or each terminal that sends the bullet screen message. After acquiring the barrage data of the first video, the server adds the barrage data to the picture data of the video frame of the first video, and then replaces the picture data of the appointed area of the corresponding video frame in the second video with the picture data of the video frame of the first video to obtain a composite video frame. The bullet screen data are added to the picture data of the live video at the server side, so that the bullet screen data can be displayed at the correct position when confluence playing is carried out, meanwhile, a client side is not required to separately obtain the confluence of the bullet screen data and the picture data of the live video, and the program resources and the data transmission resources of the client side can be saved.

Considering that a user may tend to watch only one of the first video and the second video during the process of watching the first video and the second video simultaneously, the client may set a control for watching only the first video or closing the first video in the co-broadcasting video picture. For example, a transparent film layer is arranged in a picture area of the first video and used for receiving a click operation of a user, and if the user clicks the picture area of the first video, it is considered that an operation that the user instructs to play the first video is received. And setting a closing key at the upper right corner of the first video, and if a user clicks the closing key, considering that the operation of closing the first video and playing the second video indicated by the user is received.

Specifically, in the process of playing the composite video stream, if the user clicks a designated area, such as a picture area of the first video, the client sends a single video playing request to the server in response to the operation of the user clicking the designated area. The single video playing request indicates that the first video is played and carries the video playing time of the first video. The video playing time refers to the current playing position of the first video, and is usually indicated by a timestamp. The server receives a single video playing request sent by the client, and sends the video stream of the first video indicated to be played by the single video playing request to the client according to the video playing time carried in the single video playing request. The video stream of the first video here includes a video frame starting from the video play time and a video frame following the video frame. And the client plays the first video according to the newly received video stream.

In the process of playing the composite video stream, if the user clicks the video closing key, the client responds to the operation of the user clicking the video closing key and sends a single video playing request to the server. The single video playing request indicates that the second video is played and carries the video playing time of the second video. The video playing time refers to the current playing position of the second video, and is usually indicated by a timestamp. And the server receives a single video playing request sent by the client, and sends the video stream of the second video indicated to be played by the single video playing request to the client according to the video playing time carried in the single video playing request. The video stream of the second video here includes a video frame starting from the video play time and a video frame following the video frame. And the client plays the second video according to the newly received video stream.

For easier understanding, the following describes the process of video data processing by two specific embodiments. In one embodiment, as shown in fig. 5, a video data processing method provided by an embodiment of the present disclosure includes the following steps:

in step S501, the client receives a video playing operation of the user.

The video play operation indicates that a second video is played, the second video being a short video.

Step S502, the client sends a video playing request to the server.

In step S503, the server obtains the video data of the second video indicated to be played by the video playing request.

And the video data of the second video is stored in the server or a database of a video playing platform connected with the server. The video data of the second video comprises video frames of the second video, and a plurality of continuous video frames of the second video form a video stream of the second video.

Step S504, the server sends the video stream of the second video to the client.

In step S505, the client plays the video stream of the second video.

Step S506, in the process of playing the second video by the client, the server monitors that the anchor of the second video is live.

Wherein the anchor of the second video refers to an anchor that uploads the second video.

Step S507, the server sends a live broadcast notification to the client, and notifies the client that the anchor of the second video is live broadcast.

And step S508, the client displays the video confluence prompt information according to the notification of the server.

In step S509, the client receives the video merging operation of the user.

In step S510, the client sends a video merge play request to the server.

The video confluence play request indicates that a first video and a second video are played simultaneously, wherein the first video is a live video of the anchor broadcast.

In step S511, the server performs frame-by-frame synthesis on the video frame of the first video and the video frame of the second video to obtain a synthesized video frame.

And the video frames of the first video and the video frames of the second video in the synthesized video frames are arranged according to a preset position proportion relation.

In step S512, the server generates a composite video stream according to the composite video frame, the audio frame of the first video, and the audio frame of the second video.

In step S513, the server sends the composite video stream to the client.

In step S514, the client plays the composite video stream.

The method provides a brand-new live broadcast exposure mode and more efficient browsing experience, and the live broadcast video is inserted for display in the scene of playing the short video. When watching the short video, the user can watch the live video of the anchor at the same time, and the user can better know the related information of the anchor through the short video-live broadcast.

In step S515, the client receives an operation of clicking the designated area by the user.

In the process of playing the composite video stream, the client monitors whether the user clicks a designated area, where the designated area may be a picture area of the first video.

In step S516, the client sends a single video playing request to the server.

The single video playing request indicates that a first video is played and carries the video playing time of the first video, and the video playing time is used for indicating the current playing position of the first video.

Step S517, the server obtains the video data of the first video indicated to be played by the single video playing request.

The server may obtain video data of a first video from a terminal of the anchor, where the video data of the first video includes video frames in the first video from a video playing time. Starting from the video play time, a plurality of consecutive video frames of the first video form a video stream of the first video.

In step S518, the server sends the video stream of the first video to the client.

In step S519, the client plays the video stream of the first video.

In another embodiment, as shown in fig. 6, a video data processing method provided by an embodiment of the present disclosure includes the following steps:

in step S601, the client receives a video playing operation of the user.

Step S602, the client sends a video playing request to the server.

In step S603, the server obtains the video data of the second video indicated to be played by the video playing request.

In step S604, the server sends the video stream of the second video to the client.

In step S605, the client plays the video stream of the second video.

Step S606, in the process of playing the second video by the client, the server monitors that the anchor of the second video is live.

Step S607, the server sends a live broadcast notification to the client, and notifies the client that the anchor of the second video is live broadcast.

In step S608, the client displays the video confluence prompt information according to the notification from the server.

In step S609, the client receives the video merging operation of the user.

In step S610, the client sends a video streaming playing request to the server.

In step S611, the server performs frame-by-frame synthesis on the video frame of the first video and the video frame of the second video to obtain a synthesized video frame.

In step S612, the server generates a composite video stream according to the composite video frame, the audio frame of the first video, and the audio frame of the second video.

In step S613, the server sends the composite video stream to the client.

In step S614, the client plays the composite video stream.

In step S615, the client receives an operation of the user to click the close video button.

The close video button is used for closing the first video.

In step S616, the client sends a single video playing request to the server.

The single video play request indicates that the second video is played.

In step S617, the server obtains the video data of the second video indicated to be played by the single video playing request.

The video data of the second video comprises video frames starting from the video playing time in the second video. Starting from the video playing time, a plurality of continuous video frames of the second video form a video stream of the second video.

In step S618, the server sends the video stream of the second video to the client.

Step S619, the client plays the video stream of the second video.

It should be noted that the application scenarios described in the embodiment of the present disclosure are for more clearly illustrating the technical solutions of the embodiment of the present disclosure, and do not constitute a limitation on the technical solutions provided in the embodiment of the present disclosure, and as a new application scenario appears, a person skilled in the art may know that the technical solutions provided in the embodiment of the present disclosure are also applicable to similar technical problems.

In the embodiment of the present disclosure, a video data processing method executed by a server is shown in fig. 7, and includes the following steps:

step S701, receiving a video confluence play request sent by a client.

The video confluence play request indicates that the first video and the second video are played simultaneously. The first video is a live video, and the second video is a short video; or the first video is a short video and the second video is a live video.

In an optional implementation manner, in the process of playing the second video by the client, the server monitors whether the anchor of the second video is live, and if it is determined that the anchor of the second video is live, the client is notified to display the video confluence prompt information. And if the user selects to play the two video confluence according to the video confluence prompt information displayed by the client, the client sends a video confluence playing request to the server.

Step S702, a video frame of the first video and a video frame of the second video are synthesized frame by frame to obtain a synthesized video frame.

The composite video frame is obtained by replacing the picture data of the designated area of the corresponding video frame in the second video with the picture data of the video frame of the first video. Before replacing the picture data of the designated area of the corresponding video frame in the second video with the picture data of the video frame of the first video, the server may first obtain the bullet screen data of the first video, and add the bullet screen data to the picture data of the video frame of the first video.

Step S703 generates a composite video stream from the composite video frame.

When generating the composite video stream, the audio frame of the first video and the audio frame of the second video may be added to the composite video stream, respectively, so that the client plays the audio frame of the first video or the audio frame of the second video according to the selection of the user. Alternatively, the audio frames of the first video or the audio frames of the second video may be added to the composite video stream at the client user's option.

Step S704, sending the composite video stream to the client.

Optionally, in the process of playing the co-played video by the client, if a single video playing request sent by the client is received, if the single video playing request indicates that the first video is played, sending the video stream of the first video to the client according to the video playing time of the first video carried in the single video playing request; and if the single video playing request indicates that the second video is played, sending the video stream of the second video to the client according to the video playing time of the second video carried in the single video playing request.

In the embodiment of the present disclosure, as shown in fig. 8, a video data processing method executed by a client includes the following steps:

step S801, in response to the received video join operation, sends a video join play request to the server.

Optionally, in the process of playing the second video, if receiving a notification that the anchor is live broadcast and sent by the server, the client displays the video confluence prompt information according to the notification of the server. And if the user selects to play the two video confluence according to the video confluence prompt information displayed by the client, responding to the received video confluence operation and sending a video confluence playing request to the server.

Step S802, receiving the composite video stream returned by the server.

The composite video stream is generated by a server according to a composite video frame, and the composite video frame is obtained by replacing picture data of a designated area of a corresponding video frame in a second video by picture data of a video frame of a first video through the server.

Step S803, the composite video stream is played.

In the process of playing the composite video stream, the audio frame of the first video or the audio frame of the second video in the composite video stream can be played according to the audio selected by the user.

And in the process of playing the composite video stream, if the condition that the user clicks the designated area is monitored, responding to the operation that the user clicks the designated area, and sending a single video playing request to the server. The single video playing request indicates that a first video is played and carries the video playing time of the first video.

And in the process of playing the composite video stream, if the condition that the user clicks the video closing key is monitored, responding to the operation that the user clicks the video closing key, and sending a single video playing request to the server. And the single video playing request indicates that the second video is played and carries the video playing time of the second video.

The video data processing method shown in fig. 7 is based on the same inventive concept, and the present disclosure also provides a video data processing apparatus. As shown in fig. 9, the video data processing apparatus includes:

a request receiving unit 91, configured to receive a video confluence play request sent by a client; the video confluence play request indicates that a first video and a second video are played simultaneously;

a video merging unit 92, configured to synthesize a video frame of the first video and a video frame of the second video frame by frame to obtain a synthesized video frame, and generate a synthesized video stream according to the synthesized video frame; the composite video frame is obtained by replacing the picture data of the appointed area of the corresponding video frame in the second video by the picture data of the video frame of the first video;

a video sending unit 93, configured to send the composite video stream to the client.

In a possible implementation manner, the video merging unit 92 is further configured to:

In one possible implementation, the first video is a live video; the second video is a short video; the video merging unit 92 is further configured to:

acquiring bullet screen data of the first video;

In one possible implementation, the first video is a live video; the second video is a short video; as shown in fig. 10, the above apparatus further includes:

a notification sending unit 94, configured to notify the client to display video confluence prompting information if it is determined that a main broadcast of the second video is live in a process of playing the second video by the client.

In a possible implementation manner, the video sending unit 93 is further configured to:

The video data processing method shown in fig. 8 is based on the same inventive concept, and the present disclosure also provides an animation material playing apparatus. As shown in fig. 11, the video data processing apparatus includes:

a request transmission unit 111 for transmitting a video streaming play request to the server in response to the received video streaming operation; the video confluence play request indicates that a first video and a second video are played simultaneously;

a video receiving unit 112, configured to receive a composite video stream returned by the server; the composite video stream is generated by the server according to a composite video frame, and the composite video frame is obtained by replacing picture data of a designated area of a corresponding video frame in a second video by picture data of a video frame of a first video by the server;

a video playing unit 113, configured to play the composite video stream.

In a possible implementation manner, the video playing unit 113 may be further configured to:

In one possible implementation, the second video is a short video; as shown in fig. 12, the apparatus may further include a merge presentation unit 114 configured to present video merge presentation information according to the notification from the server during the playing of the second video.

According to the video data processing device provided by the embodiment of the disclosure, after a video confluence play request sent by a client is received, picture data of a video frame of a first video is adopted to replace picture data of a designated area of a corresponding video frame in a second video to form a composite video frame, a composite video stream is generated according to the composite video frame obtained by frame-by-frame synthesis and sent to the client, and the client is enabled to play the first video and the second video in different areas of a video play picture respectively according to the received composite video stream, so that the client can play the two videos simultaneously by using one player without using two players stacked mutually to play the first video and the second video respectively, and resource consumption of the client is saved.

The video data processing method shown in fig. 7 is based on the same inventive concept, and the disclosed embodiment also provides a server, for example, the server 300 shown in fig. 1. Fig. 13 shows a block diagram of one possible server, which may include a memory 1301 and a processor 1302, as shown in fig. 13.

A memory 1301 for storing a computer program for execution by the processor 1302. The memory 1301 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the server, such as short videos, bullet screen data, and the like.

The memory 1301 may be a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 1301 may also be a non-volatile memory (non-volatile memory) such as, but not limited to, a read-only memory (rom), a flash memory (flash memory), a Hard Disk Drive (HDD) or a solid-state drive (SSD), or the memory 1301 may be any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Memory 1301 may be a combination of the above.

The processor 1302 may include one or more Central Processing Units (CPUs), Graphics Processing Units (GPUs), or digital Processing units (dsps), among others.

The specific connection medium between the memory 1301 and the processor 1302 is not limited in the embodiments of the present disclosure. In fig. 13, the memory 1301 and the processor 1302 are connected through a bus 1303, the bus 1303 is shown by a thick line in fig. 13, and the connection manner between other components is merely illustrative and not limited. The bus 1303 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 13, but this is not intended to represent only one bus or type of bus.

Specifically, the processor 1302 is configured to implement the following steps when calling the computer program stored in the memory 1301:

generating a composite video stream from the composite video frame;

and sending the composite video stream to the client.

In one possible implementation, the processor 1302 further performs:

In one possible implementation, the first video is a live video; the second video is a short video; the processor 1302 further performs:

acquiring bullet screen data of the first video;

In one possible implementation, the processor 1302 further performs:

Based on the same inventive concept as the video data processing method shown in fig. 8, the embodiment of the present disclosure further provides a terminal, for example, the terminal 101 or the terminal 102 shown in fig. 1, on which a client capable of playing short videos and live videos is installed. As shown in fig. 14, the terminal may include: a Radio Frequency (RF) circuit 1401, a power supply 1402, a processor 1403, a memory 1404, an input unit 1405, a display unit 1406, a camera 1407, a communication interface 1408, and a Wireless Fidelity (WiFi) module 1409. Those skilled in the art will appreciate that the structure of the terminal device shown in fig. 14 does not constitute a limitation of the terminal device, and the terminal device provided in the embodiments of the present application may include more or less components than those shown, or may combine some components, or may be arranged in different components.

The following describes the various components of the terminal in detail with reference to fig. 14:

the RF circuit 1401 may be used for receiving and transmitting data during a communication or conversation. In particular, the RF circuit 1401 sends live broadcast data sent by a server to the processor 1403 for processing; in addition, uplink data is transmitted to the base station. In general, the RF circuit 1401 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.

In addition, the RF circuit 1401 can also communicate with a network and other devices through wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like.

The WiFi technology belongs to a short distance wireless transmission technology, and the terminal can connect to an Access Point (AP) through a WiFi module 1409, thereby implementing Access to a data network. The WiFi module 1409 may be used for receiving and sending data during communication, and may also receive a video stream sent by a server through the WiFi module 1409.

The terminal may be physically connected to other devices via the communication interface 1408. Optionally, the communication interface 1408 is connected to the communication interface of the other device through a cable, so as to implement data transmission between the terminal and the other device.

In the embodiment of the present application, the terminal can implement a communication service and send information to other contacts, so the terminal needs to have a data transmission function, that is, the terminal needs to include a communication module inside. Although fig. 14 shows communication modules such as the RF circuit 1401, the WiFi module 1409, and the communication interface 1408, it is understood that at least one of the above components or other communication modules (such as a bluetooth module) for communication are present in the terminal for data transmission.

For example, when the terminal is a mobile phone, the terminal may include the RF circuit 1401, and may further include the WiFi module 1409; when the terminal is a computer, the terminal may include the communication interface 1408 and may further include the WiFi module 1409; when the terminal is a tablet computer, the terminal may include the WiFi module.

The input unit 1405 may be used to receive numeric or character information input by a user and generate key signal inputs related to user settings and function control of the terminal.

Optionally, the input unit 1405 may include a touch panel 1453 and other input devices 1454.

The touch panel 1453, also referred to as a touch screen, may collect touch operations of a user (for example, operations of the user on or near the touch panel 1453 using any suitable object or accessory such as a finger, a stylus, etc.) and implement corresponding operations according to a preset program, such as user selection of a gift, etc. Alternatively, the touch panel 1453 may include two parts, i.e., a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 1403, and can receive and execute commands sent by the processor 1403. In addition, the touch panel 1453 may be implemented in various types, such as resistive, capacitive, infrared, and surface acoustic wave.

Optionally, the other input devices 1454 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 1406 may be used to display information input by a user or information provided to the user and various menus of the terminal. The display unit 1406 is a display system of the terminal, and is used for presenting interfaces, such as live pictures, and implementing human-computer interaction.

The display unit 1406 may include a display panel 1461. Alternatively, the Display panel 1461 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

Further, the touch panel 1453 can cover the display panel 1461, and when the touch panel 1453 detects a touch operation on or near the touch panel, the touch operation is transmitted to the processor 1403 to determine the type of touch event, and then the processor 1403 provides a corresponding visual output on the display panel 1461 according to the type of touch event.

Although in fig. 14, the touch panel 1453 and the display 1461 are implemented as two separate components to implement the input and output functions of the terminal, in some embodiments, the touch panel 1453 and the display 1461 may be integrated to implement the input and output functions of the terminal.

The camera 1407 is used for realizing the shooting function of the terminal and shooting pictures or videos. For example, the anchor may take live video through the camera 1407. The camera 1407 can also be used to implement a scanning function of the terminal to scan a scanned object (two-dimensional code/barcode).

The terminal also includes a power supply 1402 (such as a battery) for powering the various components. Optionally, the power source 1402 may be logically connected to the processor 1403 through a power management system, so as to implement functions of managing charging, discharging, power consumption, and the like through the power management system.

The memory 1404 may be either volatile or nonvolatile memory. The memory 1404 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. A memory 1404 for storing computer programs executed by the processor 1403.

Processor 1403, which may include one or more Central Processing Units (CPUs), Graphics Processing Units (GPUs), or digital Processing units, among others. The graphics processing unit may be configured to process graphics, such as rendering video pictures.

In particular, the processor 1403 is configured to implement the following steps when invoking the computer program stored in the memory 1404:

and playing the composite video stream.

In one possible implementation, the processor 1403 specifically performs:

In one possible implementation, the second video is a short video; the processor 1403 further performs:

In one possible implementation, the processor 1403 specifically performs:

Although not shown, the terminal may further include at least one sensor, an audio circuit, and the like, which will not be described herein.

The embodiment of the present disclosure further provides a computer storage medium, where computer-executable instructions are stored in the computer storage medium, and the computer-executable instructions are used to implement any one of the video data processing methods described in the embodiment of the present disclosure.

In some possible embodiments, various aspects of the methods provided by the present disclosure may also be implemented in the form of a program product including program code for causing a computer device to perform the steps of the methods according to various exemplary embodiments of the present disclosure described above in this specification when the program product is run on the computer device, for example, the computer device may perform any one of the video data processing methods described in the embodiments of the present disclosure.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications can be made in the present disclosure without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and their equivalents, the present disclosure is intended to include such modifications and variations as well.

Claims

1. A video data processing method applied to a server, the method comprising:

generating a composite video stream from the composite video frame;

and sending the composite video stream to the client.

2. The method of claim 1, wherein the frame-by-frame synthesizing the video frame of the first video with the video frame of the second video to obtain a synthesized video frame comprises:

3. The method of claim 1, wherein after generating a composite video stream from the composite video frames, the method further comprises:

4. The method of claim 1, wherein the first video is a live video and the second video is a short video; or the first video is a short video, and the second video is a live video.

5. The method of claim 4, wherein the first video is a live video; the second video is a short video; before synthesizing the video frame of the first video and the video frame of the second video frame by frame to obtain a synthesized video frame, the method further comprises:

acquiring bullet screen data of the first video;

6. The method of claim 4, wherein the first video is a live video; the second video is a short video; before the receiving of the streaming video playing request sent by the client, the method further includes:

7. The method according to any one of claims 1 to 6, wherein after sending the composite video stream to the client, the method further comprises:

8. A video data processing method applied to a client, the method comprising:

and playing the composite video stream.

9. The method of claim 8, wherein the video frames of the first video and the video frames of the second video in the composite video frame are arranged according to a predetermined position ratio.

10. The method of claim 8, wherein said playing said composite video stream comprises:

11. The method of claim 8, wherein the first video is a live video and the second video is a short video; or the first video is a short video, and the second video is a live video.

12. The method of claim 11, wherein the second video is a short video;

13. The method of claim 12, wherein the playing the composite video stream further comprises:

14. A video data processing apparatus, characterized in that the apparatus comprises:

15. The apparatus of claim 14, wherein the video merging unit is further configured to:

16. The apparatus of claim 14, wherein the video merging unit is further configured to:

17. The apparatus of claim 14, wherein the first video is a live video and the second video is a short video; or the first video is a short video, and the second video is a live video.

18. The apparatus of claim 17, wherein the first video is a live video; the second video is a short video; the video merging unit is further configured to:

acquiring bullet screen data of the first video;

19. The apparatus of claim 17, wherein the first video is a live video; the second video is a short video; the device further comprises:

20. The apparatus according to any one of claims 14 to 17, wherein the video sending unit is further configured to:

21. A video data processing apparatus, characterized in that the apparatus comprises:

and the video playing unit is used for playing the composite video stream.

22. The apparatus of claim 21, wherein the video frames of the first video and the video frames of the second video in the composite video frame are arranged according to a predetermined position ratio.

23. The apparatus of claim 21, wherein the video playing unit is configured to:

24. The apparatus of claim 21, wherein the first video is a live video and the second video is a short video; or the first video is a short video, and the second video is a live video.

25. The apparatus of claim 24, wherein the second video is a short video; the device further comprises:

26. The apparatus of claim 25, wherein the video playing unit is further configured to:

27. A server comprising one or more processors and memory for storing processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video data processing method of any of claims 1-7.

28. A terminal comprising one or more processors and memory for storing instructions executable by the processors;

wherein the processor is configured to execute the instructions to implement the video data processing method of any of claims 8-13.

29. A computer-readable storage medium, in which a computer program is stored which, when executed by a processor, implements a video data processing method according to any one of claims 1 to 7 or according to any one of claims 8 to 13.