CN111601136A - Video data processing method and device, computer equipment and storage medium - Google Patents

Video data processing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111601136A
CN111601136A CN202010392160.0A CN202010392160A CN111601136A CN 111601136 A CN111601136 A CN 111601136A CN 202010392160 A CN202010392160 A CN 202010392160A CN 111601136 A CN111601136 A CN 111601136A
Authority
CN
China
Prior art keywords
video data
playing
video
interface
timestamp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010392160.0A
Other languages
Chinese (zh)
Other versions
CN111601136B (en
Inventor
翁名为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010392160.0A priority Critical patent/CN111601136B/en
Publication of CN111601136A publication Critical patent/CN111601136A/en
Application granted granted Critical
Publication of CN111601136B publication Critical patent/CN111601136B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The embodiment of the application discloses a video data processing method, a video data processing device, computer equipment and a storage medium, wherein the method comprises the following steps: responding to a trigger operation aiming at an operable control on a first playing interface, and intercepting a target area on the first playing interface; when second video data matched with the target object is acquired, dividing the first playing interface into a second playing interface and a third playing interface; and when the playing time stamp different from the second time stamp exists in the playing time stamp of the first video data and the playing time stamp of the second video data, adjusting the video data corresponding to the playing time stamp different from the second time stamp so as to keep the playing time stamp of the adjusted video data consistent with the second time stamp. By adopting the embodiment of the application, the accuracy of synchronous playing of the multi-channel video data can be improved.

Description

Video data processing method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a video data processing method and apparatus, a computer device, and a storage medium.
Background
Currently, when multiple (e.g., two) paths of video data are played synchronously, the player 1 and the player 2 can be used to play the video data a and the video data B simultaneously. In order to implement synchronous playing of the video data a and the video data B, current playing positions of the two players may be obtained at regular time in a timer polling manner, and when a difference between the current playing positions of the two players reaches a preset threshold (e.g., 1s), the following two video data synchronization strategies are adopted.
For example, when the player 1 plays for 10 seconds, the player 2 only plays for 9 seconds for some reason (stuck, dropped, etc.), i.e., the difference between the player 1 and the player 2 reaches the preset threshold. At this time, one video data synchronization strategy adopted by the user terminal is as follows: and pausing the playing of the player 1 and continuing to play the player 2 until the player 2 plays for 10 seconds, and the user terminal can recover the player 1 to realize the synchronous playing of the two paths of video data. Optionally, another video data synchronization policy adopted by the user terminal is as follows: the player 1 is normally played, and the player 2 is subjected to multiple playing (for example, the playing multiple value is 1.1 times) within the difference range until the playing positions of the player 2 and the player 1 are the same (for example, until both players play to the 20 th time), the normal playing of the player 2 is resumed, so as to realize the synchronous playing of the two paths of video data. Therefore, in the prior art, when the multi-channel video data are synchronously played in a timer polling manner, the difference of larger granularity exists, so that a plurality of pictures with different progresses can be displayed in the playing process of the multi-channel video data, and the accuracy of video data synchronization is further reduced.
Content of application
The embodiment of the application provides a video data processing method and device, computer equipment and a storage medium, which can improve the precision of synchronous playing of multiple paths of video data.
An aspect of the embodiments of the present application provides a method for processing video data, where the method includes:
responding to a trigger operation aiming at an operable control on a first playing interface, and intercepting a target area on the first playing interface; the target area comprises a target object; the first playing interface is obtained when the first video data is played based on a first timestamp of the external clock;
when second video data matched with the target object is acquired, dividing the first playing interface into a second playing interface and a third playing interface; the second playing interface is used for playing the first video data; the third playing interface is used for playing the second video data;
when a playing time stamp different from the second time stamp exists in the playing time stamp of the first video data and the playing time stamp of the second video data, adjusting the video data corresponding to the playing time stamp different from the second time stamp so as to keep the playing time stamp of the adjusted video data consistent with the second time stamp; the second timestamp is a timestamp of the external clock, and the second timestamp is later than the first timestamp.
An aspect of an embodiment of the present application provides a video data processing apparatus, where the apparatus includes:
the intercepting module is used for responding to the triggering operation aiming at the operable control on the first playing interface and intercepting a target area on the first playing interface; the target area comprises a target object; the first playing interface is obtained when the first video data is played based on a first timestamp of the external clock;
the dividing module is used for dividing the first playing interface into a second playing interface and a third playing interface when second video data matched with the target object is acquired; the second playing interface is used for playing the first video data; the third playing interface is used for playing the second video data;
the adjusting module is used for adjusting the video data corresponding to the playing time stamp different from the second time stamp when the playing time stamp different from the second time stamp exists in the playing time stamp of the first video data and the playing time stamp of the second video data so as to keep the playing time stamp of the adjusted video data consistent with the second time stamp; the second timestamp is a timestamp of the external clock, and the second timestamp is later than the first timestamp.
Wherein, this adjustment module includes:
an acquisition unit configured to acquire a second time stamp of the external clock, the second time stamp being used as a reference time stamp for adjusting the first video data and the second video data;
a first determining unit, configured to, when a play timestamp different from the reference timestamp exists in the play timestamp of the first video data and the play timestamp of the second video data, take the play timestamp different from the reference timestamp as a to-be-synchronized timestamp, and take video data corresponding to the to-be-synchronized timestamp as to-be-processed video data; the video data to be processed comprises one or more of first video data and second video data;
and the adjusting unit is used for adjusting the decoded video sequence and the decoded audio sequence in the video data to be processed based on the reference timestamp and the timestamp to be synchronized to obtain the adjusted video data.
The time stamp to be synchronized comprises a video playing time stamp and an audio playing time stamp;
the adjusting unit includes:
the first determining subunit is used for determining a first difference value between the video playing time stamp and the reference time stamp, and acquiring a video frame to be synchronized associated with the first difference value in a decoded video sequence of the video data to be processed;
a second determining subunit, configured to determine a second difference between the audio playing time stamp and the reference time stamp, and acquire, in a decoded audio sequence in the video data to be processed, an audio frame to be synchronized associated with the second difference;
and the adjusting subunit is used for adjusting the decoded video sequence in the video data to be processed and the decoded audio sequence in the video data to be processed based on the first difference, the second difference, the video frame to be synchronized and the audio frame to be synchronized to obtain the adjusted video data.
Wherein the adjusting subunit is further configured to:
taking a decoded video sequence in video data to be processed and a decoded audio sequence in the video data to be processed as multimedia sequences to be synchronized, taking a video frame to be synchronized and an audio frame to be synchronized as multimedia data frames, and taking a first difference value and a second difference value as difference values to be synchronized of the multimedia data frames;
if the difference value to be synchronized is a positive number, rendering the multimedia data frame in the multimedia sequence to be synchronized when the waiting time of the multimedia data frame reaches the difference value to be synchronized to obtain adjusted video data;
and if the difference value to be synchronized is negative, performing frame loss processing on the multimedia data frame in the multimedia sequence to be synchronized to obtain the adjusted video data.
Wherein, this division module includes:
the second determining unit is used for taking a terminal display interface to which the first playing interface belongs as a to-be-processed interface when second video data matched with the target object is acquired, and determining a boundary line for playing the first video data and the second video data and boundary position information of the boundary line in the to-be-processed interface;
and the dividing unit is used for dividing the interface to be processed into a second playing interface and a third playing interface based on the boundary and the boundary position information.
Wherein, the device still includes:
the first determining module is used for acquiring a target object label of a target object and determining a switching sub-interface independent of a second playing interface in the second playing interface; switching the sub-interface to be an interface on a second playing interface;
the first output module is used for outputting the target object label to the switching sub-interface when the first video data is played on the second playing interface;
and the first synchronous playing module is used for synchronously playing the second video data on the third playing interface.
Wherein, the device still includes:
the second output module is used for responding to the triggering operation of the recommendation button in the switching sub-interface, closing the third playing interface, determining K cover page display data associated with the target object label, and outputting the K cover page display data to an object display interface independent of the second playing interface; k is a positive integer;
the switching module is used for responding to the trigger operation aiming at the target cover display data in the object display interface, acquiring target video data corresponding to the target cover display data, switching the second playing interface into a fourth playing interface and playing the target video data on the fourth playing interface; the target cover display data is one of the K cover display data.
Wherein, the device still includes:
the starting module is used for responding to the starting operation aiming at the first video data, calling a scheduling layer of the user terminal, starting an external clock associated with the first video data and a first player associated with the first video data based on the scheduling layer, and taking a timestamp corresponding to the starting operation as a first timestamp of the external clock;
the decoding module is used for decoding the first coded video stream through the first player to obtain first video data; the first coding video stream is video data obtained by the user terminal from a first video address of the server; the first video data comprises a decoded video sequence and a decoded audio sequence;
and the second synchronous playing module is used for synchronously playing the decoded video sequence and the decoded audio sequence in the first video data on the first playing interface based on the first timestamp.
Wherein the decoding module comprises:
a receiving unit, configured to receive a first encoded video stream stored in a first video address by a server; the first coded video stream is obtained after the server codes the first video data;
the separation unit is used for carrying out data separation on the first coded video stream based on a separator in the first player to obtain a video packet and an audio packet which are associated with the first coded video stream;
the decoding unit is used for decoding the video packet based on a video decoder in the first player to obtain a decoded video sequence corresponding to the video packet, and decoding the audio packet based on an audio decoder in the first player to obtain a decoded audio sequence corresponding to the audio packet;
a third determining unit for decoding the decoded video sequence and the decoded audio sequence as the first video data.
Wherein, this intercept module includes:
the output unit is used for acquiring the operable control pushed by the server and having the display duration threshold, outputting the operable control to the first playing interface, and displaying the progress animation of the operable control within the display duration threshold;
and the intercepting unit is used for responding to the triggering operation aiming at the operable control within the display duration threshold value and intercepting the target area on the first playing interface.
Wherein the target area contains at least one object;
the device also includes:
the second determining module is used for taking the video frame to which the target area belongs as a video frame to be processed in the decoded video sequence, and determining the coordinate information of the target area in the video frame to be processed based on the target playing time stamp corresponding to the video frame to be processed; the target playing time stamp is a time stamp which is later than or equal to the first time stamp;
the sending module is used for sending the target playing time stamp and the coordinate information to the server so that when the server determines a target object in at least one object, an object tag corresponding to the target object is used as a target object tag; the determination of the target object is determined by the server based on a mapping relation table associated with the video frame to be processed; the mapping relation table comprises Z areas associated with the video frames to be processed, wherein one area corresponds to one object, and one object corresponds to one object label; z is a positive integer;
the third determining module is used for determining second video data matched with the target object; the second video data is determined by the server based on the second video address obtained by the target object tag.
Wherein the third determining module comprises:
the starting unit is used for starting a second player related to the second video data based on the scheduling layer and acquiring a second video address returned by the server based on the target object label;
the pulling unit is used for pulling the second coding video stream from the server through the second video address; the second coded video stream is obtained after the server codes the second video data; the second video data comprises a target object;
and the decoding unit is used for decoding the second coded video stream through the second player to obtain second video data.
One aspect of the present application provides a computer device, comprising: a processor, a memory, a network interface;
the processor is connected to a memory and a network interface, wherein the network interface is used for providing a data communication function, the memory is used for storing a computer program, and the processor is used for calling the computer program to execute the method in the above aspect in the embodiment of the present application.
An aspect of the present application provides a computer-readable storage medium storing a computer program comprising program instructions that, when executed by a processor, perform the method of the above-mentioned aspect of the embodiments of the present application.
In this embodiment of the present application, first video data may be played on a first play interface according to a first timestamp of an external clock, and when an operable control for acquiring a target object exists on the first play interface, a trigger operation for the operable control on the first play interface may be responded to, so as to intercept a target area containing the target object on the first play interface. Further, in the embodiment of the application, when second video data matched with the target object is acquired, the first playing interface is divided into a second playing interface and a third playing interface, so that the first video data is played on the second playing interface, and the second video data is played on the third playing interface. It should be understood that in the present application, the same external clock may be used to adjust the playing time stamp of one or more video data of the currently played first video data and second video data, that is, in these currently played video data, when video data with a playing time stamp different from the second time stamp of the external clock is detected, the video data with a detected playing time stamp different from the second time stamp of the external clock may be rapidly dynamically adjusted, so that the playing time stamp of the adjusted first video data or the adjusted second video data may be kept consistent with the second time stamp of the external clock. Therefore, when the time stamp of the same external clock is used for dynamically calibrating the first video data or the second video data, the video data with the fast playing progress does not need to wait for the video data with the slow playing progress, and the granularity of video synchronization can be optimized. Therefore, when the first video data and the second video data are played in the same terminal, the playing progress of the video data presented in different playing interfaces can be ensured to be the same as much as possible, and the synchronous playing precision of the multi-channel video data can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present application;
fig. 2 is a schematic view of a scenario for performing data interaction according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a video data processing method according to an embodiment of the present application;
fig. 4 is a schematic diagram of a relationship between an external clock and a plurality of players according to an embodiment of the present application;
fig. 5 is a schematic diagram illustrating a basic principle of playing video data through a player according to an embodiment of the present application;
fig. 6 is a schematic view of a scene of a decoded video sequence obtained by decoding a video packet according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a progress animation of an operable control provided by an embodiment of the present application;
fig. 8 is a schematic view of a scenario of a timestamp recorded by an external clock according to an embodiment of the present application;
fig. 9 is a schematic flowchart of a video data processing method according to an embodiment of the present application;
FIG. 10 is a schematic diagram illustrating a scenario of recommending target video data associated with a target object according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a video data processing apparatus according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of a video data processing apparatus according to an embodiment of the present application;
fig. 13 is a schematic diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Please refer to fig. 1, which is a schematic structural diagram of a network architecture according to an embodiment of the present application. As shown in fig. 1, the network architecture may include a server 10 and a user terminal cluster, which may include one or more user terminals, where the number of user terminals will not be limited. As shown in fig. 1, the system may specifically include a user terminal 100a, a user terminal 100b, user terminals 100c and …, and a user terminal 100 n. As shown in fig. 1, the user terminal 100a, the user terminal 100b, the user terminals 100c, …, and the user terminal 100n may be respectively connected to the server 10 via a network, so that each user terminal may interact with the server 10 via the network.
The server 10 shown in fig. 1 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform.
For convenience of understanding, in the embodiment of the present application, one user terminal may be selected from the plurality of user terminals shown in fig. 1 as a target user terminal, and the target user terminal may include: the intelligent terminal comprises an intelligent terminal with a video data processing function, such as a smart phone, a tablet computer, a notebook computer, a desktop computer and an intelligent television. For example, the user terminal 100a shown in fig. 1 may be used as a target user terminal, and a target application having the video data processing function may be integrated in the target user terminal. It should be understood that each user terminal in the user terminal cluster shown in fig. 1 may be installed with a target application, and when the target application runs in each user terminal, data interaction may be performed with the server 10 shown in fig. 1. The target application may include a social client, a multimedia client (e.g., a video client), an entertainment client (e.g., a game client), an education client, a live client, and the like, which have a frame sequence (e.g., a frame animation sequence) loading and playing function.
It is understood that the target user terminal may include a plurality of players, specifically, player 1, player 2, …, and player n. Each of the multiple players may synchronize the video data (here, the decoded audio sequence and the decoded video sequence) played by each player using the timestamp recorded by the same external clock (external clock), so that the multiple players running in the same terminal can achieve synchronous playing, for example, frame-level video synchronization may be achieved on the playing interface corresponding to each player according to the external clock.
For example, when a target user corresponding to a target user terminal (e.g., the user terminal 100a) accesses a target application (e.g., the video client), the target user may perform a trigger operation (i.e., a start operation) on first video data to start a first player (e.g., the player 1) of the target application, and play the first video data based on a first timestamp of an external clock on a play interface (e.g., the play interface 1000) of the first player. The trigger operation may include a contact operation such as a click or a long press, or may also include a non-contact operation such as a voice or a gesture, which is not limited herein.
In this embodiment of the present application, a timestamp for starting the first video data may be referred to as a first timestamp recorded by the external clock, and the first timestamp may be a video playing timestamp of a first video frame in the first video data. In the embodiment of the present application, a player for playing the first video data may be referred to as a first player, and a playing interface for playing the first video data may be referred to as a first playing interface.
It should be understood, however, that the first video data may include video data selected by the target user corresponding to the user terminal 100a from the default display interface of the target application (e.g., video client 1), the default display interface may be a video display interface corresponding to a home channel (i.e. a first channel) that is started synchronously when the target user starts the target application, the target user may select a certain video data (e.g., video data a, which may be a television program, an art program or an educational program) to be played from the video display interface corresponding to the home channel, and at this time, in the embodiment of the application, the video data a selected in the video display interface may be collectively referred to as first video data, and the playing interface for playing the video data a may be collectively referred to as a first playing interface.
Optionally, it may be understood that the target user may also input corresponding search text information in a text search area of the video display interface corresponding to the home page channel, so that the user terminal 100a may search video data that is of interest to the user terminal according to the search text information, and may further display another video data (for example, the video data B may be a certain art program) searched on the search interface of the target application (for example, the video client 1). Further, the target user may perform a trigger operation on the video data B on the search interface to start the first player of the target application, and play the video data B on the play interface of the first player based on the first timestamp of the external clock. At this time, it should be understood that, in the embodiment of the present application, this video data B triggered on the search interface may be collectively referred to as first video data, and a timestamp triggering the video data B is collectively referred to as a first timestamp of the external clock, so that the decoded audio sequence and the decoded video sequence of the video data B may be synchronously played in an audio-video manner based on the first timestamp of the external clock.
Alternatively, it should be understood that the target application (e.g., the video client 1) may further include a second channel different from the first channel, and the second channel may be used to obtain video data (e.g., live video data, etc.) pushed by the server 10. Based on this, in the embodiment of the application, the target application may be started, and simultaneously, the second channel in the target application is started, so as to play the video data pushed by the server 10, for example, the live video data C, on the video playing interface corresponding to the second channel. It should be understood that in the process of starting the target application, the embodiment of the application may directly start the first player of the target application, so as to quickly play the live video data C on the play interface of the first player. At this time, in the embodiment of the present application, the obtained live video data C may be collectively referred to as first video data, and the live video data C (the first video data) is played on a playing interface (i.e., a first playing interface) of the first player based on a first timestamp of an external clock.
Alternatively, it should be understood that the second channel in the video client 1 may also be another application client (e.g., the video client 2) independent from the video client 1, and in this case, the video client 2 may be collectively referred to as a target application of the target user terminal in this embodiment of the present application. It can be understood that, when responding to the trigger operation executed by the target user for the video client 2 (i.e., a new target application), the user terminal 100a may directly play the video data pushed by the server 10, for example, the video data D, in the video display interface of the video client 2, where the video data D may be the video data directly pushed to the user terminal 100a by the server 10 based on the service data such as the historical browsing data of the target user, the friend viewing data, and the like; the video data 10A may also be video data with a higher browsing volume determined by the server 10 based on the counted browsing volume of the video data, which is not limited herein. It should be understood that, in the embodiment of the present application, when the video client 2 is started, the first player corresponding to the video client 2 may be started together, and then the video data D may be directly played on the playing interface of the first player. At this time, in the embodiment of the present application, the obtained video data D may be collectively referred to as first video data, and the video data D is played on a first playing interface (i.e., a video display interface of the video client 2) based on a first timestamp of the external clock.
It is understood that during the playing of the first video data (e.g., video data a), an operable control having a display duration threshold (e.g., 5 seconds) may be displayed on the first playing interface (e.g., playing interface 1000) of the user terminal. Wherein the operable control can be used for intercepting an area containing an object of interest of a target user. Further, the user terminal 100a may intercept the target area on the play interface 1000 in response to a trigger operation for an operable control on the play interface 1000. Wherein the target area may contain the target object. Further, the user terminal 100A may acquire video data (e.g., video data 10A) matching the target object from the server 10. In this embodiment, the video data matched with the target object may be referred to as second video data, and a player (e.g., player 2) playing the second video data may be referred to as a second player. Among them, the target object in the second video data (e.g., video data 10A) may be different from the shooting angle of the target object in the first video data (e.g., video data a).
It should be understood that, when the user terminal 100A acquires the video data 10A, the terminal display interface where the play interface 1000 is located may be divided into two play interfaces, for example, a play interface 2000 for playing the video data a through the player 1 and a play interface 3000 for playing the video data 10A through the player 2. In the embodiment of the application, in the same terminal display interface, a playing interface (for example, the playing interface 2000) for playing the first video data is referred to as a second playing interface, and a playing interface (for example, the playing interface 3000) for synchronously playing the second video data is referred to as a third playing interface.
Further, when there is a play time stamp different from the second time stamp in the play time stamp of the video data a and the play time stamp of the video data 10A, the user terminal 100A may adjust the video data corresponding to the play time stamp different from the second time stamp, so that the play time stamp of the adjusted video data and the second time stamp are kept consistent. The second timestamp in this embodiment of the application refers to a timestamp recorded by an external clock, and the second timestamp is later than the first timestamp. It is understood that the adjusted video data may include the adjusted video data a and the adjusted video data 10A. The user terminal 100A can play the adjusted video data a on the playing interface 2000 through the player 1, and simultaneously play the adjusted video data 10A on the playing interface 3000 through the player 2.
Further, please refer to fig. 2, which is a schematic view of a scenario for performing data interaction according to an embodiment of the present application. The ue 200 in this embodiment may be any one of the ue in the ue cluster shown in fig. 1, for example, the ue 100 a. The server 210 in the embodiment of the present application may be the server 10 shown in fig. 1.
The first video data in the embodiment of the present application may include a plurality of objects, and specifically may include an object a, an object b, and an object c. The first video data may be the video data 10A shown in fig. 2 (the video data 10A may be video data captured by a general airplane). The server 210 shown in fig. 2 may include a plurality of video data captured by stands at different capturing angles in the same capturing scene (e.g., scene X), and the video data may specifically include the video data 10A captured by a normal stand shown in fig. 2, and may further include video data captured by other feature stands, such as the video data 20A captured by a feature stand of the capturing object a, the video data 20B captured by a feature stand of the capturing object B, and the video data 20C captured by a feature stand of the capturing object C. In addition, the server 210 may also contain recommendation data associated with the 3 objects respectively, and the specific amount of recommendation data associated with each object will not be limited herein. The recommendation data may be obtained by performing a trigger operation on a recommendation button "more works" in the switch sub-interface shown in fig. 2. The recommendation data may be historical playing data associated with the object stored in the server 210 or recommendation data that has been stored to the server 210 but not watched by the target user, such as video data of a tv show, a variety program, a movie, a short video, a shooting movie, and text data of news, a brief summary, and the like associated with the object.
As shown in fig. 2, at the first time stamp of the external clock, the user terminal 200 may play the first video data (e.g., the video data 10A) on the play interface 1. Here, it is understood that a target user corresponding to the user terminal 200 may perform a start operation on the video data 10A in the target application (e.g., the video client 1 described above), at this time, the user terminal 200 may start an external clock associated with the video data 10A and a player (e.g., the player 1) playing the video data 10A in response to the start operation, so that the video data 10A may be played synchronously based on the first timestamp recorded by the external clock.
Here, the start operation performed by the target user with respect to the video data 10A may be a first start operation performed by the target user on the video data 10A, in other words, the target user may perform the start operation on the video data 10A when viewing the video data 10A for the first time, and at this time, the user terminal 200 may start the external clock in response to the start operation. Optionally, the start operation performed by the target user for the video data 10A may also be a start operation performed by the target user on the video data 10A again, for example, after the video data 10A is played for a certain period of time (for example, 35 minutes), the target user exits from the playing interface 1 of the video data 10A, at this time, the external clock will not count time any more, and a timestamp of exiting from the playing interface 1 is saved. When the target user watches the video data 10A again, the start-up operation may be performed on the video data 10A, and at this time, the user terminal 200 may start the external clock in response to the start-up operation, so that the external clock restores the timestamp (e.g., 35 minutes) recorded when the external clock newly exits the play interface 1.
The play interface 1 of the user terminal 200 may obtain an operable control having a display duration threshold (e.g., 5 seconds) from the server 210, and display a progress animation of the operable control within the display duration threshold. It should be understood that if the user terminal 200 does not receive the trigger operation of the target user on the operable control within the display duration threshold of the operable control, the operable control may be hidden in the first play interface. If the user terminal 200 receives the trigger operation of the target user on the operable control within the display duration threshold of the operable control, at this time, the user terminal 200 may respond to the trigger operation, so as to intercept a target area containing a target object (e.g., object a) in a video frame currently played on the play interface 1.
It can be understood that, when the target user selects the target area where the target object is located, the range of the intercepting area may be adjusted, so that the server 210 may select the target object more accurately, so that the server 210 may identify the target object tag of the target object more quickly, and thus the acquired video data (e.g., the video data 20A) associated with the target object may be returned to the user terminal 200.
As shown in fig. 2, the user terminal 200 may transmit a service request for acquiring video data (i.e., second video data) matching a target object in a target area to the server 210, so that the server 210 determines the second video data (e.g., video data 20A) based on the service request. It should be appreciated that the server 210 may determine a target object (e.g., object a) in the target area, and may identify a target object tag for the target object, e.g., name, artist name, birthday, age, etc. of object a. Further, the server 210 may acquire video data (e.g., the video data 20A) matching the object a based on the object tag of the object a. The video data 20A may be the one containing the object a, and the shooting angle of the object a in the video data 20A is different from that of the video data 10A.
Further, when the user terminal 200 acquires the video data 20A, the playing interface 1 may be divided into a playing interface 2 for playing the video data 10A and a playing interface 3 for playing the video data 20A based on the boundary shown in fig. 2. It should be understood that the user terminal 200 may obtain the target object tag, determine a switching sub-interface independent from the playing interface 2 in the playing interface 2, and further output the target object tag to the switching sub-interface when the video data 10A is played on the playing interface 2, and at the same time, the user terminal 200 may also play the video data 20A on the playing interface 3 synchronously. The switching sub-interface may further include a recommendation button (e.g., "more works") so that when a subsequent target user performs a trigger operation on the recommendation button of "more works" in the switching sub-interface, the playback interface jumps to view the playback interface of the recommendation data (e.g., the video data 30A) associated with the object a.
A specific implementation manner of the user terminal adjusting the video data corresponding to the playing timestamp different from the second timestamp when the playing timestamp different from the second timestamp exists in the playing timestamp of the first video data and the playing timestamp of the second video data may be as shown in the following embodiments corresponding to fig. 3 to fig. 10.
Further, please refer to fig. 3, which is a flowchart illustrating a video data processing method according to an embodiment of the present application. As shown in fig. 3, the method may be performed by a user terminal (e.g., the user terminal 100a shown in fig. 1), a server (e.g., the server 10 shown in fig. 1), or both. For convenience of understanding, the embodiment of the present application is described by taking an example that the method is executed by a user terminal, and the method may include at least the following steps S101 to S103:
and step S101, responding to the trigger operation aiming at the operable control on the first playing interface, and intercepting a target area on the first playing interface.
It should be understood that, before the user terminal integrated with the target application (e.g., the video client) performs step S101, the user terminal may start an external clock associated with the first video data in response to the start operation for the first video data, and may use a timestamp for starting the first video data as a first timestamp of the external clock, so that the first video data (here, the decoded audio sequence and the decoded video sequence of the first video data are played synchronously) may be played synchronously on the first playing interface based on the first timestamp of the external clock. Further, it can be understood that, in the process of playing the first video data on the first playing interface, the user terminal may obtain an operable control with a display duration threshold, and then may output the operable control to the first playing interface, and display a progress animation of the operable control within the display duration threshold. Further, within the display duration threshold, the target user may perform a trigger operation on the operable control, so that the target area containing the target object may be intercepted on the first play interface. The first playing interface is obtained when the first video data is played based on the first time stamp of the external clock.
It should be understood that the starting operation performed by the target user for the first video data may specifically include a contact operation such as a click or a long press, and may also include a non-contact operation such as a voice or a gesture, which will not be limited herein. It can be understood that, when a target user accesses a target application (for example, the video client), the server may use the corresponding video data that are screened out comprehensively as home page video data, and push the home page video data to the target user, so that the target user selects a certain video data that fits his interest in a video display interface where the home page video data are located as a first video data.
It should be appreciated that the user terminal may invoke a scheduling layer of the user terminal in response to the start operation, may start the external clock associated with the first video data and the first player associated with the first video data based on the scheduling layer, and may take a timestamp corresponding to the start operation as a first timestamp of the external clock. Among other things, the scheduling layer may be used to manage and schedule the external clock and the player (e.g., the first player).
For easy understanding, please refer to fig. 4, which is a schematic diagram of a relationship between an external clock and a plurality of players according to an embodiment of the present application. As shown in fig. 4, the ue in this embodiment may be a ue integrated with a target application, and the ue may be any one of the ue in the ue cluster in the embodiment corresponding to fig. 1, for example, the ue 100 a.
It is to be understood that the user terminal may manage and schedule a plurality of players, specifically, player 1, player 2, …, and player N, and the external clock through the scheduling layer shown in fig. 4. It should be understood that each player has its own playing clock, so that different playing differences exist in the playing process of multiple paths of video data, in view of this, in the embodiment of the present application, all players may use the timestamp of the external clock to perform video synchronization based on the same external clock, so as to synchronously play video data with the same playing progress through multiple players on the same terminal display interface of the user terminal. The video data may comprise, among other things, decoded video sequences and decoded audio sequences.
Wherein the external clock may be a clock that maintains the same increasing tempo as the physical clock. In practical operation, the external clock may be selected from a system clock or an elapsed time after the system is turned on, and the clock unit may be accurate to millisecond. It should be understood that the external clock may provide a plurality of interfaces to facilitate the user terminal to manage and schedule the external clock and each of the plurality of players through the scheduling layer. For example, start clocks (start), pause clocks (pause), resume clocks (resume), reset clocks (reset), and fetch clocks (getclocks).
It should be understood that the user terminal may divide the terminal display interface in the target application into a plurality of play interfaces, and may specifically include play interface 1, play interface 2, …, and play interface N. Among them, the playing interface 1 may be used to play video data (e.g., video data 1) by the player 1, the playing interface 2 may be used to play video data (e.g., video data 2) by the player 2, …, and the playing interface N may be used to play video data (e.g., video data N) by the player N. The video data 1, the video data 2, …, and the video data N may be video data having different shooting angles in the same shooting scene (e.g., a concert with the star group a). For example, video data 1 may be video data taken for a normal stand of the star group a, video data 2 may be video data taken for a close-up stand of the object a in the star group a, …, and video data N may be video data taken for a close-up stand of the object N in the star group a.
In the terminal display interface including multiple playing interfaces, there may be a centralized control button for centrally controlling multiple players and an external clock in the terminal display interface. It can be understood that, when the target user performs a trigger operation (i.e., a centralized management and control operation) on the centralized management and control button, the user terminal may respond to the trigger operation, so that players and external clocks corresponding to multiple playing interfaces in the terminal display interface may be managed and controlled simultaneously. Optionally, the centralized management and control operation may also be a double-click operation performed by the target user on the terminal display interface, so that a centralized instruction for centralized management and control of the multiple players and the external clock may be obtained, and each of the multiple players and the external clock may be simultaneously managed and controlled based on the centralized instruction. The centralized control operation may also be trigger operations in other forms, which is not limited herein.
For example, a target user corresponding to the user terminal may perform a triggering operation (i.e., a starting operation) for playing a plurality of video data for the first time for the centralized management and control button in the terminal display interface including the plurality of playing interfaces, so that the user terminal may respond to the starting operation, invoke a scheduling layer of the user terminal, and start each of the plurality of players and the external clock based on the scheduling layer. At this time, the time stamp recorded by the external clock may be counted from zero.
Optionally, a target user corresponding to the user terminal may perform a trigger operation (i.e., a pause operation) of pausing playing of the plurality of video data for the centralized management and control button in the terminal display interface including the plurality of playing interfaces, so that the user terminal may respond to the pause operation, invoke a scheduling layer of the user terminal, and pause each of the plurality of players and the external clock based on the scheduling layer. At this time, the external clock keeps the timestamp of the current record and does not continue to count the time.
Optionally, a target user corresponding to the user terminal may perform a trigger operation (i.e., a continuation operation) of continuing to play the plurality of video data for the centralized management and control button in the terminal display interface including the plurality of play interfaces, so that the user terminal may respond to the continuation operation, invoke a scheduling layer of the user terminal, and recover each of the plurality of players and the external clock based on the scheduling layer. At this time, the external clock resumes the timestamp recorded when the video data was last paused, and continues to count the time.
Optionally, the target user corresponding to the user terminal may perform a trigger operation (i.e., a reset operation) for resetting and playing the plurality of video data for the centralized control button in the terminal display interface including the plurality of play interfaces, in other words, the target user plays the video data that has already been played again (i.e., performs reverse play), or the target user is not interested in the video data that is currently played and needs to fast forward the current play progress (i.e., fast forward play). At this time, the user terminal may invoke the scheduling layer of the user terminal in response to the reset operation, and reset the timestamp recorded by the external clock based on the scheduling layer (i.e., decrease or increase the corresponding timestamp), so that the plurality of players may retrieve the timestamp reset by the external clock again, and synchronize the video data corresponding to each player based on the reset timestamp.
In addition, when the simultaneous playing of each of the plurality of video data is completed, the user terminal may invoke a scheduling layer of the user terminal, and directly stop each of the plurality of players and the external clock based on the scheduling layer, at which time the external clock will not continue to count time, or the timestamp recorded by the external clock may be reset to 0.
Of course, when each of the plurality of video data is played simultaneously, the target user corresponding to the user terminal may perform a trigger operation (i.e., a shutdown operation) on the video data (e.g., the video data 2) played by a certain player (e.g., the player 2), at this time, the user terminal may invoke the scheduling layer of the user terminal, the player 2 is paused based on the scheduling layer, and all the other players except the player 2 may continue to play. At this time, the playing interface 2 for playing the video data 2 may be closed on the terminal display interface of the user terminal.
Further, after the user terminal starts the first player and the external clock, the first encoded video stream may be decoded by the first player, so that the first video data may be acquired. The user terminal may obtain the video data for encoding the first video data from the first video address of the server corresponding to the target application. In this embodiment, video data in which the server encodes the first video data is referred to as a first encoded video stream. The first video data may comprise a decoded video sequence and a decoded audio sequence. At this time, the user terminal may synchronously play the decoded video sequence and the decoded audio sequence in the first video data on the first play interface based on the first time stamp of the external clock.
It is to be understood that the user terminal may receive the first encoded video stream stored in the first video address by the server, and may perform data separation on the first encoded video stream based on the separator in the first player, so that video packets and audio packets associated with the first encoded video stream may be obtained. Further, the user terminal may decode the video packet based on a video decoder in the first player to obtain a decoded video sequence corresponding to the video packet, and decode the audio packet based on an audio decoder in the first player to obtain a decoded audio sequence corresponding to the audio packet. At this time, the user terminal may decode the video sequence and decode the audio sequence as the first video data.
For ease of understanding, please refer to fig. 5, which is a schematic diagram illustrating a basic principle of playing video data by a player according to an embodiment of the present application. As shown in fig. 5, the encoded video stream 100 in the embodiment of the present application may be obtained by encoding the video data 5A by the server, where the encoded video stream 100 is obtained from a video address returned by the server by the user terminal. The user terminal may be a user terminal integrated with a target application (e.g., the user terminal 100a shown in fig. 1), and the server may be a server corresponding to the target application (e.g., the server 10 shown in fig. 1).
It will be appreciated that video data 5A, when propagated over a network, often requires the use of various streaming media protocols (i.e., encoded video stream 100), such as HTTP, RTMP or MMS, among others. These protocols transmit some signalling data at the same time as the video data 5A. Specifically, control of the playback (e.g., play, pause, stop, etc.), description of the network status, and the like may be included. The process of data separation of the encoded video stream 100 by the player of the user terminal may include deprotocolation and decapsulation.
The signaling data can be removed in the protocol solving process, and only the audio and video data can be reserved. For example, data transmitted by using the RTMP protocol is subjected to a protocol decoding operation, and then FLV format data is output. In the process of decapsulating, the player-based splitter may split the input data in the encapsulated format into video stream compression-encoded data (i.e., video packets 10) and audio stream compression-encoded data (i.e., audio packets 20). It will be appreciated that the packaging format is of many kinds, such as MP4, MKV, RMVB, TS, FLV, AVI, etc., and it is used to put the compressed and encoded video data and audio data together in a certain format. For example, data in FLV format, after being subjected to a decapsulation operation, outputs an h.264 encoded video stream and an AAC encoded audio stream.
Further, the user terminal may decode the video packet 10 through a video decoder in the player, thereby obtaining a decoded video sequence. In other words, the video decoder may output compression-encoded video data as uncompressed color data, e.g., YUV420P, RGB, and the like. Furthermore, the user terminal may decode the audio packets 20 by means of an audio decoder in the player, resulting in a decoded audio sequence. In other words, the audio decoder may output compression-encoded audio data as uncompressed audio sample data, for example, PCM data. At this time, the user terminal may decode the video sequence and decode the audio sequence as first video data (video data 5A shown in fig. 5).
For easy understanding, please refer to fig. 6, which is a schematic view of a scene of a decoded video sequence obtained by decoding a video packet according to an embodiment of the present application. The video data (e.g., the video data 5A shown in fig. 5) in the embodiment of the present application may be split into a plurality of frame groups (e.g., 2) at the time of capturing, as shown in fig. 6, the captured video sequence 100 may specifically include a frame group 1 and a frame group 2. The frame set 1 may include I0 frame, B1 frame, B2 frame, P3 frame, B4 frame, B5 frame, P6 frame, B7 frame and B8 frame. Frame set 2 may include I [0] frames, B [1] frames, B [2] frames, and P [3] frames.
Wherein, a frame group refers to a group of image sequences composed of 1I frame and a plurality of B/P frames, and is a basic unit accessed by a video encoder and a video decoder. It is understood that an I-frame (Intra-coded picture, often referred to as a key frame) contains a complete picture information, belongs to an Intra-coded picture, contains no motion vectors, and does not need to refer to other frame pictures when decoding. Therefore, the channel can be switched at the I frame picture without causing picture loss or inability to decode. A P frame (Predictive-coded video frame) is an inter-coded frame that a video encoder can use for Predictive coding a previous I frame or P frame when coding. B-frames (Bi-directionally predicted picture) are inter-coded frames, Bi-directionally predicted coded using preceding and/or following I-frames or P-frames. B frames may not be reference frames.
As shown in fig. 6, the order in which video frames in captured video sequence 100 are captured resulting from the capture of video data may be referred to as the capture order. In storing or transmitting video data 5A, captured video sequence 100 may be encoded by a video encoder, i.e., captured video sequence 100 may be encoded based on the groups of frames (e.g., frame group 1 and frame group 2) in captured video sequence 100, such that encoded video sequence 200 may be obtained, and the order of the video frames in encoded video sequence 200 may be referred to as the encoding order. Further, in the process of playing the video data 5A, the pulled encoded video sequence 200 needs to be decoded by a video decoder, so as to obtain the decoded video sequence 300, and further, the decoded video sequence 300 can be displayed on a playing interface of the user terminal. The sequence of displaying the video frames in the decoded video sequence 300 on the play interface of the user terminal is referred to as a display sequence. Wherein the acquisition order may be the same as the display order.
It should be appreciated that a video decoder needs to buffer a certain number of video frames when decoding the encoded video sequence 200. It will be appreciated that a new decoding action is initiated, and the video decoder needs to acquire a plurality of frame groups (e.g., frame group 1 and frame group 2) to decode a frame group because of the trust relationship between video frames during decoding. For example, when the video decoder acquires three frames, I frame, P frame, and B frame, the video decoder needs to decode the I frame and the P frame first, and then can decode the B frame.
As shown in FIG. 6, within a frame group (e.g., frame group 1), B [1] frames in the encoded video sequence 200 depend on I [0] frames and P [3] frames, and when the video decoder decodes B [1] frames in frame group 1, the P [3] frames need to be decoded first, and then the B [1] frames can be decoded based on the I [0] frames and the P [3] frames. Furthermore, since the B7 frames of frame group 1 in the encoded video sequence 200 need to be dependent on the I0 frames in the next frame group (e.g., frame group 2) of frame group 1, the video decoder needs to decode the I0 frames in frame group 2 first when decoding the B7 frames in frame group 1, and can decode based on the I0 frames in frame group 2 and the P6 frames in frame group 1. This results in inconsistencies in the decoding order and the display order.
Further, the user terminal may periodically acquire the operable control during the playing of the first video data, for example, the user terminal may acquire the operable control every ten minutes, that is, when the playing time stamp of the first video data is the 10 th minute, the 20 th minute, the 30 th minute, and the like, the user terminal may acquire the operable control. It is understood that the operable control can also be periodically pushed to the user terminal by the server.
Alternatively, the user terminal may acquire the operable control based on a key video frame of the first video data. For example, the user terminal may obtain the actionable control when the first video data is played to a key video frame (e.g., frame 1, frame 27, frame 38, frame 52, etc.). The method for determining the key video frame of the first video data is not described in detail in the embodiments of the present application. It will be appreciated that the actionable controls may also be pushed by the server to the user terminal based on key video frames of the first video data. It should be appreciated that the user terminal may display the progress animation of the operable control within a display duration threshold of the operable control.
For easy understanding, please refer to fig. 7, which is a schematic diagram of a progress animation of an operable control provided in an embodiment of the present application. As shown in fig. 7, the ue in this embodiment may be any one of the ue in the ue cluster shown in fig. 1, for example, the ue 100 a.
The video data (i.e. the first video data) can be played in a playing interface (i.e. the first playing interface) of the user terminal. It is understood that during the playing of the first video data, the user terminal may obtain an operable control (e.g., "circle object" shown in fig. 7) having a display duration threshold. It should be understood that, the user terminal may output the obtained operable control with the display duration threshold (e.g., 5 seconds) to the first playing interface of the first video data, at this time, this embodiment of the application may refer to a time when the user terminal outputs the operable control on the first playing interface as time H1. Wherein the progress animation of the operable control displayed at time H1 may be as shown in operable control 700a in fig. 7.
It should be appreciated that the progress animation of the actionable control can dynamically change as the length of time the actionable control is displayed on the first playback interface. For example, at time H2, an actionable progress animation may be as shown by actionable control 700b in FIG. 7. Wherein, the time H2 may be a certain time when the operable control is within the display duration threshold.
The time H3 may be a time corresponding to the display duration threshold of the operable control, and when the progress animation of the operable control is shown as the operable control 700c in fig. 7, the operable control will disappear on the first playing interface, until the operable control is obtained next time, the operable control will repeat the progress animation shown in fig. 7 again. The duration formed by the time H1 to the time H3 may be referred to as a display duration (e.g., 5 seconds) of the operable control on the first playing interface. For example, if the H3 time (i.e., 5 th second) of the display duration is currently reached, it indicates that the display duration of the operable control on the first playing interface has reached 5 seconds, and the operable control may be hidden on the first playing interface, so that it can be seen to the target user that the operable control disappears in the first playing interface.
For convenience of explanation, the embodiment of the present application may take the time H2 shown in fig. 7 as an example, and is used to illustrate the time when the target user performs the trigger operation on the operable control within the display duration threshold. At this time, the user terminal may intercept the target area on the first play interface in response to the trigger operation for the operable control. The target object identification method includes the steps that when a target user selects a target area where a target object is located, the range of the intercepting area can be adjusted, so that the server can determine the target object more accurately in the following process, and therefore the server can identify an object tag of the target object and video data associated with the object tag more quickly. Wherein the target area may contain at least one object therein.
Step S102, when second video data matched with the target object is obtained, the first playing interface is divided into a second playing interface and a third playing interface.
Specifically, when the user terminal acquires the second video data matched with the target object, the terminal display interface to which the first playing interface belongs may be used as the interface to be processed, and a boundary (such as the boundary shown in fig. 2) for playing the first video data and the second video data and boundary position information of the boundary in the interface to be processed are determined in the interface to be processed. Further, the user terminal may divide the interface to be processed into a second playing interface and a third playing interface based on the boundary and the boundary position information. The second playing interface can be used for playing the first video data; the third playback interface may be used to play the second video data.
It should be understood that, after the user terminal intercepts the target area on the first play interface, the video data (i.e., the second video data) associated with the target object in the target area may be acquired. For example, the manner in which the user terminal acquires the second video data may be: the user terminal transmits a service request for determining the second video data to the server, so that the server can determine the second video data matching the target object based on the service request. The manner in which the server identifies the target object may be various, and is not limited herein. For example, the server may identify the target object in the target area by an image recognition method. It is understood that the user terminal may directly transmit the target area intercepted by the target user to the server, and at this time, the server may input the target area into the image recognition model with the image recognition function, so that the recognition result output by the image recognition model may be determined as the target object in the target area.
Optionally, the server may further identify the target object in the target area according to a target playing timestamp corresponding to a video frame (to-be-processed frame) to which the target area belongs and the coordinate information of the target area. It can be understood that, in the decoded video sequence of the first video data, the user terminal may use the video frame to which the target region belongs as the video frame to be processed. In this embodiment of the present application, a playing time stamp of a video frame to be processed may be referred to as a target playing time stamp. The target play timestamp is a timestamp later than or equal to the first timestamp. Further, the user terminal may determine the coordinate information of the target area in the video frame to be processed based on the target playing timestamp corresponding to the video frame to be processed. The user terminal may transmit the target play time stamp together with the coordinate information to the server so that the server may determine the target object among at least one object included in the target area. The target playing time stamp may be used for determining, by the server, a mapping relationship table corresponding to the to-be-processed video frame.
The first mapping relation table and the second mapping relation table in the embodiment of the present application may be collectively referred to as a mapping relation table. It is understood that the first mapping relation table may contain Z regions associated with the video frame to be processed, one region corresponding to one object, and one object corresponding to one object tag. Wherein Z is a positive integer. The second mapping relation table may include object tags, and each object tag may correspond to a video address of video data matching the object.
For easy understanding, please refer to table 1, which is a first mapping table provided in the embodiments of the present application. Wherein the server may encode the video frame M of the video sequence based on the first video dataiIn (1)Object (e.g. person, animal, etc.) to video frame MiThe region is divided, and a mapping table as shown in table 1 may be established based on the divided regions, the objects included in the regions, and the object tags corresponding to the objects. Where i may be less than or equal to the total number of video frames in the encoded video sequence, the object tag may contain basic information of the object, such as the name, age, occupation, birthday, number of fans, etc. of the object.
TABLE 1
Video frame MiDivided region Object Object tag
Region A Object a Object tag 1
Region B Object b Object tag 2
C region Object c Object tag 3
D region Object d Object tag 4
As shown in Table 1, examples of the present applicationAnd video frame Mi(e.g., video frame M)36) The associated regions may be 4, for example, and specifically may include a region a, a region B, a region C, and a region D. The area a may include an object a, and an object tag corresponding to the object a may be an object tag 1; the area B may contain an object B, and an object tag corresponding to the object B may be an object tag 2; the region C may contain an object C, and an object tag corresponding to the object C may be an object tag 3; the D area may contain an area D, and the object tag corresponding to the object D may be the object tag 4.
It is understood that the server may determine the video frame to be processed as the video frame M based on the target playing time stamp36Further, the video frame M can be acquired36A corresponding first mapping table (table 1 above). Wherein the target area may contain at least one object therein.
It should be understood that if 1 object (e.g., object a) is included in the target area, the server may directly determine that the target area is the a area based on the coordinate information sent by the user terminal. At this time, the server may set the object a in the a area as a target object based on the above table 1.
Alternatively, if the target area includes a plurality of objects (e.g., 2 objects, i.e., object a and object b), the server may determine that the target area is in the video frame M based on the coordinate information sent by the user terminal36The area of the divided area is specific gravity. For example, if the target area intercepted by the target user may include two areas (for example, an a area and a B area) shown in table 1, in this case, the server may determine that the area proportion occupied by the a area in the target area is 1 (for example, 40%) and the area proportion occupied by the B area in the target area is 2 (for example, 60%), and it may be understood that the target object interested by the target user is an object included in the B area, that is, an object B. In this case, the server may set the object b as a target object based on the above table 1.
Further, the server may determine, as the target object tag, an object tag corresponding to the determined target object, determine, in the second mapping relationship table, a video address (i.e., a second video address) associated with video data (i.e., second video data) matching the target object based on the target object tag, and then return the second video address to the user terminal, so that the user terminal determines the second video data matching the target object.
For easy understanding, please refer to table 2, which is a second mapping table provided in the embodiments of the present application. The second mapping relation table may be a mapping relation table established by the server based on the object tag and the video address in the first video data. Wherein the video address may be a video address associated with video data matching the object.
TABLE 2
Object tag Video addresses matching objects
Object tag
1 Video address 1
Object tag 2 Video address 2
Object tag 3 Video address 3
It should be understood that there may be a plurality of video data which are the same shooting scene as the first video data and have different shooting angles, and specifically, the video data a, the video data B, and the video data C may be included. As shown in table 2, the encoded video stream stored at video address 1 may be obtained after the server performs encoding processing on video data a, the encoded video stream stored at video address 2 may be obtained after the server performs encoding processing on video data B, and the encoded video stream stored at video address 3 may be obtained after the server performs encoding processing on video data C. Video data a may be video data that is directly photographed for object a, video data B may be video data that is directly photographed for object B, and video data C may be video data that is directly photographed for object C.
For example, if the server determines that the target object identified in the target area is object a and the target object tag is object tag 1 corresponding to object a according to table 1, the server may look up a video address (second video address, i.e., video address 1) matching object a in table 2 based on object tag 1, and send video address 1 to the user terminal.
Optionally, if the server determines that the target object in the target area is the object d according to the table 1, and the target object tag is the object tag 4 corresponding to the object d, the server may query the table 2 for the video data matching with the object d based on the object tag 4. As shown in table 2, the video address of the video data corresponding to the object tag 4 does not exist in the second mapping table. In other words, if the server cannot acquire the video data matched with the object d, it can be understood that the user terminal cannot respond to the trigger operation of the target user for the operable control, so that the first playing interface will not be displayed.
It should be appreciated that the user terminal may optionally launch one of the plurality of players other than the first player (i.e., the second player) based on the scheduling layer upon acquiring the second video address (e.g., video address 1 shown in table 2) associated with the second video data. Further, the second player of the user terminal may pull the second encoded video stream from the server via the second video address. The second encoded video stream may be obtained by encoding the second video data by the server. Further, the user terminal may decode the second encoded video stream via the second player, resulting in second video data (e.g., video data a).
A specific implementation process of the user terminal obtaining the video data a matched with the object a from the encoded video stream stored in the video address 1 through the second player may refer to a process of the user terminal obtaining the video data 5A from the video address corresponding to the encoded video stream 100 in fig. 5, which will not be described again here.
Further, when the video data a (i.e., the second video data) matching the object a is acquired, the user terminal may use a terminal display interface to which the first playing interface belongs as a to-be-processed interface, determine a boundary for playing the first video data and the second video data in the to-be-processed interface, and determine boundary position information of the boundary in the to-be-processed interface, and further divide the to-be-processed interface into a second playing interface and a third playing interface based on the boundary and the boundary position information.
It should be understood that, the target user may also perform a trigger operation for changing boundary position information of the boundary with respect to the boundary on the terminal display interface dividing the second playing interface and the third playing interface, further drag the boundary in the terminal display interface, and adjust the playing interface display range of the second playing interface and the third playing interface on the terminal display interface based on the dragged boundary and the new boundary position information. Further, the user terminal may update the second playing interface and the third playing interface based on the boundary position information of the dragged boundary.
Step S103, when there is a playing timestamp different from the second timestamp in the playing timestamp of the first video data and the playing timestamp of the second video data, adjusting the video data corresponding to the playing timestamp different from the second timestamp, so that the playing timestamp of the adjusted video data is consistent with the second timestamp.
Specifically, the user terminal may acquire a second time stamp of the external clock, and may use the acquired second time stamp as a reference time stamp for adjusting the first video data and the second video data. The second timestamp may be a timestamp recorded by an external clock currently acquired by the user terminal, and the second timestamp is later than the first timestamp. When there is a play timestamp different from the reference timestamp in the play timestamp of the first video data and the play timestamp of the second video data, the user terminal may use the play timestamp different from the reference timestamp as the to-be-synchronized timestamp, and may use the video data corresponding to the to-be-synchronized timestamp as the to-be-processed video data. The video data to be processed may include one or more of the first video data and the second video data. Further, the user terminal may adjust the decoded video sequence and the decoded audio sequence in the video data to be processed based on the reference timestamp and the timestamp to be synchronized, so as to obtain the adjusted video data.
It should be understood that the user terminal may obtain the target object tag of the target object, and may further determine, in the second playing interface, a switching sub-interface (such as the switching sub-interface shown in fig. 2) independent of the second playing interface. The switching sub-interface may be an interface on the second playing interface, and the switching sub-interface may include a recommendation button and a snapshot button. Further, the user terminal may output the target object tag to the switching sub-interface when playing the first video data on the second playing interface, and at the same time, the user terminal may play the second video data on the third playing interface synchronously.
It is understood that the player (the first player, the second player) of the user terminal may call a clock obtaining function (e.g., getclockUs function) of the external clock in a callback manner, and may further obtain the second timestamp recorded by the external clock, and may further use the second timestamp as a reference timestamp for adjusting the first video data and the second video data. At this moment, the user terminal can compare the playing time stamp of first video data, the playing time stamp of second video data with the reference time stamp respectively, in the playing time stamp of first video data and the playing time stamp of second video data, when there is the playing time stamp the same with the reference time stamp, the user terminal can directly export the video data that corresponds with the playing time stamp the same with the reference time stamp to the terminal display interface of user terminal, first video data directly exports to second playing interface promptly, second video data directly exports to third playing interface.
When there is a play timestamp different from the reference timestamp in the play timestamp of the first video data and the play timestamp of the second video data, the user terminal may use the play timestamp different from the reference timestamp as the to-be-synchronized timestamp, and may use the video data corresponding to the to-be-synchronized timestamp as the to-be-processed video data. The time stamp to be synchronized may include a video playing time stamp and an audio playing time stamp.
When the user terminal adjusts the decoded video sequence in the video data to be processed, the difference (i.e. the first difference) between the video playing timestamp and the reference timestamp can be determined, and the video frame to be synchronized associated with the first difference is obtained from the decoded video sequence in the video data to be processed.
Specifically, the calculation formula of the difference value (VideoDeltaUs) between the video playing time stamp and the reference time stamp may be as shown in the following formula (1):
VideoDeltaUs=VideoClockUs-BaseClockUs, (1)
wherein, the VideoClockUs may be a video playing time stamp, and the BaseClockUs may be a second time stamp (reference time stamp) of the acquired external clock.
When the user terminal adjusts the decoded audio sequence in the video data to be processed, the difference (i.e. the second difference) between the audio playing time stamp and the reference time stamp can be determined, and the audio frame to be synchronized associated with the second difference is obtained in the decoded audio sequence in the video data to be processed.
Specifically, the calculation formula of the difference value (AudioDeltaUs) between the audio playback time stamp and the reference time stamp may be as shown in the following formula (2):
AudioDeltaUs=AudioClockUs-BaseClockUs, (2)
among them, AudioClockUs may be an audio play time stamp, and BaseClockUs may be a second time stamp (reference time stamp) of the acquired external clock.
Further, the user terminal may adjust the decoded video sequence in the video data to be processed and the decoded audio sequence in the video data to be processed based on the first difference, the second difference, the video frame to be synchronized, and the audio frame to be synchronized, so that the adjusted video data may be obtained. The user terminal may use a decoded video sequence in the video data to be processed and a decoded audio sequence in the video data to be processed as a multimedia sequence to be synchronized, use a video frame to be synchronized and an audio frame to be synchronized as multimedia data frames, and use the first difference value and the second difference value as a difference value to be synchronized of the multimedia data frames.
If the difference value to be synchronized is a positive number, the user terminal can render the multimedia data frame in the multimedia sequence to be synchronized when the waiting time of the multimedia data frame reaches the difference value to be synchronized, so that the adjusted video data can be obtained. If the multimedia data frame is a video frame, the user terminal may output the video frame to a playing interface of the user terminal based on a video renderer of the player. If the multimedia data frame is an audio frame, the user terminal may output the audio frame to an audio output device of the user terminal, such as a power amplifier, an earphone, a sound box, and the like, based on the audio renderer of the player.
If the difference value to be synchronized is negative, the user terminal can perform frame loss processing on the multimedia data frame in the multimedia sequence to be synchronized to obtain the adjusted video data.
For example, when the second time stamp (reference time stamp) of the external clock is 16 th second, the time stamp to be synchronized determined by the user terminal is a video playing time stamp (for example, 19 th second). At this time, the user terminal may determine that the first difference is 3 seconds (i.e., a positive number), and obtain a video frame to be synchronized in a decoded video sequence of the video data to be processed, i.e., a video frame (e.g., video frame M) with a playing time stamp corresponding to 19 th second15). Further, the user terminal may be in video frame M15When the waiting time reaches 3 seconds, the video frame M is played by the video renderer15And performing rendering processing to obtain the adjusted video data.
For example, outsideWhen the second time stamp (reference time stamp) of the local clock is 16 th second, the time stamp to be synchronized specified by the user terminal is a video playback time stamp (for example, 13 th second). At this time, the user terminal may determine that the first difference is-3 seconds (i.e., negative number), and acquire a video frame to be synchronized in a decoded video sequence of the video data to be processed, i.e., a video frame (e.g., video frame M) between the 13 th and 16 th video playing time stamps7Video frame M8Video frame M9And video frame M10). Further, the user terminal may perform frame dropping processing on the 4 frames of video frames in the decoded video sequence, so as to obtain the adjusted video data.
Further, please refer to fig. 8, which is a schematic view of a scene of the timestamp recorded by the external clock according to an embodiment of the present application. The T1 timestamp as shown in fig. 8 may be a timestamp (i.e., a first timestamp) recorded by the external clock when the first video data (e.g., video data a) is played on the first play interface of the user terminal. The T1 timestamp may be the first timestamp described above. The T2 timestamp as shown in fig. 8 may be a timestamp recorded by the external clock when the first playback interface displays the operable control (e.g., "circle the object"). The T3 timestamp shown in fig. 8 may be a timestamp recorded by the external clock (i.e., a target play timestamp) of the video frame to which the target user intercepted the target area. It should be appreciated that the time interval between the T2 timestamp and the T3 timestamp herein may be short, i.e., not exceeding the display duration threshold of the operable control. The T4 timestamp shown in fig. 8 may be a timestamp recorded by the external clock for synchronously playing the first video data and the second video data (e.g., video data B) in the same terminal display interface.
It should be understood that when the target user performs a start operation on the video data a, the time stamp recorded by the external clock may be counted from zero, that is, when the time stamp at T1 is 0 th second, the user terminal may play the video data a in the first play interface.
Further, after the video data a is played for a certain period of time (for example, 20 minutes), the user terminal may obtain the operable control pushed by the server. In other words, at the time of the time stamp of T2 being 20 th minute, the first playback interface of the user terminal may display an operable control having a display duration threshold (e.g., 5 seconds).
At this time, the target user may perform a trigger operation on the operable control within the display duration threshold range of the operable control (e.g., at time H2 shown in fig. 7, for example, at time 3 seconds of the display of the operable control), so that the target area containing the target object may be intercepted. In other words, at the time when the T3 timestamp is 03 seconds at 20 minutes, the target user may intercept the target area in the current video frame played by the first play interface.
Further, the server may identify a target object and a target object tag in the target area based on the T3 timestamp of the target area transmitted by the user terminal and the coordinate information of the target area, and may determine a video address of video data (e.g., video data B shown in fig. 8) matching the target object. At this time, the user terminal may obtain the target object tag and the video data B, divide a terminal display interface (i.e., a to-be-processed interface) to which the first play interface belongs into a second play interface and a third play interface, and further play the video data a on the second play interface and output the target object tag on the switch sub-interface while playing the video data B on the third play interface at a time stamp of T4 (e.g., 20 minutes and 05 seconds). In other words, at the time when the time stamp of T4 is 20 th minute 05 seconds, video data a and video data B can be played synchronously on the terminal display interface of the user terminal. Wherein, the time interval between the T4 timestamp and the T3 timestamp may depend on the efficiency of the server to acquire the video address of the second video data and the efficiency of the player of the user terminal to parse the encoded video stream in the video address.
Alternatively, it should be understood that, in the embodiment of the present application, multiple pieces of video data may also be played synchronously on the same terminal display interface shown in fig. 8. For example, as shown in fig. 8, in the process that the user terminal continues to play the video data a on the second play interface, the second play interface may be used as a new first play interface, so as to output a new operable control on the terminal display interface to which the new first play interface belongs, and further, an area containing another target object (which may also be referred to as a new target object, for example, an object b) may be intercepted on the new first play interface based on the new operable control. It should be understood that the embodiment of the present application may use the area containing the object b intercepted on the new first playing interface as a new target area, so that another second video data (e.g., video data C) matching with the object b may be subsequently acquired. At this time, the user terminal may continue to divide the new first playing interface, so that the video data a, the video data B, and the video data C may be played synchronously using the same external clock on the same terminal display interface. The specific implementation manner of the user terminal acquiring and outputting the video data C may refer to the description of the video data B, and will not be further described here.
In this embodiment of the present application, first video data may be played on a first play interface according to a first timestamp of an external clock, and when an operable control for acquiring a target object exists on the first play interface, a trigger operation for the operable control on the first play interface may be responded to, so as to intercept a target area containing the target object on the first play interface. Further, in the embodiment of the application, when second video data matched with the target object is acquired, the first playing interface is divided into a second playing interface and a third playing interface, so that the first video data is played on the second playing interface, and the second video data is played on the third playing interface. It should be understood that in the present application, the same external clock may be used to adjust the playing time stamp of one or more video data of the currently played first video data and second video data, that is, in these currently played video data, when video data with a playing time stamp different from the second time stamp of the external clock is detected, the video data with a detected playing time stamp different from the second time stamp of the external clock may be rapidly dynamically adjusted, so that the playing time stamp of the adjusted first video data or the adjusted second video data may be kept consistent with the second time stamp of the external clock. Therefore, when the time stamp of the same external clock is used for dynamically calibrating the first video data or the second video data, the video data with the fast playing progress does not need to wait for the video data with the slow playing progress, and the granularity of video synchronization can be optimized. Therefore, when the first video data and the second video data are played in the same terminal, the playing progress of the video data presented in different playing interfaces can be ensured to be the same as much as possible, and the synchronous playing precision of the multi-channel video data can be improved.
Further, please refer to fig. 9, which is a flowchart illustrating a video data processing method according to an embodiment of the present application. As shown in fig. 9, the method is executed by a user terminal (e.g., the user terminal 100a shown in fig. 1), a server (e.g., the server 10 shown in fig. 1), or both the user terminal and the server. For convenience of understanding, the embodiment of the present application is described by taking an example that the method is executed by a user terminal, and the method may include at least the following steps S201 to S205:
step S201 intercepts a target area on the first play interface in response to a trigger operation for an operable control on the first play interface.
Step S202, when the second video data matched with the target object is obtained, the first playing interface is divided into a second playing interface and a third playing interface.
Step S203, when there is a playing timestamp different from the second timestamp in the playing timestamp of the first video data and the playing timestamp of the second video data, adjusting the video data corresponding to the playing timestamp different from the second timestamp, so that the playing timestamp of the adjusted video data is consistent with the second timestamp.
For specific implementation of steps S201 to S203, reference may be made to the description of steps S101 to S103 in the embodiment corresponding to fig. 3, which will not be described herein again.
Step S204, responding to the trigger operation aiming at the recommendation button in the switching sub-interface, closing the third playing interface, determining K cover page display data associated with the target object label, and outputting the K cover page display data to an object display interface independent of the second playing interface.
Specifically, the target user may perform the trigger operation based on the recommendation button (e.g., "more work" as shown in fig. 2) displayed in the switch sub-interface. At this time, the user terminal may respond to the trigger operation, acquire K pieces of recommended data associated with the target object tag, which are queried by the server, and may further send cover page display data corresponding to the K pieces of recommended data to the user terminal together. At this time, the user terminal may close the third playing interface for playing the second video data, determine an object display interface independent of the second playing interface in the second playing interface, and then output the K cover display data to the object display interface. Wherein K is a positive integer.
Step S205, in response to the trigger operation for the target cover display data in the object display interface, obtain target video data corresponding to the target cover display data, switch the second playing interface to a fourth playing interface, and play the target video data on the fourth playing interface.
Specifically, the target user may perform a trigger operation on a certain cover display data among K cover display data in the object display interface, at this time, the user terminal may use the cover display data on which the target user performs the trigger operation as the target cover display data, and may further respond to the trigger operation, obtain a video address (e.g., a third video address) associated with the target video data corresponding to the target cover display data from the server, and send the third video address to the user terminal. At this time, the user terminal may restart one player (e.g., the third player) and reset the external clock corresponding to the third player. The user terminal can pull the target coded video stream stored by the server and used for coding the target video data through the third video address, and then can decode the target coded video stream through the third player, so that the target video data is obtained. At this time, the user terminal may perform synchronization processing on the target video data based on the reset external clock to obtain the target video data after the synchronization processing, and then may switch the second playing interface to the fourth playing interface to play the target video data after the synchronization processing on the fourth playing interface of the user terminal.
The specific implementation of steps S204 to S205 can refer to the description of the embodiment corresponding to fig. 10. Further, please refer to fig. 10, which is a schematic view illustrating a scene for recommending target video data associated with a target object according to an embodiment of the present application. As shown in fig. 10, the ue 900 in this embodiment may be any one of the ue in the ue cluster shown in fig. 1, for example, the ue 100 a. The server in the embodiment of the present application may be the server 10 shown in fig. 1.
In the terminal display interface of the user terminal shown in fig. 10, the video data a (i.e., the first video data) can be played in the playing interface 2 (i.e., the second playing interface), and the video data B (i.e., the second video data) can be played in the playing interface 3 (i.e., the third playing interface). Among them, a target object tag (for example, the object tag 1 of the object a), a pan button, and a recommendation button (for example, "more work") of a target object may also be displayed in the switching sub-interface shown in fig. 10, which is independent of the play interface 2. The target object tag may contain basic information of the object a, such as a name and a truncated target area containing the object a.
It should be understood that the target user corresponding to the user terminal 900 may perform a trigger operation for the recommendation button "more works" in the switch sub-interface, and in turn, may cause the user terminal 900 to respond to the trigger operation to acquire K pieces of recommendation data associated with the object a, which is queried by the server based on the object tag 1. It is to be understood that the K pieces of recommended data may be video data such as a television show, a variety program, a movie, a short video, and the like associated with the object a, or may be text data such as news, a brief introduction, and the like associated with the object a, which is not limited herein. At this time, the user terminal 900 may close the play interface 3 for playing the video data B, determine an object display interface independent of the play interface 2 in the play interface 2, and further output the 4 front cover display data to the object display interface. The object display interface can also display an object label 1 of the object a, such as name, birthday, number of fans, age, and the like.
For convenience of illustration, the recommendation data associated with the object tag 1 queried by the server in the embodiment of the present application may take 4 as examples, and specifically may include text data 90A, video data 90B, video data 90C, and video data 90D. It should be understood that the server may send cover page display data corresponding to the 4 pieces of recommendation data to the user terminal, for example, cover page display data 90A corresponding to the text data 90A, cover page display data 90B corresponding to the video data 90B, cover page display data 90C corresponding to the video data 90C, and cover page display data 90D corresponding to the video data 90D.
Further, the target user may perform a trigger operation on one cover display data among the 4 cover display data, at this time, the user terminal 900 may use the cover display data on which the target user performs the trigger operation as the target cover display data (e.g., the cover display data 90d), and may further, in response to the trigger operation, obtain a video address (e.g., the video address 3) associated with the target cover display data from the server, and send the video address 3 to the user terminal 900.
It should be understood that the user terminal 900 may restart one player (e.g., the player 3) and start a corresponding external clock of the player 3 (i.e., an external clock different from the video data a and the video data B). The user terminal may pull the target encoded video stream from the server through the video address 3, and may further obtain the target video data (i.e., the video data 90D) based on the target encoded video stream. Further, the user terminal 900 may perform synchronization processing on the video data 90D based on the reset external clock to obtain the video data 90D after the synchronization processing, and then may switch the playing interface 2 to the playing interface 4, and play the video data 90D after the synchronization processing on the playing interface 4 (i.e., the fourth playing interface) of the user terminal 900. Therefore, when the data is recommended, the data can be recommended based on the object tag of the target object intercepted by the target user, and therefore the target user can be helped to quickly locate the video data or the text data fitting the interests and hobbies of the user.
In this embodiment of the present application, first video data may be played on a first play interface according to a first timestamp of an external clock, and when an operable control for acquiring a target object exists on the first play interface, a trigger operation for the operable control on the first play interface may be responded to, so as to intercept a target area containing the target object on the first play interface. Further, in the embodiment of the application, when second video data matched with the target object is acquired, the first playing interface is divided into a second playing interface and a third playing interface, so that the first video data is played on the second playing interface, and the second video data is played on the third playing interface. It should be understood that in the present application, the same external clock may be used to adjust the playing time stamp of one or more video data of the currently played first video data and second video data, that is, in these currently played video data, when video data with a playing time stamp different from the second time stamp of the external clock is detected, the video data with a detected playing time stamp different from the second time stamp of the external clock may be rapidly dynamically adjusted, so that the playing time stamp of the adjusted first video data or the adjusted second video data may be kept consistent with the second time stamp of the external clock. Therefore, when the time stamp of the same external clock is used for dynamically calibrating the first video data or the second video data, the video data with the fast playing progress does not need to wait for the video data with the slow playing progress, and the granularity of video synchronization can be optimized. Therefore, when the first video data and the second video data are played in the same terminal, the playing progress of the video data presented in different playing interfaces can be ensured to be the same as much as possible, and the synchronous playing precision of the multi-channel video data can be improved.
Further, please refer to fig. 11, which is a schematic structural diagram of a video data processing apparatus according to an embodiment of the present application. As shown in fig. 11, the video data processing apparatus 1 may be a computer program (including program code) running in a computer device, for example, the video data processing apparatus 1 is an application software; the video data processing apparatus 1 may be configured to perform corresponding steps in the methods provided by the embodiments of the present application. The video data processing apparatus 1 may include: an interception module 100, a division module 200 and an adjustment module 300.
The intercepting module 100 is configured to respond to a trigger operation for an operable control on a first play interface, and intercept a target area on the first play interface; the target area comprises a target object; the first playing interface is obtained when the first video data is played based on a first timestamp of the external clock;
the dividing module 200 is configured to divide the first playing interface into a second playing interface and a third playing interface when the second video data matched with the target object is obtained; the second playing interface is used for playing the first video data; the third playing interface is used for playing the second video data;
an adjusting module 300, configured to adjust, when a play timestamp different from the second timestamp exists in the play timestamp of the first video data and the play timestamp of the second video data, video data corresponding to the play timestamp different from the second timestamp, so that the play timestamp of the adjusted video data is consistent with the second timestamp; the second timestamp is a timestamp of the external clock, and the second timestamp is later than the first timestamp.
For specific implementation manners of the intercepting module 100, the dividing module 200, and the adjusting module 300, reference may be made to the description of step S101 to step S103 in the embodiment corresponding to fig. 3, and details will not be further described here. In addition, the beneficial effects of the same method are not described in detail.
Further, please refer to fig. 12, which is a schematic structural diagram of a video data processing apparatus according to an embodiment of the present application. As shown in fig. 12, the video data processing apparatus 2 may be a computer program (including program code) running in a computer device, for example, the video data processing apparatus 2 is an application software; the video data processing apparatus 2 may be configured to perform corresponding steps in the method provided by the embodiment of the present application. The video data processing apparatus 2 may include: the device comprises a starting module 11, a decoding module 12, a second synchronous playing module 13, an intercepting module 14, a second determining module 15, a sending module 16, a third determining module 17, a dividing module 18, a first determining module 19, a first output module 20, a first synchronous playing module 21, an adjusting module 22, a second output module 23 and a switching module 24.
The starting module 11 is configured to respond to a starting operation for the first video data, invoke a scheduling layer of the user terminal, start an external clock associated with the first video data and a first player associated with the first video data based on the scheduling layer, and use a timestamp corresponding to the starting operation as a first timestamp of the external clock;
the decoding module 12 is configured to decode the first encoded video stream by the first player to obtain first video data; the first coding video stream is video data obtained by the user terminal from a first video address of the server; the first video data includes a decoded video sequence and a decoded audio sequence.
Wherein, the decoding module 12 includes: a receiving unit 121, a separating unit 122, a decoding unit 123 and a third determining unit 124.
The receiving unit 121, configured to receive a first encoded video stream stored in a first video address by the server; the first coded video stream is obtained after the server codes the first video data;
the separating unit 122 is configured to perform data separation on the first encoded video stream based on a separator in the first player, so as to obtain a video packet and an audio packet associated with the first encoded video stream;
the decoding unit 123 is configured to decode the video packet based on a video decoder in the first player to obtain a decoded video sequence corresponding to the video packet, and decode the audio packet based on an audio decoder in the first player to obtain a decoded audio sequence corresponding to the audio packet;
the third determining unit 124 is configured to decode the video sequence and the audio sequence as the first video data.
For specific implementation manners of the receiving unit 121, the separating unit 122, the decoding unit 123 and the third determining unit 124, reference may be made to the description of the video data 5A in the embodiment corresponding to fig. 5, and details will not be further described here.
The second synchronous playing module 13 is configured to synchronously play the decoded video sequence and the decoded audio sequence in the first video data on the first playing interface based on the first timestamp.
The intercepting module 14 is configured to respond to a trigger operation for an operable control on the first play interface, and intercept a target area on the first play interface; the target area comprises a target object; the first playing interface is obtained when the first video data is played based on a first timestamp of the external clock;
wherein the intercept module 14 comprises: output section 141 and clipping section 142.
The output unit 141 is configured to obtain an operable control pushed by the server and having a display duration threshold, output the operable control to the first play interface, and display a progress animation of the operable control within the display duration threshold;
the intercepting unit 142 is configured to respond to a trigger operation for the operable control within the display duration threshold, and intercept the target area on the first play interface.
For specific implementation of the output unit 141 and the intercepting unit 142, reference may be made to the description of step S101 in the embodiment corresponding to fig. 3, and details will not be further described here.
Wherein the target area contains at least one object;
the second determining module 15 is configured to, in the decoded video sequence, use the video frame to which the target area belongs as a video frame to be processed, and determine, based on a target playing timestamp corresponding to the video frame to be processed, coordinate information of the target area in the video frame to be processed; the target playing time stamp is a time stamp which is later than or equal to the first time stamp;
the sending module 16 is configured to send the target playing timestamp and the coordinate information to the server, so that when the server determines a target object in at least one object, an object tag corresponding to the target object is used as the target object tag; the determination of the target object is determined by the server based on a mapping relation table associated with the video frame to be processed; the mapping relation table comprises Z areas associated with the video frames to be processed, wherein one area corresponds to one object, and one object corresponds to one object label; z is a positive integer;
the third determining module 17 is configured to determine second video data matching the target object; the second video data is determined by the server based on the second video address obtained by the target object tag.
Wherein the third determining module 17 comprises: a start unit 171, a pull unit 172, and a decode unit 173.
The starting unit 171 is configured to start a second player associated with second video data based on the scheduling layer, and obtain a second video address returned by the server based on the target object tag;
the pulling unit 172 is configured to pull the second encoded video stream from the server through the second video address; the second coded video stream is obtained after the server codes the second video data; the second video data comprises a target object;
the decoding unit 173 is configured to decode the second encoded video stream by the second player to obtain the second video data.
For specific implementation manners of the starting unit 171, the pulling unit 172 and the decoding unit 173, reference may be made to the description of step S102 in the embodiment corresponding to fig. 3, and details will not be further described here.
The dividing module 18 is configured to divide the first playing interface into a second playing interface and a third playing interface when the second video data matched with the target object is acquired; the second playing interface is used for playing the first video data; the third playing interface is used for playing the second video data.
Wherein the dividing module 18 includes: a second determination unit 181 and a dividing unit 182.
The second determining unit 181 is configured to, when second video data matched with the target object is acquired, use a terminal display interface to which the first playing interface belongs as a to-be-processed interface, and determine a boundary line used for playing the first video data and the second video data and boundary position information of the boundary line in the to-be-processed interface;
the dividing unit 182 is configured to divide the interface to be processed into a second playing interface and a third playing interface based on the boundary and the boundary position information.
For specific implementation of the second determining unit 181 and the dividing unit 182, reference may be made to the description of step S102 in the embodiment corresponding to fig. 3, and details will not be further described here.
The first determining module 19 is configured to obtain a target object tag of a target object, and determine a switching sub-interface independent from the second playing interface in the second playing interface; switching the sub-interface to be an interface on a second playing interface;
the first output module 20 is configured to output the target object tag to the switching sub-interface when the first video data is played on the second playing interface;
the first synchronous playing module 21 is configured to play the second video data synchronously on the third playing interface.
The adjusting module 22 is configured to adjust, when a play timestamp different from the second timestamp exists in the play timestamp of the first video data and the play timestamp of the second video data, video data corresponding to the play timestamp different from the second timestamp, so that the play timestamp of the adjusted video data is consistent with the second timestamp; the second timestamp is a timestamp of the external clock, and the second timestamp is later than the first timestamp.
Wherein, the adjusting module 22 comprises: an acquisition unit 221, a first determination unit 222, and an adjustment unit 223.
The acquiring unit 221 is configured to acquire a second timestamp of the external clock, and use the second timestamp as a reference timestamp for adjusting the first video data and the second video data;
a first determining unit 222, configured to, when a play timestamp different from the reference timestamp exists in the play timestamps of the first video data and the second video data, take the play timestamp different from the reference timestamp as a to-be-synchronized timestamp, and take the video data corresponding to the to-be-synchronized timestamp as to-be-processed video data; the video data to be processed comprises one or more of first video data and second video data;
the adjusting unit 223 is configured to adjust a decoded video sequence and a decoded audio sequence in the video data to be processed based on the reference timestamp and the timestamp to be synchronized, so as to obtain adjusted video data.
The time stamp to be synchronized comprises a video playing time stamp and an audio playing time stamp;
the adjusting unit 223 includes: a first determination subunit 2231, a second determination subunit 2232, and an adjustment subunit 2233.
The first determining subunit 2231 is configured to determine a first difference value between the video playing timestamp and the reference timestamp, and acquire a to-be-synchronized video frame associated with the first difference value in a decoded video sequence of the to-be-processed video data;
the second determining subunit 2232 is configured to determine a second difference value between the audio playing time stamp and the reference time stamp, and acquire, in the decoded audio sequence in the video data to be processed, an audio frame to be synchronized associated with the second difference value;
the adjusting subunit 2233 is configured to adjust, based on the first difference, the second difference, the video frame to be synchronized, and the audio frame to be synchronized, the decoded video sequence in the video data to be processed and the decoded audio sequence in the video data to be processed, so as to obtain adjusted video data.
Wherein the adjusting subunit 2233 is further configured to:
taking a decoded video sequence in video data to be processed and a decoded audio sequence in the video data to be processed as multimedia sequences to be synchronized, taking a video frame to be synchronized and an audio frame to be synchronized as multimedia data frames, and taking a first difference value and a second difference value as difference values to be synchronized of the multimedia data frames;
if the difference value to be synchronized is a positive number, rendering the multimedia data frame in the multimedia sequence to be synchronized when the waiting time of the multimedia data frame reaches the difference value to be synchronized to obtain adjusted video data;
and if the difference value to be synchronized is negative, performing frame loss processing on the multimedia data frame in the multimedia sequence to be synchronized to obtain the adjusted video data.
For specific implementation manners of the first determining subunit 2231, the second determining subunit 2232, and the adjusting subunit 2233, reference may be made to the description of the video data to be processed in the embodiment corresponding to fig. 3, and details will not be further described here.
For specific implementation manners of the obtaining unit 221, the first determining unit 222, and the adjusting unit 223, reference may be made to the description of step S103 in the embodiment corresponding to fig. 3, and details will not be further described here.
The second output module 23 is configured to close the third playing interface in response to a trigger operation for a recommendation button in the switching sub-interface, determine K cover display data associated with the target object tag, and output the K cover display data to an object display interface independent of the second playing interface; k is a positive integer;
the switching module 24 is configured to respond to a trigger operation for target cover display data in the object display interface, acquire target video data corresponding to the target cover display data, switch the second playing interface to a fourth playing interface, and play the target video data on the fourth playing interface; the target cover display data is one of the K cover display data.
For specific implementation manners of the starting module 11, the decoding module 12, the third synchronous playing module 13, the intercepting module 14, the second determining module 15, the sending module 16, the third determining module 17, the dividing module 18, the first determining module 19, the first output module 20, the second synchronous playing module 21, the adjusting module 22, the second output module 23, and the switching module 24, reference may be made to the description of step S201 to step S205 in the embodiment corresponding to fig. 9, and details will not be further described here. In addition, the beneficial effects of the same method are not described in detail.
Further, please refer to fig. 13, which is a schematic diagram of a computer device according to an embodiment of the present application. As shown in fig. 13, the computer device 1000 may be a user terminal (e.g., the user terminal 200 in the embodiment corresponding to fig. 2) or a server (e.g., the server in the embodiment corresponding to fig. 2), which is not limited herein. For convenience of understanding, in the embodiment of the present application, the computer device 1000 is taken as an example of a user terminal, and at this time, the computer device 1000 may specifically include: at least one processor 1001, such as a CPU, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may optionally also be at least one storage device located remotely from the aforementioned processor 1001. As shown in fig. 13, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.
In the computer apparatus 1000 shown in fig. 13, the network interface 1004 is mainly used for network communication with a server; the user interface 1003 is an interface for providing a user with input; alternatively, it should be understood that when the computer device 1000 is a user terminal, the user interface 1003 may include a Display screen (Display), a Keyboard (Keyboard), and the processor 1001 may be configured to call a device control application stored in the memory 1005 to implement:
responding to a trigger operation aiming at an operable control on a first playing interface, and intercepting a target area on the first playing interface; the target area comprises a target object; the first playing interface is obtained when the first video data is played based on a first timestamp of the external clock;
when second video data matched with the target object is acquired, dividing the first playing interface into a second playing interface and a third playing interface; the second playing interface is used for playing the first video data; the third playing interface is used for playing the second video data;
when a playing time stamp different from the second time stamp exists in the playing time stamp of the first video data and the playing time stamp of the second video data, adjusting the video data corresponding to the playing time stamp different from the second time stamp so as to keep the playing time stamp of the adjusted video data consistent with the second time stamp; the second timestamp is a timestamp of the external clock, and the second timestamp is later than the first timestamp.
It should be understood that the computer device 1000 described in this embodiment may perform the description of the video data processing method in the embodiment corresponding to fig. 3 and fig. 9, may also perform the description of the video data processing apparatus 1 in the embodiment corresponding to fig. 11, and may also perform the description of the video data processing apparatus 2 in the embodiment corresponding to fig. 12, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
Further, here, it is to be noted that: an embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores the aforementioned computer program executed by the video data processing apparatus 1 or the video data processing apparatus 2, and the computer program includes program instructions, and when the processor executes the program instructions, the description of the video data processing method in the embodiment corresponding to fig. 3 or fig. 9 can be executed, so that details are not repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of embodiments of the method of the present application. As an example, program instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network, which may comprise a block chain system.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (15)

1. A method of processing video data, comprising:
responding to a trigger operation aiming at an operable control on a first playing interface, and intercepting a target area on the first playing interface; the target area comprises a target object; the first playing interface is obtained when the first video data is played based on a first timestamp of an external clock;
when second video data matched with the target object is acquired, dividing the first playing interface into a second playing interface and a third playing interface; the second playing interface is used for playing the first video data; the third playing interface is used for playing the second video data;
when a playing time stamp different from a second time stamp exists in the playing time stamp of the first video data and the playing time stamp of the second video data, adjusting the video data corresponding to the playing time stamp different from the second time stamp so as to keep the playing time stamp of the adjusted video data consistent with the second time stamp; the second timestamp is a timestamp of the external clock, and the second timestamp is later than the first timestamp.
2. The method according to claim 1, wherein when there is a play timestamp different from the second timestamp in the play timestamps of the first video data and the second video data, adjusting the video data corresponding to the play timestamp different from the second timestamp comprises:
acquiring a second time stamp of the external clock, wherein the second time stamp is used as a reference time stamp for adjusting the first video data and the second video data;
when a playing time stamp different from the reference time stamp exists in the playing time stamp of the first video data and the playing time stamp of the second video data, taking the playing time stamp different from the reference time stamp as a time stamp to be synchronized, and taking the video data corresponding to the time stamp to be synchronized as video data to be processed; the video data to be processed comprises one or more of the first video data and the second video data;
and adjusting the decoded video sequence and the decoded audio sequence in the video data to be processed based on the reference timestamp and the timestamp to be synchronized to obtain the adjusted video data.
3. The method of claim 2, wherein the time stamp to be synchronized comprises a video play time stamp and an audio play time stamp;
the adjusting the decoded video sequence and the decoded audio sequence in the video data to be processed based on the reference timestamp and the timestamp to be synchronized to obtain the adjusted video data includes:
determining a first difference value between the video playing time stamp and the reference time stamp, and acquiring a video frame to be synchronized associated with the first difference value in a decoded video sequence of the video data to be processed;
determining a second difference value between the audio playing time stamp and the reference time stamp, and acquiring an audio frame to be synchronized associated with the second difference value in a decoded audio sequence in the video data to be processed;
and adjusting the decoded video sequence in the video data to be processed and the decoded audio sequence in the video data to be processed based on the first difference, the second difference, the video frame to be synchronized and the audio frame to be synchronized to obtain adjusted video data.
4. The method according to claim 3, wherein the adjusting the decoded video sequence in the video data to be processed and the decoded audio sequence in the video data to be processed based on the first difference, the second difference, the video frame to be synchronized, and the audio frame to be synchronized to obtain adjusted video data comprises:
taking a decoded video sequence in the video data to be processed and a decoded audio sequence in the video data to be processed as multimedia sequences to be synchronized, taking the video frames to be synchronized and the audio frames to be synchronized as multimedia data frames, and taking the first difference value and the second difference value as difference values to be synchronized of the multimedia data frames;
if the difference value to be synchronized is a positive number, when the waiting time of the multimedia data frame reaches the difference value to be synchronized, rendering the multimedia data frame in the multimedia sequence to be synchronized to obtain adjusted video data;
and if the difference value to be synchronized is a negative number, performing frame loss processing on the multimedia data frame in the multimedia sequence to be synchronized to obtain adjusted video data.
5. The method according to claim 1, wherein the dividing the first playing interface into a second playing interface and a third playing interface when the second video data matched with the target object is acquired includes:
when second video data matched with the target object is acquired, taking a terminal display interface to which the first playing interface belongs as a to-be-processed interface, and determining a boundary for playing the first video data and the second video data and boundary position information of the boundary in the to-be-processed interface;
and dividing the interface to be processed into a second playing interface and a third playing interface based on the boundary and the boundary position information.
6. The method of claim 5, further comprising:
acquiring a target object label of the target object, and determining a switching sub-interface independent of the second playing interface in the second playing interface; the switching sub-interface is an interface on the second playing interface;
when the first video data is played on the second playing interface, outputting the target object label to the switching sub-interface;
and synchronously playing the second video data on the third playing interface.
7. The method of claim 6, further comprising:
responding to a trigger operation aiming at a recommendation button in the switching sub-interface, closing the third playing interface, determining K cover display data associated with the target object label, and outputting the K cover display data to an object display interface independent of the second playing interface; k is a positive integer;
responding to a trigger operation aiming at target cover display data in the object display interface, acquiring target video data corresponding to the target cover display data, switching the second playing interface into a fourth playing interface, and playing the target video data on the fourth playing interface; the target cover display data is one of the K cover display data.
8. The method of claim 1, further comprising:
responding to a starting operation aiming at first video data, calling a scheduling layer of a user terminal, starting an external clock associated with the first video data and a first player associated with the first video data based on the scheduling layer, and taking a playing time stamp corresponding to the starting operation as a first time stamp of the external clock;
decoding a first coded video stream through the first player to obtain first video data; the first coded video stream is video data obtained by the user terminal from a first video address of a server; the first video data comprises a decoded video sequence and a decoded audio sequence;
playing the decoded video sequence and the decoded audio sequence in the first video data on a first playing interface based on the first timestamp.
9. The method of claim 8, wherein decoding the first encoded video stream by the first player to obtain the first video data comprises:
receiving a first encoded video stream stored by the server at a first video address; the first coded video stream is obtained after the server codes the first video data;
performing data separation on the first coded video stream based on a separator in the first player to obtain video packets and audio packets associated with the first coded video stream;
decoding the video packet based on a video decoder in the first player to obtain a decoded video sequence corresponding to the video packet, and decoding the audio packet based on an audio decoder in the first player to obtain a decoded audio sequence corresponding to the audio packet;
and using the decoded video sequence and the decoded audio sequence as first video data.
10. The method of claim 8, wherein intercepting a target area on a first playback interface in response to a triggering operation for an operable control on the first playback interface comprises:
acquiring an operable control pushed by the server and having a display duration threshold, outputting the operable control to the first playing interface, and displaying a progress animation of the operable control within the display duration threshold;
and within the display duration threshold, responding to the trigger operation aiming at the operable control, and intercepting a target area on the first playing interface.
11. The method of claim 10, wherein the target region contains at least one object therein;
the method further comprises the following steps:
in the decoded video sequence, taking a video frame to which the target area belongs as a video frame to be processed, and determining coordinate information of the target area in the video frame to be processed based on a target playing time stamp corresponding to the video frame to be processed; the target playing timestamp is a timestamp later than or equal to the first timestamp;
sending the target playing timestamp and the coordinate information to the server, so that when the server determines a target object in the at least one object, an object tag corresponding to the target object is used as a target object tag; the determination of the target object is determined by the server based on a mapping relation table associated with the video frame to be processed; the mapping relation table comprises Z areas associated with the video frames to be processed, wherein one area corresponds to one object, and one object corresponds to one object label; z is a positive integer;
determining second video data matched with the target object; the second video data is determined by the server based on a second video address acquired by the target object tag.
12. The method of claim 11, wherein determining the second video data that matches the target object comprises:
starting a second player associated with the second video data based on the scheduling layer, and acquiring a second video address returned by the server based on the target object tag;
pulling a second encoded video stream from the server via the second video address; the second coded video stream is obtained after the server codes the second video data; the second video data comprises the target object;
and decoding a second coded video stream through the second player to obtain the second video data.
13. A video data processing apparatus, comprising:
the intercepting module is used for responding to triggering operation aiming at an operable control on a first playing interface and intercepting a target area on the first playing interface; the target area comprises a target object; the first playing interface is obtained when the first video data is played based on a first timestamp of an external clock;
the dividing module is used for dividing the first playing interface into a second playing interface and a third playing interface when second video data matched with the target object is acquired; the second playing interface is used for playing the first video data; the third playing interface is used for playing the second video data;
an adjusting module, configured to adjust video data corresponding to a playing timestamp different from a second timestamp when the playing timestamp different from the second timestamp exists in the playing timestamp of the first video data and the playing timestamp of the second video data, so that the playing timestamp of the adjusted video data is consistent with the second timestamp; the second timestamp is a timestamp of the external clock, and the second timestamp is later than the first timestamp.
14. A computer device, comprising: a processor, a memory, a network interface;
the processor is connected to a memory for providing data communication functions, a network interface for storing a computer program, and a processor for calling the computer program to perform the method according to any one of claims 1 to 12.
15. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method according to any one of claims 1-12.
CN202010392160.0A 2020-05-11 2020-05-11 Video data processing method and device, computer equipment and storage medium Active CN111601136B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010392160.0A CN111601136B (en) 2020-05-11 2020-05-11 Video data processing method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010392160.0A CN111601136B (en) 2020-05-11 2020-05-11 Video data processing method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111601136A true CN111601136A (en) 2020-08-28
CN111601136B CN111601136B (en) 2021-03-26

Family

ID=72191091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010392160.0A Active CN111601136B (en) 2020-05-11 2020-05-11 Video data processing method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111601136B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112217960A (en) * 2020-10-14 2021-01-12 四川长虹电器股份有限公司 Method for synchronously displaying multi-screen playing pictures
CN112910592A (en) * 2021-02-22 2021-06-04 北京小米松果电子有限公司 Clock synchronization method and device, terminal and storage medium
CN113873315A (en) * 2021-10-27 2021-12-31 北京达佳互联信息技术有限公司 Video data playing method, device and equipment
CN115068911A (en) * 2021-03-16 2022-09-20 北京卡路里科技有限公司 Control method and device of fitness equipment, storage medium and processor
CN115119029A (en) * 2021-03-19 2022-09-27 海信视像科技股份有限公司 Display device and display control method
CN115243088A (en) * 2022-07-21 2022-10-25 苏州金螳螂文化发展股份有限公司 Multi-host video frame-level synchronous rendering method
WO2024104182A1 (en) * 2022-11-16 2024-05-23 北京字跳网络技术有限公司 Video-based interaction method, apparatus, and device, and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103476021A (en) * 2012-06-06 2013-12-25 孙绎成 Device for realizing video surveillance and simultaneously recording mobile equipment information
CN103686315A (en) * 2012-09-13 2014-03-26 深圳市快播科技有限公司 Synchronous audio and video playing method and device
US20140359470A1 (en) * 2013-05-27 2014-12-04 Tencent Technology (Shenzhen) Company Limited Method and device for playing media synchronously
CN104581346A (en) * 2015-01-14 2015-04-29 华东师范大学 Micro video course making system and method
CN107071509A (en) * 2017-05-18 2017-08-18 北京大生在线科技有限公司 The live video precise synchronization method of multichannel
CN107846623A (en) * 2017-10-31 2018-03-27 广东中星电子有限公司 A kind of video interlink method and system
CN108184158A (en) * 2017-12-29 2018-06-19 深圳华侨城卡乐技术有限公司 A kind of method and system that video is played simultaneously
CN109963200A (en) * 2017-12-25 2019-07-02 上海全土豆文化传播有限公司 Video broadcasting method and device
CN110166788A (en) * 2018-08-02 2019-08-23 腾讯科技(深圳)有限公司 Synchronizing information playback method, device and storage medium
CN110636324A (en) * 2019-10-24 2019-12-31 腾讯科技(深圳)有限公司 Interface display method and device, computer equipment and storage medium
CN110933449A (en) * 2019-12-20 2020-03-27 北京奇艺世纪科技有限公司 Method, system and device for synchronizing external data and video pictures

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103476021A (en) * 2012-06-06 2013-12-25 孙绎成 Device for realizing video surveillance and simultaneously recording mobile equipment information
CN103686315A (en) * 2012-09-13 2014-03-26 深圳市快播科技有限公司 Synchronous audio and video playing method and device
US20140359470A1 (en) * 2013-05-27 2014-12-04 Tencent Technology (Shenzhen) Company Limited Method and device for playing media synchronously
CN104581346A (en) * 2015-01-14 2015-04-29 华东师范大学 Micro video course making system and method
CN107071509A (en) * 2017-05-18 2017-08-18 北京大生在线科技有限公司 The live video precise synchronization method of multichannel
CN107846623A (en) * 2017-10-31 2018-03-27 广东中星电子有限公司 A kind of video interlink method and system
CN109963200A (en) * 2017-12-25 2019-07-02 上海全土豆文化传播有限公司 Video broadcasting method and device
CN108184158A (en) * 2017-12-29 2018-06-19 深圳华侨城卡乐技术有限公司 A kind of method and system that video is played simultaneously
CN110166788A (en) * 2018-08-02 2019-08-23 腾讯科技(深圳)有限公司 Synchronizing information playback method, device and storage medium
CN110636324A (en) * 2019-10-24 2019-12-31 腾讯科技(深圳)有限公司 Interface display method and device, computer equipment and storage medium
CN110933449A (en) * 2019-12-20 2020-03-27 北京奇艺世纪科技有限公司 Method, system and device for synchronizing external data and video pictures

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张少松: "多路视频图像处理系统的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112217960A (en) * 2020-10-14 2021-01-12 四川长虹电器股份有限公司 Method for synchronously displaying multi-screen playing pictures
CN112217960B (en) * 2020-10-14 2022-04-12 四川长虹电器股份有限公司 Method for synchronously displaying multi-screen playing pictures
CN112910592A (en) * 2021-02-22 2021-06-04 北京小米松果电子有限公司 Clock synchronization method and device, terminal and storage medium
CN112910592B (en) * 2021-02-22 2023-08-04 北京小米松果电子有限公司 Clock synchronization method and device, terminal and storage medium
CN115068911A (en) * 2021-03-16 2022-09-20 北京卡路里科技有限公司 Control method and device of fitness equipment, storage medium and processor
CN115068911B (en) * 2021-03-16 2024-03-15 杭州卡路里体育有限公司 Control method and device of fitness equipment, storage medium and processor
CN115119029A (en) * 2021-03-19 2022-09-27 海信视像科技股份有限公司 Display device and display control method
CN115119029B (en) * 2021-03-19 2024-04-02 海信视像科技股份有限公司 Display equipment and display control method
CN113873315A (en) * 2021-10-27 2021-12-31 北京达佳互联信息技术有限公司 Video data playing method, device and equipment
CN115243088A (en) * 2022-07-21 2022-10-25 苏州金螳螂文化发展股份有限公司 Multi-host video frame-level synchronous rendering method
WO2024104182A1 (en) * 2022-11-16 2024-05-23 北京字跳网络技术有限公司 Video-based interaction method, apparatus, and device, and storage medium

Also Published As

Publication number Publication date
CN111601136B (en) 2021-03-26

Similar Documents

Publication Publication Date Title
CN111601136B (en) Video data processing method and device, computer equipment and storage medium
CN109168078B (en) Video definition switching method and device
WO2023024834A1 (en) Game data processing method and apparatus, and storage medium
EP3533227B1 (en) Anchors for live streams
US10362366B2 (en) Techniques for seamless media content switching during fixed-duration breaks
WO2022052624A1 (en) Video data processing method and apparatus, computer device and storage medium
US12015770B2 (en) Method for encoding video data, device, and storage medium
WO2016074327A1 (en) Control method, apparatus and system of media stream
CN106791988B (en) Multimedia data carousel method and terminal
CN109714622B (en) Video data processing method and device and electronic equipment
JP7431329B2 (en) Video processing methods, apparatus, computer devices and computer programs
RU2603629C2 (en) Terminal device, server device, information processing method, program and system for providing related applications
CN110809168A (en) Video live broadcast processing method and device, terminal and storage medium
US10129592B2 (en) Audience measurement and feedback system
US20120090009A1 (en) Video Assets Having Associated Graphical Descriptor Data
CN113141514A (en) Media stream transmission method, system, device, equipment and storage medium
JP7290260B1 (en) Servers, terminals and computer programs
CN107690093B (en) Video playing method and device
CN112929713A (en) Data synchronization method, device, terminal and storage medium
CN114501052B (en) Live broadcast data processing method, cloud platform, computer equipment and storage medium
CN111436009A (en) Real-time video stream transmission and display method and transmission and play system
CN111010620B (en) Method and device for multimedia resource carousel, electronic equipment and storage medium
US20170048291A1 (en) Synchronising playing of streaming content on plural streaming clients
CN110225370B (en) Timeline control method for personalized presentation of multimedia content
WO2016090916A1 (en) Code stream transmission method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40027958

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant