CN117793459A - Video processing method, device and storage medium - Google Patents

Video processing method, device and storage medium Download PDF

Info

Publication number
CN117793459A
CN117793459A CN202211186658.7A CN202211186658A CN117793459A CN 117793459 A CN117793459 A CN 117793459A CN 202211186658 A CN202211186658 A CN 202211186658A CN 117793459 A CN117793459 A CN 117793459A
Authority
CN
China
Prior art keywords
video
audio
time stamp
data
played
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211186658.7A
Other languages
Chinese (zh)
Inventor
余水来
白茂生
胥晓明
刘诗萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Unicom Digital Technology Co Ltd
Unicom Cloud Data Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Unicom Digital Technology Co Ltd
Unicom Cloud Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd, Unicom Digital Technology Co Ltd, Unicom Cloud Data Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202211186658.7A priority Critical patent/CN117793459A/en
Publication of CN117793459A publication Critical patent/CN117793459A/en
Pending legal-status Critical Current

Links

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present invention relates to the field of communications technologies, and in particular, to a video processing method, apparatus, and storage medium, which enable a player to play videos in multiple different formats. The method comprises the following steps: extracting audio data from media data to be played; playing the audio data and determining a first audio time stamp; the first audio time stamp is the time stamp of the audio data which is currently played; transmitting first indication information to a local video decoding service; the first indication information is used for indicating a first audio time stamp; receiving video data from a local video decoding service; the video data are the video data with the same playing time stamp as the first audio time stamp in the media data to be played; and performing audio-video synchronization on the video data and the currently played audio data. The method and the device are used in the video processing process.

Description

Video processing method, device and storage medium
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a video processing method, apparatus, and storage medium.
Background
Current players typically use video tagging to play video or use media sources to extend MSE when playing video. However, the current video formats are of a large variety, and different vendors use different video formats (for example, different streaming media transmission protocols, different encoding and decoding algorithms, and different container packaging formats), and the player usually only supports a part of the video formats, and at this time, the player can only play the video of the video format supported by the player, but cannot play the video of the video format not supported by the player. Therefore, how to enable a player to play video in multiple different formats is currently called a technical problem to be solved.
Disclosure of Invention
The application provides a video processing method, a video processing device and a storage medium, which can enable a player to play videos in a plurality of different formats.
In order to achieve the above purpose, the present application adopts the following technical scheme:
in a first aspect, the present application provides a video processing method, applied to a player, where the method includes: extracting audio data from media data to be played; the media data to be played comprises audio data and video data; playing the audio data and determining a first audio time stamp; the first audio time stamp is the audio time stamp of the audio data which is currently played; transmitting first indication information to a local video decoding service; the first indication information is used for indicating a first audio time stamp; the local video decoding service is used for decoding videos in a plurality of video formats; the video data in the multiple video formats comprises media data to be played; receiving video data from a local video decoding service; the video data is the video data with the same time as the first audio time stamp in the media data to be played; and performing audio-video synchronization on the video data and the currently played audio data.
With reference to the first aspect, in a possible implementation manner, before determining the first audio timestamp, the method further includes: determining a second audio timestamp; the second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data; the audio of the first preset duration comprises the audio played currently; and sending second indication information to the local video decoding service, wherein the second indication information is used for indicating the local video decoding service to decode video data corresponding to the second audio time stamp.
With reference to the first aspect, in a possible implementation manner, before determining the second audio timestamp, the method further includes: acquiring play address list information of media data to be played; transmitting decoding instruction information to a local video decoding service, the decoding instruction information including: play address list information of media data to be played.
With reference to the first aspect, in one possible implementation manner, in a case that a multi-path stream is included in media data to be played; the first indication information further includes: a stream Identification (ID) of each of the multiple streams; the second instruction information further includes: a stream identity ID of each stream in the multiple streams; the playlist information further includes: the flow identity ID of each flow in the multi-path flows.
In a second aspect, the present application provides a video processing method, applied to a local video decoding service, where the method includes: decoding video data in the media data to be played; receiving first indication information from a player, and determining a first audio time stamp; the first audio time stamp is an audio time stamp of audio data currently played by the player; determining video data having the same time as the first audio time stamp according to the first audio time stamp; video data is transmitted to the player at the same time as the first audio time stamp.
With reference to the second aspect, in one possible implementation manner, decoding video data in media data to be played includes: receiving a second audio timestamp; the second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data; determining video data corresponding to the second audio time stamp according to the second audio time stamp; and decoding video data corresponding to the second audio time stamp.
With reference to the second aspect, in one possible implementation manner, before receiving the second audio timestamp, the method further includes: receiving decoding instruction information sent by a player; the decoding instruction information includes: play address list information of media data to be played; video data corresponding to the second audio time stamp is determined from the decoding instruction information.
With reference to the second aspect, in one possible implementation manner, decoding video data corresponding to the second audio timestamp includes: decoding the video and adding time stamp information for the single-way stream; and for the multipath stream, performing frame loss processing on video decoding, and adding time stamp information.
In a third aspect, the present application provides a video processing apparatus, for use in a player, the apparatus comprising: a processing unit and a communication unit; the processing unit is used for extracting audio data from the media data to be played; the media data to be played comprises audio data and video data; the processing unit is also used for playing the audio data and determining a first audio time stamp; the first audio time stamp is the audio time stamp of the audio data which is currently played; a communication unit for transmitting first indication information to a local video decoding service; the first indication information is used for indicating a first audio time stamp; the local video decoding service is used for decoding videos in a plurality of video formats; the video data in the multiple video formats comprises media data to be played; a processing unit for receiving video data from a local video decoding service; the video data is the video data with the same time as the first audio time stamp in the media data to be played; and the processing unit is also used for carrying out audio-video synchronization on the video data and the currently played audio data.
With reference to the third aspect, in a possible implementation manner, the processing unit is further configured to determine a second audio timestamp; the second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data; the audio of the first preset duration comprises the audio played currently; and the communication unit is also used for sending second indication information to the local video decoding service, wherein the second indication information is used for indicating the local video decoding service to decode the video data corresponding to the second audio time stamp.
With reference to the third aspect, in one possible implementation manner, the processing unit is further configured to obtain play address list information of media data to be played; the communication unit is further configured to send decoding instruction information to the local video decoding service, where the decoding instruction information includes: play address list information of media data to be played.
With reference to the third aspect, in one possible implementation manner, in a case where multiple streams are included in media data to be played; the first indication information further includes: a stream identity ID of each stream in the multiple streams; the second instruction information further includes: a stream identity ID of each stream in the multiple streams; the playlist information further includes: the flow identity ID of each flow in the multi-path flows.
In a fourth aspect, the present application provides a video processing apparatus, applied to a local video decoding service, where the apparatus further includes: a processing unit and a communication unit; the processing unit is used for decoding video data in the media data to be played; the communication unit is used for receiving the first indication information from the player and determining a first audio time stamp; the first audio time stamp is an audio time stamp of audio data currently played by the player; the processing unit is further used for determining video data with the same time as the first audio time stamp according to the first audio time stamp; and the communication unit is also used for sending video data with the same time as the first audio time stamp to the player.
With reference to the fourth aspect, in a possible implementation manner, the processing unit is further configured to receive a second audio timestamp; the second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data; the processing unit is also used for determining video data corresponding to the second audio time stamp according to the second audio time stamp; and the processing unit is also used for decoding the video data corresponding to the second audio time stamp.
With reference to the fourth aspect, in a possible implementation manner, the processing unit is further configured to receive decoding instruction information sent by the player; the decoding instruction information includes: play address list information of media data to be played; and the processing unit is also used for determining video data corresponding to the second audio time stamp from the decoding instruction information.
With reference to the fourth aspect, in one possible implementation manner, decoding video data corresponding to the second audio timestamp includes: decoding the video and adding time stamp information for the single-way stream; and for the multipath stream, performing frame loss processing on video decoding, and adding time stamp information.
In a fifth aspect, the present application provides a video processing apparatus, the apparatus comprising: a processor and a communication interface; the communication interface is coupled to a processor for running a computer program or instructions to implement the video processing method as described in any one of the possible implementations of the first aspect and the first aspect.
In a sixth aspect, the present application provides a video processing apparatus, the apparatus comprising: a processor and a communication interface; the communication interface is coupled to a processor for running a computer program or instructions to implement the video processing method as described in any one of the possible implementations of the second aspect and the second aspect.
In a seventh aspect, the present application provides a computer readable storage medium having instructions stored therein that, when run on a terminal, cause the terminal to perform a video processing method as described in any one of the possible implementations of the first aspect and the first aspect.
In an eighth aspect, the present application provides a computer readable storage medium having instructions stored therein which, when run on a terminal, cause the terminal to perform a video processing method as described in any one of the possible implementations of the second aspect and the second aspect.
These and other aspects of the present application will be more readily apparent from the following description.
Based on the technical scheme, compared with the prior art, the video processing method provided by the application has at least the following beneficial effects: the player determines a first audio time stamp by extracting audio data of media to be played and playing the audio data, and simultaneously sends first indication information to the local video decoding service, wherein the local video decoding service decodes videos according to the first indication information, and the local video decoding service can decode videos in various video formats. After the local video decoding service finishes decoding, receiving and sending video data with the same time as the first audio time stamp to the player, and performing audio-video synchronization on the video data and the currently played audio data by the player. Compared with the scheme that video in an unsupported video format cannot be decoded when video decoding is performed during playing in the prior art, the video in various video formats can be decoded through local video decoding service decoding and played by the player, so that the problem that the player cannot play videos in various different formats in the playing process is effectively solved.
Drawings
Fig. 1 is a block diagram of a video processing apparatus provided in the present application;
fig. 2 is a flowchart of a method for processing video of a player provided in the present application;
FIG. 3 is a flowchart of another video processing method of a player provided in the present application;
FIG. 4 is a flowchart of a video processing method provided in the present application;
FIG. 5 is a flow chart of yet another video processing method provided herein;
FIG. 6 is a flow chart of yet another video processing method provided herein;
fig. 7 is a block diagram of a video processing device applied to a player provided in the present application;
fig. 8 is a block diagram of a video processing apparatus applied to a local video decoding service provided in the present application;
FIG. 9 is a schematic diagram of another possible configuration of a video processing apparatus provided herein;
fig. 10 is a schematic diagram of still another possible structure of a video processing apparatus provided in the present application.
Detailed Description
The following describes in detail a video processing method and apparatus provided in an embodiment of the present application with reference to the accompanying drawings.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone.
The terms "first" and "second" and the like in the description and in the drawings are used for distinguishing between different objects or for distinguishing between different processes of the same object and not for describing a particular sequential order of objects.
Furthermore, references to the terms "comprising" and "having" and any variations thereof in the description of the present application are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or apparatus.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
In the description of the present application, unless otherwise indicated, the meaning of "a plurality" means two or more.
Fig. 1 is a schematic structural diagram of a video playing device according to an embodiment of the present disclosure. As shown in fig. 1, the video playback device 100 includes at least one processor 101, a communication line 102, and at least one communication interface 104, and may also include a memory 103. The processor 101, the memory 103, and the communication interface 104 may be connected through a communication line 102.
The processor 101 may be a central processing unit (central processing unit, CPU), or may be an application specific integrated circuit (application specific integrated circuit, ASIC), or one or more integrated circuits configured to implement embodiments of the present disclosure, such as: one or more digital signal processors (digital signal processor, DSP), or one or more field programmable gate arrays (field programmable gate array, FPGA).
Communication line 102 may include a pathway for communicating information between the aforementioned components.
The communication interface 104, for communicating with other devices or communication networks, may use any transceiver-like device, such as ethernet, radio access network (radio access network, RAN), wireless local area network (wireless local area networks, WLAN), etc.
The memory 103 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), a compact disc read-only memory (compact disc read-only memory) or other optical disc storage, optical disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to include or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
In a possible design, the memory 103 may exist independent of the processor 101, i.e. the memory 103 may be a memory external to the processor 101, where the memory 103 may be connected to the processor 101 through a communication line 102 for storing execution instructions or application program codes, and the execution is controlled by the processor 101 to implement a network quality determining method provided by the embodiments described below in the present disclosure. In yet another possible design, the memory 103 may be integrated with the processor 101, i.e., the memory 103 may be an internal memory of the processor 101, e.g., the memory 103 may be a cache, and may be used to temporarily store some data and instruction information, etc.
As one implementation, processor 101 may include one or more CPUs, such as CPU0 and CPU1 in fig. 1. As another implementation, the video playback device 100 may include multiple processors, such as the processor 101 and the processor 107 in fig. 1. As yet another implementation, the video playing apparatus 100 may further include an output device 105 and an input device 106.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the network node is divided into different functional modules to implement all or part of the functions described above. The specific working processes of the above-described system, module and network node may refer to the corresponding processes in the foregoing method embodiments, which are not described herein.
With the gradual development of internet technology and audio technology, a player (e.g., an H5 player) is currently used as an audio/video core technology, and is widely applied in the fields of online education, online meeting, security monitoring and the like, and achieves a better utilization rate.
Most conventional players directly use video tags to play videos or use media source expansion (Media Source Extensions, MSE) to play media streams; however, the current video formats are of a large variety, and different vendors use different video formats (for example, different streaming media transmission protocols, different encoding and decoding algorithms, and different container packaging formats), and the player usually only supports a part of the video formats, and at this time, the player can only play the video of the video format supported by the player, but cannot play the video of the video format not supported by the player. Therefore, how to enable a player to play video in multiple different formats is currently called a technical problem to be solved.
Meanwhile, under various application scenes, especially under the conditions of installation scenes and conference scenes, the requirement for multi-channel video playing is gradually increased, the video resolution of each application scene is increasingly increased, and the support of a new video coding standard algorithm is provided; when multi-path video playing is performed, the player cannot well realize multi-path video playing due to the fact that a browser does not support the hardware decoding of video in a certain format and the decoding performance limitation of WASM software.
As shown in fig. 2, in the related art, when a player plays a video, a process of performing video processing includes:
step one: the player multi-source acquires the first streaming media data corresponding to the playing request and judges whether the first streaming media data is in the Fragment MP4 format.
Step two: when the first streaming media data is in Fragment MP4 format, the player sends the first streaming media data to the media source extension (Media Source Extensions, MSE) so that the media source extension (Media Source Extensions, MSE) plays the first streaming media data.
Step three: when the first streaming media data is not in the Fragment MP4 format, the player performs data buffering, decapsulates demux video data, and finally repacks the video data after decapsulation demux into second streaming media data in the Fragment MP4 format.
Step four: the player transmits the second streaming media data to the media source extension (Media Source Extensions, MSE) so that the media source extension (Media Source Extensions, MSE) plays the second streaming media data.
As shown in fig. 3, a further process of performing video processing when a player provided in the related art plays a video includes:
step 1: the player receives the play request, obtains first streaming media data corresponding to the play request in a multi-source mode by utilizing a content delivery network service (Content Delivery Network, CDN), and stores the first streaming media data in a cache.
Step 2: and the player extracts the cache stream media data in unit time and judges whether the cache stream media is in the Fragment MP4 format or not.
Step 3: and when the cached streaming media data is in the Fragment MP4 format, the player adds the cached streaming media data into the waiting queue.
Step 4: when the cached streaming media data is not in the Fragment MP4 format, the player uses a demux to unpack the cached streaming media data, then uses the demux to pack the unpacked data into target streaming media data in the Fragment MP4 format, and adds the target streaming media data into a waiting queue.
Step 5: the player sends the streaming media data to be played in the waiting queue to the media source expansion (Media Source Extensions, MSE) in sequence so that the media source expansion (Media Source Extensions, MSE) decodes and plays the streaming media data to be played.
Step 6: and when the streaming media data corresponding to the on-demand request is completely played, the player releases the cached streaming media data.
The above two processing methods of the player have the following three problems: 1. the Video tag only supports the playing of MP4, ogg and WebM Video container formats; 2. the supported video coding format is limited by the browser (e.g., the browser such as chrome, 360 browser, firefox, etc. does not support HEVC decoding, HEVC can only select WASM software decoding at the above-mentioned browser); 3. the WASM general software decoding method can not meet the requirement of multi-path video playing.
In order to avoid the limitation of the player in the playing process and enable the player to support multi-channel video playing, the application provides a video processing method as shown in fig. 4. As shown in fig. 4, the player extracts audio data from media data to be played; and playing the audio data, determining a first audio time stamp; the player sends first indication information to the local video decoding service; the local video decoding service decodes video data in the media data to be played, and it is noted that the local video decoding service can decode videos in various video formats, so that the player can only support partial video formats, meanwhile, the local video decoding service device determines a first audio time stamp according to the first indication information, determines video data with the same time as the first audio time stamp according to the first audio time stamp and sends the video data to the player, and finally the player receives the video data from the local video decoding service and performs audio-video synchronization on the video data and the currently played audio data; according to the method, the situation that the player cannot play in the using process is avoided, compared with the scheme that the player cannot decode video in an unsupported video format when decoding video in the prior art, the method can decode video in multiple video formats through the decoding of the local video decoding service, and the player plays the video in multiple different formats, so that the problem that the player cannot play the video in multiple different formats in the playing process is effectively solved.
As shown in fig. 4, a flowchart of a video processing method provided in an embodiment of the present application is shown, where the video processing method provided in the embodiment of the present application may be applied to the video playing device shown in fig. 1, and the video processing method provided in the embodiment of the present application may be implemented by the following steps.
S401, the video playing device extracts audio data from the media data to be played.
Specifically, the video playing device only extracts audio data from media data to be played, where the media data to be played may be media data in a live broadcast scene or media data in a recorded broadcast scene.
Illustratively, the video player extracts a 100 second audio source from a movie fragment.
S402, the video playing device plays the audio data and determines a first audio time stamp.
Wherein the first audio time stamp is an audio time stamp of the audio data currently played.
Specifically, the video playing device plays the extracted audio data, and determines the time stamp of the audio data of each node.
Illustratively, the video player plays a 100 second audio source extracted from a certain movie fragment a and plays it. During the playing process, the video playing device takes the timestamp of the audio of 2ms currently played as the first audio timestamp. For example, when the audio of the 2ms out of the 100 seconds is currently played, the audio time stamp of the 2ms is used as the first audio time stamp. The application is not limited to this, but may be audio time stamps of other lengths.
S403, the video playing device sends first indication information to the local video decoding service. Accordingly, the local video decoding service device receives the first indication information from the video playing device.
The first indication information is used for indicating a first audio time stamp; the local video decoding service is used for decoding videos in a plurality of video formats; video data in a plurality of video formats includes the media data to be played.
Specifically, the video playing device sends a timestamp of the currently played audio data to the local video decoding service through a websocket/http protocol, and makes a request for pulling the video data to the local video decoding service.
Illustratively, the video player sends the timestamp of the currently playing 2ms audio source to the local video decoder via the websocket/http protocol, and requests the local video decoder to pull 2ms video data corresponding to the timestamp of the currently playing 2ms audio source via the websocket/http protocol.
S404, the local video decoding service device decodes the video data in the media data to be played.
In one possible implementation, the local video decoding service device decodes video data in the media data to be played.
S405, the local video decoding service device determines a first audio time stamp according to the first indication information.
The first audio time stamp is a time stamp of audio data currently played by the player.
In one possible implementation, the local video decoding service device receives and determines a timestamp of audio data currently being played by the player.
S406, the local video decoding service device determines video data with the same time as the first audio time stamp according to the first audio time stamp.
In one possible implementation manner, the local video decoding service device determines, according to the determined timestamp of the audio data currently played by the player, video data having the same time as the timestamp of the audio data currently played.
Illustratively, the local video decoder decodes video data at the same time as the first audio timestamp into a JPEG picture stream. For example: the video decoded by the local video decoder corresponds to the 2ms video of the 100 second video of step 402, which is decoded into a JPEG slice stream format.
S407, the local video decoding service device transmits video data having the same time as the first audio time stamp to the player. Accordingly, the video playback device receives video data from the local video decoding service.
In one possible implementation, the local video decoding service device sends the video data in S406 that is the same as the time stamp of the currently played audio data to the player, and it is noted that: the video data in S406 and S407 are video data decoded by the local video decoding service device in S404 and the same time as the time stamp of the currently played audio data.
The video data is the video data with the same time as the first audio time stamp in the media data to be played.
Illustratively, the video player receives 2ms of video data corresponding to the timestamp of the currently playing 2ms audio source pulled by the local video decoder.
S408, the video playing device performs audio-video synchronization on the video data and the currently played audio data.
Specifically, the video playing device plays the video data according to the time stamp of the video data, and draws the video data through the Canvas; and finally, the video playing device performs audio and video synchronization according to the time stamp of the currently played audio data and the time stamp of the video data.
Illustratively, the video player plays according to the time of 2ms video data, and then performs audio-video synchronization through the time stamp of 2ms audio data and the time stamp of 2ms video data.
The technical scheme provided by the embodiment at least brings the following beneficial effects that the video processing method provided by the application can effectively avoid the problem that the player is limited in playing process, and ensures that the player supports a plurality of different videos to play; the player of the application determines a first audio time stamp by extracting and playing audio data, and simultaneously sends first indication information to the local video decoding service, so that the player sends the first indication information to the local video decoding service; the local video decoding service decodes video data in the media data to be played, and it is noted that the local video decoding service device can decode videos in various video formats, so that the player can only support partial video formats, meanwhile, the local video decoding service device determines a first audio time stamp according to first indication information, then determines video data with the same time as the first audio time stamp according to the first audio time stamp and sends the video data to the player, and finally the player receives the video data from the local video decoding service and performs audio-video synchronization on the video data and the currently played audio data; according to the method, the situation that the player cannot play in the using process is avoided, the problem that only partial video formats are supported in the past is solved, and the player is ensured to play videos in various different formats.
The video processing method provided by the embodiment of the application will be described in detail below with reference to specific embodiments, and the method is applied to a video processing device.
In a possible implementation manner, as shown in fig. 5 in conjunction with fig. 4, before determining the first audio timestamp in step 402, the following steps are further included, specifically implemented through S501-S505, and are described in detail below:
s501, the video playing device determines a second audio time stamp.
The second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data; the audio of the first preset duration comprises the audio played currently.
In one possible implementation, the video playback device determines a time stamp of the audio for the first preset duration and sends the time stamp of the audio for the first preset duration to the local video decoding service.
Illustratively, the video player determines a time stamp for audio for a first preset duration of 100 seconds, which may also be referred to as an audiovisual stream play buffer time stamp for 100 seconds, and sends the time stamp for the audio for 100 seconds to the local video decoder.
S502, the video playing device sends second indication information to the local video decoding service.
The second indication information is used for indicating the local video decoding service to decode the video data corresponding to the second audio time stamp.
In a possible implementation manner, the video playing device sends decoding indication information to the local video decoding service through websocket/http protocol.
Illustratively, the video player transmits decoding instruction information to the local video decoder through a websocket/http protocol, requesting the local video decoder to decode video data corresponding to a time stamp of 100 seconds of audio.
S503, the local video decoding service device receives the second audio time stamp.
Wherein the second audio time stamp comprises a time stamp of audio of a first preset duration in the audio data.
In one possible implementation, the local video decoding service device receives a timestamp of audio for a first preset duration.
Illustratively, the local video decoding service device receives an audio play buffer time stamp of 100 seconds.
S504, the local video decoding service device determines video data corresponding to the second audio time stamp according to the second audio time stamp.
In one possible implementation, the local video decoding service device determines video data corresponding to the time stamp of the audio of the first preset duration from the playlist information according to the received time stamp of the audio of the first preset duration.
Illustratively, the local video decoding service device determines video data corresponding to the audio-visual stream play buffer time stamp of 100 seconds from the play address list information according to the audio-visual stream play buffer time stamp of 100 seconds.
S505, the local video decoding service device decodes the video data corresponding to the second audio time stamp.
In one possible implementation, the local video decoding service device receives the second instruction information to start decoding the video data, decodes the video data in the media data to be played into a picture stream of JPEG for a single-channel stream, and appends the time stamp information.
And aiming at the multi-path stream, after decoding video data in each path of media data to be played, carrying out frame loss processing according to the image time of decoding frame data of different video streams and the time stamp of the current played audio, compressing the video data into a picture stream, and simultaneously adding time stamp information.
Illustratively, the local video decoder receives the second indication information sent by the video player, then the local video decoder starts decoding the video data corresponding to the audio-visual stream buffer time stamp of 100 seconds, decodes the video data in the media data to be played into a picture stream of JPEG for a single-channel stream, and attaches the time stamp information.
And aiming at the multi-path stream, after decoding video data in each path of media data to be played, carrying out frame loss processing according to the image time of decoding frame data of different video streams and the time stamp of the current played audio, compressing the video data into a picture stream, and simultaneously adding time stamp information.
The video playing device determines a second audio time stamp before determining the first audio time stamp; then, the local video decoding service determines video data corresponding to the second audio time stamp, the video playing device sends second indication information to the local video decoding service, and the local video decoding service receives the second indication information and decodes the video data corresponding to the second audio time stamp; the local video decoding service adds time stamp information to the decoded video data, the video playing device requests the local video decoding service to extract and send the video data with the same time as the first audio time stamp to the video playing device according to the first audio time stamp, and finally the video playing device realizes audio-video synchronization according to the time stamp of the decoded video data and the first audio time stamp. The local video decoding service decodes the video data corresponding to the audio-video stream playing buffer time stamp in advance, namely decodes the video data corresponding to the second audio time stamp, so that the condition that the audio-video synchronization in the subsequent program avoids losing frames in the playing process can be ensured.
The implementation of the step embodiment before determining the first audio timestamp is described in detail above.
A detailed description of a specific implementation of the steps before determining the second audio timestamp follows.
As shown in fig. 6, in one possible implementation, before determining the second audio timestamp, the method is specifically implemented through S601-S604, which is described in detail below:
s601, the video playing device acquires playing address list information of media data to be played.
In a possible implementation, the video playing device initializes and starts a local video decoding service, and obtains play address list information of media data to be played from the streaming media service.
The video playing device performs initialization preparation work, starts a local video decoder, obtains a video playing address corresponding to the target video identification information from the streaming media server, and determines audio and video data to be played according to the video playing address.
S602, the video playing device sends decoding instruction information to the local video decoding service.
Wherein decoding instruction information includes: play address list information of media data to be played.
In a possible implementation manner, the video playing device sends the playlist information obtained from the streaming media service to the local video decoding service through websocket/http protocol.
The video playing device sends the audio/video data of a certain movie fragment a obtained from the streaming media service to the local video decoding service through websocket/http protocol.
S603, the local video decoding service device receives decoding instruction information sent by the player.
Wherein decoding instruction information includes: play address list information of media data to be played.
In one possible implementation, the local video decoding service device is started by the video playing device and starts a function of monitoring a control command of the video playing device, and the local video decoding service receives play address list information sent by the video playing device, where the play address list includes a stream ID.
Illustratively, the local video decoder initiates a listening video player, and the local video decoder receives audio-video data of a movie fragment a obtained by the video player from the streaming server.
S604, the local video decoding service device determines video data corresponding to the second audio time stamp from the decoding instruction information.
In one possible implementation, the local video decoding service device determines video data corresponding to the audio play buffer time stamp from the play address list information.
Illustratively, the local video decoding server determines video data corresponding to an audio play buffer time stamp of 100 seconds from the play address list information.
According to the video decoding method and device, the video playing device sends the play address list information to the local video decoding service, the local video decoding service is ensured to determine video data corresponding to the audio playing buffer time stamp from the play address list information so as to finish decoding the video data in advance, the local video decoding service can decode videos in various video formats, and the player is ensured to play videos in various different formats. .
The implementation of the embodiment of the steps before determining the second audio timestamp is described in detail above.
In the above, how the video processing method better supports video playback is described in detail.
The embodiment of the present application may divide functional modules or functional units of the video processing apparatus according to the above method example, for example, each functional module or functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated modules may be implemented in hardware, or in software functional modules or functional units. The division of the modules or units in the embodiments of the present application is merely a logic function division, and other division manners may be implemented in practice.
Fig. 7 is a schematic structural diagram of a video processing device applied to a player according to an embodiment of the present application, where the device includes: a processing unit 701 and a communication unit 702; the processing unit 701 is configured to extract audio data from media data to be played; the processing unit 701 is further configured to play the audio data, and determine a first audio timestamp; the first audio time stamp is the time stamp of the audio data which is currently played; the communication unit 702 is configured to send first indication information to a local video decoding service; the first indication information is used for indicating the first audio time stamp; the processing unit 701 is further configured to receive video data from the local video decoding service; the video data are video data with the same playing time stamp as the first audio time stamp in the media data to be played; the processing unit 701 is further configured to perform audio-video synchronization on the video data and the currently played audio data.
Optionally, the processing unit 701 is further configured to determine a second audio timestamp; the second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data; the audio of the first preset duration comprises the audio played currently; the communication unit 702 is further configured to send second indication information to a local video decoding service, where the second indication information is used to instruct the local video decoding service to decode video data corresponding to the second audio timestamp.
Optionally, the processing unit 701 is further configured to obtain playlist information; the communication unit 702 is further configured to send decoding instruction information to the local video decoding service, where the decoding instruction information includes: play address list information.
As shown in fig. 8, a schematic structural diagram of a video processing apparatus applied to a local video decoding service according to an embodiment of the present application is provided, where the apparatus includes: a processing unit 801 and a communication unit 802; the processing unit 801 is configured to decode video data in media data to be played; the communication unit 802 is configured to receive first indication information from a player, and determine a first audio timestamp; the first audio time stamp is a time stamp of audio data currently played by the player; the processing unit 801 is further configured to determine, according to the first audio timestamp, video data having a same time as the first audio timestamp; the communication unit 802 is further configured to send video data to the player at the same time as the first audio timestamp.
Optionally, the processing unit 801 is further configured to receive a second audio timestamp; the second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data; the processing unit 801 is further configured to determine video data corresponding to the second audio timestamp according to the second audio timestamp; the processing unit 801 is further configured to decode video data corresponding to the second audio timestamp.
Optionally, the processing unit 801 is further configured to receive decoding instruction information sent by the player; the decoding instruction information includes: play address list information; the processing unit 801 is further configured to determine video data corresponding to the second audio timestamp from the decoding instruction information.
When implemented in hardware, the communication unit 702 in the embodiments of the present application may be integrated on a communication interface, and the processing unit 701 may be integrated on a processor. A specific implementation is shown in fig. 9.
Fig. 9 shows still another possible structural diagram of the video processing apparatus involved in the above-described embodiment. The video processing apparatus includes: a processor 902 and a communication interface 903. The processor 902 is configured to control and manage the actions of the video processing apparatus, e.g., perform the steps performed by the processing unit 701 described above, and/or perform other processes of the techniques described herein. The communication interface 903 is used to support communication between the video processing apparatus and other network entities, for example, to perform the steps performed by the communication unit 702 described above. The video processing device may also include a memory 901 and a bus 904, the memory 901 for storing program codes and data for the video processing device.
Wherein the memory 901 may be a memory or the like in the video processing apparatus, which may include a volatile memory such as a random access memory; the memory may also include non-volatile memory, such as read-only memory, flash memory, hard disk or solid state disk; the memory may also comprise a combination of the above types of memories.
The processor 902 may be implemented or realized with the various illustrative logical blocks, modules, and circuits described in connection with the present disclosure. The processor may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor may also be a combination that performs the function of a computation, e.g., a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, etc.
Bus 904 may be an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus or the like. The bus 904 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 9, but not only one bus or one type of bus.
When implemented in hardware, the communication unit 802 in the embodiments of the present application may be integrated on a communication interface, and the processing unit 801 may be integrated on a processor. A specific implementation is shown in fig. 10.
Fig. 10 shows still another possible structural diagram of the video processing apparatus involved in the above-described embodiment. The video processing apparatus includes: a processor 1002 and a communication interface 1003. The processor 1002 is configured to control and manage the actions of the video processing apparatus, for example, to perform the steps performed by the processing unit 801 described above, and/or to perform other processes of the techniques described herein. The communication interface 1003 is used to support communication of the video processing apparatus with other network entities, for example, to perform the steps performed by the communication unit 802 described above. The video processing device may also include a memory 1001 and a bus 1004, the memory 1001 for storing program codes and data of the video processing device.
Wherein the memory 1001 may be a memory or the like in a video processing apparatus, which may include a volatile memory such as a random access memory; the memory may also include non-volatile memory, such as read-only memory, flash memory, hard disk or solid state disk; the memory may also comprise a combination of the above types of memories.
The processor 1002 may be implemented or realized with the various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein. The processor may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor may also be a combination that performs the function of a computation, e.g., a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, etc.
Bus 1004 may be an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus or the like. The bus 1004 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 10, but not only one bus or one type of bus.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above. The specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which are not described herein.
The present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the video processing method of the method embodiments described above.
The embodiment of the application also provides a computer readable storage medium, in which instructions are stored, which when executed on a computer, cause the computer to execute the video processing method in the method flow shown in the method embodiment.
The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access Memory (Random Access Memory, RAM), a Read-Only Memory (ROM), an erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), a register, a hard disk, an optical fiber, a portable compact disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing, or any other form of computer readable storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuit, ASIC). In the context of the present application, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
Since the video processing apparatus, the computer readable storage medium, and the computer program product in the embodiments of the present invention can be applied to the above-mentioned method, the technical effects that can be obtained by the video processing apparatus, the computer readable storage medium, and the computer program product can also refer to the above-mentioned method embodiments, and the embodiments of the present invention are not described herein again.
In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, indirect coupling or communication connection of devices or units, electrical, mechanical, or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (12)

1. A video processing method, for use in a player, the method comprising:
extracting audio data from media data to be played; the media data to be played comprises audio data and video data;
playing the audio data and determining a first audio time stamp; the first audio time stamp is an audio time stamp of the audio data which is currently played;
transmitting first indication information to a local video decoding service; the first indication information is used for indicating the first audio time stamp; the local video decoding service is used for decoding videos in a plurality of video formats; the video data of the multiple video formats comprises the media data to be played;
Receiving video data from the local video decoding service; the video data are the video data with the same time as the first audio time stamp in the media data to be played;
and carrying out audio-video synchronization on the video data and the currently played audio data.
2. The video processing method of claim 1, further comprising, prior to determining the first audio timestamp:
determining a second audio timestamp; the second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data; the audio of the first preset duration comprises the audio played currently;
and sending second indication information to the local video decoding service, wherein the second indication information is used for indicating the local video decoding service to decode video data corresponding to the second audio time stamp.
3. The video processing method of claim 2, further comprising, prior to determining the second audio timestamp:
acquiring play address list information of the media data to be played;
transmitting decoding instruction information to the local video decoding service, the decoding instruction information including: and playing address list information of the media data to be played.
4. A video processing method according to claim 3, wherein in case that the media data to be played includes multiple streams;
the first indication information further includes: a flow identity ID of each flow in the multi-path flows;
the second indication information further includes: a flow identity ID of each flow in the multi-path flows;
the playlist information further includes: and the flow identity ID of each flow in the multiple flows.
5. A video processing method, applied to a local video decoding service, the method further comprising:
decoding video data in the media data to be played;
receiving first indication information from a player, and determining a first audio time stamp; the first audio time stamp is an audio time stamp of audio data currently played by the player;
determining video data with the same time as the first audio time stamp according to the first audio time stamp;
video data is sent to the player at the same time as the first audio timestamp.
6. The video processing method according to claim 5, wherein the decoding video data in the media data to be played comprises:
Receiving a second audio timestamp; the second audio time stamp comprises a time stamp of audio with a first preset duration in the audio data;
determining video data corresponding to the second audio time stamp according to the second audio time stamp;
and decoding video data corresponding to the second audio time stamp.
7. The video processing method of claim 6, further comprising, prior to receiving the second audio timestamp:
receiving decoding instruction information sent by the player; the decoding instruction information includes: play address list information of media data to be played;
video data corresponding to the second audio time stamp is determined from the decoding instruction information.
8. The method of video processing according to claim 6, wherein said decoding video data corresponding to the second audio time stamp comprises:
decoding the video and adding time stamp information for the single-way stream;
and for the multipath stream, performing frame loss processing on video decoding, and adding time stamp information.
9. A video processing apparatus for use in a player, the apparatus comprising: a processing unit and a communication unit;
The processing unit is used for extracting audio data from the media data to be played; the media data to be played comprises audio data and video data;
the processing unit is further used for playing the audio data and determining a first audio time stamp; the first audio time stamp is an audio time stamp of the audio data which is currently played;
the communication unit is used for sending first indication information to the local video decoding service; the first indication information is used for indicating the first audio time stamp; the local video decoding service is used for decoding videos in a plurality of video formats; the video data of the multiple video formats comprises the media data to be played;
the processing unit is further configured to receive video data from the local video decoding service; the video data are the video data with the same time as the first audio time stamp in the media data to be played;
and the processing unit is also used for carrying out audio-video synchronization on the video data and the currently played audio data.
10. A video processing apparatus for use in a local video decoding service, the apparatus further comprising: a processing unit and a communication unit;
The processing unit is used for decoding video data in the media data to be played;
the communication unit is used for receiving first indication information from the player and determining a first audio time stamp; the first audio time stamp is an audio time stamp of audio data currently played by the player;
the processing unit is further used for determining video data with the same time as the first audio time stamp according to the first audio time stamp;
the communication unit is further configured to send video data to the player at the same time as the first audio timestamp.
11. A video processing apparatus, comprising: a processor and a communication interface; the communication interface is coupled to the processor for running a computer program or instructions to implement the video processing method as claimed in any one of claims 1-8.
12. A computer readable storage medium having instructions stored therein, characterized in that when executed by a computer, the computer performs the video processing method of any of the preceding claims 1-8.
CN202211186658.7A 2022-09-27 2022-09-27 Video processing method, device and storage medium Pending CN117793459A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211186658.7A CN117793459A (en) 2022-09-27 2022-09-27 Video processing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211186658.7A CN117793459A (en) 2022-09-27 2022-09-27 Video processing method, device and storage medium

Publications (1)

Publication Number Publication Date
CN117793459A true CN117793459A (en) 2024-03-29

Family

ID=90394936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211186658.7A Pending CN117793459A (en) 2022-09-27 2022-09-27 Video processing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN117793459A (en)

Similar Documents

Publication Publication Date Title
CA2965484C (en) Adaptive bitrate streaming latency reduction
CN107409234B (en) Streaming based on file format using DASH format based on LCT
US8788933B2 (en) Time-shifted presentation of media streams
KR102120525B1 (en) Communication apparatus, communication data generation method, and communication data processing method
KR101558116B1 (en) Switching between representations during network streaming of coded multimedia data
WO2018076998A1 (en) Method and device for generating playback video file
US11284135B2 (en) Communication apparatus, communication data generation method, and communication data processing method
WO2018014691A1 (en) Method and device for acquiring media data
US11438645B2 (en) Media information processing method, related device, and computer storage medium
EP3096524B1 (en) Communication apparatus, communication data generation method, and communication data processing method
CN111182322A (en) Director control method and device, electronic equipment and storage medium
CN108494792A (en) A kind of flash player plays the converting system and its working method of hls video flowings
WO2019149066A1 (en) Video playback method, terminal apparatus, and storage medium
CN115623264A (en) Live stream subtitle processing method and device and live stream playing method and device
EP3096525B1 (en) Communication apparatus, communication data generation method, and communication data processing method
CN117793459A (en) Video processing method, device and storage medium
CN107148779B (en) Method for transmitting media content
CN113973215A (en) Data deduplication method and device and storage medium
CN114760486A (en) Live broadcasting method, device, equipment and storage medium
WO2022100742A1 (en) Video encoding and video playback method, apparatus and system
WO2021114305A1 (en) Video processing method and apparatus, and computer readable storage medium
CN117981328A (en) Multi-channel synchronous playing method and device for audio and video, electronic equipment and storage medium
CN117061813A (en) Media playback method and related media playback device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination