CN112866716A

CN112866716A - Method and system for synchronously decapsulating video file

Info

Publication number: CN112866716A
Application number: CN202110055010.5A
Authority: CN
Inventors: 严龙; 罗鑫; 王达
Original assignee: Beijing Ruixin High Throughput Technology Co ltd
Current assignee: Beijing Ruixin High Throughput Technology Co ltd
Priority date: 2021-01-15
Filing date: 2021-01-15
Publication date: 2021-05-28

Abstract

The invention relates to a method and a system for synchronously decapsulating video files, wherein the method comprises the following steps: step S1: extracting file description information for caching; step S2: extracting a Cluster data group; step S3: splicing the file description information and the Cluster data group; step S4: and performing de-encapsulation and de-multiplexing processing on the spliced decodable file, and extracting a video frame for decoding. The invention can clear the upper line text information required by the FFmpeg after processing once, thereby reducing the occupation of memory space.

Description

Method and system for synchronously decapsulating video file

Technical Field

The present invention relates to the field of video technologies, and in particular, to a method and a system for synchronously decapsulating a video file in an online playing scene (or streaming processing scene).

Background

Video decoding technology is widely used in the fields of video playing, video transcoding and video content auditing. Before decoding the compressed data of the video, it is necessary to decapsulate the video media file, that is, extract the video data stream and the audio data stream from the media file for processing by different decoders. The decapsulation operation correctly extracts the data stream corresponding to the audio/video, which is a necessary condition for the decoder to be able to correctly decode. The packaging format of the media files is various, such as MP4, AVI, FLV, MPEG, WEB M, etc. And corresponding decapsulation methods are available for different media encapsulation formats.

FFmpeg is a set of open source computer programs that can be used to record, convert digital audio, video, and convert them into streams. LGPL or GPL licenses are used. It provides a complete solution for recording, converting and streaming audio and video. It contains the very advanced audio/video codec library libavcodec, and many codes in libavcodec are developed from the beginning in order to ensure high portability and codec quality.

The flow of demultiplexing and decoding the WEBM video file using FFmpeg is shown in fig. 1. Wherein the av _ seek _ frame () function can locate a desired time point in the video for processing.

However, this technique requires that the time information of the WEBM video file needs to be predicted in advance, and at the same time, it is ensured that the expected located time point has corresponding code stream data. For the online playing (streaming processing) scene, data is processed in a segmented mode, and complete information of a subsequent video file cannot be obtained before processing; meanwhile, when the information of the starting time point of the segment cannot be acquired, the av _ seek _ frame () operation fails, and the subsequent data cannot be normally demultiplexed; in addition, the method needs to always maintain the file description context information, and when the number of processed video paths is increased, a large amount of memory space is occupied.

Disclosure of Invention

Problems to be solved by the invention

The invention mainly aims to provide a method and a system for synchronously decapsulating a video file, which solve the problems of relocation of WEBM video file data segments and high memory occupation when multiple paths of videos are processed simultaneously in an online playing (streaming processing) scene.

Means for solving the problems

In order to achieve the above object, an embodiment of the present invention is a method for synchronously decapsulating a video file, including the following steps:

step S1: extracting file description information for caching;

step S2: extracting a Cluster data group;

step S3: splicing the file description information and the Cluster data group;

step S4: and performing de-encapsulation and de-multiplexing processing on the spliced decodable file, and extracting a video frame for decoding.

Preferably, the step S1 includes the following sub-steps:

step S11: comparing the media data segments byte by byte according to the byte sequence of 0x1F, 0x43, 0xB6 and 0x75 by adopting a KMP algorithm;

step S12: searching an offset address A of the byte sequence appearing for the first time, wherein all data from the offset address A to the file head are description information of the file;

step S13: and extracting and caching.

Preferably, the step S2 includes the following sub-steps:

step S21: performing byte-by-byte comparison in the byte order of "0 x1F, 0x43, 0xB6, 0x 75" starting from the offset address a located in step S1 by using the KMP algorithm;

step S22: searching an offset address B of the byte sequence appearing for the last time;

step S23: extracting binary data from the offset address A to the address B as a Cluster data group;

step S24: and caching the data from the address B to the end of the media data segment.

Preferably, in step S3, a new memory space is applied by malloc, and the length is the sum of the file description information cached in step S1 and the byte length of the Cluster data set extracted in step S2.

Preferably, the step S3 includes the following sub-steps:

step S31: copying the file description information cached in the step S1 from the first address of the newly allocated memory space;

step S32: and copying the Cluster data group extracted in the step S2 as subsequent data to form a complete decodable file.

Another embodiment of the present invention is a system for synchronously decapsulating a video file, including:

an extraction cache module which extracts the file description information for caching;

the Cluster data group extracting module is used for extracting a Cluster data group;

the splicing module is used for splicing the file description information and the Cluster data group; and

and the decoding processing module is used for performing de-encapsulation and de-multiplexing processing on the spliced decodable file and extracting the video frame for decoding.

Preferably, in the extraction cache module, the media data segments are aligned byte by using a KMP algorithm according to a byte sequence of "0 x1F, 0x43, 0xB6, 0x 75", and an offset address a at which the byte sequence occurs for the first time is found, where all data from the offset address a to the file header are description information of the file.

Preferably, in the module for extracting the Cluster data group, byte-by-byte comparison is performed according to the byte order of "0 x1F, 0x43, 0xB6 and 0x 75", an offset address B where the byte sequence occurs last time is searched, binary data from the offset address a to the address B is extracted as the Cluster data group, and data from the address B to the end of the media data segment is cached.

Preferably, in the splicing module, a new memory space is applied by malloc, and the length is the sum of the file description information cached in the extraction cache module and the byte length of the Cluster data group extracted in the Cluster data group extraction module.

Preferably, in the splicing module, the file description information cached in the extraction caching module is copied from the first address of the newly allocated memory space; and copying the Cluster data group extracted from the Cluster data group extracting module as subsequent data to form a complete decodable file.

ADVANTAGEOUS EFFECTS OF INVENTION

The upper line information required by the FFmpeg can be cleared every time the FFmpeg is processed, and the occupation of the memory space is reduced.

With this scheme, each data segment can be treated as an independent WEBM video without relocating the newly received video segment with av _ seek _ frame ().

Drawings

Fig. 1 is a flow diagram of demultiplexing and decoding a WEBM video file using FFmpeg.

Fig. 2 is a schematic diagram of the file structure of the WEBM video file of the present invention.

FIG. 3 is a flow chart of a method for synchronous decapsulation of video files according to the present invention.

FIG. 4 is a schematic diagram of a system for synchronous decapsulation of video files according to the present invention.

Description of reference numerals: 1: a synchronous decapsulation system; 11: an extraction cache module; 12: a Clu ster data group extraction module; 13: a splicing module; 14: and a decoding processing module.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention. It should be further emphasized here that the following embodiments provide preferred embodiments, and that the various aspects (embodiments) may be used in combination or cooperation with each other.

Fig. 2 is a schematic diagram of the file structure of the WEBM video file of the present invention. The Cl user part stores code stream data of audio and video, and the part before the Cluster part is description information of a file.

Each Cluster unit header has a corresponding data field to identify the beginning of the Cluster and its subsequent data length. When the FFmpeg processes the WEBM format file, the file description context and the decoder can be successfully initialized according to the description information of the file, and then the FFmpeg sequentially analyzes according to the length information of each Cluster to extract code stream data. Therefore, the description information of the WEBM file is spliced with a plurality of complete Cluster data segments to form an independent WEBM video file which is used as input to be automatically processed by FFmpeg.

According to the principle, the overall idea of the invention is as follows: when the WEBM video file is played online, the related data segment described by the file can be extracted from the first segment of the WEBM video file to be cached, then data splicing is carried out from the subsequently received video segment by taking Cluster as a unit, the last Cluster identifier and the subsequent data in the segment are always cached, and the last Cluster identifier and the subsequent data are spliced into a complete Cluster unit with the subsequently received data segment to be processed. Meanwhile, the upper line text information required by the FFmpeg can be cleared every time the FFmpeg is processed, and the occupation of the memory space is reduced.

As shown in fig. 3, it is a flowchart of the method for synchronously decapsulating video files of the present invention, and the method for synchronously decapsulating video files of the present invention includes the following steps:

step S1: extracting file description information for caching, which comprises the following substeps:

step S13: and extracting and caching.

Step S2: extracting a Cluster data set, which comprises the following substeps;

Step S3: splicing the file description information and the Cluster data group;

applying for a new memory space by using malloc, wherein the length is the sum of the file description information cached in the step S1 and the byte length of the Cluster data group extracted in the step S2, and the method specifically comprises the following substeps:

Step S4: and performing decoding processing, specifically performing decapsulation and demultiplexing processing on the decodable file spliced in step S3, and extracting and decoding the video frame.

By this method, each data segment can be treated as an independent WEBM video without relocating the newly received video segment with av _ seek _ frame ().

As shown in fig. 4, which is a schematic diagram of a system for synchronizing and decapsulating video files according to the present invention, the synchronizing and decapsulating system 1 includes: an extraction buffer module 11, an extraction Cluster data group module 12, a splicing module 13 and a decoding processing module 14. Wherein:

the extraction cache module 11 is configured to extract file description information for caching, specifically, perform byte-by-byte comparison on media data segments according to a byte sequence of "0 x1F, 0x43, 0xB6, 0x 75" by using a KMP algorithm, and search for an offset address a where the byte sequence occurs for the first time, where all data from the offset address a to a file header are file description information;

and the Cluster data group extracting module 12 is used for extracting a Cluster data group, and specifically adopts a KMP algorithm, starting from the located offset address A, performing byte-by-byte comparison according to byte sequences of 0x1F, 0x43, 0xB6 and 0x75, searching for an offset address B where the byte sequence appears at the last time, extracting binary data between the offset address A and the address B as a Cluster data group, and caching data from the address B to the end of a media data segment.

The splicing module 13 is used for splicing the file description information and the Cluster data group, wherein a new memory space is applied for a section by using ma lloc, the length is the sum of the file description information cached in the extraction cache module 11 and the byte length of the Cluster data group extracted from the extraction Cluster data group module 12, and specifically, the file description information cached in the extraction cache module 11 is copied from the head address of the newly allocated memory space; and copying the Cluster data group extracted from the Cluster data group extracting module 12 as subsequent data to form a complete decodable file.

And the decoding processing module 14 is used for performing de-encapsulation and de-multiplexing processing on the spliced decodable file and extracting the video frame for decoding.

The difference between video frame data obtained by demultiplexing a WEBM video file and data frames of an original video in the prior art and the technical scheme of the invention is tested by adopting the same video and according to the characteristics of segmented processing in an online playing scene. Table 1 below is test data that shows that the frame data of the WEBM format file demultiplexed by the scheme of the present invention in the streaming scenario may substantially coincide with the number of frames of the original video.

TABLE 1

Benchmark test file	Number of standard frames	Prior art output	Output of the scheme
				VN_00013_vp8_640x360.webm	2511	507	2488
VN_00014_vp8_640x360.webm	19998	9927	19422
				VN_00021_vp8_640x360.webm	6953	3146	6795
VN_00041_vp8_640x360.webm	9489	6373	9211

Compared with the prior art, the invention has the following beneficial effects:

Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for synchronously decapsulating a video file is characterized by comprising the following steps:

step S1: extracting file description information for caching;

step S2: extracting a Cluster data group;

step S3: splicing the file description information and the Cluster data group;

2. The method for synchronously decapsulating video files according to claim 1, wherein said step S1 comprises the following sub-steps:

step S13: and extracting and caching.

3. The method for synchronously decapsulating video files according to claim 2, wherein said step S2 comprises the following sub-steps:

4. The method according to claim 3, wherein in step S3, a new memory space is applied by malloc, and the length is the sum of the file description information cached in step S1 and the byte length of the Cluster data set extracted in step S2.

5. The method for synchronously decapsulating video files according to claim 4, wherein said step S3 comprises the following sub-steps:

6. A system for synchronizing and decapsulating video files, comprising:

7. The system of claim 6, wherein in the fetch buffer module, the KMP algorithm is used to perform byte-by-byte comparison on the media data segments according to the byte order "0 x1F, 0x43, 0xB6, 0x 75" to find the offset address A of the first occurrence of the byte sequence, and all data from the offset address A to the file header is the description information of the file.

8. The system for synchronously decapsulating video files according to claim 6, wherein in the Cluster data group extraction module, byte-by-byte comparison is performed according to a byte sequence of "0 x1F, 0x43, 0xB6, 0x 75", an offset address B where the byte sequence occurs last time is searched, binary data from the offset address a to the address B is extracted as a Cluster data group, and data from the address B to the end of a media data segment is buffered.

9. The system of claim 6, wherein in the splicing module, a new memory space is applied by malloc, and the length is the sum of the file description information cached in the extraction cache module and the byte length of the Cluster data group extracted from the Cluster data group extraction module.

10. The system according to claim 6, wherein in the splicing module, the file description information cached in the extraction caching module is copied from a first address of a newly allocated memory space; and copying the Cluster data group extracted from the Cluster data group extracting module as subsequent data to form a complete decodable file.