CN112637634B - High-concurrency video processing method and system for multi-process shared data - Google Patents
High-concurrency video processing method and system for multi-process shared data Download PDFInfo
- Publication number
- CN112637634B CN112637634B CN202011554999.6A CN202011554999A CN112637634B CN 112637634 B CN112637634 B CN 112637634B CN 202011554999 A CN202011554999 A CN 202011554999A CN 112637634 B CN112637634 B CN 112637634B
- Authority
- CN
- China
- Prior art keywords
- video
- data
- gop
- decoding
- queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 217
- 238000003672 processing method Methods 0.000 title claims abstract description 7
- 230000015654 memory Effects 0.000 claims abstract description 56
- 238000012545 processing Methods 0.000 claims abstract description 51
- 239000012634 fragment Substances 0.000 claims abstract description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000004458 analytical method Methods 0.000 claims description 7
- 101000946275 Homo sapiens Protein CLEC16A Proteins 0.000 claims description 6
- 102100034718 Protein CLEC16A Human genes 0.000 claims description 6
- 101100122750 Caenorhabditis elegans gop-2 gene Proteins 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 description 7
- 230000001133 acceleration Effects 0.000 description 5
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000003139 buffering effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234309—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/544—Buffers; Shared memory; Pipes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23406—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving management of server-side video buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234345—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44004—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440209—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display for formatting on an optical medium, e.g. DVD
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440245—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/443—OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a high-concurrency video processing method and a high-concurrency video processing system for multi-process shared data, wherein the processing method comprises the following steps: s1: inputting video segments with different sizes into a session manager for managing different video programs; s2: the method comprises the steps that a fragment-based multi-process decoding task scheduling load balancing method is used, data distribution is carried out on an input video segment to queues corresponding to different decoding processes according to the video fragment with the smallest weight unit, and then corresponding video segment data in the queues are sent to the corresponding decoding processes in a socket mode; s3: the decoding process decodes the received video clip and places the decoded data into the corresponding shared memory queue; s4: the main process acquires YUV data from different shared memory queues, correspondingly processes each acquired frame of YUV data, and returns the YUV data to a corresponding buffer area after the YUV data is processed; s5: and returning the processing result to the session manager.
Description
Technical Field
The invention relates to the field of data processing, in particular to a method and a system for processing high-concurrency video sharing data by multiple processes, and more particularly to a method and a system capable of carrying out balanced processing on high-concurrency video load and realizing shared memory communication of data among multiple processes.
Background
In recent years, the global internet traffic scale is continuously increasing at a high speed, and according to the statistics of Chinese telecommunications, the future video traffic will dominate the internet traffic, and the video traffic is expected to account for more than 82% of the total internet traffic by 2021. On-line video materials are usually compressed by using a codec, and at an equipment end, in order to be able to correctly browse the materials, a user can directly download and install a codec package to realize decompression and playing of videos, while at a cloud video processing center, a large number of different videos need to be decoded and analyzed every day, but the current video decoding schemes and technologies based on a data center are relatively few.
The video decoding scheme of the existing video processing data center usually adopts a single-process multi-channel video decoding method. The management of video conversation and the decoding of different videos and the processing and analysis of decoded data are realized in a single process, the memories accessed in the video processing process of each module in the process can be shared, the use is convenient, and the realization difficulty is low. In addition, many data centers use hardware acceleration devices such as the great GPU to perform decoding and video analysis of multiple videos. However, in the existing scheme for implementing video processing by adopting a single process, when a process is in error or crashed or hardware equipment is in error, the whole video processing service process is stopped; if a single-process video processing process is directly divided into multiple processes by modules or by an acceleration device, data copy overhead of inter-process communication is increased. Therefore, a method capable of balancing the load of processing tasks of a plurality of decoding devices or decoding processes and dividing the processes according to modules is needed to realize cooperative work among different processes and transfer of YUV data among different processes, where YUV is a color coding format, and the decoded data is usually data in the YUV format.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a method and a system for processing multi-process shared data high-concurrency videos, which realize load balance of multi-process decoding tasks by distributing and scheduling a plurality of paths of videos to a plurality of different processes, reduce the influence on a main process when the process exits due to abnormal decoding process or failure of hardware acceleration equipment, and enhance the fault tolerance of the system; meanwhile, through the processing of data sharing between the multiple decoding processes and the decoded YUV data processing process, the coupling between different processes of the system is reduced, and the zero copy of the YUV data among the processes is realized, so that the overhead in the aspect of data communication brought by the multi-process cooperative work is reduced.
In order to achieve the above object, the present invention provides a method for processing highly concurrent video with shared data by multiple processes, which comprises the following steps:
s1: inputting video segments with different sizes into a session manager for managing different video programs;
s2: the method comprises the steps that a fragment-based multi-process decoding task scheduling load balancing method is used, data distribution is carried out on an input video segment to queues corresponding to different decoding processes according to a video fragment with the smallest weight unit, and then corresponding video fragment data in the queues are sent to the corresponding decoding processes in a socket mode;
s3: the decoding process decodes the received video clip and places the decoded data into the corresponding shared memory queue;
s4: the main process acquires YUV data from different shared memory queues, correspondingly processes each acquired frame of YUV data, and returns the YUV data to a corresponding buffer area after the YUV data is processed;
s5: and returning the processing result to the session manager.
In an embodiment of the present invention, the specific steps of performing data distribution in S2 are as follows:
s21, acquiring any video segment and acquiring current video segment information, wherein the video segment information comprises: the occupied memory space size, the video session information, the video bandwidth and the video height are high;
s22, calculating the number of the frame number of the current video segment and the number of the key frame I through multimedia processing software, and calculating the weight W of the current video segment through the frame number, the video width and the video height:
W=frame_num×width×height
wherein, frame _ num is the frame number of the current video segment, width is the video width, and height is the video height;
s23, splitting the video segment into a plurality of video segments with the minimum weight unit according to the weight W of the current video segment;
s24, distributing the split video clips, and the specific process is as follows: comparing the total weight of the queues corresponding to all the processes to obtain a minimum total weight queue, adding the video segment of the 1 st minimum weight unit split in the S23 into the minimum total weight queue, and completing the distribution of the video segment of the 1 st minimum weight unit to the decoding process corresponding to the minimum total weight queue;
and S25, repeating the step S24 until all the video clips with the minimum weight unit are distributed to the decoding process corresponding to the queue with the minimum total weight.
In an embodiment of the present invention, the specific process of S23 is as follows:
s2301: checking the number of key frames I of the current video segment, wherein if the number of the I frames is 1, the current video segment is the video segment with the minimum weight unit; if the number of I frames is greater than 1, comparing the weight W of the current video segment calculated in S22 with a preset minimum weight unit;
s2302: if the weight W of the current video section is larger than the minimum weight unit, calculating the weight of a GOP according to the size of a GOP (group of pictures) in which any I frame in the video sections is positioned, wherein the GOP is all video frames including the current I frame from the current I frame to the next I frame, the size of the GOP is the number of the video frames included in the current GOP, and calculating the weight W of the current GOP according to the size of the current GOP GOP The calculation method comprises the following steps:
W GOP size of GOP × width × height
Wherein, width is the video width, height is the video height;
s2303: calculating the current GOP weight W of the 1 st GOP of the current video segment GOP1 Previously, the total weight sumW of the current video segment 0 Is 0;
s2304: calculating the current GOP weight W of the 2 nd GOP of the current video clip GOP2 Previously, the total weight sumW of the current video segment 1 =W GOP1 +sumW 0 The same applies to the calculation of the current GOP weight W of the xth GOP GOPx Previously, the total weight sumW of the current video segment x-1 =W GOPx-1 +sumW x-2 ;
S2305: when W is GOPx +sumW x-1 When the value of (1) is greater than the minimum weight unit, dividing the video sections from the 1 st GOP to the x-th GOP as a single video section, resetting the sumW to 0, and repeating S2303 and S2304 with the next GOP as the 1 st GOP; otherwise, entering the next step;
s2306: if the nth GOP is calculated and n < x is not available for calculation, the video clips from the 1 st GOP to the nth GOP are directly split as a single video clip.
In an embodiment of the present invention, the specific process of S3 is as follows:
s31: the main process allocates a shared memory according to the name of each decoding process, and divides the shared memory into two queues, namely a busy queue and an empty queue, wherein each queue comprises a plurality of shared memory blocks and is used for storing the decoded YUV data, the empty queue is a queue for managing an idle YUV buffer area, and the busy queue is a queue for managing a buffer area for caching the YUV data after decoding;
s32: when the decoding process decodes new YUV data, the decoding process acquires an idle YUV buffer area from the empty queue;
s33: and the decoding process fills the YUV data information into a YUV buffer area according to the video segment information corresponding to the current YUV data, wherein the YUV data information comprises: program information, image width, image height, image size, image format;
s34: and the decoding process puts the YUV buffer area filled with the data into a busy queue to finish the storage of the YUV data in the shared memory.
In an embodiment of the present invention, the specific process of S4 is as follows:
s41: the main process sets the state of a busy queue corresponding to the decoding process for each decoding process by a plurality of image processing threads, and when the busy queue is inquired to have data, corresponding YUV data nodes are directly taken out from the head of the busy queue;
s42: the main process directly performs corresponding analysis processing on the YUV data in the YUV data node taken out in the S41 according to the specific service requirement of the obtained YUV data node;
s43: and after the analysis processing of the YUV data in the S42 is completed by the main process, the YUV buffer area is placed in the empty queue corresponding to the decoding process of the YUV data again, and the YUV buffer area is reused by the decoding process.
In an embodiment of the present invention, the main process implements state query on each decoding process through the monitoring thread, and when any decoding process exits abnormally, the main process directly destroys the shared memory corresponding to the decoding process.
In order to achieve the above object, the present invention further provides a system for highly concurrent video processing with shared data by multiple processes, configured to perform the foregoing method, including:
the session management module comprises a session manager and a video processing module, wherein the session manager is used for managing video segment data input into the system and decoded YUV data;
the data distribution module is in data connection with the session management module and is used for distributing different video segments to queues corresponding to different decoding processes by executing a multi-process decoding task scheduling load balancing method;
the decoding module comprises a plurality of decoding processes, is in data connection with the data distribution module and is used for decoding the distributed corresponding video clips through different decoding processes;
the shared memory module comprises a plurality of shared memories, each shared memory is distributed by the main process according to the name of each decoding process, is in data connection with the decoding module and is used for managing the decoded YUV data in a queue mode;
and the image processing module comprises an image processor, is in data connection with the shared memory module and the session management module, and is used for correspondingly processing the YUV data acquired from the shared memory module according to service requirements and returning a processing result to the session management module.
In an embodiment of the present invention, each shared memory includes two queues, which are a busy queue and an empty queue, respectively, each queue includes a plurality of shared memory blocks, each shared memory block is used as a buffer for storing decoded YUV data, the empty queue is used to manage a buffer for idle YUV data, and the busy queue is used to manage a buffer for buffering decoded YUV data.
Compared with the prior art, the invention has at least the following advantages:
(1) the multi-process decoding task scheduling load balancing method based on the fragments is used, an effective task balancing scheduling method is provided for multi-channel video decoding under the high concurrency condition, and load balancing of all decoding processes is achieved;
(2) by adopting a management mode of sharing the memory by multiple queues, the coupling among different processes of the system is reduced, zero copy of decoded data is realized, and the overhead in data communication caused by the cooperative work of the multiple processes is reduced;
(3) by cooperatively decoding a plurality of processes, normal use of a main process is not influenced when the process exits due to the exception of one decoding process or the fault of hardware acceleration equipment used by the process, so that the fault tolerance of the system can be enhanced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a block diagram of a video processing system architecture according to the present invention;
FIG. 3 is a flowchart of a multi-process decode task scheduling process of the present invention;
description of reference numerals: 10-a session management module; 20-a data distribution module; 30-a decoding module; 40-shared memory module; 50-image processing module.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Fig. 1 is a flowchart of a method of the present invention, and as shown in fig. 1, the present invention provides a method for processing a high-concurrency video with shared data by multiple processes, which includes the following steps:
s1: inputting video segments of various sizes into a session manager (session manager) for managing different video programs;
s2: the method comprises the steps of using a fragment-based multi-process decoding task scheduling load balancing method, distributing data of an input video segment to queues corresponding to different decoding processes according to a video fragment with a minimum weight unit, and then sending corresponding video fragment data in the queues to the corresponding decoding processes in a socket mode, wherein a socket (Chinese is called socket) is a set of network scheme for java to tcp layer communication encapsulation, and socket connection is used for realizing long connection between a server and a client;
for decoding and scheduling of multi-channel videos among multiple processes, the embodiment of the invention adopts a method for dividing according to video segments, each video segment is divided into video segments with different sizes according to the preset minimum weight unit of the video segment, different queues are adopted for managing different video segments, and then the different queues are bound to different processes, so that data distribution of different processes is realized. Meanwhile, in order to ensure the load balance of the system, each video segment has a corresponding weight, and the queue of each video segment has a corresponding total weight, so that the total weight of the tasks distributed by each video decoding process can be quickly inquired and compared during the load balance, and the data distribution is performed on the new decoding tasks according to the total weight, thereby realizing the load balance of the task data distribution.
The specific steps of data distribution in S2 are as follows:
s21, acquiring any video segment and acquiring current video segment information, wherein the video segment information comprises: the occupied memory space size, the video session information, the video bandwidth, the video height and the like;
s22, calculating the frame number (frame _ num) and the number of key frame I frames of the current video segment through multimedia processing software (ffmpeg or other video decoding software), and calculating the weight W of the current video segment through the frame number (frame _ num), the video width (width) and the video height (height):
W=frame_num×width×height;
s23, splitting the video segment into a plurality of video segments with the minimum weight unit according to the weight W of the current video segment;
the specific process of splitting the video segment into a plurality of video segments with the minimum weight unit is as follows:
s2301: checking the number of key frames I of the current video segment, wherein if the number of the I frames is 1, the current video segment is the video segment with the minimum weight unit; if the number of I frames is greater than 1, comparing the weight W of the current video segment calculated in S22 with a preset minimum weight unit;
s2302: if the weight W of the current video segment is larger than the minimum weight unit, the weight of the GOP is calculated according to the size of a GOP (Group of Pictures, which is called as a picture Group) where any I frame in the video segment is located, wherein the GOP is all video frames including the current I frame from the current I frame to the next I frame, the size of the GOP is the number of the video frames included in the current GOP, and the calculation method for calculating the weight of the current GOP according to the size of the current GOP comprises the following steps:
current GOP weight W GOP Size of GOP × width × height
Wherein, width is the video width, height is the video height;
s2303: calculating the current GOP weight W of the 1 st GOP of the current video segment GOP1 Previously, the total weight sumW of the current video segment 0 Is 0;
s2304: calculating the current GOP weight W of the 2 nd GOP of the current video segment GOP2 Previously, the total weight sumW of the current video segment 1 =W GOP1 +sumW 0 The same applies to the calculation of the current GOP weight W of the xth GOP GOPx Previously, the total weight sumW of the current video segment x-1 =W GOPx-1 +sumW x-2 ;
S2305: when W is GOPx +sumW x-1 When the value of (a) is greater than the minimum weight unit, dividing the video segments from the 1 st GOP to the xth GOP as a single video segment, resetting sumW to 0, and repeating S2303 and S2304 with the next GOP (xth +1 GOP) as the 1 st GOP; otherwise, entering the next step;
s2306: if the nth GOP is calculated and n is less than x, namely no subsequent GOP can be used for calculation, the video clips from the 1 st GOP to the nth GOP are directly split as a single video clip;
s24, distributing the split video clips, and the specific process is as follows: comparing the total weight of the queues corresponding to all the processes to obtain a minimum total weight queue, adding the video segment of the 1 st minimum weight unit split in the step S23 into the minimum total weight queue, and completing the distribution of the video segment of the 1 st minimum weight unit to the decoding process corresponding to the minimum total weight queue;
and S25, repeating S24 until all video clips with the minimum weight unit are distributed to the decoding process corresponding to the minimum total weight queue.
By the steps, the video segments can be segmented and distributed, so that the total weight of decoding tasks loaded by all decoding processes is basically consistent, and the loads of all decoding processes are balanced.
S3: the decoding process decodes the received video clips and places the decoded data into corresponding shared memory queues;
wherein, the specific process of S3 is as follows:
s31: the method comprises the steps that a main process allocates a shared memory according to the name of each decoding process, the shared memory is divided into two queues, namely a busy queue and an empty queue, each queue comprises a plurality of shared memory blocks and is used for storing decoded YUV data, the empty queue is used for managing an idle YUV buffer area, and the busy queue is used for managing a buffer area for caching the YUV data after decoding;
s32: when the decoding process decodes new YUV data, the decoding process acquires an idle YUV buffer area from an empty queue (empty queue);
s33: and the decoding process fills the YUV data information into a YUV buffer area according to the video segment information corresponding to the current YUV data, wherein the YUV data information comprises: program information, image width, image height, image size, image format, and the like;
s34: and the decoding process puts the YUV buffer area filled with the data into a busy queue (busy queue) to finish the storage of the YUV data into the shared memory.
In this embodiment, the decoding process implements management of the shared memory in a queue manner.
S4: the main process acquires YUV data from different shared memory queues, correspondingly processes each frame of acquired YUV data, and returns a corresponding buffer area to an empty queue (empty queue) after the processing is finished, so that the shared memory buffer area is reused;
wherein, the specific process of S4 is as follows:
s41: the main process sets a plurality of image processing threads for each decoding process to continuously inquire the state of a busy queue (busy queue) corresponding to the corresponding decoding process, and when the busy queue (busy queue) is inquired to have data, a corresponding YUV data node is directly taken out from the head of the busy queue (busy queue);
s42: the main process directly performs corresponding analysis processing on the YUV data in the YUV data node taken out in the S41 according to the specific service requirement of the obtained YUV data node;
s43: after the YUV data analysis processing in S42 is completed by the host process, the YUV buffer is placed again in an empty queue (empty queue) corresponding to the YUV data decoding process, and the YUV buffer is reused by the decoding process.
In an embodiment of the present invention, the main process queries the state of the decoding process through the monitoring thread, and when a certain decoding process exits abnormally, the main process directly destroys the shared memory corresponding to the decoding process, so as to enhance the fault tolerance of the system.
In this embodiment, the main process and the decoding process continuously obtain corresponding nodes in the empty queue and the busy queue, so that the YUV data is transmitted between two different processes, for the same decoding process, all the YUV data are always in the same buffer, and no data copy is made on the YUV data between the main process and the decoding process, thereby reducing the overhead in data communication caused by multi-process cooperative work. Meanwhile, the buffer processing of the YUV data and the state identification of the YUV data are realized by setting two queues for each decoding process.
S5: the processing result is returned to the Session manager.
Fig. 2 is a schematic diagram of a video processing system according to the present invention, and as shown in fig. 2, the present invention further provides a method for processing high-concurrency video with shared data by multiple processes, which includes:
a Session management module (10) including a Session manager (Session manager) for managing video segment data input into the system and decoded YUV data;
the data distribution module (20) (Dispatch) is in data connection with the session management module and is used for distributing different video segments to queues corresponding to different decoding processes by executing a multi-process decoding task scheduling load balancing method;
a decoding module (30) which comprises a plurality of decoding processes, is in data connection with the data distribution module (Dispatch), and is used for decoding the distributed corresponding video clips through different decoding processes;
the shared memory module (40) comprises a plurality of shared memories, each shared memory is distributed by the main process according to the name of each decoding process, is in data connection with the decoding module, and is used for managing the decoded YUV data in a queue mode;
each shared memory comprises two queues, namely a busy queue (busy queue) and an empty queue (empty queue), wherein each queue comprises a plurality of shared memory blocks and is used as a buffer area for storing decoded YUV data, the empty queue (empty queue) is used for managing the buffer area of the idle YUV data, and the busy queue (busy queue) is used for managing the buffer area for caching the YUV data after decoding;
and the Image processing module (50) comprises an Image processor (Image processor), is in data connection with the shared memory module and the session management module, and is used for correspondingly processing the YUV data acquired from different shared memory queues of the shared memory module according to business requirements and returning a processing result to the session management module.
According to the invention, by distributing and scheduling the tasks of the multi-path video to various different processes, the load balance of the multi-process decoding task is realized, the influence on the main process when the process exits due to the abnormal decoding process or the fault of hardware acceleration equipment is reduced, and the fault tolerance of the system is enhanced; meanwhile, through the processing of data sharing between the multiple decoding processes and the decoded YUV data processing process, the coupling between different processes of the system is reduced, and the zero copy of the YUV data among the processes is realized, so that the overhead in the aspect of data communication brought by the multi-process cooperative work is reduced.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (5)
1. A high-concurrency video processing method for multi-process shared data is characterized by comprising the following steps:
s1: inputting video segments with different sizes into a session manager for managing different video programs;
s2: the method comprises the steps of using a fragment-based multi-process decoding task scheduling load balancing method, carrying out data distribution on an input video segment to queues corresponding to different decoding processes according to a video fragment with a minimum weight unit, and then sending corresponding video segment data in the queues to the corresponding decoding processes in a socket mode, wherein the specific steps of carrying out data distribution are as follows:
s21: acquiring any video segment and acquiring current video segment information, wherein the video segment information comprises: the occupied memory space size, the video session information, the video bandwidth and the video height are high;
s22: calculating the number of the frame of the current video segment and the number of the key frame I frames through multimedia processing software, and calculating the weight W of the current video segment through the frame number, the video width and the video height:
W=frame_num×width×height
wherein, frame _ num is the frame number of the current video segment, width is the video width, and height is the video height;
s23: according to the weight W of the current video segment, splitting the video segment into a plurality of video segments with the minimum weight unit, which comprises the following specific processes:
s2301: checking the number of key frames I of the current video segment, wherein if the number of the I frames is 1, the current video segment is the video segment with the minimum weight unit; if the number of I frames is greater than 1, comparing the weight W of the current video segment calculated in S22 with a preset minimum weight unit;
s2302: if the weight W of the current video section is larger than the minimum weight unit, calculating the weight of a GOP according to the size of a GOP (group of pictures) in which any I frame in the video sections is positioned, wherein the GOP is all video frames including the current I frame from the current I frame to the next I frame, the size of the GOP is the number of the video frames included in the current GOP, and calculating the weight W of the current GOP according to the size of the current GOP GOP The calculation method comprises the following steps:
W GOP size of GOP × width × height
Wherein, width is the video width, height is the video height;
s2303: calculating the current GOP weight W of the 1 st GOP of the current video clip GOP1 Previously, the total weight sumW of the current video segment 0 Is 0;
s2304: calculating the current GOP weight W of the 2 nd GOP of the current video segment GOP2 Total weight sumW of previous, current video segment 1 =W GOP1 +sumW 0 The same applies to the calculation of the current GOP weight W of the xth GOP GOPx Previously, the total weight sumW of the current video segment x-1 =W GOPx-1 +sumW x-2 ;
S2305: when W is GOPx +sumW x-1 When the value of (1) is greater than the minimum weight unit, dividing the video sections from the 1 st GOP to the x-th GOP as a single video section, resetting the sumW to 0, and repeating S2303 and S2304 with the next GOP as the 1 st GOP; otherwise, entering the next step;
s2306: if the nth GOP is calculated and n is less than x, namely no subsequent GOP can be used for calculation, the video clips from the 1 st GOP to the nth GOP are directly split as a single video clip;
s24: distributing the split video clips;
s3: the decoding process decodes the received video segments and places the decoded data into the corresponding shared memory queue, and the specific process is as follows:
s31: the main process allocates a shared memory according to the name of each decoding process, and divides the shared memory into two queues, namely a busy queue and an empty queue, wherein each queue comprises a plurality of shared memory blocks and is used for storing the decoded YUV data, the empty queue is a queue for managing an idle YUV buffer area, and the busy queue is a queue for managing a buffer area for caching the YUV data after decoding;
s32: when the decoding process decodes new YUV data, the decoding process acquires an idle YUV buffer area from the empty queue;
s33: and the decoding process fills the YUV data information into a YUV buffer area according to the video segment information corresponding to the current YUV data, wherein the YUV data information comprises: program information, image width, image height, image size, image format;
s34: the decoding process puts the YUV buffer area filled with the data into a busy queue to finish the storage of the YUV data in the shared memory;
s4: the main process acquires YUV data from different shared memory queues, correspondingly processes each acquired frame of YUV data, and returns the YUV data to a corresponding buffer area after the YUV data is processed, and the specific process comprises the following steps:
s41: the main process sets the state of a busy queue corresponding to the decoding process for each decoding process by a plurality of image processing threads, and when the busy queue is inquired to have data, corresponding YUV data nodes are directly taken out from the head of the busy queue;
s42: the main process directly performs corresponding analysis processing on the YUV data in the YUV data node taken out in the S41 according to the specific service requirement of the obtained YUV data node;
s43: after the analysis processing of the YUV data in the S42 is completed by the main process, the YUV buffer area is placed in the empty queue corresponding to the decoding process of the YUV data again, and the YUV buffer area is reused by the decoding process;
s5: and returning the processing result to the session manager.
2. The video processing method according to claim 1, wherein the specific process of S24 is: comparing the total weight of the queues corresponding to all the processes to obtain a minimum total weight queue, adding the video segment of the 1 st minimum weight unit split in the S23 into the minimum total weight queue, and completing the distribution of the video segment of the 1 st minimum weight unit to the decoding process corresponding to the minimum total weight queue;
then, execution of S25: and repeating the step S24 until all the video clips with the minimum weight unit are distributed to the decoding process corresponding to the minimum total weight queue.
3. The video processing method according to claim 1, wherein the main process queries the state of each decoding process through the monitoring thread, and when any decoding process exits abnormally, the main process directly destroys the shared memory corresponding to the decoding process.
4. A multi-process data sharing high-concurrency video processing system for performing the method of any one of claims 1 to 3, comprising:
the session management module comprises a session manager and a video processing module, wherein the session manager is used for managing video segment data input into the system and decoded YUV data;
the data distribution module is in data connection with the session management module and is used for distributing different video segments to queues corresponding to different decoding processes by executing a multi-process decoding task scheduling load balancing method;
the decoding module comprises a plurality of decoding processes, is in data connection with the data distribution module and is used for decoding the distributed corresponding video clips through different decoding processes;
the shared memory module comprises a plurality of shared memories, each shared memory is distributed by the main process according to the name of each decoding process, is in data connection with the decoding module and is used for managing the decoded YUV data in a queue mode;
and the image processing module comprises an image processor, is in data connection with the shared memory module and the session management module, and is used for correspondingly processing the YUV data acquired from the shared memory module according to service requirements and returning a processing result to the session management module.
5. The video processing system of claim 4, wherein each shared memory comprises two queues, namely a busy queue and an empty queue, each queue comprises a plurality of shared memory blocks, each shared memory block is used as a buffer for storing the decoded YUV data, wherein the empty queue is used for managing a free YUV data buffer, and the busy queue is used for managing a decoded YUV data buffer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011554999.6A CN112637634B (en) | 2020-12-24 | 2020-12-24 | High-concurrency video processing method and system for multi-process shared data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011554999.6A CN112637634B (en) | 2020-12-24 | 2020-12-24 | High-concurrency video processing method and system for multi-process shared data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112637634A CN112637634A (en) | 2021-04-09 |
CN112637634B true CN112637634B (en) | 2022-08-05 |
Family
ID=75324681
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011554999.6A Active CN112637634B (en) | 2020-12-24 | 2020-12-24 | High-concurrency video processing method and system for multi-process shared data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112637634B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113453010B (en) * | 2021-08-31 | 2021-12-10 | 知见科技(江苏)有限公司 | Processing method based on high-performance concurrent video real-time processing framework |
CN115499665A (en) * | 2022-09-14 | 2022-12-20 | 北京睿芯高通量科技有限公司 | High-concurrency coding and decoding system for multi-channel videos |
CN116055664B (en) * | 2023-03-28 | 2023-06-02 | 北京睿芯通量科技发展有限公司 | Method, device and storage medium for sharing memory for video processing process |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7227589B1 (en) * | 1999-12-22 | 2007-06-05 | Intel Corporation | Method and apparatus for video decoding on a multiprocessor system |
JP4476261B2 (en) * | 2006-09-13 | 2010-06-09 | 株式会社ソニー・コンピュータエンタテインメント | Decoding device and decoding method |
US20110274178A1 (en) * | 2010-05-06 | 2011-11-10 | Canon Kabushiki Kaisha | Method and device for parallel decoding of video data units |
CN101916219A (en) * | 2010-07-05 | 2010-12-15 | 南京大学 | Streaming media display platform of on-chip multi-core network processor |
US8948269B1 (en) * | 2011-03-23 | 2015-02-03 | Marvell International Ltd. | Processor implemented systems and methods for optimized video decoding using adaptive thread priority adjustment |
US20130077690A1 (en) * | 2011-09-23 | 2013-03-28 | Qualcomm Incorporated | Firmware-Based Multi-Threaded Video Decoding |
CN103442204A (en) * | 2013-08-08 | 2013-12-11 | 浙江工业大学 | Network video transmission system and method based on DM365 |
CN103974333B (en) * | 2014-05-16 | 2017-07-28 | 西安电子科技大学 | For SVC video traffics and the load-balancing method of translational speed |
CN104394353B (en) * | 2014-10-14 | 2018-03-09 | 浙江宇视科技有限公司 | Video concentration method and device |
CN105992005A (en) * | 2015-03-04 | 2016-10-05 | 广州市动景计算机科技有限公司 | Video decoding method and device and terminal device |
CN104850456A (en) * | 2015-05-27 | 2015-08-19 | 苏州科达科技股份有限公司 | Multi-process decoding method and multi-process decoding system |
CN106488257A (en) * | 2015-08-27 | 2017-03-08 | 阿里巴巴集团控股有限公司 | A kind of generation method of video file index information and equipment |
CN107566843B (en) * | 2017-10-09 | 2019-07-09 | 武汉斗鱼网络科技有限公司 | A kind of video decoding process guard method and device |
CN108848384A (en) * | 2018-06-19 | 2018-11-20 | 复旦大学 | A kind of efficient parallel code-transferring method towards multi-core platform |
CN109413432B (en) * | 2018-07-03 | 2023-01-13 | 北京中科睿芯智能计算产业研究院有限公司 | Multi-process coding method, system and device based on event and shared memory mechanism |
CN110381322B (en) * | 2019-07-15 | 2023-03-14 | 腾讯科技(深圳)有限公司 | Video stream decoding method and device, terminal equipment and storage medium |
-
2020
- 2020-12-24 CN CN202011554999.6A patent/CN112637634B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112637634A (en) | 2021-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112637634B (en) | High-concurrency video processing method and system for multi-process shared data | |
CN110769278B (en) | Distributed video transcoding method and system | |
CN110381322B (en) | Video stream decoding method and device, terminal equipment and storage medium | |
Li et al. | Cost-efficient and robust on-demand video transcoding using heterogeneous cloud services | |
CN1198210C (en) | Micro dispatching method and operation system inner core | |
US8675739B2 (en) | Method and apparatus for video decoding based on a multi-core processor | |
US9612965B2 (en) | Method and system for servicing streaming media | |
Li et al. | CVSS: A cost-efficient and QoS-aware video streaming using cloud services | |
US9787736B2 (en) | Redirection apparatus and method | |
US20230024699A1 (en) | System for high performance on-demand video transcoding | |
US20200107057A1 (en) | Video coding method, system and server | |
Reddy et al. | Qos-Aware Video Streaming Based Admission Control And Scheduling For Video Transcoding In Cloud Computing | |
Jokhio et al. | A computation and storage trade-off strategy for cost-efficient video transcoding in the cloud | |
CN111277869B (en) | Video playing method, device, equipment and storage medium | |
US8130841B2 (en) | Method and apparatus for compression of a video signal | |
CN107155093B (en) | Video preview method, device and equipment | |
WO2009130871A1 (en) | Decoding device | |
CN112511840A (en) | Decoding system and method based on FFMPEG and hardware acceleration equipment | |
US20110239264A1 (en) | Moving-picture image data-distribution method | |
CN112543374A (en) | Transcoding control method and device and electronic equipment | |
CN115941907A (en) | RTP data packet sending method, system, electronic equipment and storage medium | |
Venkatasubramanian et al. | E ective load management for scalable video servers | |
CN115379235A (en) | Image decoding method and device based on buffer pool, readable medium and electronic equipment | |
GB2610677A (en) | Pooling User Interface (UI) engines for cloud UI rendering | |
WO2020155538A1 (en) | Video processing method and system, computer device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: Room 711c, 7 / F, block a, building 1, yard 19, Ronghua Middle Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing 102600 Patentee after: Beijing Zhongke Flux Technology Co.,Ltd. Address before: Room 711c, 7 / F, block a, building 1, yard 19, Ronghua Middle Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing 102600 Patentee before: Beijing Ruixin high throughput technology Co.,Ltd. |