CN112637634B - High-concurrency video processing method and system for multi-process shared data - Google Patents

High-concurrency video processing method and system for multi-process shared data Download PDF

Info

Publication number
CN112637634B
CN112637634B CN202011554999.6A CN202011554999A CN112637634B CN 112637634 B CN112637634 B CN 112637634B CN 202011554999 A CN202011554999 A CN 202011554999A CN 112637634 B CN112637634 B CN 112637634B
Authority
CN
China
Prior art keywords
video
data
gop
decoding
queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011554999.6A
Other languages
Chinese (zh)
Other versions
CN112637634A (en
Inventor
葛长恩
罗鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Flux Technology Co ltd
Original Assignee
Beijing Ruixin High Throughput Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ruixin High Throughput Technology Co ltd filed Critical Beijing Ruixin High Throughput Technology Co ltd
Priority to CN202011554999.6A priority Critical patent/CN112637634B/en
Publication of CN112637634A publication Critical patent/CN112637634A/en
Application granted granted Critical
Publication of CN112637634B publication Critical patent/CN112637634B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23406Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving management of server-side video buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234345Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44004Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440209Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display for formatting on an optical medium, e.g. DVD
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440245Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/443OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a high-concurrency video processing method and a high-concurrency video processing system for multi-process shared data, wherein the processing method comprises the following steps: s1: inputting video segments with different sizes into a session manager for managing different video programs; s2: the method comprises the steps that a fragment-based multi-process decoding task scheduling load balancing method is used, data distribution is carried out on an input video segment to queues corresponding to different decoding processes according to the video fragment with the smallest weight unit, and then corresponding video segment data in the queues are sent to the corresponding decoding processes in a socket mode; s3: the decoding process decodes the received video clip and places the decoded data into the corresponding shared memory queue; s4: the main process acquires YUV data from different shared memory queues, correspondingly processes each acquired frame of YUV data, and returns the YUV data to a corresponding buffer area after the YUV data is processed; s5: and returning the processing result to the session manager.

Description

High-concurrency video processing method and system for multi-process shared data
Technical Field
The invention relates to the field of data processing, in particular to a method and a system for processing high-concurrency video sharing data by multiple processes, and more particularly to a method and a system capable of carrying out balanced processing on high-concurrency video load and realizing shared memory communication of data among multiple processes.
Background
In recent years, the global internet traffic scale is continuously increasing at a high speed, and according to the statistics of Chinese telecommunications, the future video traffic will dominate the internet traffic, and the video traffic is expected to account for more than 82% of the total internet traffic by 2021. On-line video materials are usually compressed by using a codec, and at an equipment end, in order to be able to correctly browse the materials, a user can directly download and install a codec package to realize decompression and playing of videos, while at a cloud video processing center, a large number of different videos need to be decoded and analyzed every day, but the current video decoding schemes and technologies based on a data center are relatively few.
The video decoding scheme of the existing video processing data center usually adopts a single-process multi-channel video decoding method. The management of video conversation and the decoding of different videos and the processing and analysis of decoded data are realized in a single process, the memories accessed in the video processing process of each module in the process can be shared, the use is convenient, and the realization difficulty is low. In addition, many data centers use hardware acceleration devices such as the great GPU to perform decoding and video analysis of multiple videos. However, in the existing scheme for implementing video processing by adopting a single process, when a process is in error or crashed or hardware equipment is in error, the whole video processing service process is stopped; if a single-process video processing process is directly divided into multiple processes by modules or by an acceleration device, data copy overhead of inter-process communication is increased. Therefore, a method capable of balancing the load of processing tasks of a plurality of decoding devices or decoding processes and dividing the processes according to modules is needed to realize cooperative work among different processes and transfer of YUV data among different processes, where YUV is a color coding format, and the decoded data is usually data in the YUV format.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a method and a system for processing multi-process shared data high-concurrency videos, which realize load balance of multi-process decoding tasks by distributing and scheduling a plurality of paths of videos to a plurality of different processes, reduce the influence on a main process when the process exits due to abnormal decoding process or failure of hardware acceleration equipment, and enhance the fault tolerance of the system; meanwhile, through the processing of data sharing between the multiple decoding processes and the decoded YUV data processing process, the coupling between different processes of the system is reduced, and the zero copy of the YUV data among the processes is realized, so that the overhead in the aspect of data communication brought by the multi-process cooperative work is reduced.
In order to achieve the above object, the present invention provides a method for processing highly concurrent video with shared data by multiple processes, which comprises the following steps:
s1: inputting video segments with different sizes into a session manager for managing different video programs;
s2: the method comprises the steps that a fragment-based multi-process decoding task scheduling load balancing method is used, data distribution is carried out on an input video segment to queues corresponding to different decoding processes according to a video fragment with the smallest weight unit, and then corresponding video fragment data in the queues are sent to the corresponding decoding processes in a socket mode;
s3: the decoding process decodes the received video clip and places the decoded data into the corresponding shared memory queue;
s4: the main process acquires YUV data from different shared memory queues, correspondingly processes each acquired frame of YUV data, and returns the YUV data to a corresponding buffer area after the YUV data is processed;
s5: and returning the processing result to the session manager.
In an embodiment of the present invention, the specific steps of performing data distribution in S2 are as follows:
s21, acquiring any video segment and acquiring current video segment information, wherein the video segment information comprises: the occupied memory space size, the video session information, the video bandwidth and the video height are high;
s22, calculating the number of the frame number of the current video segment and the number of the key frame I through multimedia processing software, and calculating the weight W of the current video segment through the frame number, the video width and the video height:
W=frame_num×width×height
wherein, frame _ num is the frame number of the current video segment, width is the video width, and height is the video height;
s23, splitting the video segment into a plurality of video segments with the minimum weight unit according to the weight W of the current video segment;
s24, distributing the split video clips, and the specific process is as follows: comparing the total weight of the queues corresponding to all the processes to obtain a minimum total weight queue, adding the video segment of the 1 st minimum weight unit split in the S23 into the minimum total weight queue, and completing the distribution of the video segment of the 1 st minimum weight unit to the decoding process corresponding to the minimum total weight queue;
and S25, repeating the step S24 until all the video clips with the minimum weight unit are distributed to the decoding process corresponding to the queue with the minimum total weight.
In an embodiment of the present invention, the specific process of S23 is as follows:
s2301: checking the number of key frames I of the current video segment, wherein if the number of the I frames is 1, the current video segment is the video segment with the minimum weight unit; if the number of I frames is greater than 1, comparing the weight W of the current video segment calculated in S22 with a preset minimum weight unit;
s2302: if the weight W of the current video section is larger than the minimum weight unit, calculating the weight of a GOP according to the size of a GOP (group of pictures) in which any I frame in the video sections is positioned, wherein the GOP is all video frames including the current I frame from the current I frame to the next I frame, the size of the GOP is the number of the video frames included in the current GOP, and calculating the weight W of the current GOP according to the size of the current GOP GOP The calculation method comprises the following steps:
W GOP size of GOP × width × height
Wherein, width is the video width, height is the video height;
s2303: calculating the current GOP weight W of the 1 st GOP of the current video segment GOP1 Previously, the total weight sumW of the current video segment 0 Is 0;
s2304: calculating the current GOP weight W of the 2 nd GOP of the current video clip GOP2 Previously, the total weight sumW of the current video segment 1 =W GOP1 +sumW 0 The same applies to the calculation of the current GOP weight W of the xth GOP GOPx Previously, the total weight sumW of the current video segment x-1 =W GOPx-1 +sumW x-2
S2305: when W is GOPx +sumW x-1 When the value of (1) is greater than the minimum weight unit, dividing the video sections from the 1 st GOP to the x-th GOP as a single video section, resetting the sumW to 0, and repeating S2303 and S2304 with the next GOP as the 1 st GOP; otherwise, entering the next step;
s2306: if the nth GOP is calculated and n < x is not available for calculation, the video clips from the 1 st GOP to the nth GOP are directly split as a single video clip.
In an embodiment of the present invention, the specific process of S3 is as follows:
s31: the main process allocates a shared memory according to the name of each decoding process, and divides the shared memory into two queues, namely a busy queue and an empty queue, wherein each queue comprises a plurality of shared memory blocks and is used for storing the decoded YUV data, the empty queue is a queue for managing an idle YUV buffer area, and the busy queue is a queue for managing a buffer area for caching the YUV data after decoding;
s32: when the decoding process decodes new YUV data, the decoding process acquires an idle YUV buffer area from the empty queue;
s33: and the decoding process fills the YUV data information into a YUV buffer area according to the video segment information corresponding to the current YUV data, wherein the YUV data information comprises: program information, image width, image height, image size, image format;
s34: and the decoding process puts the YUV buffer area filled with the data into a busy queue to finish the storage of the YUV data in the shared memory.
In an embodiment of the present invention, the specific process of S4 is as follows:
s41: the main process sets the state of a busy queue corresponding to the decoding process for each decoding process by a plurality of image processing threads, and when the busy queue is inquired to have data, corresponding YUV data nodes are directly taken out from the head of the busy queue;
s42: the main process directly performs corresponding analysis processing on the YUV data in the YUV data node taken out in the S41 according to the specific service requirement of the obtained YUV data node;
s43: and after the analysis processing of the YUV data in the S42 is completed by the main process, the YUV buffer area is placed in the empty queue corresponding to the decoding process of the YUV data again, and the YUV buffer area is reused by the decoding process.
In an embodiment of the present invention, the main process implements state query on each decoding process through the monitoring thread, and when any decoding process exits abnormally, the main process directly destroys the shared memory corresponding to the decoding process.
In order to achieve the above object, the present invention further provides a system for highly concurrent video processing with shared data by multiple processes, configured to perform the foregoing method, including:
the session management module comprises a session manager and a video processing module, wherein the session manager is used for managing video segment data input into the system and decoded YUV data;
the data distribution module is in data connection with the session management module and is used for distributing different video segments to queues corresponding to different decoding processes by executing a multi-process decoding task scheduling load balancing method;
the decoding module comprises a plurality of decoding processes, is in data connection with the data distribution module and is used for decoding the distributed corresponding video clips through different decoding processes;
the shared memory module comprises a plurality of shared memories, each shared memory is distributed by the main process according to the name of each decoding process, is in data connection with the decoding module and is used for managing the decoded YUV data in a queue mode;
and the image processing module comprises an image processor, is in data connection with the shared memory module and the session management module, and is used for correspondingly processing the YUV data acquired from the shared memory module according to service requirements and returning a processing result to the session management module.
In an embodiment of the present invention, each shared memory includes two queues, which are a busy queue and an empty queue, respectively, each queue includes a plurality of shared memory blocks, each shared memory block is used as a buffer for storing decoded YUV data, the empty queue is used to manage a buffer for idle YUV data, and the busy queue is used to manage a buffer for buffering decoded YUV data.
Compared with the prior art, the invention has at least the following advantages:
(1) the multi-process decoding task scheduling load balancing method based on the fragments is used, an effective task balancing scheduling method is provided for multi-channel video decoding under the high concurrency condition, and load balancing of all decoding processes is achieved;
(2) by adopting a management mode of sharing the memory by multiple queues, the coupling among different processes of the system is reduced, zero copy of decoded data is realized, and the overhead in data communication caused by the cooperative work of the multiple processes is reduced;
(3) by cooperatively decoding a plurality of processes, normal use of a main process is not influenced when the process exits due to the exception of one decoding process or the fault of hardware acceleration equipment used by the process, so that the fault tolerance of the system can be enhanced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a block diagram of a video processing system architecture according to the present invention;
FIG. 3 is a flowchart of a multi-process decode task scheduling process of the present invention;
description of reference numerals: 10-a session management module; 20-a data distribution module; 30-a decoding module; 40-shared memory module; 50-image processing module.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Fig. 1 is a flowchart of a method of the present invention, and as shown in fig. 1, the present invention provides a method for processing a high-concurrency video with shared data by multiple processes, which includes the following steps:
s1: inputting video segments of various sizes into a session manager (session manager) for managing different video programs;
s2: the method comprises the steps of using a fragment-based multi-process decoding task scheduling load balancing method, distributing data of an input video segment to queues corresponding to different decoding processes according to a video fragment with a minimum weight unit, and then sending corresponding video fragment data in the queues to the corresponding decoding processes in a socket mode, wherein a socket (Chinese is called socket) is a set of network scheme for java to tcp layer communication encapsulation, and socket connection is used for realizing long connection between a server and a client;
for decoding and scheduling of multi-channel videos among multiple processes, the embodiment of the invention adopts a method for dividing according to video segments, each video segment is divided into video segments with different sizes according to the preset minimum weight unit of the video segment, different queues are adopted for managing different video segments, and then the different queues are bound to different processes, so that data distribution of different processes is realized. Meanwhile, in order to ensure the load balance of the system, each video segment has a corresponding weight, and the queue of each video segment has a corresponding total weight, so that the total weight of the tasks distributed by each video decoding process can be quickly inquired and compared during the load balance, and the data distribution is performed on the new decoding tasks according to the total weight, thereby realizing the load balance of the task data distribution.
The specific steps of data distribution in S2 are as follows:
s21, acquiring any video segment and acquiring current video segment information, wherein the video segment information comprises: the occupied memory space size, the video session information, the video bandwidth, the video height and the like;
s22, calculating the frame number (frame _ num) and the number of key frame I frames of the current video segment through multimedia processing software (ffmpeg or other video decoding software), and calculating the weight W of the current video segment through the frame number (frame _ num), the video width (width) and the video height (height):
W=frame_num×width×height;
s23, splitting the video segment into a plurality of video segments with the minimum weight unit according to the weight W of the current video segment;
the specific process of splitting the video segment into a plurality of video segments with the minimum weight unit is as follows:
s2301: checking the number of key frames I of the current video segment, wherein if the number of the I frames is 1, the current video segment is the video segment with the minimum weight unit; if the number of I frames is greater than 1, comparing the weight W of the current video segment calculated in S22 with a preset minimum weight unit;
s2302: if the weight W of the current video segment is larger than the minimum weight unit, the weight of the GOP is calculated according to the size of a GOP (Group of Pictures, which is called as a picture Group) where any I frame in the video segment is located, wherein the GOP is all video frames including the current I frame from the current I frame to the next I frame, the size of the GOP is the number of the video frames included in the current GOP, and the calculation method for calculating the weight of the current GOP according to the size of the current GOP comprises the following steps:
current GOP weight W GOP Size of GOP × width × height
Wherein, width is the video width, height is the video height;
s2303: calculating the current GOP weight W of the 1 st GOP of the current video segment GOP1 Previously, the total weight sumW of the current video segment 0 Is 0;
s2304: calculating the current GOP weight W of the 2 nd GOP of the current video segment GOP2 Previously, the total weight sumW of the current video segment 1 =W GOP1 +sumW 0 The same applies to the calculation of the current GOP weight W of the xth GOP GOPx Previously, the total weight sumW of the current video segment x-1 =W GOPx-1 +sumW x-2
S2305: when W is GOPx +sumW x-1 When the value of (a) is greater than the minimum weight unit, dividing the video segments from the 1 st GOP to the xth GOP as a single video segment, resetting sumW to 0, and repeating S2303 and S2304 with the next GOP (xth +1 GOP) as the 1 st GOP; otherwise, entering the next step;
s2306: if the nth GOP is calculated and n is less than x, namely no subsequent GOP can be used for calculation, the video clips from the 1 st GOP to the nth GOP are directly split as a single video clip;
s24, distributing the split video clips, and the specific process is as follows: comparing the total weight of the queues corresponding to all the processes to obtain a minimum total weight queue, adding the video segment of the 1 st minimum weight unit split in the step S23 into the minimum total weight queue, and completing the distribution of the video segment of the 1 st minimum weight unit to the decoding process corresponding to the minimum total weight queue;
and S25, repeating S24 until all video clips with the minimum weight unit are distributed to the decoding process corresponding to the minimum total weight queue.
By the steps, the video segments can be segmented and distributed, so that the total weight of decoding tasks loaded by all decoding processes is basically consistent, and the loads of all decoding processes are balanced.
S3: the decoding process decodes the received video clips and places the decoded data into corresponding shared memory queues;
wherein, the specific process of S3 is as follows:
s31: the method comprises the steps that a main process allocates a shared memory according to the name of each decoding process, the shared memory is divided into two queues, namely a busy queue and an empty queue, each queue comprises a plurality of shared memory blocks and is used for storing decoded YUV data, the empty queue is used for managing an idle YUV buffer area, and the busy queue is used for managing a buffer area for caching the YUV data after decoding;
s32: when the decoding process decodes new YUV data, the decoding process acquires an idle YUV buffer area from an empty queue (empty queue);
s33: and the decoding process fills the YUV data information into a YUV buffer area according to the video segment information corresponding to the current YUV data, wherein the YUV data information comprises: program information, image width, image height, image size, image format, and the like;
s34: and the decoding process puts the YUV buffer area filled with the data into a busy queue (busy queue) to finish the storage of the YUV data into the shared memory.
In this embodiment, the decoding process implements management of the shared memory in a queue manner.
S4: the main process acquires YUV data from different shared memory queues, correspondingly processes each frame of acquired YUV data, and returns a corresponding buffer area to an empty queue (empty queue) after the processing is finished, so that the shared memory buffer area is reused;
wherein, the specific process of S4 is as follows:
s41: the main process sets a plurality of image processing threads for each decoding process to continuously inquire the state of a busy queue (busy queue) corresponding to the corresponding decoding process, and when the busy queue (busy queue) is inquired to have data, a corresponding YUV data node is directly taken out from the head of the busy queue (busy queue);
s42: the main process directly performs corresponding analysis processing on the YUV data in the YUV data node taken out in the S41 according to the specific service requirement of the obtained YUV data node;
s43: after the YUV data analysis processing in S42 is completed by the host process, the YUV buffer is placed again in an empty queue (empty queue) corresponding to the YUV data decoding process, and the YUV buffer is reused by the decoding process.
In an embodiment of the present invention, the main process queries the state of the decoding process through the monitoring thread, and when a certain decoding process exits abnormally, the main process directly destroys the shared memory corresponding to the decoding process, so as to enhance the fault tolerance of the system.
In this embodiment, the main process and the decoding process continuously obtain corresponding nodes in the empty queue and the busy queue, so that the YUV data is transmitted between two different processes, for the same decoding process, all the YUV data are always in the same buffer, and no data copy is made on the YUV data between the main process and the decoding process, thereby reducing the overhead in data communication caused by multi-process cooperative work. Meanwhile, the buffer processing of the YUV data and the state identification of the YUV data are realized by setting two queues for each decoding process.
S5: the processing result is returned to the Session manager.
Fig. 2 is a schematic diagram of a video processing system according to the present invention, and as shown in fig. 2, the present invention further provides a method for processing high-concurrency video with shared data by multiple processes, which includes:
a Session management module (10) including a Session manager (Session manager) for managing video segment data input into the system and decoded YUV data;
the data distribution module (20) (Dispatch) is in data connection with the session management module and is used for distributing different video segments to queues corresponding to different decoding processes by executing a multi-process decoding task scheduling load balancing method;
a decoding module (30) which comprises a plurality of decoding processes, is in data connection with the data distribution module (Dispatch), and is used for decoding the distributed corresponding video clips through different decoding processes;
the shared memory module (40) comprises a plurality of shared memories, each shared memory is distributed by the main process according to the name of each decoding process, is in data connection with the decoding module, and is used for managing the decoded YUV data in a queue mode;
each shared memory comprises two queues, namely a busy queue (busy queue) and an empty queue (empty queue), wherein each queue comprises a plurality of shared memory blocks and is used as a buffer area for storing decoded YUV data, the empty queue (empty queue) is used for managing the buffer area of the idle YUV data, and the busy queue (busy queue) is used for managing the buffer area for caching the YUV data after decoding;
and the Image processing module (50) comprises an Image processor (Image processor), is in data connection with the shared memory module and the session management module, and is used for correspondingly processing the YUV data acquired from different shared memory queues of the shared memory module according to business requirements and returning a processing result to the session management module.
According to the invention, by distributing and scheduling the tasks of the multi-path video to various different processes, the load balance of the multi-process decoding task is realized, the influence on the main process when the process exits due to the abnormal decoding process or the fault of hardware acceleration equipment is reduced, and the fault tolerance of the system is enhanced; meanwhile, through the processing of data sharing between the multiple decoding processes and the decoded YUV data processing process, the coupling between different processes of the system is reduced, and the zero copy of the YUV data among the processes is realized, so that the overhead in the aspect of data communication brought by the multi-process cooperative work is reduced.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (5)

1. A high-concurrency video processing method for multi-process shared data is characterized by comprising the following steps:
s1: inputting video segments with different sizes into a session manager for managing different video programs;
s2: the method comprises the steps of using a fragment-based multi-process decoding task scheduling load balancing method, carrying out data distribution on an input video segment to queues corresponding to different decoding processes according to a video fragment with a minimum weight unit, and then sending corresponding video segment data in the queues to the corresponding decoding processes in a socket mode, wherein the specific steps of carrying out data distribution are as follows:
s21: acquiring any video segment and acquiring current video segment information, wherein the video segment information comprises: the occupied memory space size, the video session information, the video bandwidth and the video height are high;
s22: calculating the number of the frame of the current video segment and the number of the key frame I frames through multimedia processing software, and calculating the weight W of the current video segment through the frame number, the video width and the video height:
W=frame_num×width×height
wherein, frame _ num is the frame number of the current video segment, width is the video width, and height is the video height;
s23: according to the weight W of the current video segment, splitting the video segment into a plurality of video segments with the minimum weight unit, which comprises the following specific processes:
s2301: checking the number of key frames I of the current video segment, wherein if the number of the I frames is 1, the current video segment is the video segment with the minimum weight unit; if the number of I frames is greater than 1, comparing the weight W of the current video segment calculated in S22 with a preset minimum weight unit;
s2302: if the weight W of the current video section is larger than the minimum weight unit, calculating the weight of a GOP according to the size of a GOP (group of pictures) in which any I frame in the video sections is positioned, wherein the GOP is all video frames including the current I frame from the current I frame to the next I frame, the size of the GOP is the number of the video frames included in the current GOP, and calculating the weight W of the current GOP according to the size of the current GOP GOP The calculation method comprises the following steps:
W GOP size of GOP × width × height
Wherein, width is the video width, height is the video height;
s2303: calculating the current GOP weight W of the 1 st GOP of the current video clip GOP1 Previously, the total weight sumW of the current video segment 0 Is 0;
s2304: calculating the current GOP weight W of the 2 nd GOP of the current video segment GOP2 Total weight sumW of previous, current video segment 1 =W GOP1 +sumW 0 The same applies to the calculation of the current GOP weight W of the xth GOP GOPx Previously, the total weight sumW of the current video segment x-1 =W GOPx-1 +sumW x-2
S2305: when W is GOPx +sumW x-1 When the value of (1) is greater than the minimum weight unit, dividing the video sections from the 1 st GOP to the x-th GOP as a single video section, resetting the sumW to 0, and repeating S2303 and S2304 with the next GOP as the 1 st GOP; otherwise, entering the next step;
s2306: if the nth GOP is calculated and n is less than x, namely no subsequent GOP can be used for calculation, the video clips from the 1 st GOP to the nth GOP are directly split as a single video clip;
s24: distributing the split video clips;
s3: the decoding process decodes the received video segments and places the decoded data into the corresponding shared memory queue, and the specific process is as follows:
s31: the main process allocates a shared memory according to the name of each decoding process, and divides the shared memory into two queues, namely a busy queue and an empty queue, wherein each queue comprises a plurality of shared memory blocks and is used for storing the decoded YUV data, the empty queue is a queue for managing an idle YUV buffer area, and the busy queue is a queue for managing a buffer area for caching the YUV data after decoding;
s32: when the decoding process decodes new YUV data, the decoding process acquires an idle YUV buffer area from the empty queue;
s33: and the decoding process fills the YUV data information into a YUV buffer area according to the video segment information corresponding to the current YUV data, wherein the YUV data information comprises: program information, image width, image height, image size, image format;
s34: the decoding process puts the YUV buffer area filled with the data into a busy queue to finish the storage of the YUV data in the shared memory;
s4: the main process acquires YUV data from different shared memory queues, correspondingly processes each acquired frame of YUV data, and returns the YUV data to a corresponding buffer area after the YUV data is processed, and the specific process comprises the following steps:
s41: the main process sets the state of a busy queue corresponding to the decoding process for each decoding process by a plurality of image processing threads, and when the busy queue is inquired to have data, corresponding YUV data nodes are directly taken out from the head of the busy queue;
s42: the main process directly performs corresponding analysis processing on the YUV data in the YUV data node taken out in the S41 according to the specific service requirement of the obtained YUV data node;
s43: after the analysis processing of the YUV data in the S42 is completed by the main process, the YUV buffer area is placed in the empty queue corresponding to the decoding process of the YUV data again, and the YUV buffer area is reused by the decoding process;
s5: and returning the processing result to the session manager.
2. The video processing method according to claim 1, wherein the specific process of S24 is: comparing the total weight of the queues corresponding to all the processes to obtain a minimum total weight queue, adding the video segment of the 1 st minimum weight unit split in the S23 into the minimum total weight queue, and completing the distribution of the video segment of the 1 st minimum weight unit to the decoding process corresponding to the minimum total weight queue;
then, execution of S25: and repeating the step S24 until all the video clips with the minimum weight unit are distributed to the decoding process corresponding to the minimum total weight queue.
3. The video processing method according to claim 1, wherein the main process queries the state of each decoding process through the monitoring thread, and when any decoding process exits abnormally, the main process directly destroys the shared memory corresponding to the decoding process.
4. A multi-process data sharing high-concurrency video processing system for performing the method of any one of claims 1 to 3, comprising:
the session management module comprises a session manager and a video processing module, wherein the session manager is used for managing video segment data input into the system and decoded YUV data;
the data distribution module is in data connection with the session management module and is used for distributing different video segments to queues corresponding to different decoding processes by executing a multi-process decoding task scheduling load balancing method;
the decoding module comprises a plurality of decoding processes, is in data connection with the data distribution module and is used for decoding the distributed corresponding video clips through different decoding processes;
the shared memory module comprises a plurality of shared memories, each shared memory is distributed by the main process according to the name of each decoding process, is in data connection with the decoding module and is used for managing the decoded YUV data in a queue mode;
and the image processing module comprises an image processor, is in data connection with the shared memory module and the session management module, and is used for correspondingly processing the YUV data acquired from the shared memory module according to service requirements and returning a processing result to the session management module.
5. The video processing system of claim 4, wherein each shared memory comprises two queues, namely a busy queue and an empty queue, each queue comprises a plurality of shared memory blocks, each shared memory block is used as a buffer for storing the decoded YUV data, wherein the empty queue is used for managing a free YUV data buffer, and the busy queue is used for managing a decoded YUV data buffer.
CN202011554999.6A 2020-12-24 2020-12-24 High-concurrency video processing method and system for multi-process shared data Active CN112637634B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011554999.6A CN112637634B (en) 2020-12-24 2020-12-24 High-concurrency video processing method and system for multi-process shared data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011554999.6A CN112637634B (en) 2020-12-24 2020-12-24 High-concurrency video processing method and system for multi-process shared data

Publications (2)

Publication Number Publication Date
CN112637634A CN112637634A (en) 2021-04-09
CN112637634B true CN112637634B (en) 2022-08-05

Family

ID=75324681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011554999.6A Active CN112637634B (en) 2020-12-24 2020-12-24 High-concurrency video processing method and system for multi-process shared data

Country Status (1)

Country Link
CN (1) CN112637634B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113453010B (en) * 2021-08-31 2021-12-10 知见科技(江苏)有限公司 Processing method based on high-performance concurrent video real-time processing framework
CN115499665A (en) * 2022-09-14 2022-12-20 北京睿芯高通量科技有限公司 High-concurrency coding and decoding system for multi-channel videos
CN116055664B (en) * 2023-03-28 2023-06-02 北京睿芯通量科技发展有限公司 Method, device and storage medium for sharing memory for video processing process

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7227589B1 (en) * 1999-12-22 2007-06-05 Intel Corporation Method and apparatus for video decoding on a multiprocessor system
JP4476261B2 (en) * 2006-09-13 2010-06-09 株式会社ソニー・コンピュータエンタテインメント Decoding device and decoding method
US20110274178A1 (en) * 2010-05-06 2011-11-10 Canon Kabushiki Kaisha Method and device for parallel decoding of video data units
CN101916219A (en) * 2010-07-05 2010-12-15 南京大学 Streaming media display platform of on-chip multi-core network processor
US8948269B1 (en) * 2011-03-23 2015-02-03 Marvell International Ltd. Processor implemented systems and methods for optimized video decoding using adaptive thread priority adjustment
US20130077690A1 (en) * 2011-09-23 2013-03-28 Qualcomm Incorporated Firmware-Based Multi-Threaded Video Decoding
CN103442204A (en) * 2013-08-08 2013-12-11 浙江工业大学 Network video transmission system and method based on DM365
CN103974333B (en) * 2014-05-16 2017-07-28 西安电子科技大学 For SVC video traffics and the load-balancing method of translational speed
CN104394353B (en) * 2014-10-14 2018-03-09 浙江宇视科技有限公司 Video concentration method and device
CN105992005A (en) * 2015-03-04 2016-10-05 广州市动景计算机科技有限公司 Video decoding method and device and terminal device
CN104850456A (en) * 2015-05-27 2015-08-19 苏州科达科技股份有限公司 Multi-process decoding method and multi-process decoding system
CN106488257A (en) * 2015-08-27 2017-03-08 阿里巴巴集团控股有限公司 A kind of generation method of video file index information and equipment
CN107566843B (en) * 2017-10-09 2019-07-09 武汉斗鱼网络科技有限公司 A kind of video decoding process guard method and device
CN108848384A (en) * 2018-06-19 2018-11-20 复旦大学 A kind of efficient parallel code-transferring method towards multi-core platform
CN109413432B (en) * 2018-07-03 2023-01-13 北京中科睿芯智能计算产业研究院有限公司 Multi-process coding method, system and device based on event and shared memory mechanism
CN110381322B (en) * 2019-07-15 2023-03-14 腾讯科技(深圳)有限公司 Video stream decoding method and device, terminal equipment and storage medium

Also Published As

Publication number Publication date
CN112637634A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
CN112637634B (en) High-concurrency video processing method and system for multi-process shared data
CN110769278B (en) Distributed video transcoding method and system
CN110381322B (en) Video stream decoding method and device, terminal equipment and storage medium
Li et al. Cost-efficient and robust on-demand video transcoding using heterogeneous cloud services
CN1198210C (en) Micro dispatching method and operation system inner core
US8675739B2 (en) Method and apparatus for video decoding based on a multi-core processor
US9612965B2 (en) Method and system for servicing streaming media
Li et al. CVSS: A cost-efficient and QoS-aware video streaming using cloud services
US9787736B2 (en) Redirection apparatus and method
US20230024699A1 (en) System for high performance on-demand video transcoding
US20200107057A1 (en) Video coding method, system and server
Reddy et al. Qos-Aware Video Streaming Based Admission Control And Scheduling For Video Transcoding In Cloud Computing
Jokhio et al. A computation and storage trade-off strategy for cost-efficient video transcoding in the cloud
CN111277869B (en) Video playing method, device, equipment and storage medium
US8130841B2 (en) Method and apparatus for compression of a video signal
CN107155093B (en) Video preview method, device and equipment
WO2009130871A1 (en) Decoding device
CN112511840A (en) Decoding system and method based on FFMPEG and hardware acceleration equipment
US20110239264A1 (en) Moving-picture image data-distribution method
CN112543374A (en) Transcoding control method and device and electronic equipment
CN115941907A (en) RTP data packet sending method, system, electronic equipment and storage medium
Venkatasubramanian et al. E ective load management for scalable video servers
CN115379235A (en) Image decoding method and device based on buffer pool, readable medium and electronic equipment
GB2610677A (en) Pooling User Interface (UI) engines for cloud UI rendering
WO2020155538A1 (en) Video processing method and system, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: Room 711c, 7 / F, block a, building 1, yard 19, Ronghua Middle Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing 102600

Patentee after: Beijing Zhongke Flux Technology Co.,Ltd.

Address before: Room 711c, 7 / F, block a, building 1, yard 19, Ronghua Middle Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing 102600

Patentee before: Beijing Ruixin high throughput technology Co.,Ltd.