CN116016969A

CN116016969A - Method and system for online transcoding multiple-output shared filter

Info

Publication number: CN116016969A
Application number: CN202211620354.7A
Authority: CN
Inventors: 唐杰; 王遥远; 杨天使; 李庆瑜; 戴立言
Original assignee: SHANGHAI WONDERTEK SOFTWARE CO Ltd
Current assignee: SHANGHAI WONDERTEK SOFTWARE CO Ltd
Priority date: 2022-12-15
Filing date: 2022-12-15
Publication date: 2023-04-25

Abstract

The invention relates to the technical field of online transcoding of audio and video, and provides a method for online transcoding and multiple output of a shared filter, which comprises the following steps: s1: decapsulating the original audio and video stream to obtain video frames and audio frames; s2: defining filters with the same number as the number of audio and video paths to be output, adopting a first filter in the filters to scale video frames to the same resolution as the first path of video frames to be output, and carrying out adaptive processing on the video frames; s3: the filters except the first filter copy the video frames of which the first path of processing is completed, and the video frames are scaled to the same resolution as the path of video output corresponding to the current filter; s4: and creating output channels with the same number as the number of audio and video channels to be output, and respectively outputting the corresponding processed video frames and audio frames to different devices through the output channels. Aiming at the performance consumption defect of the existing online transcoding system, the CPU utilization rate is greatly reduced by using a thread synchronization mode, so that the transcoding cost is reduced.

Description

Method and system for online transcoding multiple-output shared filter

Technical Field

The invention relates to the technical field of online transcoding of audio and video, in particular to a method and a system for online transcoding and multiple output sharing of filters.

Background

With the progress of the era, the live broadcast industry rapidly develops, such as live games, live talents, live broadcast outdoors, live broadcast with goods, live broadcast movie and television drama and the like. In actual live broadcast, instead of directly outputting the original streaming code of the anchor to be directly delivered to the user through the CDN, the mobile phone model, the network condition, the user experience and the like of the actual user are considered. One path of live broadcast stream needs to be high-definition, standard definition, smooth and other different resolutions to meet the network environment and viewing requirements of different users, and meanwhile, some image quality enhancement, beauty Yan Texiao and the like are performed, and Logo, mosaic, characters and the like are superimposed on a picture.

In the prior art, a common method is to output multiple paths, and scale, enhance the image quality, and make a message Yan Texiao, overlap a log, a mosaic, and characters, etc. on each path. This approach is equivalent to that each path will repeat a part of the same processing, resulting in redundant performance consumption and excessive CPU utilization.

Disclosure of Invention

In view of the above problems, an object of the present invention is to provide a method and a system for online transcoding multiple output common filter. Aiming at the performance consumption defect of the existing online transcoding system, the CPU utilization rate is greatly reduced by using a thread synchronization mode, so that the transcoding cost is reduced.

The above object of the present invention is achieved by the following technical solutions:

a method for online transcoding multiple output common filter comprises the following steps:

s1: the information source processing module pulls the original audio-video stream, and decapsulates the original audio-video stream to obtain video frames and audio frames;

s2: the filter processing module defines filters with the same number as the number of audio and video paths to be output, adopts the first filter in the filters to scale the video frames to the same resolution as the first video frames to be output, and carries out adaptive processing on the video frames;

s3: copying the video frames of which the first path of processing is completed by the filter except the first filter, and scaling the video frames to the same resolution as the video output of the path corresponding to the current filter;

s4: the output processing module creates output channels with the same number as the number of audio and video channels to be output, and outputs the corresponding processed video frames and the audio frames to different devices through the output channels respectively.

Further, in step S2, the adaptive processing is performed on the video frame, which specifically includes:

one or more treatments including image quality enhancement, aesthetic Yan Texiao, overlay of ogo, demosaicing, and text addition are performed.

Further, in step S2, further includes:

and adopting a first filter in the filters to perform noise reduction treatment on the audio frames.

Further, in step S3, the method further includes:

the filters except the first filter are transparent to the audio frames processed in the first pass.

Further, in step S4, the method further includes:

and before outputting, packaging the processed video frames and the processed audio frames.

Further, the method for online transcoding the multiple output common filter further comprises the following steps:

the unpacked video frames and the audio frames are structured and packed into messages and sent to a first filter message queue corresponding to a first filter, and the first filter reads the first filter message queue and processes the video frames;

the first filter packages the processed video frames and the audio frames into messages and sends the messages to a first output message queue and filter message queues corresponding to other filters except the first filter respectively;

reading the corresponding filter message queues by the other filters except the first filter, scaling the video frame to the same resolution as one path of video output corresponding to the current filter, packaging the processed video frame and the directly transparent audio frame into a structured message, and sending the structured message to an output message queue corresponding to the current filter;

and the output channels respectively read the corresponding output message queues and code and output the video frames and the audio frames.

the information source processing module, the filter processing module and the output processing module adopt independent threads for processing and communicate through an inter-thread message mechanism.

A system for performing the method of the online transcoding multiple output common filter, comprising:

the information source processing module is used for pulling an original audio-video stream, and decapsulating the original audio-video stream to obtain video frames and audio frames;

the filter processing module is used for defining filters with the same number as the number of audio and video paths to be output, adopting a first filter in the filters to scale the video frames to the same resolution as the first video frames to be output, and carrying out adaptive processing on the video frames; copying the video frames of which the first path of processing is completed by the filter except the first filter, and scaling the video frames to the same resolution as the video output of the path corresponding to the current filter;

and the output processing module is used for creating output channels with the same number as the number of audio and video channels to be output, and respectively outputting the corresponding processed video frames and the audio frames to different devices through the output channels.

A computer device comprising a memory and one or more processors, the memory having stored therein computer code which, when executed by the one or more processors, causes the one or more processors to perform a method as described above.

A computer readable storage medium storing computer code which, when executed, performs a method as described above.

Compared with the prior art, the invention has the following beneficial effects:

the method for online transcoding of the multi-output shared filter comprises the following steps: s1: the information source processing module pulls the original audio-video stream, and decapsulates the original audio-video stream to obtain video frames and audio frames; s2: the filter processing module defines filters with the same number as the number of audio and video paths to be output, adopts the first filter in the filters to scale the video frames to the same resolution as the first video frames to be output, and carries out adaptive processing on the video frames; s3: copying the video frames of which the first path of processing is completed by the filter except the first filter, and scaling the video frames to the same resolution as the video output of the path corresponding to the current filter; s4: the output processing module creates output channels with the same number as the number of audio and video channels to be output, and outputs the corresponding processed video frames and the audio frames to different devices through the output channels respectively. By adopting the technical scheme, when one path of live broadcast stream needs to be processed by multiple filters and multiple paths of live broadcast streams are output, the same filter processing in multiple paths of output can be improved to one filter for processing, so that CPU consumption caused by independently processing the filters by multiple paths of output is reduced.

Drawings

FIG. 1 is a general flow chart of a method for online transcoding multiple-output common filter in a first embodiment of the present invention;

FIG. 2 is a flow chart of a multi-output common filter according to a second embodiment of the invention;

FIG. 3 is a flow chart of the pull source decoding according to a second embodiment of the present invention;

FIG. 4 is a flowchart showing the process of the first filter according to the second embodiment of the present invention;

FIG. 5 is a flowchart showing the process of the second filter according to the second embodiment of the present invention;

FIG. 6 is a flowchart showing a third filter according to a second embodiment of the present invention;

fig. 7 is an overall structure diagram of a system for on-line transcoding multiple output common filter in a third embodiment of the present invention.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

First embodiment

As shown in fig. 1, the present embodiment provides a method for online transcoding multiple output common filter, which includes the following steps:

s1: and the information source processing module pulls the original audio-video stream, and decapsulates the original audio-video stream to obtain video frames and audio frames.

Specifically, an original audio and video stream of a host in live broadcast is obtained, the original audio and video stream is unpacked before filter processing is carried out, and an original video frame and an audio frame are obtained so as to facilitate subsequent processing of the video frame and the audio frame.

S2: the filter processing module defines the filters with the same number as the number of audio and video paths to be output, adopts the first filter in the filters to scale the video frames to the same resolution as the first video frames to be output, and carries out adaptive processing on the video frames.

Specifically, the filters with the same number as the number of audio and video paths to be output are defined, and firstly, a first filter is adopted to process video frames and scale the video frames to the same resolution as the first path of video frames to be output. And adaptively processing the video frames.

The processing of the applicability comprises one or more of image quality enhancement, american Yan Texiao, superposition of one ogo, mosaic and text addition. The processing items herein are merely examples, and other processing procedures may be adopted in actual processing.

Further, for the audio frame, the first filter performs noise reduction processing on the audio frame.

S3: the filters except the first filter copy the video frame after the first processing, and scale the video frame to the same resolution as the video output of the corresponding channel of the current filter.

Specifically, for video frames of other paths, the repeated processing of the video frames of the first path is not needed, and only the video frames processed by the first path are copied, so that the resolution is scaled to the same resolution as the current output path number. Meanwhile, when the audio frame does not need to be specially processed, the audio frame processed in the first path is directly transmitted.

In addition, in the present embodiment, the resolution is different for each path, but other processes are the same. In actual use, the resolution may be the same, and other processes may be different. For example, two outputs are required, the resolution is 1080p, and image quality is enhanced and aesthetic Yan Texiao is required. However, one path of the text needs to be added, and the other path of the text does not need to be added. The first path can be processed according to 1080p resolution, enhanced image quality and special effect of beauty, and the second path is copied to the processing result of the first path, and then added with characters.

Specifically, each output channel encodes and encapsulates the processed video frames and audio frames and outputs the video frames and audio frames to different devices.

Further, the information source processing module, the filter processing module and the output processing module adopt independent threads for processing and communicate through an inter-thread message mechanism. The message mechanism communication specifically comprises the following steps:

the unpacked video frames and the audio frames are structured and packed into messages and sent to a first filter message queue corresponding to a first filter, and the first filter reads the first filter message queue and processes the video frames; the first filter packages the processed video frames and the audio frames into messages and sends the messages to a first output message queue and filter message queues corresponding to other filters except the first filter respectively; reading the corresponding filter message queues by the other filters except the first filter, scaling the video frame to the same resolution as one path of video output corresponding to the current filter, packaging the processed video frame and the directly transparent audio frame into a structured message, and sending the structured message to an output message queue corresponding to the current filter; and the output channels respectively read the corresponding output message queues and code and output the video frames and the audio frames.

Second embodiment

As shown in fig. 2, the present embodiment is illustrated by a specific example.

This example uses three outputs:

one path of output 1080p, image quality enhancement, american Yan Texiao, superposition of l ogo, mosaic, text addition and audio noise reduction treatment;

two-way output 720p, image quality enhancement, american Yan Texiao, superposition of l ogo, mosaic, text addition and audio noise reduction treatment;

three paths of output 480p, image quality enhancement, american Yan Texiao, superposition of l ogo, mosaic, text addition and audio noise reduction treatment.

The solution method comprises the following steps:

(1) The whole transcoding process is split into a source processing module csourcess i on, a filter processing module CGraphProSess i on and an output processing module CS i nkSess i on, where in this embodiment, a definition filter CGr aphProSess i on Master correspondingly outputs a CS i nkSess i on1, a filter CGraphProSess i on Branch correspondingly outputs two CS i nkSess i on2, and a filter CGraphProSess i on Branch2 correspondingly outputs three CS i nkSess i on3.

(2) The information source processing module CSourceess ion pulls the information source, the information source processing module CSourceess ion carries out decapsulation and decoding to obtain video frames and audio frames (shown in figure 3), the decoded video frames and audio frames are structured and packed into information, and the information is sent to the CGraphProSess i on Master module information queue;

(3) A first filter CGraphProSess i on Master module reads a message queue, a first filter CGr aphProSess i on Master scales a decoded video frame to 1080p and then performs image quality enhancement, american Yan Texiao, l ogo superposition, instant stop-cock beating, word processing addition and noise reduction processing on an audio frame (as shown in fig. 4);

the filter one CGraphProSess i on Master module packages the processed audio and video frames into a message and sends the message to a CS i nkSess i on1 module message queue;

the filter one CGraphProSess i on Master module packages the processed audio and video into a message and sends the message to the CGraphProSess i on Branch1 module message queue;

and the filter one CGraphProSess i on Master module packages the processed audio and video into a message and sends the message to a CGraphProSess i on Branch message queue.

(4) Filter two CGraphProSess i on Branch1 reads the message queue, filter two CGraph ProSess i on Branch1 scales the video frame processed by filter one CGraphProSess i on Master to 720p, and the audio frame is transmitted only (as shown in fig. 5);

filter two CGraphProSess i on Branch1 packages the processed video frames and the unprocessed audio frames into structured messages and sends the structured messages to the CSi nkSess i on2 module message queue.

(5) Filter three CGraphProSess i on Branch2 reads the message queue, filter three CGraph ProSess i on Branch2 scales the video frame processed by filter CGraphProSess i on Master to 480p, and the audio frame is transmitted only (as shown in fig. 6);

filter three CGraphProSess i on Branch2 packages the processed video frames and the unprocessed audio frames into a structured message and sends the structured message to the CS i nkSess i on3 message queue.

Wherein, the image quality is enhanced, the image quality is Yan Texiao, the image is superimposed, the image is mosaic, the text is added, the audio is noise-reduced and is processed by CGraphProSess i on Master, and then CGraphProSess i on Branch and CG raphProSess ion Branch2 are not repeatedly processed, so that the processing steps are simplified, and the CPU consumption is reduced.

(6) Reading a message queue by CS i nkSess i on1, and encoding and outputting audio and video frames;

reading a message queue by CS i nkSess i on2, and encoding and outputting audio and video frames; and the CS i nkSess i on3 reads the message queue and encodes and outputs the audio and video frames.

Third embodiment

As shown in fig. 7, the present embodiment provides a system for performing an on-line transcoding multiple output common filter as a method of the first embodiment, including:

the information source processing module 1 is used for pulling an original audio-video stream, and decapsulating the original audio-video stream to obtain video frames and audio frames;

the filter processing module 2 is used for defining filters with the same number as the number of audio and video paths to be output, adopting a first filter in the filters to scale the video frames to the same resolution as the first video frames to be output, and carrying out adaptive processing on the video frames; copying the video frames of which the first path of processing is completed by the filter except the first filter, and scaling the video frames to the same resolution as the video output of the path corresponding to the current filter;

and the output processing module 3 is used for creating output channels with the same number as the number of audio and video channels to be output, and respectively outputting the corresponding processed video frames and the audio frames to different devices through the output channels.

A computer readable storage medium storing computer code which, when executed, performs a method as described above. Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read-only memory (ROM, read On l y Memory), random access memory (RAM, random Access Memory), magnetic or optical disks, and the like.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

It should be noted that the above embodiments can be freely combined as needed. The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A method for transcoding multiple outputs on line with a shared filter, comprising the steps of:

2. The method of on-line transcoding multiple output common filter according to claim 1, wherein in step S2, the adaptive processing is performed on the video frame, specifically comprising:

one or more treatments including image quality enhancement, aesthetic Yan Texiao, logo overlaying, mosaic playing, text adding are performed.

3. The method of on-line transcoding multiple output common filter of claim 1, further comprising, in step S2:

4. The method of on-line transcoding multiple-output common filter of claim 3, further comprising, in step S3:

5. The method of on-line transcoding multiple output common filter of claim 1, further comprising, in step S4:

6. The method of on-line transcoding multiple output common filter of claim 3, further comprising:

7. The method of on-line transcoding multiple output common filter of claim 6, further comprising:

8. A system of on-line transcoding multiple output common filters for performing the method of on-line transcoding multiple output common filters of claims 1-7, comprising:

9. A computer device comprising a memory and one or more processors, the memory having stored therein computer code that, when executed by the one or more processors, causes the one or more processors to perform the method of any of claims 1-7.

10. A computer readable storage medium storing computer code which, when executed, performs the method of any one of claims 1 to 7.