CN112543328B

CN112543328B - Auxiliary encoding method, device, computer equipment and storage medium

Info

Publication number: CN112543328B
Application number: CN201910893770.6A
Authority: CN
Inventors: 洪旭东
Original assignee: Guangzhou Huya Technology Co Ltd
Current assignee: Guangzhou Huya Technology Co Ltd
Priority date: 2019-09-20
Filing date: 2019-09-20
Publication date: 2024-06-21
Anticipated expiration: 2039-09-20
Also published as: CN112543328A

Abstract

The application discloses an auxiliary coding method, an auxiliary coding device, computer equipment and a storage medium, wherein the method comprises the steps of obtaining first coding data of multimedia data; determining scene information from the first encoded data; adjusting the first coded data according to the scene information to obtain second coded data; and sending the second encoded data to the main encoder so that the main encoder encodes according to the second encoded data to obtain the multimedia data. The embodiment of the application can identify the current scene based on the content of the coded data, and adjust the coded data based on the current scene, so that the adjusted coded data is more matched with the current scene, the effectiveness of the video code rate is improved, and the cost is reduced.

Description

Auxiliary encoding method, device, computer equipment and storage medium

Technical Field

The embodiment of the application relates to a digital signal processing technology, in particular to an auxiliary coding method, an auxiliary coding device, computer equipment and a storage medium.

Background

With the rapid development of internet technology, along with the increase of the demand of users for high-definition video, the video data volume of video multimedia is also increasing. These videos are difficult to apply to actual storage and transmission if not compressed. The video compression decoding technology can help effectively remove redundant information in video data, and achieve rapid transmission and offline storage of the video data in the Internet. Thus, video compression decoding technology is a key technology in video applications.

Currently, video compression coding techniques have become popular, such as encoding using an x264 encoder. The x264 encoder is a video compression encoder based on the h.264/MPEG-4AVC video compression standard, which provides a variety of parameters that can be set to control its coding efficiency.

During a live broadcast, such as a game master, there are situations where the master is on the game screen and not on the game screen. At present, video coding is unified coding, namely, coding strategies of game pictures and non-game pictures are the same, and the situation that high-frequency code rate is wasted and cost is wasted can occur.

Disclosure of Invention

The application provides an auxiliary coding method, an auxiliary coding device, computer equipment and a storage medium, so as to improve the effectiveness of video code rate and reduce the cost.

In a first aspect, an embodiment of the present application provides an auxiliary encoding method, applied to an auxiliary encoder, including:

Acquiring first coded data of multimedia data;

Determining scene information from the first encoded data;

Adjusting the first coded data according to the scene information to obtain second coded data;

and sending the second encoded data to the main encoder so that the main encoder encodes according to the second encoded data to obtain the multimedia data.

In a second aspect, an embodiment of the present application further provides an auxiliary encoding apparatus, applied to an auxiliary encoder, including:

the coded data acquisition module is used for acquiring first coded data of the multimedia data;

The scene information determining module is used for determining scene information according to the first coded data;

The adjusting module is used for adjusting the first coded data according to the scene information to obtain second coded data;

And the data feedback module is used for sending the second encoded data to the main encoder so that the main encoder encodes according to the second encoded data to obtain the multimedia data.

In a third aspect, an embodiment of the present application further provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the auxiliary encoding method as shown in the first aspect when executing the program.

In a fourth aspect, embodiments of the present application also provide a storage medium containing computer executable instructions which, when executed by a computer processor, are adapted to carry out the auxiliary encoding method as described in the first aspect.

The auxiliary coding method provided by the embodiment of the application can obtain the first coding data of the multimedia data by the auxiliary coder; then, determining scene information according to the first encoded data; and finally, the second encoded data is sent to a main encoder so that the main encoder encodes according to the second encoded data to obtain the multimedia data. Compared with the prior art that after the coded data are obtained, video coding is carried out according to the coded data, and different scenes cannot be distinguished, the embodiment of the application can identify the current scene based on the content of the coded data, and adjust the coded data based on the current scene, so that the adjusted coded data are more matched with the current scene, the effectiveness of video code rate is improved, and the cost is reduced.

Drawings

FIG. 1 is a schematic diagram of a system architecture according to a first embodiment of the present application;

FIG. 2 is a flow chart of an auxiliary encoding method according to a first embodiment of the present application;

FIG. 3 is a flow chart of an auxiliary encoding method in a second embodiment of the present application;

Fig. 4 is a schematic structural diagram of an auxiliary encoding device in a third embodiment of the present application;

fig. 5 is a schematic structural diagram of a computer device in a fourth embodiment of the present application.

Detailed Description

The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present application are shown in the drawings.

Currently, X264 encoders offer multiple classes of characteristics, such as adaptive quantization, macroblock trees, rate-distortion optimization, etc., and generally employ the same coding parameters and strategies throughout the encoding when using the X264 encoder. However, the content of the video frames is often dynamically changed, and as an example, a game anchor is presented, and a frame in which the anchor waits for the start of the game is presented. If the same coding strategy is used in both cases, the problem of resource waste occurs. How to better detect different scenes in real time and dynamically configure the coding strategy according to the different scenes has been a difficult problem in the industry. Based on the above, the embodiment of the application provides a scheme for dynamically configuring different coding strategies according to different scenes in the whole coding process during the open sowing period, thereby achieving the effects of better reducing the video code rate and saving the cost.

Example 1

Fig. 1 is a schematic diagram of a system architecture used in an embodiment of the present application, which includes a primary encoder 010 and a secondary encoder 020. The auxiliary encoder 020 communicates with the main encoder 010 through an API interface. In the present application, the auxiliary encoder 020 may generate first encoded data based on the delay characteristic, obtain scene information based on the first encoded data, adjust the first encoded data based on the scene information to obtain second encoded data, and transmit the second encoded data back to the main encoder 010 through the API interface. Wherein both the primary encoder 010 and the secondary encoder 020 may use the coding strategy of x 264.

Fig. 2 is a flowchart of an auxiliary encoding method provided in an embodiment of the present application, where the method may be applied to a case of video encoding, and the method may be performed by an auxiliary encoder, and the auxiliary encoder may be located in a server or a user terminal, and the method specifically includes the following steps:

step 110, first encoded data of the multimedia data is obtained.

When encoding a certain frame, the first encoded data may be generated using an auxiliary encoder based on the delay characteristics. The first encoded data is encoding result data, and at least comprises one or more of the following data: 1) The number of each type of macroblock in the encoding, the macroblock includes an I macroblock, a B macroblock, a P macroblock, and a SKIP macroblock. Wherein I (Intra picture) macroblocks are intra coded macroblocks. An I macroblock may be the first frame of each group of pictures (Group of Pictures, GOP) that is moderately compressed to serve as a reference point for random access and may be considered a picture. The I macroblock corresponds to compressed data obtained by compressing one image. P (predictive) macroblocks are forward predicted macroblocks that compress an encoded picture of a transmitted data amount, also called a predicted frame, by exploiting temporal redundancy information that is lower than previously encoded frames in the picture sequence. B (bi-directional interpolated prediction) macro blocks are bi-predictive interpolated encoded frames, which compress the encoded pictures of the amount of transmission data taking into account both the temporal redundancy information with the encoded frames preceding the source picture sequence and the encoded frames following the source picture sequence, also called bi-predictive frames. SKIP macroblocks identify a macroblock without motion vector residuals. The number of the above-mentioned various macro blocks is counted. 2) The actual constant quantization parameter CRF value is encoded. If the CRF code control mode is adopted, the CRF value which is required to be set for the same code rate is encoded. 3) The peak signal-to-Noise Ratio (PNSR) of the code is used to reflect the quality of the code. 4) And counting the coding time length.

Step 120, determining scene information according to the first encoded data.

And analyzing the first coded data to acquire the change condition of the image frames in the video data, and determining scene information based on the change condition. The scene information may be identified by some parameter, such as scene complexity. Scene complexity is higher when the image changes in the scene are larger. Scene complexity is lower when the image change area in the scene is still or changes less.

And 130, adjusting the first coded data according to the scene information to obtain second coded data.

After determining the scene information, the first encoded data may be adjusted according to the scene information. For example, to adjust the CRF. Different coding strategies may be configured according to different scene complexities, each coding strategy containing different coded data. And (3) determining the coded data contained in the coding strategy from the preset mapping relation based on the scene information obtained in the step (120), and modifying the first coded data based on the coded data to obtain second coded data.

And 140, transmitting the second encoded data to the main encoder so that the main encoder encodes according to the second encoded data to obtain the multimedia data.

After the auxiliary encoder completes adjustment on the first encoded data, the second encoded data can be transmitted back to the main encoder through the API interface. The method can quickly provide the coded data for the main encoder in a delay mechanism, and reduce the load of the main encoder.

Further, the first encoded data of the multimedia data is obtained by using the delay buffer of the main encoder and encoding by using the auxiliary encoder.

Generally, in the case of x264, a certain delay is allowed, and this delay is used by the lookahead module of x264, and the lookahead module mainly performs some adaptive quantization, correlation analysis of macro block tree and rate control on these buffered frames. The Lookahead module is used for providing better code rate distribution and rate distortion optimization under the same code rate, and improving the code image quality. The embodiment of the application starts a quick auxiliary encoder without delay in the auxiliary encoder based on the delay characteristic provided by the Lookahead module. For convenience, the auxiliary encoder employs a constant code rate (ABR) code control algorithm.

Example two

Fig. 3 is a flow chart of an auxiliary encoding method according to a second embodiment of the present application, as a further explanation of the above embodiment, including:

Step 210, obtain first encoded data of the multimedia data.

Step 220, counting the total number of macro blocks contained in the multimedia data, and the number of intra-coded I macro blocks and inter SKIP macro blocks therein.

From the first encoded data obtained in step 210, the total number of macroblocks is counted. The total number of macroblocks may be the sum of the number of I macroblocks, the number of B macroblocks, the number of P macroblocks, and the number of inter SKIP macroblocks.

Step 230, calculating a first proportion according to the total macro block number and the I macro block number, wherein the first proportion is the proportion of the I macro block in the total macro block; and calculating a second proportion according to the total macro block number and the SKIP macro block number, wherein the second proportion is the proportion of the SKIP macro block in the total macro block.

After the total number of macroblocks, and the number of intra-coded I-macroblocks and inter-SKIP macroblocks therein, are obtained in step 220, a first ratio _I and a second ratio _skip are calculated. The value ranges of the first proportion and the second proportion are [0,1], namely, the value ranges are more than or equal to zero and less than or equal to 1. In general, the first ratio _I represents more intra-frame coding of the video sequence than higher, the video frame motion is complex, and the picture changes dramatically. A larger second ratio _skip represents more skip type macroblocks between frames and the video frames are relatively still.

Step 240, determining the scene complexity according to the first proportion and the second proportion, and taking the scene complexity as scene information.

The first scale represents the proportion of the video tending to move and the second scale represents the proportion of the video tending to rest. The scene complexity may be determined based on the first scale or the second scale, e.g. a value of greater than 1/2 is taken as the scene complexity. This approach, while quicker, does not accurately identify the real scene complexity.

Based on this, scene complexity may have been calculated by: first, a first preprocessing proportion is determined according to preset parameters and a second proportion. Then, scene complexity is determined from the sum of the first preprocessing scale and the first scale.

The scene complexity function may be calculated by the following formula:

complexity＝ratio_I+λ(1-ratio_skip)

Wherein complexity is scene complexity. Empirically, a preset parameter λ=0.5 is chosen. As can be easily seen, the content of the active components is more than or equal to 0 and less than or equal to complexity and less than or equal to 1.5. Substituting the first ratio _I and the second ratio _skip into the above formula yields scene complexity. Different preset parameters can be set according to different live broadcast environments, and therefore the complexity of a scene is calculated rapidly and accurately.

Further, because of the numerical terminal included between 0 and 1.5, for convenience of quantization, the K end may be divided equally into the value intervals of the scene complexity, and then the subintervals where the scene complexity obtained by the formula is located are used as the representation of the scene complexity. Wherein K is a positive integer, optionally 10.

Step 250, adjusting the first encoded data according to the scene information to obtain second encoded data.

Illustratively, the first encoded data further comprises a constant quantization parameter CRF. At this time, step 250 includes: and determining a coding parameter value mapped with the scene complexity according to a preset corresponding relation, wherein the coding parameter value comprises CRF.

The mapping relation between different scene complexity quality and coding reference values can be preset. The coding reference values are used to describe some or all of the coding strategies. Coded reference values include, but are not limited to: constant quantization parameter CRF, number of reference frames, motion search space (small complexity motion search reduction), etc.

The coding parameter value corresponding to each scene complexity can be preset, for example, when the CRF mode is adopted for coding, the lower scene complexity can adopt the smaller CRF to ensure the quality, and the higher scene complexity adopts the higher CRF to control the code rate so as not to overflow. Meanwhile, for lower scene complexity, more reserved details of lighter deblocking filtering intensity can be adopted, better subjective quality is provided by adopting stronger mind optimization parameters, and the blocking effect is restrained by adopting stronger deblocking filtering intensity for higher scene complexity.

By presetting the mapping relation between the scene complexity and the encoding parameter value, the CRF value can be rapidly determined. Particularly, when the method is used in a delay mechanism, the method can realize quick processing by searching according to the mapping relation, improve the processing efficiency and realize delay-free auxiliary coding calculation when the system is delayed.

Further, the first encoded data further includes a peak signal to noise ratio PSNR, and after the step, further includes: judging whether the PSNR is lower than a threshold value, and if so, reducing the CRF; if above the threshold, CRF is increased.

After the CRF is obtained, the CRF may be further optimized according to the peak signal-to-noise ratio PSNR in the first encoded data. For example, the peak signal-to-noise ratio PSNR may be segmented. If a lower PSNR value for the current complexity is detected, the CRF value is further reduced to ensure quality and vice versa.

The above embodiment can further correct the CRF using the PSNR, and improve the coding efficiency.

Optionally, the first encoded data further includes an encoding duration, where the encoding duration is a time that the encoding has been used.

Step 250 further comprises:

if the encoding time length exceeds the preset time length, the rate distortion optimization strength is improved;

and if the encoding time length does not exceed the preset time length, reducing the rate distortion optimization strength.

If the encoding duration is longer, it means that the remaining time in the delay mechanism is smaller. At this time, the rate-distortion optimization strength needs to be increased to reduce the time taken for rate-distortion optimization. The optimized parameters obtained by optimizing the rate distortion cost are used for guaranteeing that better subjective and objective quality can be provided under the same code rate, and the only cost is to increase the coding time length and reduce the coding efficiency. In this step, since the encoding duration is detected, if the encoding duration is lower, the rate distortion optimization strength is improved to obtain better encoding quality, and vice versa.

The embodiment can adjust the rate distortion optimization intensity based on the encoding time length, achieves the aim of completing the preset task within the controllable time range, and improves the efficiency and the process integrity.

Step 260, the second encoded data is sent to the main encoder, so that the main encoder encodes according to the second encoded data to obtain the multimedia data.

The auxiliary coding method provided by the embodiment of the application can obtain the first coding data of the multimedia data by the auxiliary coder; then, determining scene information according to the first encoded data; and finally, the second encoded data is sent to a main encoder so that the main encoder encodes according to the second encoded data to obtain the multimedia data. Compared with the prior art that after the coded data are obtained, video coding is carried out according to the coded data, and different scenes cannot be distinguished, the embodiment of the application can identify the current scene based on the content of the coded data, and adjust the coded data based on the current scene, so that the adjusted coded data are more matched with the current scene, the effectiveness of video code rate is improved, and the cost is reduced. In addition, the embodiment of the application can monitor the scene information of the video sequence to be coded in real time based on the open source encoder x264, dynamically configure the corresponding coding parameters according to the scene information, and save the cost on the premise of not affecting the image quality. Meanwhile, the coding strategy is updated in real time according to coding feedback, and a more efficient coding scheme is provided.

Example III

Fig. 4 is a schematic diagram of an auxiliary encoding device according to a third embodiment of the present application, which is applied to an auxiliary encoder, and includes: the device comprises an encoded data acquisition module 301, a scene information determination module 302, an adjustment module 303 and a data feedback module 304.

Wherein, the encoded data obtaining module 301 is configured to obtain first encoded data of the multimedia data;

a scene information determining module 302, configured to determine scene information according to the first encoded data;

The adjustment module 303 is configured to adjust the first encoded data according to the scene information to obtain second encoded data;

The data feedback module 304 is configured to send the second encoded data to the primary encoder, so that the primary encoder encodes the second encoded data to obtain multimedia data.

Further, the scene information determining module 302 is configured to:

Counting the total macro block number contained in the multimedia data, and the number of intra-frame coding I macro blocks and the number of inter-frame SKIP macro blocks in the total macro block number;

Calculating a first proportion according to the total macro block number and the I macro block number, wherein the first proportion is the proportion of the I macro block in the total macro block;

Calculating a second proportion according to the total macro block number and the SKIP macro block number, wherein the second proportion is the proportion of the SKIP macro block in the total macro block;

And determining the scene complexity according to the first proportion and the second proportion, and taking the scene complexity as scene information.

Further, the scene information determining module 302 is configured to:

determining a first pretreatment proportion according to preset parameters and a second proportion;

The scene complexity is determined from the sum of the first pre-processing scale and the first scale.

Further, the first encoded data further includes a constant quantization parameter CRF, and the adjusting module 303 is configured to:

and determining a coding parameter value mapped with the scene complexity according to a preset corresponding relation, wherein the coding parameter value comprises CRF.

Further, the first encoded data further comprises a peak signal to noise ratio PSNR, and the adjusting module 303 is further configured to:

judging whether the PSNR is lower than a threshold value, and if so, reducing the CRF;

If above the threshold, CRF is increased.

Further, the first encoded data further includes an encoding duration, where the encoding duration is a time used for encoding; correspondingly, the adjusting module 303 is further configured to:

Further, the encoded data acquisition module 301 is configured to:

the first encoded data of the multimedia data is obtained by encoding using an auxiliary encoder using a delay buffer of the main encoder.

The auxiliary coding device provided by the embodiment of the application can be formed by: the encoded data acquisition module 301 acquires first encoded data of multimedia data; then, the scene information determination module 302 determines scene information from the first encoded data; the adjustment module 303 adjusts the first encoded data according to the scene information to obtain second encoded data, and finally, the data feedback module 304 sends the second encoded data to the main encoder, so that the main encoder encodes according to the second encoded data to obtain multimedia data. Compared with the prior art that after the coded data are obtained, video coding is carried out according to the coded data, and different scenes cannot be distinguished, the embodiment of the application can identify the current scene based on the content of the coded data, and adjust the coded data based on the current scene, so that the adjusted coded data are more matched with the current scene, the effectiveness of video code rate is improved, and the cost is reduced.

The auxiliary coding device provided by the embodiment of the application can execute the auxiliary coding method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method.

Example IV

Fig. 5 is a schematic structural diagram of a computer device according to a fourth embodiment of the present application, and as shown in fig. 5, the computer device includes a processor 40, a memory 41, an input device 42 and an output device 43; the number of processors 40 in the computer device may be one or more, one processor 40 being taken as an example in fig. 5; the processor 40, the memory 41, the input means 42 and the output means 43 in the computer device may be connected by a bus or by other means, in fig. 5 by way of example.

The memory 41 is a computer readable storage medium, and may be used to store a software program, a computer executable program, and modules, such as program instructions/modules (e.g., an encoded data acquisition module 301, a scene information determination module 302, an adjustment module 303, and a data feedback module 304) corresponding to the auxiliary encoding method in the embodiment of the present application. The processor 40 performs various functional applications of the computer device and data processing, i.e., implements the above-described auxiliary encoding method, by running software programs, instructions and modules stored in the memory 41.

The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal, etc. In addition, memory 41 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 41 may further comprise memory located remotely from processor 40, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input means 42 may be used to receive entered numeric or character information and to generate key signal inputs related to user settings and function control of the computer device. The output means 43 may comprise a display device such as a display screen.

Example five

A fifth embodiment of the present application also provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing a method of auxiliary encoding, the method being applied to an auxiliary encoder, the method comprising:

Acquiring first coded data of multimedia data;

Determining scene information from the first encoded data;

Further, determining scene information from the first encoded data includes:

Further, determining the scene complexity according to the first scale and the second scale includes:

Further, the first encoded data further includes a constant quantization parameter CRF, and correspondingly, the first encoded data is adjusted according to the scene information to obtain second encoded data, including:

Further, the first encoded data further includes a peak signal-to-noise ratio PSNR, and after determining an encoding parameter value mapped with the scene complexity according to a preset correspondence, the encoding parameter value includes a CRF, the method further includes:

If above the threshold, CRF is increased.

Further, the first encoded data further includes an encoding duration, where the encoding duration is a time used for encoding; correspondingly, the first coded data is adjusted according to the scene information to obtain second coded data, which comprises the following steps:

Further, obtaining the first encoded data includes:

Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present application is not limited to the above method operations, and may also perform the related operations in the auxiliary encoding method provided in any embodiment of the present application.

From the above description of embodiments, it will be clear to a person skilled in the art that the present application may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk, or an optical disk of a computer, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method of the embodiments of the present application.

It should be noted that, in the above-mentioned embodiments of the search apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above-mentioned division, as long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present application.

Note that the above is only a preferred embodiment of the present application and the technical principle applied. It will be understood by those skilled in the art that the present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, while the application has been described in connection with the above embodiments, the application is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the application, which is set forth in the following claims.

Claims

1. An auxiliary encoding method, applied to an auxiliary encoder, comprising:

Acquiring first coded data of multimedia data;

Determining scene complexity according to the first proportion and the second proportion, and taking the scene complexity as scene information;

and sending the second encoded data to a main encoder so that the main encoder encodes according to the second encoded data to obtain multimedia data.

2. The auxiliary encoding method according to claim 1, wherein the determining scene complexity according to the first scale and the second scale comprises:

and determining scene complexity according to the sum of the first preprocessing proportion and the first proportion.

3. The auxiliary encoding method according to claim 1, wherein the first encoded data further includes a constant quantization parameter CRF, and the adjusting the first encoded data according to the scene information, accordingly, includes:

4. The auxiliary encoding method according to claim 3, wherein the first encoded data further includes a peak signal-to-noise ratio PSNR, and after determining the encoding parameter value mapped with the scene complexity according to a preset correspondence, the encoding parameter value includes a CRF, further includes:

Judging whether PSNR is lower than a threshold value, and if so, reducing the CRF;

And if the CRF is higher than the threshold value, increasing the CRF.

5. The auxiliary encoding method according to claim 1, wherein the first encoded data further includes an encoding time period, the encoding time period being a time that the encoding has been used; correspondingly, the adjusting the first encoded data according to the scene information to obtain second encoded data includes:

And if the coding time length does not exceed the preset time length, reducing the rate distortion optimization strength.

6. The auxiliary encoding method according to any one of claims 1 to 5, wherein the acquiring the first encoded data includes:

7. An auxiliary encoding device, applied to an auxiliary encoder, comprising:

The scene information determining module is used for counting the total macro block number contained in the multimedia data, and the intra-frame coding I macro block number and the inter-frame SKIP macro block number in the total macro block number; calculating a first proportion according to the total macro block number and the I macro block number, wherein the first proportion is the proportion of the I macro block in the total macro block; calculating a second proportion according to the total macro block number and the SKIP macro block number, wherein the second proportion is the proportion of the SKIP macro block in the total macro block; determining scene complexity according to the first proportion and the second proportion, and taking the scene complexity as scene information;

And the data feedback module is used for sending the second encoded data to a main encoder so that the main encoder encodes according to the second encoded data to obtain multimedia data.

8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the auxiliary encoding method according to any of claims 1-6 when executing the program.

9. A storage medium containing computer executable instructions for performing the auxiliary encoding method of any of claims 1-6 when executed by a computer processor.