CN115633176A

CN115633176A - Method, device and storage medium for dynamically adjusting length of picture group

Info

Publication number: CN115633176A
Application number: CN202211569825.6A
Authority: CN
Inventors: 李瑞亮; 郭建君
Original assignee: Beijing Weiling Times Technology Co Ltd
Current assignee: Beijing Weiling Times Technology Co Ltd
Priority date: 2022-12-08
Filing date: 2022-12-08
Publication date: 2023-01-20
Anticipated expiration: 2042-12-08
Also published as: CN115633176B

Abstract

The invention relates to a method, a device and a storage medium for dynamically adjusting the length of a picture group, wherein the method comprises the following steps: and (3) encoding: the first encoder takes the first frame as a full-frame compression encoding frame, and the second frame is used as a forward prediction encoding frame to continuously output video encoding; the second encoder continuously outputs video coding in the form of full-frame compression coding frames; outputting the output frame of the first encoder as a video frame; a judging step: judging whether to enter a setting step or not according to the calculation parameters of the first encoder and the second encoder; the setting step: and setting the current second encoder as the first encoder, and continuing the encoding step. When the method is used, the calculation amount is reduced by the method, and simultaneously, a hardware encoder and a hardware decoder are utilized, so that the occupation of a CPU (Central processing Unit) and a GPU (graphics processing Unit) is reduced, the image quality change can be monitored in time, the video quality is accurately controlled, the low-delay video output quality is improved, and the hardware cost is reduced.

Description

Method, device and storage medium for dynamically adjusting length of picture group

Technical Field

The present invention relates to a method, an apparatus and a storage medium for dynamically adjusting a group length of a picture, and more particularly, to a method, an apparatus and a storage medium for dynamically adjusting a group length of a picture for a low latency video stream.

Background

When a video file is stored, played online and the like, pictures need to be encoded and decoded, and currently popular encoding algorithms such as H264, H265 and the like have common video compression frame types including I-frame, P-frame and B-frame:

video compressed I-frame (intra picture): key frames, using intra-frame compression methods, can be decoded individually into one complete picture. P frame (Predictive Picture): the inter-frame compression method is used, and the inter-frame coding is carried out by referring to the previous I frame or P frame and adopting a motion prediction mode for predicting the coded image frame. B frame (bidirectional predicted picture): the bidirectional predictive coding image frame needs to be predicted by referring to the frames before and after the bidirectional predictive coding image frame, so the frame prediction mode is more suitable for scenes without considering time delay, such as recorded and broadcast videos and the like. Therefore, in a low-delay video stream scene, such as a cloud game, live broadcast and the like, only the I frame and the P frame are adopted to reduce video stream delay, and the B frame is not adopted.

In a video coding standard, a Group of Pictures GOP (Group of Pictures) is a collection of consecutive coded images of video Pictures coded in a certain coding standard, the length of a Group of Pictures, i.e. the number of frames between one I-frame and the next I-frame.

Since the data amount of the I frame is generally much larger than that of the P frame, the I frame consumes more transmission time than the P frame. Therefore, in the scenario of low-latency video stream, there are generally two simple methods of setting the GOP length:

one method for encoding with IPPPP 823030a sequence for setting a fixed GOP length, forcing the current frame type to be an I-frame after exceeding the fixed length, has the following disadvantages: 1. when the image motion changes little, the video frame more suitable for being coded into the P frame is coded into the I frame, so that the transmission delay is increased, and under the condition of a fixed code rate, the I frame can also occupy a part of data space of other P frames of the same GOP group, so that the video quality of the whole GOP group is reduced. 2. When the picture has motion change, the overlong P-frame sequence can gradually superpose picture distortion, and the video quality is reduced. 3. When a scene is switched, because there is no actual reference value between two frames before and after the scene is switched, the latter frame is more suitable to be coded as an I frame, if the latter frame is coded as a P frame according to a fixed GOP length, the data volume will greatly increase to approach or even exceed the I frame, and the image quality is lower than the I frame with the same data volume.

The second is to dynamically adjust the GOP length, theoretically one sequence can be set to be longer when the motion changes less; when the motion changes much, a sequence is set to be shorter; the output frame is set to an I frame when the scene is switched. In this way, low latency and good picture quality can be maintained.

Patent document CN101322413B discloses a method of encoding video frames, including: determining a metric indicative of a distance between a selected video frame and a previous video frame to the selected video frame, wherein the determination is based on motion information associated with the previous video frame; and adaptively assigning an encoding method to the selected video frame based on the determined metric, comprising: encoding the selected frame as an intra-coded I-frame type if the determined metric is greater than a first threshold indicative of a scene change; encoding the selected frame as a predictive coded P frame type if the determined metric is not greater than the first threshold and a distance between the selected frame and a last frame assigned as a predictive coded P frame is greater than a second threshold; if the selected frame is not determined to be an intra-coded I frame type or a predictive-coded P frame type and the determined metric is greater than a third threshold, then the selected frame is encoded as a bi-directionally-coded B frame type.

Patent document CN104780367A discloses a method for dynamically adjusting GOP length, applied to an image coding apparatus, the method including the steps of: step 1: carrying out picture complexity judgment on the nth frame picture of the current coded GOP picture group; step 2: if the picture complexity of the nth frame picture is less than a preset threshold value or n is greater than a preset maximum length threshold value of a GOP picture group, carrying out intra-frame coding on the nth frame picture as a first frame of a next group of GOP picture group; otherwise, after the nth frame picture is subjected to interframe coding, adding 1 to the value of n, and returning to the step 1; the initial value of n is statically specified or obtained based on business model training.

In the related art, no matter whether scene change or image complexity is calculated, the output frame and the original image sequence before encoding are calculated, and the output frame is reset to be the I frame when a certain threshold value is exceeded. The method has large calculation amount, needs more CPUs and GPUs, and has high requirement on calculation force, thereby increasing the cost of firmware. Because the visual effect of the coded video stream is also influenced by other factors such as code rate, definition, code rate control algorithm and the like, the visual effect of the coded video stream cannot be guaranteed only by calculating the dynamic length of the GOP according to the method. Therefore, if the I frame is forced to exceed the threshold after the calculation in the above manner, the image may not be encoded as good as the P frame with the same data size. On the other hand, since the distortion of the P frames will gradually accumulate, if the distortion does not exceed the threshold, the method will not reset the I frame, and the actual picture effect may be poor due to the accumulation of the distortion.

Disclosure of Invention

The invention aims to solve the problems of large calculation amount, more occupation of CPU and GPU resources, high calculation requirement cost and unsatisfactory output effect when the GOP length is dynamically adjusted in the prior art.

In view of the above limitations, the present invention provides a method for dynamically adjusting the length of a group of pictures, which includes:

and (3) encoding: the first encoder 201 takes the first frame as a full-frame compression encoding frame I, and the second frame is a forward prediction encoding frame P type continuous output video encoding; the second encoder 202 continuously outputs video encoding in the form of full frame compression encoded frames I; the output frame of the first encoder 201 is output as a video frame;

a judging step: judging whether to enter a setting step according to the calculation parameters of the first encoder 201 and the second encoder 202;

the setting step: the current second encoder 202 is set to the first encoder 201, the current first encoder 201 is set to the second encoder 202, and the encoding steps are continued.

Further: the judging step comprises the following steps:

a first judging step: when the number of interval frames > = N between the current forward predictive coding frame P of the first encoder 201 and the last full-frame compressed coding frame I of the first encoder 201 is judged, a second judgment step is carried out; n is a first preset threshold;

a second judgment step: when the size of the current frame of the first encoder 201 is larger than or equal to K1 times that of the current frame of the second encoder 202, entering a setting step; k1 is a second preset threshold value.

Further: the judging step comprises the following steps:

a first judgment step: when judging that the number of interval frames > = N between the current forward predictive coding frame P of the first encoder 201 and the last full-frame compressed coding frame I of the first encoder 201, entering a second judgment step; n is a first preset threshold;

a second judgment step: when the frame size of the current frame of the first encoder 201 is more than or equal to K1 times the frame size of the current frame of the second encoder 202, entering a setting step; k1 is a second preset threshold value; when the size of the current frame of the first encoder 201 is less than K1 times the size of the current frame of the second encoder 202, calculating image quality parameters of the current frame of the first encoder 201 and the current frame of the second encoder 202, and entering a third judgment step;

a third judging step: when the image quality parameter of the current frame of the first encoder 201 is not more than K2 times the image quality parameter of the current frame of the second encoder 202, entering a setting step; k2 is a third preset threshold.

Further: when the image quality parameters are calculated, the decoder a2010 decodes the current frame of the first encoder 201 into a first image 2011, the decoder B2020 decodes the current frame of the second encoder 202 into a second image 2022, and the image quality parameters are calculated according to the first image 2011, the second image 2022 and the uncoded original image by using a VMAF or SSIM method.

Further: the first encoder 201 and the second encoder 202 use the same encoding algorithm and rate control algorithm, and the encoding algorithm is H265 or H264; the code rate control algorithm is one of VBR, CBR and ABR; the decoder a2010 and the decoder B2020 use a decoding algorithm corresponding to the encoding algorithm used by the first encoder 201 and the second encoder 202.

An encoding apparatus for dynamically adjusting a group length of a picture, comprising:

the encoding module 200: comprises a first encoder 201 and a second encoder 202; the first encoder 201 is arranged to continue outputting video coding with the first frame as a full frame compression coded frame I and the second frame as a forward predictive coded frame P; the second encoder 202 is arranged to continuously output video coding in the form of full frame compression coded frames I; the output frame of the first encoder 201 is output as a video frame;

the decision module 300: judging whether the setting module 500 needs to be called or not;

image quality parameter calculation module 400: calculating image quality parameters of output frames of the first encoder 201 and the second encoder 202;

the setting module 500: the encoding module 200 sets the current second encoder 202 as the first encoder 201, and sets the current first encoder 201 as the second encoder 202.

Further, the decision module 300 includes:

a first determination module: when the number of interval frames > = N between the current forward predictive coding frame P of the first encoder 201 and the last full-frame compressed coding frame I of the first encoder 201 is judged, a second judgment module is called; n is a first preset threshold;

a second determination module: when the frame size of the current frame of the first encoder 201 is larger than or equal to K1 times the frame size of the current frame of the second encoder 202, the setting module 500 is called; when the frame size of the frame of the first encoder 201 is smaller than K1 times the frame size of the current frame of the second encoder 202, the image quality parameter calculation module 400 is invoked to calculate the image quality parameter of the current frame of the first encoder 201 and the image quality parameter of the current frame of the second encoder 202, and a third determination module is invoked; k1 is a second preset threshold value;

a third determination module: when the current frame image quality parameter of the first encoder 201 is less than or equal to K2 times the current frame image quality parameter of the second encoder 202, the setting module 500 is called; k2 is a third preset threshold.

Further, when the image quality parameter calculation module 400 calculates the image quality parameter, the decoder a2010 decodes the current frame of the first encoder 201 into the first image 2011, the decoder B2020 decodes the current frame of the second encoder 202 into the second image 2022, and the image quality parameter is calculated by the VMAF or SSIM method according to the first image 2011, the second image 2022 and the un-encoded original image.

Further, the first encoder 201 and the second encoder 202 use the same encoding algorithm and rate control algorithm, and the encoding algorithm is H265 or H264; the code rate control algorithm is one of VBR, CBR and ABR. The decoder a2010 and the decoder B2020 use a decoding algorithm corresponding to the encoding algorithm used by the first encoder 201 and the second encoder 202

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-5.

Compared with the related art, the invention has the following advantages: the sizes of the current frame of the first encoder 201 and the current frame of the second encoder 202 are compared when the frame type is set, and when the motion change is large, the conclusion that the output frame should be set as the I frame again can be obtained through simple comparison of the sizes of the current frame and the current frame, so that the output quality of the video is improved through simple operation, and compared with parameters needing to be subjected to complex operation, such as similarity calculation and the like during each comparison, partial operation amount is saved.

The method has the advantages that the encoder on the server hardware is used for encoding, the decoder on the server hardware is used for decoding the encoded image to be compared when the image quality parameter is calculated, and the two encoders are directly exchanged when the output is required to be switched, so that the resources of a CPU (Central processing Unit) and a GPU (graphics processing Unit) are greatly saved, and the requirement on computing power and the investment cost of hardware are reduced.

By calculating the image quality parameters of the encoded video frames, the image quality change can be monitored in time, the video quality is accurately controlled, and the quality of the low-delay output video is ensured.

Drawings

Fig. 1 is a flowchart of a method for dynamically adjusting GOP length according to an embodiment of the present invention.

Fig. 2 is a flowchart of a method for dynamically adjusting GOP length according to an embodiment of the invention.

Fig. 3 is a flowchart of a method for dynamically adjusting GOP length according to an embodiment of the present invention.

Fig. 4 is a schematic structural diagram of an apparatus for dynamically adjusting GOP length according to an embodiment of the present invention.

FIG. 5 is a schematic input/output diagram of an apparatus for dynamically adjusting GOP length according to an embodiment of the present invention.

Fig. 6 is a diagram illustrating internal states of an apparatus for dynamically adjusting GOP length according to an embodiment of the present invention.

Fig. 7 is a diagram illustrating a method for calculating an image quality parameter of an apparatus for dynamically adjusting GOP length according to an embodiment of the present invention.

Fig. 8 is a diagram of an encoding apparatus for dynamically adjusting GOP length according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below. It is to be understood that the description herein is only illustrative of the present invention and is not intended to limit the scope of the present invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, and the terms used herein in the specification of the present invention are for the purpose of describing particular embodiments only and are not intended to limit the present invention. All the characterization means referred to herein can be referred to the related description in the prior art, and are not repeated herein.

For a further understanding of the present invention, reference will now be made in detail to the preferred embodiments of the present invention.

Example 1

Referring to the flowchart of fig. 1, a method for dynamically adjusting the group length of pictures according to an embodiment of the present invention includes:

and (3) encoding: the first encoder 201 continuously outputs video coding in the form of a first frame as a full-frame compression-coded frame (I-frame) and a second frame as a forward predictive-coded frame (P-frame); the second encoder 202 continuously outputs video encoding in the form of full frame compression encoded frames (I-frames); the output frame of the first encoder 201 is output as a video frame; i.e., the first encoder outputs a sequence IPPPP \8230, the second encoder 202 outputs a sequence IIIII \8230, and the video output sequence is IPPPP \8230atthis time.

A judging step: judging whether to enter a setting step or not according to the calculation parameters of the first encoder 201 and the second encoder 202;

the setting step: the current second encoder 202 is set as the first encoder 201, the current first encoder 201 is set as the second encoder 202, and the encoding step is continued. That is, the roles of the first encoder 201 and the second encoder 202 are switched so that the current I frame of the original second encoder 202 is used as the video output frame, and from this point on, the video output starts a new GOP sequence: IPPPP \8230. The new first encoder 201 and second encoder 202 still conform to the corresponding encoding mode in the encoding step after the swap. The first encoder 201 outputs IPPPP 8230, and the second encoder 202 outputs IIIII 8230. The I frame may also be set as an IDR frame. An IDR frame is a type of I frame, and the decoder recognizes the set of preamble parameters on which the IDR frame will clear. A schematic diagram of the process of swapping two encoders is shown in figure 6.

The invention has the following advantages: since the sizes of the current frame of the first encoder 201 and the current frame of the second encoder 202 are compared when the frame type is set, when the motion change is large, the conclusion that the output frame should be set as the I frame again can be obtained by simply comparing the sizes of the two frames, so that the output quality of the video is improved through simple operation, and the operation amount is saved.

The encoder on the server hardware is used for encoding, the encoder decodes the encoded image to be compared when the image quality parameters are calculated, and the two encoders are directly exchanged when the output is required to be switched, so that the resources of a CPU (Central processing Unit) and a GPU (graphics processing Unit) are greatly saved, and the requirement on computing power and the investment cost of hardware are reduced.

Example 2

Referring to fig. 2, a flowchart of a method for dynamically adjusting GOP length according to an embodiment of the present invention is shown, where the method for dynamically adjusting GOP length includes:

and (3) encoding: the first encoder 201 continuously outputs video coding in the form of a first frame as a full-frame compression-coded frame (I-frame) and a second frame as a forward predictive-coded frame (P-frame); the second encoder 202 continuously outputs video encoding in the form of full frame compression encoded frames (I-frames); the output frame of the first encoder 201 is output as a video frame; that is, the first encoder 201 outputs the sequence IPPPP \8230, the second encoder 202 outputs the sequence IIIII \8230, and the video output sequence is IPPPP \8230atthis time.

the setting step: setting the current second encoder 202 as the first encoder 201, the first encoder 201 as the second encoder 202, and continuing the encoding step. That is, the roles of the first encoder 201 and the second encoder 202 are switched, and the original second encoder 202 is used as video output, so that the current I frame of the original second encoder 202 is used as a video output frame, and from this moment, the video output starts a new GOP sequence: IPPPP \8230,. The new first encoder 201 and second encoder 202 still conform to the corresponding encoding mode in the encoding step after the swap. The I frame may also be set as an IDR frame. An IDR frame is a type of I-frame, and the decoder recognizes the set of preamble parameters on which the IDR frame depends before it will be cleared.

Further, the judging step includes:

a first judgment step: when the number of interval frames > = N between the current forward predictive coding frame (P) of the first encoder 201 and the last full-frame compressed coding frame (I) of the first encoder 201 is judged, a second judgment step is carried out; n is a first preset threshold; the value of N is preferably (1. Ltoreq. N.ltoreq.6000), but the value is not limited to the above value, and can be set according to the actual use requirement.

A second judgment step: when the size of the current P frame of the first encoder 201 is more than or equal to K1 times the size of the current I frame of the second encoder 202, entering a setting step; k1 is a second preset threshold value. K1 is (0-cloth-type K1-cloth-type 2), preferably (0.6-cloth-type K1-cloth-type 1), but not limited to the above, and may be set according to the actual use requirement. Therefore, when the data volume of the P frame is close to or larger than the data volume of the I frame, the output frame is set as the I frame in time, and a new GOP sequence is formed at the same time, so that the quality of the output frame of the image is adjusted in time by using simple calculation.

Further, as shown in fig. 7, when calculating the image quality parameter, the decoder a2010 decodes the current P frame of the first encoder 201 into the first image 2011, the decoder B2020 decodes the current I frame of the second encoder 202 into the second image 2022, and the image quality parameter is calculated by the VMAF or SSIM method according to the first image 2011, the second image 2022 and the uncoded original image.

Further, the first encoder 201 and the second encoder 202 use the same encoding algorithm and rate control algorithm, and the encoding algorithm is H265 or H264; the code rate control algorithm is one of VBR, CBR and ABR. The decoder a2010 and the decoder B2020 use decoding algorithms corresponding to the first encoder and the second encoder.

It can be seen that the correlation calculation is only needed when the current frame P size of the first encoder 201 is < K1 times the current frame I size of the second encoder 202, thereby saving the computation effort compared to the calculation in the correlation technique.

Example 3

On the basis of embodiment 1, as shown in fig. 3, further, the judging step includes:

a first judging step: when judging that the number of interval frames > = N between the current forward predictive coding frame (P) of the first encoder 201 and the last full-frame compressed coding frame (I) of the first encoder 201, entering a second judgment step; n is a first preset threshold; the value of N is preferably (1. Ltoreq. N.ltoreq.6000), but the value is not limited to the above value, and can be set according to the actual use requirement.

A second judgment step: when the frame size of the current P frame of the first encoder 201 is greater than or equal to K1 times the frame size of the current I frame of the second encoder 202, entering a setting step; when the size of the current P frame of the first encoder 201 is less than K1 times the size of the current I frame of the second encoder 202, calculating the image quality parameters of the current P frame of the first encoder 201 and the current I frame of the second encoder 202, and entering a third judgment step; k1 is a second preset threshold; k1 is (0-cloth-type K1-cloth-type 2), preferably (0.6-cloth-type K1-cloth-type 1), but not limited to the above, and may be set according to the actual use requirement.

A third judging step: when the image quality parameter of the current P frame of the first encoder 201 is not more than K2 times of the image quality parameter of the current I frame of the second encoder 202, entering a setting step; k2 is a third preset threshold. The value of K2 is preferably (K2 is more than or equal to 0 and less than or equal to 1), but the value is not limited to the above value, and can be set according to the actual use requirement.

Further, as shown in fig. 7, when calculating the image quality parameter, the decoder a2010 decodes the current P frame of the first encoder 201 into the first image 2011, the decoder B2020 decodes the current I frame of the second encoder 202 into the second image 2022, and the image quality parameter is calculated by using the VMAF or SSIM method according to the first image 2011, the second image 2022 and the un-encoded original image. According to the correlation formula, the image one 2011 and the original image are calculated to obtain the image quality parameter of the current frame of the first encoder 201, and the image two 2022 and the original image are calculated to obtain the image quality parameter of the current frame of the second encoder 202.

The VMAF method may be computed using an open source tool such as libvmaf or the like.

The SSIM method has the following calculation formula:

wherein the first term compares the intensities of x and y and the second term comparesContrast between them, third term comparing the structures of both, alpha>0，β>0，γ>0，

Is the covariance of x and y, C ₁ ，C ₂ ，C ₃ Is a constant. Mu.s _x And mu _y Is the average of x and y, σ _x And σ _y And is the standard deviation of x and y.

Further, the first encoder 201 and the second encoder 202 use the same encoding algorithm and rate control algorithm, and the encoding algorithm is H265 or H264; the code rate control algorithm is one of VBR, CBR and ABR. The decoder a2010 and the decoder B2020 use corresponding decoding algorithms with the first encoder 201 and the second encoder 202.

It can be seen that the correlation calculation is only needed when the current frame P size of the first encoder 201 is < K1 times the current frame I size of the second encoder 202, thus saving computation compared to the calculation in the related art.

Through the above steps, the encoding is continued, and when the condition is detected, the first encoder 201 and the second encoder 202 are exchanged again. Thereby the following effects can be achieved: the GOP length is dynamically adjusted, and the visual effect of the video stream is guaranteed to the maximum extent under the limiting conditions of certain delay requirement, bandwidth occupation and the like. By calculating the image quality parameters of the encoded video frames, the change of the image quality can be sensed in real time, and the video quality can be accurately controlled. When the image quality parameter calculation component in the product calculates the image quality parameters of the encoded video frames, the hardware video decoder on the server can be used for decoding the video frames, the calculation power of a CPU and a GPU is less, and hardware resources are saved.

Example 4

An encoding apparatus for dynamically adjusting a group length of pictures, comprising:

the encoding module 200: comprises a first encoder 201 and a second encoder 202; the first encoder 201 is arranged to continue outputting video coding with the first frame as a full frame compression coded frame I and the second frame as a forward predictive coded frame P; the second encoder 202 is arranged to continuously output video coding in the form of full-frame compression coded frames I; the output frame of the first encoder 201 is output as a video frame; that is, the first encoder 201 outputs IPPPP \8230, the second encoder 202 outputs IIIII \8230, and the video output sequence is IPPPP \8230.

The decision module 300: judging whether the first encoder 201 and the second encoder 202 in the encoding module 200 need to be reset;

the setting module 500: the encoding module 200 sets the current second encoder 202 as the first encoder 201, and sets the current first encoder 201 as the second encoder 202. That is, the roles of the first encoder 201 and the second encoder 202 are switched so that the current I frame of the original second encoder 202 is used as the video output frame, and from this point on, the video output starts a new GOP sequence: IPPPP \8230. The new first encoder 201 and second encoder 202 still conform to the corresponding encoding mode in the encoding step after the swap. The first encoder 201 outputs IPPPP 8230, and the second encoder 202 outputs IIIII 8230. The I frame may also be set as an IDR frame. An IDR frame is a type of I frame, and the decoder recognizes the set of preamble parameters on which the IDR frame will clear.

Example 5

On the basis of embodiment 4, further, the decision module 300 includes:

a first determination module: when the number of interval frames > = N between the current forward predictive coding frame of the first encoder 201 and the last full-frame compressed coding frame of the first encoder 201 is judged, a second judgment module is called; n is a first preset threshold; the value of N is preferably (1. Ltoreq. N.ltoreq.6000), but the value is not limited to the above value, and can be set according to the actual use requirement.

A second determination module: when the frame size of the current frame of the first encoder 201 is greater than or equal to K1 times the frame size of the current frame of the second encoder 202, the setting module 500 is called; when the frame size of the frame of the first encoder 201 is smaller than K1 times the frame size of the current frame of the second encoder 202, the image quality parameter calculation module 400 is invoked to calculate the image quality parameter of the current frame of the first encoder 201 and the image quality parameter of the current frame of the second encoder 202, and a third determination module is invoked; k1 is a second preset threshold value; k1 is (0-cloth-type K1-cloth-type 2), preferably (0.6-cloth-type K1-cloth-type 1), but not limited to the above, and may be set according to the actual use requirement.

A third determination module: the setting module 500 is called when the current frame image quality parameter of the first encoder 201 is less than or equal to K2 times of the current frame image quality parameter of the second encoder 202; k2 is a third preset threshold. The value of K2 is preferably (0. Ltoreq. K2. Ltoreq.1), but is not limited to the above value, and can be set according to the actual use requirement.

Further, when the image quality parameter calculation module 400 calculates the image quality parameter, the decoder a2010 decodes the current frame of the first encoder 201 into the first image 2011, the decoder B2020 decodes the current frame of the second encoder 202 into the second image 2022, and the image quality parameter is calculated by using the VMAF or SSIM method according to the first image 2011, the second image 2022 and the un-encoded original image. The specific calculation method is the same as in example 3.

Through the above modules, the encoding is continued, and the first encoder 201 and the second encoder 202 are exchanged again when conditions are met. Thereby the following effects can be achieved: the GOP length is dynamically adjusted, and the visual effect of the video stream is guaranteed to the maximum extent under the limiting conditions of certain delay requirement, bandwidth occupation and the like. By calculating the image quality parameters of the encoded video frames, the image quality change can be sensed in real time, and the video quality can be accurately controlled. When the image quality parameter calculation component in the product calculates the image quality parameters of the encoded video frames, the hardware video decoder on the server can be used for decoding the video frames, the calculation power of a CPU and a GPU is less, and hardware resources are saved.

Example 6

A computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method of any one of embodiments 1, 2, and 3.

Example 7

Fig. 8 is an encoding apparatus for dynamically adjusting GOP length according to an embodiment of the present invention, which includes a first encoder 201, a second encoder 202, and an encoding control component 103. The schematic diagram of the encoding device in operation is shown in fig. 5. The working flow of the coding device is as follows:

1. the first encoder 201 and the second encoder 202 of the encoder with 2 same parameters are established, the same parameters are encoding algorithm, code rate control algorithm, code rate, etc. but not limited to the above parameters. Such as h265 or h264, but not limited to the above. Such as vbr or cbr, but not limited to the above control algorithm.

2. The two encoders encode the original pictures simultaneously, initially using the first encoder 201 as the primary encoder and outputting as a normal video stream, and the second encoder 202 as the secondary encoder, forcing each frame to be encoded as an I-frame. The primary encoder output is ippppp. And outputting the output frame of the main encoder as an encoded video frame.

3. And setting a minimum I frame interval N as required, and maintaining the current main encoder unchanged when the interval between the currently encoded P frame and the last I frame of the main encoder is less than N.

4.1 set up coefficient K1 (0 is instead K1 instead of K2) as required, when the interval between the current P frame coded by the main coder and the last I frame is more than or equal to N, compare the size of the current P frame coded by the main coder with the size of the I frame coded by the auxiliary coder, when the size of the P frame is more than or equal to K1 times the size of the I frame, output the I frame coded by the second coder 202 as the current video frame and exchange the roles of the main and auxiliary coders, that is, output as a normal video stream with the second coder 202 as the main coder, and the first coder 201 as the auxiliary coder forces each frame to be coded as an I frame. That is, the output of the second encoder 202 is ippppp. The input and output schematic diagram when the primary and secondary encoders are switched is shown in fig. 6. N is equal to or more than 1 and equal to or less than 6000.

4.2 When the size of the P frame is smaller than K1 times that of the I frame, the image quality parameters of the P frame and the image quality parameters of the I frame are respectively calculated, wherein the image quality parameters are obtained by calculating image quality parameter calculation methods selected according to needs, such as VMAF, SSIM and the like. And setting the coefficient K2, wherein the value of K2 is (K2 is more than or equal to 0 and less than or equal to 1).

And when the P frame image quality parameter is less than or equal to the I frame image quality parameter multiplied by K2, exchanging the roles of the main encoder and the auxiliary encoder.

And 4.1, running to the position for judging whether the size of the P frame is more than or equal to K1 times of the size of the I frame, if so, continuing to run the rest steps of 4.1, and if the size of the P frame is less than K1 times of the size of the I frame, entering 4.2.

A schematic diagram of the calculation of the image quality parameters is shown in fig. 7.

5. And continuously coding, detecting and exchanging the roles of the main coder and the auxiliary coder again when the conditions are met.

Generally, when a certain large game is run on a GPU rendering server of a cloud game, the number of game paths is often below 5, and the hardware codec capability is above 10. The product utilizes idle hardware codec resources and has lower cost.

In the embodiment of the present invention, the encoding method and apparatus of the present invention may be used in application scenarios such as online video playing, online game, live video, cloud game, and meta universe, and it is to be understood that the present invention is not limited to the above application scenarios.

The above description is intended to be illustrative of the preferred embodiment of the present invention and should not be taken as limiting the invention, but rather, the invention is intended to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

Claims

1. A method for dynamically adjusting the length of a group of pictures, comprising:

and (3) encoding: the first encoder (201) takes the first frame as a full-frame compression encoding frame (I), and the second frame is a forward prediction encoding frame (P) type continuous output video encoding; the second encoder (202) continuously outputs video coding in the form of full frame compression coded frames (I); outputting the output frame of the first encoder (201) as a video frame;

a judging step: judging whether to enter a setting step or not according to the calculation parameters of the first encoder (201) and the second encoder (202);

the setting step: setting the current second encoder (202) as the first encoder (201), setting the current first encoder (201) as the second encoder (202), and continuing the encoding step.

2. The method of claim 1, wherein: the judging step comprises the following steps:

a first judgment step: when judging the number of interval frames > = N between a current forward predictive coding frame (P) of a first encoder (201) and a last full-frame compressed coding frame (I) of the first encoder, entering a second judgment step; n is a first preset threshold;

a second judgment step: when the size of the current frame of the first encoder (201) is more than or equal to K1 times that of the current frame of the second encoder (202), entering a setting step; k1 is a second preset threshold value.

3. The method of claim 1, wherein: the judging step comprises the following steps:

a first judgment step: entering a second judgment step when judging the number of interval frames > = N between the current forward predictive coding frame (P) of the first encoder (201) and the last full-frame compressed coding frame (I) of the first encoder (201); n is a first preset threshold;

a second judgment step: when the frame size of the current frame of the first encoder (201) is more than or equal to K1 times of the frame size of the current frame of the second encoder (202), entering a setting step; k1 is a second preset threshold value; when the size of the current frame of the first encoder (201) is smaller than K1 times that of the current frame of the second encoder (202), calculating image quality parameters of the current frame of the first encoder (201) and the current frame of the second encoder (202), and entering a third judgment step;

a third judging step: when the image quality parameter of the current frame of the first encoder (201) is less than or equal to K2 times of the image quality parameter of the current frame of the second encoder (202), entering a setting step; k2 is a third preset threshold.

4. The method of claim 3, wherein: when the image quality parameters are calculated, a decoder A (2010) decodes the current frame of the first encoder (201) into a first image (2011), a decoder B (2020) decodes the current frame of the second encoder (202) into a second image (2022), and the image quality parameters are calculated by a VMAF or SSIM method based on the first image (2011), the second image (2022) and an uncoded original image.

5. The method of claim 4, wherein: the first encoder (201) and the second encoder (202) use the same encoding algorithm and rate control algorithm, the encoding algorithm being H265 or H264; the code rate control algorithm is one of VBR, CBR and ABR; the decoder a (2010) and the decoder B (2020) use a decoding algorithm corresponding to the encoding algorithm used by the first encoder (201) and the second encoder (202).

6. An encoding apparatus for dynamically adjusting a group length of a picture, comprising:

encoding module (200): comprises a first encoder (201) and a second encoder (202); the first encoder (201) is arranged to output video coding continuously in the form of first frames being full-frame compression encoded frames (I), second frames being forward prediction encoded frames (P); the second encoder (202) is arranged to continuously output video coding in the form of full frame compressed encoded frames (I); outputting the output frame of the first encoder (201) as a video frame;

decision module (300): judging whether a setting module (500) needs to be called or not;

image quality parameter calculation module (400): -calculating image quality parameters of output frames of said first (201) and second (202) encoder;

a setup module (500): the current second encoder (202) of the encoding module (200) is set as the first encoder (201), and the current first encoder (201) is set as the second encoder (202).

7. The apparatus of claim 6, wherein the decision module (300) comprises:

a first determination module: when judging that the number of interval frames > = N between the current forward predictive coding frame (P) of a first encoder (201) and the last full-frame compressed coding frame (I) of the first encoder (201), calling a second judgment module; n is a first preset threshold;

a second determination module: when the frame size of the current frame of the first encoder (201) is larger than or equal to K1 times of the frame size of the current frame of the second encoder (202), calling a setting module (500); when the frame size of the current frame of the first encoder (201) is smaller than K1 times that of the current frame of the second encoder (202), an image quality parameter calculation module (400) is called to calculate the image quality parameters of the current frame of the first encoder (201) and the current frame of the second encoder (202), and a third judgment module is called; k1 is a second preset threshold;

a third determination module: when the current frame image quality parameter of the first encoder (201) is less than or equal to K2 times of the current frame image quality parameter of the second encoder (202), calling a setting module (500); k2 is a third preset threshold.

8. The apparatus according to claim 6 or 7, wherein the image quality parameter calculation module (400) is adapted to decode the current frame of the first encoder into a first image (2011) by the decoder A (2010), to decode the current frame of the second encoder (202) into a second image (2022) by the decoder B (2020), and to calculate the image quality parameter by VMAF or SSIM from the first image (2011), the second image (2022) and the uncoded original image.

9. The apparatus of claim 8, wherein the first encoder (201) and the second encoder (202) use the same encoding algorithm and rate control algorithm, the encoding algorithm being H265 or H264; the code rate control algorithm is one of VBR, CBR and ABR; the decoder a (2010) and the decoder B (2020) use a decoding algorithm corresponding to the encoding algorithm used by the first encoder (201) and the second encoder (202).

10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1-5.