CN117221547B

CN117221547B - CTU-level downsampling-based 8K video coding method and device

Info

Publication number: CN117221547B
Application number: CN202311467428.2A
Authority: CN
Inventors: 李子清; 刘征; 王友全; 宋小民; 李怡; 周廷开; 殷卓君; 吴成志; 郑慧明; 赵周丽
Original assignee: Sichuan Xinshi Chuangwei Ultra High Definition Technology Co ltd
Current assignee: Sichuan Guochuang Innovation Vision Ultra HD Video Technology Co.,Ltd.
Priority date: 2023-11-07
Filing date: 2023-11-07
Publication date: 2024-01-23
Anticipated expiration: 2043-11-07
Also published as: CN117221547A

Abstract

The invention discloses an 8K video coding method and device based on CTU-level downsampling, and belongs to the field of video coding. An 8K video encoding method comprising: encoding the original 8K video sequence by CTU to obtain a first reconstructed CTU block, and placing the first reconstructed CTU block into an encoded image buffer area; downsampling an original 8K video sequence to a first resolution to obtain a first downsampled video sequence; encoding the first downsampled video sequence CTUs by CTU to obtain a second reconstructed CTU block; performing super-resolution reconstruction on the second reconstructed CTU block to obtain a third reconstructed CTU block with 8K resolution; replacing a first reconstructed CTU block which accords with a preset rule in the coded image buffer area with a corresponding third reconstructed CTU block; and combining the code streams of the CTU blocks in the coded image buffer into a final code stream. The invention can reduce the size of the code stream on the premise of ensuring certain coding quality.

Description

CTU-level downsampling-based 8K video coding method and device

Technical Field

The invention belongs to the field of video coding, and particularly relates to an 8K video coding method and device based on CTU-level downsampling.

Background

With the development of the video industry, the demand for high resolution video, especially 8K video, in large scenes is increasing. However, current 8K video codec still has some challenges and problems, which limit its application range and practical effect.

The data volume contained in the 8K video is far more than that of the traditional high-definition resolution video, so that the size of the coded code stream is greatly increased, and a higher transmission bandwidth is further required. However, the transmission bandwidth is often quite limited, which makes it difficult to directly apply the existing 8K video coding method. Therefore, how to realize efficient coding of 8K large-scene video under the condition of low code rate becomes an industry difficulty to be solved currently.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides an 8K video coding method and device based on CTU-level downsampling.

The aim of the invention is realized by the following technical scheme:

a first aspect of the present invention provides an 8K video coding method based on CTU-level downsampling, comprising:

encoding the original 8K video sequence by CTU to obtain a first reconstructed CTU block;

placing the first reconstructed CTU block into an encoded image buffer;

downsampling an original 8K video sequence to a first resolution to obtain a first downsampled video sequence;

encoding the first downsampled video sequence CTUs by CTU to obtain a second reconstructed CTU block;

performing super-resolution reconstruction on the second reconstructed CTU block to obtain a third reconstructed CTU block with 8K resolution;

replacing a first reconstructed CTU block conforming to a first preset rule in the coded image buffer area with a corresponding third reconstructed CTU block;

and combining the code streams of the CTU blocks in the coded image buffer into a final code stream.

Further, replacing the first reconstructed CTU block in the encoded image buffer meeting the first preset rule with a corresponding third reconstructed CTU block, including:

comparing the rate distortion performance of each first reconstructed CTU block and the corresponding third reconstructed CTU block;

and if the rate distortion performance of a certain first reconstruction CTU block is worse than that of a corresponding third reconstruction CTU block, replacing the first reconstruction CTU block in the image coding buffer with the corresponding third reconstruction CTU block.

Further, the 8K video encoding method further includes:

downsampling the original 8K video sequence to a second resolution to obtain a second downsampled video sequence;

encoding the second downsampled video sequence CTUs by CTU to obtain a fourth reconstructed CTU block;

performing super-resolution reconstruction on the fourth reconstructed CTU block to obtain a fifth reconstructed CTU block with 8K resolution;

replacing the first reconstructed CTU block conforming to the second preset rule in the coded image buffer area with a corresponding fifth reconstructed CTU block;

wherein the second resolution is greater than the first resolution.

Further, replacing the first reconstructed CTU block in the encoded image buffer meeting the second preset rule with a corresponding fifth reconstructed CTU block, comprising:

for first reconstructed CTU blocks in the encoded image buffer that are not replaced by third reconstructed CTU blocks, respectively comparing rate-distortion performance of each first reconstructed CTU block with that of a corresponding fifth reconstructed CTU block;

and if the rate distortion performance of a certain first reconstructed CTU block is worse than that of a corresponding fifth reconstructed CTU block, replacing the first reconstructed CTU block in the image coding buffer with the corresponding fifth reconstructed CTU block.

Further, the first resolution is 1080P and the second resolution is 4K.

A second aspect of the present invention provides an 8K video encoding apparatus based on CTU-level downsampling, comprising:

the first CTU coding module is used for coding the original 8K video sequence one by one to obtain a first reconstructed CTU block, and placing the first reconstructed CTU block into a coding image buffer area;

the first downsampling module is used for downsampling the original 8K video sequence to a first resolution to obtain a first downsampled video sequence;

a second CTU encoding module, configured to encode the first downsampled video sequence CTU by CTU, to obtain a second reconstructed CTU block;

the first super-division module is used for carrying out super-resolution reconstruction on the second reconstructed CTU block to obtain a third reconstructed CTU block with 8K resolution;

the first updating module is used for replacing the first reconstructed CTU block which accords with the preset rule in the coded image buffer zone with a corresponding third reconstructed CTU block;

and the reorganization module is used for combining the code streams of the CTU blocks in the coded image buffer into a final code stream.

Further, the first updating module is specifically configured to:

Further, the 8K video encoding apparatus further includes:

the second downsampling module is used for downsampling the original 8K video sequence to a second resolution, so as to obtain a second downsampled video sequence, wherein the second resolution is larger than the first resolution;

a third CTU encoding module, configured to encode the second downsampled video sequence CTU by CTU, to obtain a fourth reconstructed CTU block;

the second super-division module is used for carrying out super-resolution reconstruction on the fourth reconstructed CTU block to obtain a fifth reconstructed CTU block with 8K resolution;

and the second updating module is used for replacing the first reconstructed CTU block which accords with the preset rule in the coded image buffer zone with a corresponding fifth reconstructed CTU block.

Further, the second updating module is specifically configured to:

Further, the first resolution is 1080P and the second resolution is 4K.

The beneficial effects of the invention are as follows: the invention performs coding by downsampling 8K video to low resolution, then superdividing back to 8K resolution, and then performs rate distortion preferential selection with the original coding result, so that the code stream is necessarily reduced as long as the superdivided coding block is selected, and better rate distortion performance is also presented, thereby achieving the purpose of reducing the size of the code stream on the premise of ensuring certain coding quality, and completing the transmission of 8K video under the condition of limited bandwidth.

Drawings

FIG. 1 is a flowchart of an embodiment of an 8K video encoding method according to the present invention;

fig. 2 is a block diagram of an 8K video encoding device according to an embodiment of the present invention.

Detailed Description

The technical solutions of the present invention will be clearly and completely described below with reference to the embodiments, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by a person skilled in the art without any inventive effort, are intended to be within the scope of the present invention, based on the embodiments of the present invention.

Referring to fig. 1 to 2, the present invention provides an 8K video coding method and apparatus based on CTU level downsampling:

a first aspect of the present embodiment discloses an 8K video coding method based on CTU level downsampling. As shown in fig. 1, the 8K video encoding method includes S100 to S700.

S100, coding the original 8K video sequence by CTU to obtain a first reconstructed CTU block.

CTU (Coding Tree Unit) is a coding tree unit, where video is not directly coded on the whole frame of data during coding, but is divided into small blocks, where each time only one small block is coded, the small block is a CTU, so that video coding is performed by using the CTU as a unit. In some embodiments, the CTU is a 64 x 64 sized CTU; in other embodiments, CTUs may also be recursively partitioned down into smaller sizes according to a quadtree.

Specifically, an original 8K video sequence to be encoded is received first, and CTUs in the original 8K video sequence are encoded one by one, so as to obtain a plurality of first reconstructed CTU blocks.

S200, placing the first reconstructed CTU block into an encoded image buffer area.

For all first reconstructed CTU blocks obtained by encoding CTUs, the first reconstructed CTU blocks are stored in a Coded Picture Buffer (CPB).

S300, downsampling an original 8K video sequence to a first resolution to obtain a first downsampled video sequence.

The original 8K video sequence is downsampled to a low resolution video sequence, the first resolution being less than the original 8K resolution. For example, the first resolution is 1920×1080, the original 8K video sequence may be downsampled to video sequences with other resolutions as needed, and the larger the downsampling multiple, the smaller the encoded code stream, and the worse the corresponding quality may be.

S400, coding the first downsampled video sequence according to the CTU to obtain a second reconstructed CTU block.

Specifically, CTUs in the first downsampled video sequence are encoded one by one, so as to obtain a plurality of second reconstructed CTU blocks.

S500, performing super-resolution reconstruction on the second reconstruction CTU block to obtain a third reconstruction CTU block with 8K resolution.

And respectively carrying out super-resolution reconstruction on each second reconstruction CTU block to obtain a third reconstruction CTU block with original resolution, namely a third reconstruction CTU block with 8K resolution.

S600, replacing the first reconstructed CTU block which accords with the first preset rule in the coded image buffer area with a corresponding third reconstructed CTU block.

In some embodiments, replacing a first reconstructed CTU block in the encoded image buffer that meets a first preset rule with a corresponding third reconstructed CTU block comprises: calculating rate distortion performance of the first reconstructed CTU block and the third reconstructed CTU block; comparing the rate distortion performance of each first reconstructed CTU block and the corresponding third reconstructed CTU block; and if the rate distortion performance of a certain first reconstruction CTU block is worse than that of a corresponding third reconstruction CTU block, replacing the first reconstruction CTU block in the image coding buffer with the corresponding third reconstruction CTU block.

The method of the embodiment can encode under low resolution, the obtained code stream is smaller, but the code stream can be super-divided back to 8K resolution after being compared with the original reconstruction block through rate distortion performance, and the purpose of transmitting ultra-high definition video with low code rate is achieved. The method does not rely on video content characteristics and the final rate-distortion performance will be better. In addition, in order to further reduce the encoded code rate, the rate distortion function model can be modified, and by changing the code rate and the weight of distortion, the encoder can select more super-divided reconstruction blocks, and the code stream can be correspondingly reduced.

S700, combining the code streams of the CTU blocks in the coded image buffer into a final code stream.

In some embodiments, combining the code streams of CTU blocks in the coded picture buffer into a final code stream includes: and combining the code stream of the first reconstruction CTU block and the code stream of the third reconstruction CTU block in the coded image buffer into a final code stream.

In some embodiments, the 8K video encoding method further comprises: downsampling the original 8K video sequence to a second resolution to obtain a second downsampled video sequence; encoding the second downsampled video sequence CTUs by CTU to obtain a fourth reconstructed CTU block; performing super-resolution reconstruction on the fourth reconstructed CTU block to obtain a fifth reconstructed CTU block with 8K resolution; replacing the first reconstructed CTU block conforming to the second preset rule in the coded image buffer area with a corresponding fifth reconstructed CTU block; wherein the second resolution is greater than the first resolution.

In some embodiments, combining the code streams of CTU blocks in the coded picture buffer into a final code stream includes: and combining the code stream of the first reconstruction CTU block, the code stream of the third reconstruction CTU block and the code stream of the fifth reconstruction CTU block in the coded image buffer into a final code stream.

In some embodiments, replacing a first reconstructed CTU block in the encoded image buffer that meets a second preset rule with a corresponding fifth reconstructed CTU block comprises: for first reconstructed CTU blocks in the encoded image buffer that are not replaced by third reconstructed CTU blocks, respectively comparing rate-distortion performance of each first reconstructed CTU block with that of a corresponding fifth reconstructed CTU block; and if the rate distortion performance of a certain first reconstructed CTU block is worse than that of a corresponding fifth reconstructed CTU block, replacing the first reconstructed CTU block in the image coding buffer with the corresponding fifth reconstructed CTU block.

In some embodiments, downsampling results in two downsampled video sequences of different resolutions, for example, a first downsampled video sequence of 1080P and a second downsampled video sequence of 4K are downsampled from an original 8K video sequence, and when performing rate-distortion performance comparison, a third reconstructed CTU block obtained from the first downsampled video sequence is first compared with the rate-distortion performance of a first reconstructed CTU block in the encoded image buffer, and then the first reconstructed CTU block in the encoded image buffer that is not replaced by the third reconstructed CTU block is compared with a fifth reconstructed CTU block obtained from the second downsampled video sequence, thereby obtaining a final encoded result.

In some embodiments, the method of the present embodiment may be run on an encoding platform model HEVC reference platform HM-16.20.

A second aspect of the present embodiment discloses an 8K video encoding apparatus based on CTU level downsampling. As shown in fig. 2, the 8K video encoding apparatus includes a first CTU encoding module, a first downsampling module, a second CTU encoding module, a first superdivision module, a first update module, and a reorganization module.

And the first CTU coding module is used for coding the original 8K video sequence one by one to obtain a first reconstructed CTU block, and placing the first reconstructed CTU block into the coded image buffer area.

And the first downsampling module is used for downsampling the original 8K video sequence to a first resolution to obtain a first downsampled video sequence.

And the second CTU coding module is used for coding the first downsampled video sequence according to the CTU to obtain a second reconstructed CTU block.

And the first super-division module is used for carrying out super-resolution reconstruction on the second reconstructed CTU block to obtain a third reconstructed CTU block with 8K resolution.

And the first updating module is used for replacing the first reconstructed CTU block which accords with the preset rule in the coded image buffer zone with a corresponding third reconstructed CTU block.

In some embodiments, the first updating module is specifically configured to: comparing the rate distortion performance of each first reconstructed CTU block and the corresponding third reconstructed CTU block; and if the rate distortion performance of a certain first reconstruction CTU block is worse than that of a corresponding third reconstruction CTU block, replacing the first reconstruction CTU block in the image coding buffer with the corresponding third reconstruction CTU block.

In some embodiments, the 8K video encoding apparatus further comprises a second downsampling module, a third CTU encoding module, a second superdivision module, and a second updating module.

And the second downsampling module is used for downsampling the original 8K video sequence to a second resolution, so as to obtain a second downsampled video sequence, wherein the second resolution is larger than the first resolution.

And the third CTU coding module is used for coding the second downsampled video sequence according to the CTU to obtain a fourth reconstructed CTU block.

And the second super-division module is used for carrying out super-resolution reconstruction on the fourth reconstructed CTU block to obtain a fifth reconstructed CTU block with 8K resolution.

In some embodiments, the second updating module is specifically configured to: for first reconstructed CTU blocks in the encoded image buffer that are not replaced by third reconstructed CTU blocks, respectively comparing rate-distortion performance of each first reconstructed CTU block with that of a corresponding fifth reconstructed CTU block; and if the rate distortion performance of a certain first reconstructed CTU block is worse than that of a corresponding fifth reconstructed CTU block, replacing the first reconstructed CTU block in the image coding buffer with the corresponding fifth reconstructed CTU block.

It should be noted that, for convenience and brevity, a person skilled in the art will clearly understand that, for the specific working process of the system or apparatus described above, reference may be made to the corresponding process in the foregoing method embodiment, and the description will not be repeated here.

The foregoing is merely a preferred embodiment of the invention, and it is to be understood that the invention is not limited to the form disclosed herein but is not to be construed as excluding other embodiments, but is capable of numerous other combinations, modifications and environments and is capable of modifications within the scope of the inventive concept, either as taught or as a matter of routine skill or knowledge in the relevant art. And that modifications and variations which do not depart from the spirit and scope of the invention are intended to be within the scope of the appended claims.

Claims

1. An 8K video coding method based on CTU-level downsampling, comprising:

placing the first reconstructed CTU block into an encoded image buffer;

combining the code streams of the CTU blocks in the coded image buffer into a final code stream;

replacing a first reconstructed CTU block in the encoded image buffer meeting a first preset rule with a corresponding third reconstructed CTU block, comprising:

if the rate distortion performance of a certain first reconstructed CTU block is worse than that of a corresponding third reconstructed CTU block, replacing the first reconstructed CTU block in the image coding buffer with the corresponding third reconstructed CTU block;

the 8K video encoding method further includes:

wherein the second resolution is greater than the first resolution;

replacing a first reconstructed CTU block in the encoded image buffer meeting a second preset rule with a corresponding fifth reconstructed CTU block, comprising:

2. The CTU-level downsampling-based 8K video encoding method of claim 1, wherein the first resolution is 1080P and the second resolution is 4K.

3. An 8K video encoding device based on CTU-level downsampling, comprising:

the reorganization module is used for combining the code streams of the CTU blocks in the coded image buffer area into a final code stream;

the first updating module is specifically configured to:

the 8K video encoding device further includes:

the second updating module is used for replacing the first reconstructed CTU block which accords with the preset rule in the coded image buffer zone with a corresponding fifth reconstructed CTU block;

the second updating module is specifically configured to:

4. A CTU-level downsampling based 8K video encoding device according to claim 3, wherein the first resolution is 1080P and the second resolution is 4K.