CN111669583A - Image prediction method, device, equipment, system and storage medium - Google Patents

Image prediction method, device, equipment, system and storage medium Download PDF

Info

Publication number
CN111669583A
CN111669583A CN201910696741.0A CN201910696741A CN111669583A CN 111669583 A CN111669583 A CN 111669583A CN 201910696741 A CN201910696741 A CN 201910696741A CN 111669583 A CN111669583 A CN 111669583A
Authority
CN
China
Prior art keywords
current node
block
prediction
coding
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910696741.0A
Other languages
Chinese (zh)
Inventor
赵寅
杨海涛
陈建乐
张恋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to BR112021003269-0A priority Critical patent/BR112021003269A2/en
Priority to JP2021510741A priority patent/JP7204891B2/en
Priority to AU2019333452A priority patent/AU2019333452B2/en
Priority to PCT/CN2019/103094 priority patent/WO2020043136A1/en
Priority to ES19855934T priority patent/ES2966509T3/en
Priority to KR1020217008065A priority patent/KR102631517B1/en
Priority to CA3110477A priority patent/CA3110477C/en
Priority to HUE19855934A priority patent/HUE064218T2/en
Priority to KR1020247003066A priority patent/KR20240017109A/en
Priority to NZ773632A priority patent/NZ773632A/en
Priority to PT198559346T priority patent/PT3836542T/en
Priority to MX2021002396A priority patent/MX2021002396A/en
Priority to EP23200770.8A priority patent/EP4387224A1/en
Priority to EP19855934.6A priority patent/EP3836542B1/en
Priority to MX2021008340A priority patent/MX2021008340A/en
Priority to CN202080001551.3A priority patent/CN112075077B/en
Priority to CN202111475069.6A priority patent/CN114173114B/en
Priority to KR1020237043658A priority patent/KR20240005108A/en
Priority to KR1020217025090A priority patent/KR102616713B1/en
Priority to PCT/CN2020/070976 priority patent/WO2020143684A1/en
Priority to CN202111468095.6A priority patent/CN114245113B/en
Priority to EP20738949.5A priority patent/EP3907988A4/en
Priority to AU2020205376A priority patent/AU2020205376B2/en
Priority to CN202111467815.7A priority patent/CN114157864B/en
Priority to CA3125904A priority patent/CA3125904A1/en
Priority to JP2021539883A priority patent/JP7317973B2/en
Priority to BR112021013444-1A priority patent/BR112021013444A2/en
Publication of CN111669583A publication Critical patent/CN111669583A/en
Priority to PH12021550378A priority patent/PH12021550378A1/en
Priority to CL2021000494A priority patent/CL2021000494A1/en
Priority to US17/187,184 priority patent/US11323708B2/en
Priority to ZA2021/01354A priority patent/ZA202101354B/en
Priority to IL281144A priority patent/IL281144A/en
Priority to US17/369,350 priority patent/US11388399B2/en
Priority to US17/734,829 priority patent/US11758134B2/en
Priority to US17/843,798 priority patent/US11849109B2/en
Priority to JP2022212121A priority patent/JP2023038229A/en
Priority to JP2023117695A priority patent/JP2023134742A/en
Priority to US18/360,639 priority patent/US20230370597A1/en
Priority to AU2023229600A priority patent/AU2023229600A1/en
Priority to US18/503,304 priority patent/US20240146909A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application provides an image prediction method, device, equipment, system and storage medium. The method comprises the following steps: the method comprises the steps of obtaining a dividing mode of a current node, and judging whether an image block with a preset size is obtained by dividing the current node based on the dividing mode of the current node, wherein the image block comprises a luminance block or a chrominance block. Under the condition that an image block with a preset size is obtained by dividing a current node based on a dividing mode of the current node, intra-frame prediction is used for all coding blocks covered by the current node, or inter-frame prediction is used for all coding blocks covered by the current node. The method uses intra-frame prediction or inter-frame prediction for all the coding blocks of the current node, can realize parallel processing of all the coding blocks of the current node, and improves the processing performance of image prediction, thereby improving the processing speed of encoding and decoding.

Description

Image prediction method, device, equipment, system and storage medium
Technical Field
The embodiment of the application relates to the technical field of video coding and decoding, in particular to an image prediction method, device, equipment, system and storage medium.
Background
Digital video capabilities can be incorporated into a wide variety of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, tablet computers, electronic book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video gaming consoles, cellular or satellite radio telephones (so-called "smart phones"), video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video compression techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 part 10 Advanced Video Coding (AVC), the video coding standard H.265/High Efficiency Video Coding (HEVC), and extensions of such standards. Video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video compression techniques.
With the development of information technology, video services such as high-definition televisions, web conferences, IPTV, 3D televisions and the like are rapidly developed, and video signals become the most important way for people to acquire information in daily life with the advantages of intuition, high efficiency and the like. Since the video signal contains a large amount of data, it needs to occupy a large amount of transmission bandwidth and storage space. In order to effectively transmit and store video signals, compression coding needs to be performed on the video signals, and video compression technology is becoming an indispensable key technology in the field of video application.
The encoding process mainly includes Intra Prediction (Intra Prediction), inter Prediction (inter Prediction), Transform (Transform), Quantization (Quantization), Entropy coding (Entropy encoding), in-loop filtering (in-loop filtering) (mainly deblocking filtering, de-blocking filtering), and other links. And dividing the image into coding blocks, then carrying out intra-frame prediction or inter-frame prediction, carrying out transform quantization after obtaining a residual error, finally carrying out entropy coding and outputting a code stream. Here, the coding block is an array of sizes (M × N, M may be equal to N or not equal to N) composed of pixels, and the pixel value of each pixel position is known. Video decoding is equivalent to the inverse of video encoding. For example, residual information is obtained by inverse quantization and inverse transformation through entropy decoding, and whether the current block uses intra-frame prediction or inter-frame prediction is determined according to a decoded code stream. And if the intra-frame coding is adopted, constructing a prediction block according to the used intra-frame prediction method by using the pixel values of the pixel points in the reconstructed area around the current image. If the image is inter-frame coded, motion information needs to be analyzed, a reference block is determined in a reconstructed image by using the analyzed motion information, a pixel value of a pixel point in the reference block is used as a prediction block (the process is called Motion Compensation (MC)), and the reconstruction information can be obtained by filtering operation by using the prediction block and residual information.
Currently, a node of 8xM (or Mx8) size is divided into two sub-nodes of 4xM (or Mx4) size using vertical (or horizontal) bisection. Similarly, a 16xM (or Mx16) sized node using a vertical three-part (or horizontal three-part) partition would result in two 4xM (or Mx4) sized children and one 8xM (or Nx8) sized child. For the YUV4:2:0 data format, the resolution of the chrominance components is 1/2 for the luminance components, i.e., a 4xM node contains a 4xM luminance block and two 2x (M/2) chrominance blocks. Therefore, dividing the current node by the preset division may generate 2x2, 2x4, 4x2 and other chroma patches. For a hardware decoder, the processing complexity of the chroma small block is high, and specifically includes the following 3 aspects.
1) The intra prediction problem: in order to increase the processing speed in the hardware design, the intra prediction generally processes 16 pixels at a time, and the chroma small blocks such as 2x2, 2x4, 4x2 contain less than 16 pixels, which reduces the processing performance of the intra prediction.
2) The problem of coefficient coding: transform coefficient coding in HEVC is based on a Coefficient Group (CG) containing 16 coefficients, whereas chroma small blocks of 2x2, 2x4, 4x2 contain 4 or 8 transform coefficients, which results in the need to increase the coefficient group containing 4 and 8 coefficients to support coefficient coding of these small blocks, thus increasing implementation complexity.
3) Inter prediction problem: inter-frame prediction of chroma small blocks has high requirements on data bandwidth and also affects the processing speed of decoding.
Disclosure of Invention
The application provides an image prediction method, device, equipment, system and storage medium, which improve the processing performance of image prediction and improve the processing speed of encoding and decoding.
A first aspect of the present application provides an image prediction method, including:
acquiring a division mode of a current node; judging whether the current node is divided based on the dividing mode to obtain an image block with a preset size; the image block comprises a luminance block or a chrominance block; and under the condition that the image blocks with the preset size are obtained by determining that the current node is divided based on the dividing mode, using intra-frame prediction for all coding blocks covered by the current node, or using inter-frame prediction for all coding blocks covered by the current node.
Alternatively, the image block of the preset size may be a luminance block with a size smaller than a threshold, where the threshold may be 128, 64, or 32 luminance sample points, or 32, 16, or 8 chrominance sample points. The size of the current node may be greater than or equal to the threshold.
Alternatively, the intra prediction may be performed using a normal intra mode (intra mode), or may be performed using an ibc (intra block copy) mode.
Optionally, all the coding blocks covered by the current node refer to all the coding blocks located in the current node area. The coding block may also be a coding unit (coding unit).
Optionally, in a case where the type (slice type) of the slice in which the current node is located is an Intra (Intra) type, Intra prediction is used for all coding blocks covered by the current node, and inter prediction is not used.
The beneficial effects of the embodiment of the application are that: in the method, the situation that the brightness block or the chroma block with the preset size can be obtained by dividing the image block corresponding to the current node is considered, and the intra-frame prediction or the inter-frame prediction is used by the encoding end or the decoding end for all the coding blocks which are divided or not divided by taking the current node as a root node under the situation, so that the parallel processing of the brightness block or the chroma block with the preset size can be realized, the processing performance of the image prediction is improved, and the encoding and decoding performance is improved.
Optionally, the following two cases both belong to the image block with the preset size: the using intra prediction for all coding blocks covered by the current node or inter prediction for all coding blocks covered by the current node comprises: judging whether the current node is divided based on the dividing mode to obtain the brightness block with the first preset size; under the condition that a brightness block with a first preset size is obtained by determining that the current node is divided based on the dividing mode, intra-frame prediction is used for all coding blocks covered by the current node; and under the condition that the brightness block with the first preset size cannot be obtained by dividing the current node based on the dividing mode, using intra-frame prediction for all coding blocks covered by the current node or using inter-frame prediction for all coding blocks covered by the current node.
Optionally, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node under the condition that it is determined that the luminance block with the first preset size cannot be obtained by dividing the current node based on the dividing manner may include: under the condition that the fact that a brightness block with a first preset size cannot be obtained by dividing the current node based on the dividing mode is determined, analyzing a prediction mode state identifier of the current node; when the value of the prediction mode state identifier is a first value, using inter-frame prediction for all coding blocks covered by the current node; or, when the value of the prediction mode state identifier is a second value, intra-frame prediction is used for all coding blocks covered by the current node.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the determining whether the current node is divided based on the dividing manner to obtain an image block with a preset size includes: and determining whether the current node is divided based on the dividing mode to obtain a brightness block with a first preset size or not according to the size of the current node and the dividing mode.
Alternatively, the first preset-sized luminance block may be a luminance block having a pixel size of 4 × 4, or 8 × 8, or a luminance block having an area of 16 or 32.
Optionally, when the luminance block of the first preset size has a pixel size of 4 × 4 or an area of 16, determining whether the luminance block of the first preset size is obtained by dividing the current node based on the dividing method according to the size of the current node and the dividing method may include:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode is quadtree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the brightness block of the current node is 64, and the dividing mode is ternary tree division; alternatively, the first and second electrodes may be,
3) the number of sampling points of the luminance block of the current node is 32, and the division mode is binary tree division.
With reference to the first possible implementation manner of the first aspect, optionally, when it is determined that an image block with the preset size is obtained by dividing the current node based on the dividing manner, using intra prediction for all coding blocks covered by the current node, or using inter prediction for all coding blocks covered by the current node, includes: and under the condition that the brightness block with the first preset size is obtained by determining that the current node is divided based on the dividing mode, using intra-frame prediction for all coding blocks covered by the current node.
With reference to the first possible implementation manner of the first aspect, optionally, in a case that it is determined that partitioning the current node based on the partitioning manner does not result in a luminance block with a first preset size, the method further includes: judging whether the current node is divided based on the dividing mode to obtain a chrominance block with a second preset size; and under the condition that the chroma block with the second preset size is obtained by determining that the current node is divided based on the dividing mode, using intra-frame prediction for all coding blocks covered by the current node, or using inter-frame prediction for all coding blocks covered by the current node.
In summary, by determining that intra-frame prediction is used for all coding blocks which are divided or not divided by taking the current node as the root node or inter-frame prediction is used for all coding blocks, parallel processing of a luminance block or a chrominance block with a preset size can be realized, and the processing performance of image prediction is improved, so that the coding and decoding performance is improved.
Alternatively, the luma block having the first preset size may be a 4 × 4 luma block or a 16 area luma block, and in the case that the luma block having the first preset size is a 4 × 4 luma block, the chroma block having the second preset size may be a 2 × 4 or 4 × 2 pixel size chroma block or an 8 area chroma block, excluding a 2 × 2 or 4 pixel size chroma block.
Alternatively, the luminance block having the first preset size may be a 4 × 4 luminance block or a 16 area luminance block, and in the case that the luminance block having the first preset size is a 4 × 4 luminance block, the chrominance block having the second preset size may be a 4 × 8 or 8 × 4 pixel size luminance block or a 32 area luminance block, excluding a 4 × 4 or 16 pixel size luminance block.
Optionally, when the chroma block of the second preset size is a chroma block with a pixel size of 2 × 4, or 4 × 2, or an area of 8, or a luma block with a pixel size of 4 × 8, or 8 × 4, or an area of 32, determining whether the current node is divided based on the dividing manner to obtain the chroma block of the second preset size may include:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode is binary tree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the luminance block of the current node is 128, and the division manner is the ternary tree division.
With reference to the first aspect, in a second possible implementation manner of the first aspect, the determining whether the current node is divided based on the dividing manner to obtain an image block with a preset size includes: and determining whether the current node is divided based on the dividing mode to obtain a chrominance block with a second preset size or not according to the size and the dividing mode of the current node.
Alternatively, the second preset-sized chroma block may be a chroma block having a pixel size of 2 × 2, 2 × 4, or 4 × 2, or an area of 4 or 8.
Optionally, the determining, according to the size of the current node and the dividing manner, whether the chroma block with the second preset size is obtained by dividing the current node based on the dividing manner may include: and determining whether the current node is divided based on the dividing mode to obtain a brightness block with a third preset size or not according to the size and the dividing mode of the current node.
Alternatively, the luminance block having the third preset size may be a 4 × 4, 4 × 8, or 8 × 4 luminance block or a luminance block having an area of 32 or 16.
Optionally, the determining whether the current node is divided based on the dividing manner to obtain the chrominance block of the second preset size may include:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode is quadtree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the brightness block of the current node is 64, and the dividing mode is ternary tree division; alternatively, the first and second electrodes may be,
3) the number of sampling points of the brightness block of the current node is 32, and the division mode is binary tree division; alternatively, the first and second electrodes may be,
4) The number of sampling points of the brightness block of the current node is 64, and the division mode is binary tree division; alternatively, the first and second electrodes may be,
5) the number of sampling points of the luminance block of the current node is 128, and the division manner is the ternary tree division.
Alternatively, the second preset-sized chroma block may be a chroma block having a pixel size of 2 × 4, or 4 × 2, or an area of 8, and does not include a chroma block having a pixel size of 2 × 2, or an area of 4. Similarly, the luminance block having the third preset size may be a luminance block having a pixel size of 4 × 8, or 8 × 4, or an area of 32, and does not include a luminance block having a pixel size of 4 × 4, or an area of 16. Correspondingly, the determining whether the current node is divided based on the dividing manner to obtain the chrominance block with the second preset size may include:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode is binary tree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the luminance block of the current node is 128, and the division manner is the ternary tree division.
With reference to the first implementation manner or the second implementation manner, in a case that it is determined that a chroma block with a second preset size is obtained by dividing the current node based on the dividing manner, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node includes: analyzing the prediction mode state identification of the current node; when the value of the prediction mode state identifier is a first value, using inter-frame prediction for all coding blocks covered by the current node; or, when the value of the prediction mode state identifier is a second value, intra-frame prediction is used for all coding blocks covered by the current node. The implementation mode is applied to a video decoder, and the prediction modes of all coding blocks which are obtained by dividing or not dividing by taking the current node as a root node are determined by analyzing the prediction mode state identification from the code stream.
Optionally, the type (slice type) of the slice in which the current node is located is not an Intra (Intra) type.
Based on the first implementation manner or the second implementation manner, in a case that it is determined that partitioning the current node based on the partitioning manner would result in a chroma block with a second preset size, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node includes: when the prediction mode of any coding block covered by the current node is inter-frame prediction, using inter-frame prediction for all coding blocks covered by the current node; or, when the prediction mode of any coding block covered by the current node is intra-frame prediction, using intra-frame prediction for all coding blocks covered by the current node. Optionally, any coding block is a first coding block in a decoding order of all coding blocks covered by the current node. The implementation mode is applied to a video decoder, and all coding blocks obtained by dividing or not dividing the current node as a root node are predicted according to the analyzed prediction mode by analyzing the prediction mode of any one coding block of the current node from a code stream.
With reference to the second implementation manner, optionally, under the condition that it is determined that a chroma block with a second preset size is obtained by dividing the current node based on the dividing manner, the using intra prediction on all coding blocks covered by the current node or using inter prediction on all coding blocks covered by the current node includes: judging whether a brightness block with a first preset size is obtained by dividing the current node based on the dividing mode; and under the condition that the luminance block with the first preset size is obtained by determining that the current node is divided based on the dividing mode of the current node, using intra-frame prediction for all coding blocks covered by the current node. By the implementation mode, intra-frame prediction is determined to be used for all coding blocks which are divided or not divided by taking the current node as the root node, parallel processing of the brightness block with the first preset size and the chroma block with the second preset size can be achieved, processing performance of image prediction is improved, and coding and decoding performance is improved.
Optionally, under the condition that it is determined that the current node is divided based on the dividing manner and a luminance block of a first preset size is not obtained, using intra-frame prediction for all coding blocks covered by the current node or using inter-frame prediction for all coding blocks covered by the current node includes: analyzing the prediction mode state identification of the current node; when the value of the prediction mode state identifier is a first value, using inter-frame prediction for all coding blocks covered by the current node; or, when the value of the prediction mode state identifier is a second value, intra-frame prediction is used for all coding blocks covered by the current node. The implementation mode is applied to a video decoder, and the prediction modes of all coding blocks which are obtained by dividing or not dividing by taking the current node as a root node are determined by analyzing the prediction mode state identification from the code stream.
Optionally, under the condition that it is determined that the current node is divided based on the dividing manner and a luminance block of a first preset size is not obtained, using intra-frame prediction for all coding blocks covered by the current node or using inter-frame prediction for all coding blocks covered by the current node includes: when the prediction mode of any coding block covered by the current node is inter-frame prediction, using inter-frame prediction for all coding blocks covered by the current node; or, when the prediction mode of any coding block covered by the current node is intra-frame prediction, using intra-frame prediction for all coding blocks covered by the current node. The implementation mode is applied to a video decoder, and all coding blocks obtained by dividing or not dividing the current node as a root node are predicted according to the analyzed prediction mode by analyzing the prediction mode of any one coding block of the current node from a code stream.
Optionally, any coding block is a first coding block in a decoding order of all coding blocks covered by the current node.
With reference to the first aspect or any one of the possible implementations of the first aspect, in a third possible implementation of the first aspect, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node includes: dividing the brightness blocks included in the current node according to the dividing mode to obtain divided brightness blocks, using intra-frame prediction on the divided brightness blocks, using the chrominance blocks included in the current node as chrominance coding blocks, and using the intra-frame prediction on the chrominance coding blocks; or, the luminance block included in the current node is divided according to the dividing mode to obtain a divided luminance block, inter-frame prediction is performed on the divided luminance block, the chrominance block included in the current node is divided according to the dividing mode to obtain a divided chrominance block, and inter-frame prediction is performed on the divided chrominance block. In this implementation, no matter whether all the coding blocks covered by the current node use intra-frame prediction or inter-frame prediction, the luminance block of the current node is always divided, the chrominance block of the current node can be divided in the inter-frame prediction mode, and the chrominance block of the current node is not divided in the intra-frame prediction mode. The implementation method does not generate the chroma block with the second preset size using the intra-frame prediction, thereby solving the problem of the intra-frame prediction of the chroma small block and further improving the processing speed of video coding.
With reference to the first aspect or any one of the possible implementations of the first aspect, in a fourth possible implementation of the first aspect, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node includes: dividing the brightness blocks included in the current node according to the dividing mode to obtain divided brightness blocks, using intra-frame prediction on the divided brightness blocks, using the chrominance blocks included in the current node as chrominance coding blocks, and using the intra-frame prediction on the chrominance coding blocks; or, the luminance block included in the current node is divided according to the dividing mode to obtain a divided luminance block, inter-frame prediction is used for the divided luminance block, the chrominance block included in the current node is used as a chrominance coding block, and inter-frame prediction is used for the chrominance coding block. In this implementation, no matter whether all the coding blocks covered by the current node use intra-frame prediction or inter-frame prediction, the chroma block of the current node is always not divided, and the luma block is divided according to the division manner of the luma block. The implementation method does not generate the chroma block with the second preset size using the intra-frame prediction, thereby solving the problem of the intra-frame prediction of the chroma small block and further improving the processing speed of video coding.
With reference to the first aspect or any one of the possible implementations of the first aspect, in a fifth possible implementation of the first aspect, in a case that inter prediction is used for all coding blocks covered by the current node, the using inter prediction for all coding blocks covered by the current node includes:
dividing the current node according to the division mode of the current node to obtain child nodes of the current node; determining an unallowable partition mode of the child node of the current node according to the size of the child node of the current node; determining a fast partitioning strategy of the child nodes of the current node according to the unallowed partitioning modes of the child nodes of the current node; and obtaining a coding block corresponding to the child node of the current node according to the block division strategy of the child node of the current node, and using inter-frame prediction for the corresponding coding block. In this implementation, the generation of the luminance block of the first preset size on the premise of inter prediction can be avoided.
The child node may be obtained by dividing the current node once, or may be obtained by dividing N times. N is an integer greater than 1.
The partitioning policy may include not partitioning, may also include partitioning once, and may also include partitioning N times. N is an integer greater than 1.
A second aspect of the present application provides an image prediction apparatus comprising:
the acquisition module is used for acquiring the division mode of the current node;
the judging module is used for judging whether the current node is divided based on the dividing mode to obtain an image block with a preset size; the image block comprises a luminance block or a chrominance block;
and the execution module is used for using intra-frame prediction for all coding blocks covered by the current node or using inter-frame prediction for all coding blocks covered by the current node under the condition that the image block with the preset size is obtained by determining that the current node is divided based on the dividing mode.
A third aspect of the present application provides a video encoding apparatus comprising a processor and a memory for storing executable instructions of the processor; wherein the processor performs the method as described in the first aspect of the application.
A fourth aspect of the present application provides a video decoding device comprising a processor and a memory for storing executable instructions of the processor; wherein the processor performs a method as described in the first aspect of the application.
A fifth aspect of the present application provides an image prediction system, comprising: the video encoding device is connected with the video acquisition device and the video decoding device respectively, and the video decoding device is connected with the display device.
A sixth aspect of the present application provides a computer readable storage medium having stored thereon a computer program for execution by a processor to perform the method according to any of the first aspects of the present application.
The application provides an image prediction method, device, equipment, system and storage medium. The method comprises the following steps: the method comprises the steps of obtaining a dividing mode of a current node, and judging whether an image block with a preset size is obtained by dividing the current node based on the dividing mode of the current node, wherein the image block comprises a luminance block or a chrominance block. Under the condition that an image block with a preset size is obtained by dividing a current node based on a dividing mode of the current node, intra-frame prediction is used for all coding blocks covered by the current node, or inter-frame prediction is used for all coding blocks covered by the current node. The method uses intra-frame prediction or inter-frame prediction for all the coding blocks of the current node, can realize parallel processing of all the coding blocks of the current node, and improves the processing performance of image prediction, thereby improving the processing speed of encoding and decoding.
Drawings
FIG. 1A is a block diagram of an example of a video encoding and decoding system 10 for implementing embodiments of the present application;
FIG. 1B is a block diagram of an example of a video coding system 40 for implementing embodiments of the present application;
FIG. 2 is a block diagram of an example structure of an encoder 20 for implementing embodiments of the present application;
FIG. 3 is a block diagram of an example structure of a decoder 30 for implementing embodiments of the present application;
FIG. 4 is a block diagram of an example of a video coding apparatus 400 for implementing an embodiment of the present application;
FIG. 5 is a block diagram of another example of an encoding device or a decoding device for implementing embodiments of the present application;
FIG. 6 is a schematic block diagram of one manner of block partitioning for implementing embodiments of the present application;
FIG. 7 is a schematic block diagram of an intra prediction for implementing embodiments of the present application;
FIG. 8 is a schematic block diagram of a video communication system for implementing embodiments of the present application;
fig. 9 is a schematic flowchart of a first image prediction method according to an embodiment of the present application;
FIG. 10 is a flowchart illustrating a second image prediction method according to an embodiment of the present application;
FIG. 11 is a flowchart illustrating a third image prediction method according to an embodiment of the present application;
fig. 12 is a schematic flowchart of a fourth image prediction method according to an embodiment of the present application;
Fig. 13 is a schematic flowchart of a fifth image prediction method according to an embodiment of the present application;
fig. 14 is a flowchart illustrating a sixth image prediction method according to an embodiment of the present application;
fig. 15 is a functional structure diagram of an image prediction apparatus according to an embodiment of the present application;
fig. 16 is a schematic hardware structure diagram of a video encoding apparatus provided in an embodiment of the present application;
fig. 17 is a schematic hardware structure diagram of a video decoding apparatus according to an embodiment of the present application;
fig. 18 is a schematic structural diagram of an image prediction system according to an embodiment of the present application.
Detailed Description
The embodiments of the present application will be described below with reference to the drawings. In the following description, reference is made to the accompanying drawings which form a part hereof and in which is shown by way of illustration specific aspects of embodiments of the application or in which specific aspects of embodiments of the application may be employed. It should be understood that embodiments of the present application may be used in other ways and may include structural or logical changes not depicted in the drawings. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present application is defined by the appended claims. For example, it should be understood that the disclosure in connection with the described methods may equally apply to the corresponding apparatus or system for performing the methods, and vice versa. For example, if one or more particular method steps are described, the corresponding apparatus may comprise one or more units, such as functional units, to perform the described one or more method steps (e.g., a unit performs one or more steps, or multiple units, each of which performs one or more of the multiple steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a particular apparatus is described based on one or more units, such as functional units, the corresponding method may comprise one step to perform the functionality of the one or more units (e.g., one step performs the functionality of the one or more units, or multiple steps, each of which performs the functionality of one or more of the plurality of units), even if such one or more steps are not explicitly described or illustrated in the figures. Further, it is to be understood that features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless explicitly stated otherwise.
Video coding generally refers to processing a sequence of pictures that form a video or video sequence. In the field of video coding, the terms "picture", "frame" or "image" may be used as synonyms. Video encoding as used herein means video encoding or video decoding. Video encoding is performed on the source side, typically including processing (e.g., by compressing) the original video picture to reduce the amount of data required to represent the video picture for more efficient storage and/or transmission. Video decoding is performed at the destination side, typically involving inverse processing with respect to the encoder, to reconstruct the video pictures. Embodiments are directed to video picture "encoding" to be understood as referring to "encoding" or "decoding" of a video sequence. The combination of the encoding part and the decoding part is also called codec (encoding and decoding).
A video sequence comprises a series of images (pictures) which are further divided into slices (slices) which are further divided into blocks (blocks). Video coding performs the coding process in units of blocks, and in some new video coding standards, the concept of blocks is further extended. For example, a macroblock may be further partitioned into a plurality of prediction blocks (partitions) that may be used for predictive coding. Alternatively, a basic concept such as a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU) is used, and various block units are functionally divided, and a completely new tree-based structure is used for description. For example, a CU may be partitioned into smaller CUs according to a quadtree, and the smaller CUs may be further partitioned to form a quadtree structure, where the CU is a basic unit for partitioning and encoding an encoded image. There is also a similar tree structure for PU and TU, and PU may correspond to a prediction block, which is the basic unit of predictive coding. The CU is further partitioned into PUs according to a partitioning pattern. A TU may correspond to a transform block, which is a basic unit for transforming a prediction residual. However, CU, PU and TU are basically concepts of blocks (or image blocks).
The CTU is split into CUs by using a quadtree structure represented as a coding tree. A decision is made at the CU level whether to encode a picture region using inter-picture (temporal) or intra-picture (spatial) prediction. Each CU may be further split into one, two, or four PUs according to the PU split type. The same prediction process is applied within one PU and the relevant information is transmitted to the decoder on a PU basis. After obtaining the residual block by applying a prediction process based on the PU split type, the CU may be partitioned into Transform Units (TUs) according to other quadtree structures similar to the coding tree used for the CU. In recent developments of video compression techniques, the coding blocks are partitioned using Quad-tree and binary tree (QTBT) partition frames. In the QTBT block structure, a CU may be square or rectangular in shape.
Herein, for convenience of description and understanding, an image block to be encoded in a currently encoded image may be referred to as a current block, e.g., in encoding, referring to a block currently being encoded; in decoding, refers to the block currently being decoded. A decoded image block in a reference picture used for predicting the current block is referred to as a reference block, i.e. a reference block is a block that provides a reference signal for the current block, wherein the reference signal represents pixel values within the image block. A block in the reference picture that provides a prediction signal for the current block may be a prediction block, wherein the prediction signal represents pixel values or sample values or a sampled signal within the prediction block. For example, after traversing multiple reference blocks, a best reference block is found that will provide prediction for the current block, which is called a prediction block.
In the case of lossless video coding, the original video picture can be reconstructed, i.e., the reconstructed video picture has the same quality as the original video picture (assuming no transmission loss or other data loss during storage or transmission). In the case of lossy video coding, the amount of data needed to represent the video picture is reduced by performing further compression, e.g., by quantization, while the decoder side cannot fully reconstruct the video picture, i.e., the quality of the reconstructed video picture is lower or worse than the quality of the original video picture.
Several video coding standards of h.261 belong to the "lossy hybrid video codec" (i.e., the combination of spatial and temporal prediction in the sample domain with 2D transform coding in the transform domain for applying quantization). Each picture of a video sequence is typically partitioned into non-overlapping sets of blocks, typically encoded at the block level. In other words, the encoder side typically processes, i.e., encodes, video at the block (video block) level, e.g., generates a prediction block by spatial (intra-picture) prediction and temporal (inter-picture) prediction, subtracts the prediction block from the current block (currently processed or block to be processed) to obtain a residual block, transforms the residual block and quantizes the residual block in the transform domain to reduce the amount of data to be transmitted (compressed), while the decoder side applies the inverse processing portion relative to the encoder to the encoded or compressed block to reconstruct the current block for representation. In addition, the encoder replicates the decoder processing loop such that the encoder and decoder generate the same prediction (e.g., intra-prediction and inter-prediction) and/or reconstruction for processing, i.e., encoding, subsequent blocks.
The system architecture to which the embodiments of the present application apply is described below. Referring to fig. 1A, fig. 1A schematically shows a block diagram of a video encoding and decoding system 10 to which an embodiment of the present application is applied. As shown in fig. 1A, video encoding and decoding system 10 may include a source device 12 and a destination device 14, source device 12 generating encoded video data, and thus source device 12 may be referred to as a video encoding apparatus. Destination device 14 may decode the encoded video data generated by source device 12, and thus destination device 14 may be referred to as a video decoding apparatus. Various implementations of source apparatus 12, destination apparatus 14, or both may include one or more processors and memory coupled to the one or more processors. The memory can include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures that can be accessed by a computer, as described herein. Source apparatus 12 and destination apparatus 14 may comprise a variety of devices, including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, wireless communication devices, or the like.
Although fig. 1A depicts source apparatus 12 and destination apparatus 14 as separate apparatuses, an apparatus embodiment may also include the functionality of both source apparatus 12 and destination apparatus 14 or both, i.e., source apparatus 12 or corresponding functionality and destination apparatus 14 or corresponding functionality. In such embodiments, source device 12 or corresponding functionality and destination device 14 or corresponding functionality may be implemented using the same hardware and/or software, or using separate hardware and/or software, or any combination thereof.
A communication connection may be made between source device 12 and destination device 14 over link 13, and destination device 14 may receive encoded video data from source device 12 via link 13. Link 13 may comprise one or more media or devices capable of moving encoded video data from source apparatus 12 to destination apparatus 14. In one example, link 13 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real-time. In this example, source apparatus 12 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to destination apparatus 14. The one or more communication media may include wireless and/or wired communication media such as a Radio Frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the internet). The one or more communication media may include routers, switches, base stations, or other apparatuses that facilitate communication from source apparatus 12 to destination apparatus 14.
Source device 12 includes an encoder 20, and in the alternative, source device 12 may also include a picture source 16, a picture preprocessor 18, and a communication interface 22. In one implementation, the encoder 20, the picture source 16, the picture preprocessor 18, and the communication interface 22 may be hardware components of the source device 12 or may be software programs of the source device 12. Described below, respectively:
the picture source 16, which may include or be any kind of picture capturing device, is used for capturing, for example, a real-world picture, and/or any kind of picture or comment generation device (for screen content encoding, some text on the screen is also considered as part of the picture or image to be encoded), such as a computer graphics processor for generating a computer animation picture, or any kind of device for acquiring and/or providing a real-world picture, a computer animation picture (e.g., screen content, a Virtual Reality (VR) picture), and/or any combination thereof (e.g., an Augmented Reality (AR) picture). The picture source 16 may be a camera for capturing pictures or a memory for storing pictures, and the picture source 16 may also include any kind of (internal or external) interface for storing previously captured or generated pictures and/or for obtaining or receiving pictures. When picture source 16 is a camera, picture source 16 may be, for example, an integrated camera local or integrated in the source device; when the picture source 16 is a memory, the picture source 16 may be an integrated memory local or integrated, for example, in the source device. When the picture source 16 comprises an interface, the interface may for example be an external interface receiving pictures from an external video source, for example an external picture capturing device such as a camera, an external memory or an external picture generating device, for example an external computer graphics processor, a computer or a server. The interface may be any kind of interface according to any proprietary or standardized interface protocol, e.g. a wired or wireless interface, an optical interface.
The picture can be regarded as a two-dimensional array or matrix of pixel elements (picture elements). The pixels in the array may also be referred to as sampling points. The number of sampling points of the array or picture in the horizontal and vertical directions (or axes) defines the size and/or resolution of the picture. To represent color, three color components are typically employed, i.e., a picture may be represented as or contain three sample arrays. For example, in RBG format or color space, a picture includes corresponding arrays of red, green, and blue samples. However, in video coding, each pixel is typically represented in a luminance/chrominance format or color space, e.g. for pictures in YUV format, comprising a luminance component (sometimes also indicated with L) indicated by Y and two chrominance components indicated by U and V. The luminance (luma) component Y represents luminance or gray level intensity (e.g., both are the same in a gray scale picture), while the two chrominance (chroma) components U and V represent chrominance or color information components. Accordingly, a picture in YUV format includes a luma sample array of luma sample values (Y), and two chroma sample arrays of chroma values (U and V). Pictures in RGB format can be converted or transformed into YUV format and vice versa, a process also known as color transformation or conversion. If the picture is black and white, the picture may include only an array of luminance samples. In the embodiment of the present application, the pictures transmitted from the picture source 16 to the picture processor may also be referred to as raw picture data 17.
Picture pre-processor 18 is configured to receive original picture data 17 and perform pre-processing on original picture data 17 to obtain pre-processed picture 19 or pre-processed picture data 19. For example, the pre-processing performed by picture pre-processor 18 may include trimming, color format conversion (e.g., from RGB format to YUV format), toning, or de-noising.
An encoder 20 (or video encoder 20) for receiving the pre-processed picture data 19, processing the pre-processed picture data 19 with a relevant prediction mode (such as the prediction mode in various embodiments herein), thereby providing encoded picture data 21 (structural details of the encoder 20 will be described further below based on fig. 2 or fig. 4 or fig. 5). In some embodiments, the encoder 20 may be configured to perform various embodiments described hereinafter to implement the application of the chroma block prediction method described herein on the encoding side.
A communication interface 22, which may be used to receive encoded picture data 21 and may transmit encoded picture data 21 over link 13 to destination device 14 or any other device (e.g., memory) for storage or direct reconstruction, which may be any device for decoding or storage. Communication interface 22 may, for example, be used to encapsulate encoded picture data 21 into a suitable format, such as a data packet, for transmission over link 13.
Destination device 14 includes a decoder 30, and optionally destination device 14 may also include a communication interface 28, a picture post-processor 32, and a display device 34. Described below, respectively:
communication interface 28 may be used to receive encoded picture data 21 from source device 12 or any other source, such as a storage device, such as an encoded picture data storage device. The communication interface 28 may be used to transmit or receive the encoded picture data 21 by way of a link 13 between the source device 12 and the destination device 14, or by way of any type of network, such as a direct wired or wireless connection, any type of network, such as a wired or wireless network or any combination thereof, or any type of private and public networks, or any combination thereof. Communication interface 28 may, for example, be used to decapsulate data packets transmitted by communication interface 22 to obtain encoded picture data 21.
Both communication interface 28 and communication interface 22 may be configured as a one-way communication interface or a two-way communication interface, and may be used, for example, to send and receive messages to establish a connection, acknowledge and exchange any other information related to a communication link and/or data transfer, such as an encoded picture data transfer.
A decoder 30 (otherwise referred to as decoder 30) for receiving the encoded picture data 21 and providing decoded picture data 31 or decoded pictures 31 (structural details of the decoder 30 will be described further below based on fig. 3 or fig. 4 or fig. 5). In some embodiments, the decoder 30 may be configured to perform various embodiments described hereinafter to implement the application of the chroma block prediction method described herein on the decoding side.
A picture post-processor 32 for performing post-processing on the decoded picture data 31 (also referred to as reconstructed picture data) to obtain post-processed picture data 33. Post-processing performed by picture post-processor 32 may include: color format conversion (e.g., from YUV format to RGB format), toning, trimming or resampling, or any other process may also be used to transmit post-processed picture data 33 to display device 34.
A display device 34 for receiving the post-processed picture data 33 for displaying pictures to, for example, a user or viewer. Display device 34 may be or may include any type of display for presenting the reconstructed picture, such as an integrated or external display or monitor. For example, the display may include a Liquid Crystal Display (LCD), an Organic Light Emitting Diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS), a Digital Light Processor (DLP), or any other display of any kind.
Although fig. 1A depicts source device 12 and destination device 14 as separate devices, device embodiments may also include the functionality of both source device 12 and destination device 14 or both, i.e., source device 12 or corresponding functionality and destination device 14 or corresponding functionality. In such embodiments, source device 12 or corresponding functionality and destination device 14 or corresponding functionality may be implemented using the same hardware and/or software, or using separate hardware and/or software, or any combination thereof.
It will be apparent to those skilled in the art from this description that the existence and (exact) division of the functionality of the different elements, or source device 12 and/or destination device 14 as shown in fig. 1A, may vary depending on the actual device and application. Source device 12 and destination device 14 may comprise any of a variety of devices, including any type of handheld or stationary device, such as a notebook or laptop computer, a mobile phone, a smartphone, a tablet or tablet computer, a camcorder, a desktop computer, a set-top box, a television, a camera, an in-vehicle device, a display device, a digital media player, a video game console, a video streaming device (e.g., a content service server or a content distribution server), a broadcast receiver device, a broadcast transmitter device, etc., and may not use or use any type of operating system.
Both encoder 20 and decoder 30 may be implemented as any of a variety of suitable circuits, such as one or more microprocessors, Digital Signal Processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), discrete logic, hardware, or any combinations thereof. If the techniques are implemented in part in software, an apparatus may store instructions of the software in a suitable non-transitory computer-readable storage medium and may execute the instructions in hardware using one or more processors to perform the techniques of this disclosure. Any of the foregoing, including hardware, software, a combination of hardware and software, etc., may be considered one or more processors.
In some cases, the video encoding and decoding system 10 shown in fig. 1A is merely an example, and the techniques of this application may be applicable to video encoding settings (e.g., video encoding or video decoding) that do not necessarily involve any data communication between the encoding and decoding devices. In other examples, the data may be retrieved from local storage, streamed over a network, and so on. A video encoding device may encode and store data to a memory, and/or a video decoding device may retrieve and decode data from a memory. In some examples, the encoding and decoding are performed by devices that do not communicate with each other, but merely encode data to and/or retrieve data from memory and decode data.
Referring to fig. 1B, fig. 1B is an illustrative diagram of an example of a video coding system 40 including the encoder 20 of fig. 2 and/or the decoder 30 of fig. 3, according to an example embodiment. Video coding system 40 may implement a combination of the various techniques of the embodiments of the present application. In the illustrated embodiment, video coding system 40 may include an imaging device 41, an encoder 20, a decoder 30 (and/or a video codec implemented by logic 47 of a processing unit 46), an antenna 42, one or more processors 43, one or more memories 44, and/or a display device 45.
As shown in fig. 1B, the imaging device 41, the antenna 42, the processing unit 46, the logic circuit 47, the encoder 20, the decoder 30, the processor 43, the memory 44, and/or the display device 45 can communicate with each other. As discussed, although video coding system 40 is depicted with encoder 20 and decoder 30, in different examples video coding system 40 may include only encoder 20 or only decoder 30.
In some instances, antenna 42 may be used to transmit or receive an encoded bitstream of video data. Additionally, in some instances, display device 45 may be used to present video data. In some examples, logic 47 may be implemented by processing unit 46. The processing unit 46 may comprise application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, or the like. Video decoding system 40 may also include an optional processor 43, which optional processor 43 similarly may include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, or the like. In some examples, the logic 47 may be implemented in hardware, such as video encoding specific hardware, and the processor 43 may be implemented in general purpose software, an operating system, and so on. In addition, the Memory 44 may be any type of Memory, such as a volatile Memory (e.g., Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), etc.) or a nonvolatile Memory (e.g., flash Memory, etc.), and the like. In a non-limiting example, storage 44 may be implemented by a speed cache memory. In some instances, logic circuitry 47 may access memory 44 (e.g., to implement an image buffer). In other examples, logic 47 and/or processing unit 46 may include memory (e.g., cache, etc.) for implementing image buffers, etc.
In some examples, encoder 20, implemented by logic circuitry, may include an image buffer (e.g., implemented by processing unit 46 or memory 44) and a graphics processing unit (e.g., implemented by processing unit 46). The graphics processing unit may be communicatively coupled to the image buffer. The graphics processing unit may include an encoder 20 implemented by logic circuitry 47 to implement the various modules discussed with reference to fig. 2 and/or any other encoder system or subsystem described herein. Logic circuitry may be used to perform various operations discussed herein.
In some examples, decoder 30 may be implemented by logic circuitry 47 in a similar manner to implement the various modules discussed with reference to decoder 30 of fig. 3 and/or any other decoder system or subsystem described herein. In some examples, logic circuit implemented decoder 30 may include an image buffer (implemented by processing unit 2820 or memory 44) and a graphics processing unit (e.g., implemented by processing unit 46). The graphics processing unit may be communicatively coupled to the image buffer. The graphics processing unit may include a decoder 30 implemented by logic circuitry 47 to implement the various modules discussed with reference to fig. 3 and/or any other decoder system or subsystem described herein.
In some instances, antenna 42 may be used to receive an encoded bitstream of video data. As discussed, the encoded bitstream may include data related to the encoded video frame, indicators, index values, mode selection data, etc., discussed herein, such as data related to the encoding partition (e.g., transform coefficients or quantized transform coefficients, (as discussed) optional indicators, and/or data defining the encoding partition). Video coding system 40 may also include a decoder 30 coupled to antenna 42 and used to decode the encoded bitstream. The display device 45 is used to present video frames.
It should be understood that for the example described with reference to encoder 20 in the embodiments of the present application, decoder 30 may be used to perform the reverse process. With respect to signaling syntax elements, decoder 30 may be configured to receive and parse such syntax elements and decode the associated video data accordingly. In some examples, encoder 20 may entropy encode the syntax elements into an encoded video bitstream. In such instances, decoder 30 may parse such syntax elements and decode the relevant video data accordingly.
It should be noted that the decoding method described in the embodiment of the present application is mainly used in the decoding process, which exists in both the encoder 20 and the decoder 30.
Referring to fig. 2, fig. 2 shows a schematic/conceptual block diagram of an example of an encoder 20 for implementing embodiments of the present application. In the example of fig. 2, encoder 20 includes a residual calculation unit 204, a transform processing unit 206, a quantization unit 208, an inverse quantization unit 210, an inverse transform processing unit 212, a reconstruction unit 214, a buffer 216, a loop filter unit 220, a Decoded Picture Buffer (DPB) 230, a prediction processing unit 260, and an entropy encoding unit 270. Prediction processing unit 260 may include inter prediction unit 244, intra prediction unit 254, and mode selection unit 262. Inter prediction unit 244 may include a motion estimation unit and a motion compensation unit (not shown). The encoder 20 shown in fig. 2 may also be referred to as a hybrid video encoder or a video encoder according to a hybrid video codec.
For example, the residual calculation unit 204, the transform processing unit 206, the quantization unit 208, the prediction processing unit 260, and the entropy encoding unit 270 form a forward signal path of the encoder 20, and, for example, the inverse quantization unit 210, the inverse transform processing unit 212, the reconstruction unit 214, the buffer 216, the loop filter 220, the Decoded Picture Buffer (DPB) 230, the prediction processing unit 260 form a backward signal path of the encoder, wherein the backward signal path of the encoder corresponds to a signal path of a decoder (see the decoder 30 in fig. 3).
The encoder 20 receives, e.g., via an input 202, a picture 201 or an image block 203 of a picture 201, e.g., a picture in a sequence of pictures forming a video or a video sequence. Image block 203 may also be referred to as a current picture block or a picture block to be encoded, and picture 201 may be referred to as a current picture or a picture to be encoded (especially when the current picture is distinguished from other pictures in video encoding, such as previously encoded and/or decoded pictures in the same video sequence, i.e., a video sequence that also includes the current picture).
An embodiment of the encoder 20 may comprise a partitioning unit (not shown in fig. 2) for partitioning the picture 201 into a plurality of blocks, e.g. image blocks 203, typically into a plurality of non-overlapping blocks. The partitioning unit may be used to use the same block size for all pictures in a video sequence and a corresponding grid defining the block size, or to alter the block size between pictures or subsets or groups of pictures and partition each picture into corresponding blocks.
In one example, prediction processing unit 260 of encoder 20 may be used to perform any combination of the above-described segmentation techniques.
Like picture 201, image block 203 is also or can be considered as a two-dimensional array or matrix of sample points having sample values, although its size is smaller than picture 201. In other words, the image block 203 may comprise, for example, one sample array (e.g., a luma array in the case of a black and white picture 201) or three sample arrays (e.g., a luma array and two chroma arrays in the case of a color picture) or any other number and/or class of arrays depending on the color format applied. The number of sampling points in the horizontal and vertical directions (or axes) of the image block 203 defines the size of the image block 203.
The encoder 20 as shown in fig. 2 is used to encode a picture 201 block by block, e.g. performing encoding and prediction for each image block 203.
The residual calculation unit 204 is configured to calculate a residual block 205 based on the picture image block 203 and the prediction block 265 (further details of the prediction block 265 are provided below), e.g. by subtracting sample values of the prediction block 265 from sample values of the picture image block 203 sample by sample (pixel by pixel) to obtain the residual block 205 in the sample domain.
The transform processing unit 206 is configured to apply a transform, such as a Discrete Cosine Transform (DCT) or a Discrete Sine Transform (DST), on the sample values of the residual block 205 to obtain transform coefficients 207 in a transform domain. The transform coefficients 207 may also be referred to as transform residual coefficients and represent the residual block 205 in the transform domain.
The transform processing unit 206 may be configured to apply integer approximations of DCT/DST, such as transforms specified by AVS, AVS2, AVS 3. Such integer approximations are typically scaled by some factor compared to the orthogonal DCT transform. To maintain the norm of the residual block processed by the forward transform and the inverse transform, an additional scaling factor is applied as part of the transform process. The scaling factor is typically selected based on certain constraints, e.g., the scaling factor is a power of 2 for a shift operation, a trade-off between bit depth of transform coefficients, accuracy and implementation cost, etc. For example, a specific scaling factor may be specified on the decoder 30 side for the inverse transform by, for example, inverse transform processing unit 212 (and on the encoder 20 side for the corresponding inverse transform by, for example, inverse transform processing unit 212), and correspondingly, a corresponding scaling factor may be specified on the encoder 20 side for the forward transform by transform processing unit 206.
Quantization unit 208 is used to quantize transform coefficients 207, e.g., by applying scalar quantization or vector quantization, to obtain quantized transform coefficients 209. Quantized transform coefficients 209 may also be referred to as quantized residual coefficients 209. The quantization process may reduce the bit depth associated with some or all of transform coefficients 207. For example, an n-bit transform coefficient may be rounded down to an m-bit transform coefficient during quantization, where n is greater than m. The quantization level may be modified by adjusting a Quantization Parameter (QP). For example, for scalar quantization, different scales may be applied to achieve finer or coarser quantization. Smaller quantization steps correspond to finer quantization and larger quantization steps correspond to coarser quantization. An appropriate quantization step size may be indicated by a Quantization Parameter (QP). For example, the quantization parameter may be an index of a predefined set of suitable quantization step sizes. For example, a smaller quantization parameter may correspond to a fine quantization (smaller quantization step size) and a larger quantization parameter may correspond to a coarse quantization (larger quantization step size), or vice versa. The quantization may comprise a division by a quantization step size and a corresponding quantization or inverse quantization, e.g. performed by inverse quantization 210, or may comprise a multiplication by a quantization step size. Embodiments according to some criteria, such as AVS, AVS2, AVS3, may use quantization parameters to determine the quantization step size. In general, the quantization step size may be calculated based on the quantization parameter using a fixed point approximation of an equation that includes division. Additional scaling factors may be introduced for quantization and dequantization to recover the norm of the residual block that may be modified due to the scale used in the fixed point approximation of the equation for the quantization step size and quantization parameter. In one example implementation, the inverse transform and inverse quantization scales may be combined. Alternatively, a custom quantization table may be used and signaled from the encoder to the decoder, e.g., in a bitstream. Quantization is a lossy operation, where the larger the quantization step size, the greater the loss.
The inverse quantization unit 210 is configured to apply inverse quantization of the quantization unit 208 on the quantized coefficients to obtain inverse quantized coefficients 211, e.g., to apply an inverse quantization scheme of the quantization scheme applied by the quantization unit 208 based on or using the same quantization step as the quantization unit 208. The dequantized coefficients 211 may also be referred to as dequantized residual coefficients 211, corresponding to transform coefficients 207, although the loss due to quantization is typically not the same as the transform coefficients.
The inverse transform processing unit 212 is configured to apply an inverse transform of the transform applied by the transform processing unit 206, for example, an inverse Discrete Cosine Transform (DCT) or an inverse Discrete Sine Transform (DST), to obtain an inverse transform block 213 in the sample domain. The inverse transform block 213 may also be referred to as an inverse transform dequantized block 213 or an inverse transform residual block 213.
The reconstruction unit 214 (e.g., summer 214) is used to add the inverse transform block 213 (i.e., the reconstructed residual block 213) to the prediction block 265 to obtain the reconstructed block 215 in the sample domain, e.g., to add sample values of the reconstructed residual block 213 to sample values of the prediction block 265.
Optionally, a buffer unit 216 (or simply "buffer" 216), such as a line buffer 216, is used to buffer or store the reconstructed block 215 and corresponding sample values, for example, for intra prediction. In other embodiments, the encoder may be used to use the unfiltered reconstructed block and/or corresponding sample values stored in buffer unit 216 for any class of estimation and/or prediction, such as intra prediction.
For example, an embodiment of encoder 20 may be configured such that buffer unit 216 is used not only to store reconstructed blocks 215 for intra prediction 254, but also for loop filter unit 220 (not shown in fig. 2), and/or such that buffer unit 216 and decoded picture buffer unit 230 form one buffer, for example. Other embodiments may be used to use filtered block 221 and/or blocks or samples from decoded picture buffer 230 (neither shown in fig. 2) as input or basis for intra prediction 254.
The loop filter unit 220 (or simply "loop filter" 220) is used to filter the reconstructed block 215 to obtain a filtered block 221, so as to facilitate pixel transition or improve video quality. Loop filter unit 220 is intended to represent one or more loop filters, such as a deblocking filter, a sample-adaptive offset (SAO) filter, or other filters, such as a bilateral filter, an Adaptive Loop Filter (ALF), or a sharpening or smoothing filter, or a collaborative filter. Although loop filter unit 220 is shown in fig. 2 as an in-loop filter, in other configurations, loop filter unit 220 may be implemented as a post-loop filter. The filtered block 221 may also be referred to as a filtered reconstructed block 221. The decoded picture buffer 230 may store the reconstructed encoded block after the loop filter unit 220 performs a filtering operation on the reconstructed encoded block.
Embodiments of encoder 20 (correspondingly, loop filter unit 220) may be configured to output loop filter parameters (e.g., sample adaptive offset information), e.g., directly or after entropy encoding by entropy encoding unit 270 or any other entropy encoding unit, e.g., such that decoder 30 may receive and apply the same loop filter parameters for decoding.
Decoded Picture Buffer (DPB) 230 may be a reference picture memory that stores reference picture data for use by encoder 20 in encoding video data. DPB 230 may be formed from any of a variety of memory devices, such as Dynamic Random Access Memory (DRAM) including Synchronous DRAM (SDRAM), Magnetoresistive RAM (MRAM), Resistive RAM (RRAM), or other types of memory devices. The DPB 230 and the buffer 216 may be provided by the same memory device or separate memory devices. In a certain example, a Decoded Picture Buffer (DPB) 230 is used to store filtered blocks 221. Decoded picture buffer 230 may further be used to store other previous filtered blocks, such as previous reconstructed and filtered blocks 221, of the same current picture or of a different picture, such as a previous reconstructed picture, and may provide the complete previous reconstructed, i.e., decoded picture (and corresponding reference blocks and samples) and/or the partially reconstructed current picture (and corresponding reference blocks and samples), e.g., for inter prediction. In a certain example, if reconstructed block 215 is reconstructed without in-loop filtering, Decoded Picture Buffer (DPB) 230 is used to store reconstructed block 215.
Prediction processing unit 260, also referred to as block prediction processing unit 260, is used to receive or obtain image block 203 (current image block 203 of current picture 201) and reconstructed picture data, e.g., reference samples of the same (current) picture from buffer 216 and/or reference picture data 231 of one or more previously decoded pictures from decoded picture buffer 230, and to process such data for prediction, i.e., to provide prediction block 265, which may be inter-predicted block 245 or intra-predicted block 255.
The mode selection unit 262 may be used to select a prediction mode (e.g., intra or inter prediction mode) and/or a corresponding prediction block 245 or 255 used as the prediction block 265 to calculate the residual block 205 and reconstruct the reconstructed block 215.
Embodiments of mode selection unit 262 may be used to select prediction modes (e.g., from those supported by prediction processing unit 260) that provide the best match or the smallest residual (smallest residual means better compression in transmission or storage), or that provide the smallest signaling overhead (smallest signaling overhead means better compression in transmission or storage), or both. The mode selection unit 262 may be configured to determine a prediction mode based on Rate Distortion Optimization (RDO), i.e., select a prediction mode that provides the minimum rate distortion optimization, or select a prediction mode in which the associated rate distortion at least meets the prediction mode selection criteria.
The prediction processing performed by the example of the encoder 20 (e.g., by the prediction processing unit 260) and the mode selection performed (e.g., by the mode selection unit 262) will be explained in detail below.
As described above, the encoder 20 is configured to determine or select the best or optimal prediction mode from a set of (predetermined) prediction modes. The prediction mode set may include, for example, intra prediction modes and/or inter prediction modes.
The intra prediction mode set may include 35 different intra prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in h.265, or may include 67 different intra prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in h.266 under development.
In possible implementations, the set of inter Prediction modes may include, for example, an Advanced Motion Vector Prediction (AMVP) mode and a merge (merge) mode depending on available reference pictures (i.e., at least partially decoded pictures stored in the DBP 230, for example, as described above) and other inter Prediction parameters, e.g., depending on whether the entire reference picture or only a portion of the reference picture, such as a search window region of a region surrounding the current block, is used to search for a best matching reference block, and/or depending on whether pixel interpolation, such as half-pixel and/or quarter-pixel interpolation, is applied, for example. In a specific implementation, the inter prediction mode set may include an improved control point-based AMVP mode and an improved control point-based merge mode according to an embodiment of the present application. In one example, intra-prediction unit 254 may be used to perform any combination of the inter-prediction techniques described below.
In addition to the above prediction mode, embodiments of the present application may also apply a skip mode and/or a direct mode.
The prediction processing unit 260 may further be configured to partition the image block 203 into smaller block partitions or sub-blocks, for example, by iteratively using Quad-Tree (QT) partitions, binary-Tree (BT) partitions or ternary-Tree (TT) or Extended Quad-Tree (EQT), or any combination thereof, and to perform prediction, for example, for each of the block partitions or sub-blocks, wherein mode selection includes selecting a Tree structure of the partitioned image block 203 and selecting a prediction mode to apply to each of the block partitions or sub-blocks.
The inter prediction unit 244 may include a Motion Estimation (ME) unit (not shown in fig. 2) and a Motion Compensation (MC) unit (not shown in fig. 2). The motion estimation unit is used to receive or obtain a picture image block 203 (current picture image block 203 of current picture 201) and a decoded picture 231, or at least one or more previously reconstructed blocks, e.g., reconstructed blocks of one or more other/different previously decoded pictures 231, for motion estimation. For example, the video sequence may comprise a current picture and a previously decoded picture 31, or in other words, the current picture and the previously decoded picture 31 may be part of, or form, a sequence of pictures forming the video sequence.
For example, the encoder 20 may be configured to select a reference block from a plurality of reference blocks of the same or different one of a plurality of other pictures and provide the reference picture and/or an offset (spatial offset) between a position (X, Y coordinates) of the reference block and a position of the current block to a motion estimation unit (not shown in fig. 2) as an inter prediction parameter. This offset is also called a Motion Vector (MV).
The motion compensation unit is configured to obtain inter-prediction parameters and perform inter-prediction based on or using the inter-prediction parameters to obtain an inter-prediction block 245. The motion compensation performed by the motion compensation unit (not shown in fig. 2) may involve taking or generating a prediction block based on a motion/block vector determined by motion estimation (possibly performing interpolation to sub-pixel precision). Interpolation filtering may generate additional pixel samples from known pixel samples, potentially increasing the number of candidate prediction blocks that may be used to encode a picture block. Upon receiving the motion vector for the PU of the current picture block, motion compensation unit 246 may locate the prediction block in one reference picture list to which the motion vector points. Motion compensation unit 246 may also generate syntax elements associated with the blocks and video slices for use by decoder 30 in decoding picture blocks of the video slices.
Specifically, the inter prediction unit 244 may transmit a syntax element including an inter prediction parameter (e.g., indication information for selecting an inter prediction mode for current block prediction after traversing a plurality of inter prediction modes) to the entropy encoding unit 270. In a possible application scenario, if there is only one inter prediction mode, the inter prediction parameters may not be carried in the syntax element, and the decoding end 30 can directly use the default prediction mode for decoding. It will be appreciated that the inter prediction unit 244 may be used to perform any combination of inter prediction techniques.
The intra prediction unit 254 is used to obtain, for example, a picture block 203 (current picture block) of the same picture and one or more previously reconstructed blocks, e.g., reconstructed neighboring blocks, to be received for intra estimation. For example, the encoder 20 may be configured to select an intra-prediction mode from a plurality of (predetermined) intra-prediction modes.
Embodiments of encoder 20 may be used to select an intra prediction mode based on optimization criteria, such as based on a minimum residual (e.g., an intra prediction mode that provides a prediction block 255 that is most similar to current picture block 203) or a minimum code rate distortion.
The intra-prediction unit 254 is further configured to determine the intra-prediction block 255 based on the intra-prediction parameters as the selected intra-prediction mode. In any case, after selecting the intra-prediction mode for the block, intra-prediction unit 254 is also used to provide intra-prediction parameters, i.e., information indicating the selected intra-prediction mode for the block, to entropy encoding unit 270. In one example, intra-prediction unit 254 may be used to perform any combination of intra-prediction techniques.
Specifically, the above-described intra prediction unit 254 may transmit a syntax element including an intra prediction parameter (such as indication information of selecting an intra prediction mode for current block prediction after traversing a plurality of intra prediction modes) to the entropy encoding unit 270. In a possible application scenario, if there is only one intra-prediction mode, the intra-prediction parameters may not be carried in the syntax element, and the decoding end 30 may directly use the default prediction mode for decoding.
Entropy encoding unit 270 is configured to apply an entropy encoding algorithm or scheme (e.g., a Variable Length Coding (VLC) scheme, a Context Adaptive VLC (CAVLC) scheme, an arithmetic coding scheme, a Context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), Probability Interval Partition Entropy (PIPE) coding, or other entropy encoding methods or techniques) to individual or all of quantized residual coefficients 209, inter-prediction parameters, intra-prediction parameters, and/or loop filter parameters (or not) to obtain encoded picture data 21 that may be output by output 272 in the form of, for example, encoded bitstream 21. The encoded bitstream may be transmitted to video decoder 30, or archived for later transmission or retrieval by video decoder 30. Entropy encoding unit 270 may also be used to entropy encode other syntax elements of the current video slice being encoded.
Other structural variations of video encoder 20 may be used to encode the video stream. For example, the non-transform based encoder 20 may quantize the residual signal directly without the transform processing unit 206 for certain blocks or frames. In another embodiment, encoder 20 may have quantization unit 208 and inverse quantization unit 210 combined into a single unit.
Specifically, in the embodiment of the present application, the encoder 20 may be used to implement the encoding method described in the following embodiments.
It should be understood that other structural variations of the video encoder 20 may be used to encode the video stream. For example, for some image blocks or image frames, video encoder 20 may quantize the residual signal directly without processing by transform processing unit 206 and, correspondingly, without processing by inverse transform processing unit 212; alternatively, for some image blocks or image frames, the video encoder 20 does not generate residual data and accordingly does not need to be processed by the transform processing unit 206, the quantization unit 208, the inverse quantization unit 210, and the inverse transform processing unit 212; alternatively, video encoder 20 may store the reconstructed image block directly as a reference block without processing by filter 220; alternatively, the quantization unit 208 and the inverse quantization unit 210 in the video encoder 20 may be merged together. The loop filter 220 is optional, and in the case of lossless compression coding, the transform processing unit 206, the quantization unit 208, the inverse quantization unit 210, and the inverse transform processing unit 212 are optional. It should be appreciated that the inter prediction unit 244 and the intra prediction unit 254 may be selectively enabled according to different application scenarios.
Referring to fig. 3, fig. 3 shows a schematic/conceptual block diagram of an example of a decoder 30 for implementing embodiments of the present application. Video decoder 30 is operative to receive encoded picture data (e.g., an encoded bitstream) 21, e.g., encoded by encoder 20, to obtain a decoded picture 231. During the decoding process, video decoder 30 receives video data, such as an encoded video bitstream representing picture blocks of an encoded video slice and associated syntax elements, from video encoder 20.
In the example of fig. 3, decoder 30 includes entropy decoding unit 304, inverse quantization unit 310, inverse transform processing unit 312, reconstruction unit 314 (e.g., summer 314), buffer 316, loop filter 320, decoded picture buffer 330, and prediction processing unit 360. The prediction processing unit 360 may include an inter prediction unit 344, an intra prediction unit 354, and a mode selection unit 362. In some examples, video decoder 30 may perform a decoding pass that is substantially reciprocal to the encoding pass described with reference to video encoder 20 of fig. 2.
Entropy decoding unit 304 is to perform entropy decoding on encoded picture data 21 to obtain, for example, quantized coefficients 309 and/or decoded encoding parameters (not shown in fig. 3), such as any or all of inter-prediction, intra-prediction parameters, loop filter parameters, and/or other syntax elements (decoded). The entropy decoding unit 304 is further for forwarding the inter-prediction parameters, the intra-prediction parameters, and/or other syntax elements to the prediction processing unit 360. Video decoder 30 may receive syntax elements at the video slice level and/or the video block level.
Inverse quantization unit 310 may be functionally identical to inverse quantization unit 110, inverse transform processing unit 312 may be functionally identical to inverse transform processing unit 212, reconstruction unit 314 may be functionally identical to reconstruction unit 214, buffer 316 may be functionally identical to buffer 216, loop filter 320 may be functionally identical to loop filter 220, and decoded picture buffer 330 may be functionally identical to decoded picture buffer 230.
Prediction processing unit 360 may include inter prediction unit 344 and intra prediction unit 354, where inter prediction unit 344 may be functionally similar to inter prediction unit 244 and intra prediction unit 354 may be functionally similar to intra prediction unit 254. The prediction processing unit 360 is typically used to perform block prediction and/or to obtain a prediction block 365 from the encoded data 21, as well as to receive or obtain (explicitly or implicitly) prediction related parameters and/or information about the selected prediction mode from, for example, the entropy decoding unit 304.
When the video slice is encoded as an intra-coded (I) slice, intra-prediction unit 354 of prediction processing unit 360 is used to generate a prediction block 365 for the picture block of the current video slice based on the signaled intra-prediction mode and data from previously decoded blocks of the current frame or picture. When a video frame is encoded as an inter-coded (i.e., B or P) slice, inter prediction unit 344 (e.g., a motion compensation unit) of prediction processing unit 360 is used to generate a prediction block 365 for the video block of the current video slice based on the motion vectors and other syntax elements received from entropy decoding unit 304. For inter prediction, a prediction block may be generated from one reference picture within one reference picture list. Video decoder 30 may construct the reference frame list using default construction techniques based on the reference pictures stored in DPB 330: list 0 and list 1.
Prediction processing unit 360 is used to determine a prediction block for a video block of the current video slice by parsing the motion vectors and other syntax elements, and to generate a prediction block for the current video block being decoded using the prediction block. In an example of the present application, prediction processing unit 360 uses some of the syntax elements received to determine a prediction mode (e.g., intra or inter prediction) for encoding video blocks of a video slice, an inter prediction slice type (e.g., B-slice, P-slice, or GPB-slice), construction information for one or more of a reference picture list of the slice, a motion vector for each inter-coded video block of the slice, an inter prediction state for each inter-coded video block of the slice, and other information to decode video blocks of a current video slice. In another example of the present disclosure, the syntax elements received by video decoder 30 from the bitstream include syntax elements received in one or more of an Adaptive Parameter Set (APS), a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), or a slice header.
Inverse quantization unit 310 may be used to inverse quantize (i.e., inverse quantize) the quantized transform coefficients provided in the bitstream and decoded by entropy decoding unit 304. The inverse quantization process may include using quantization parameters calculated by video encoder 20 for each video block in the video slice to determine the degree of quantization that should be applied and likewise the degree of inverse quantization that should be applied.
Inverse transform processing unit 312 is used to apply an inverse transform (e.g., an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process) to the transform coefficients in order to produce a block of residuals in the pixel domain.
The reconstruction unit 314 (e.g., summer 314) is used to add the inverse transform block 313 (i.e., reconstructed residual block 313) to the prediction block 365 to obtain the reconstructed block 315 in the sample domain, e.g., by adding sample values of the reconstructed residual block 313 to sample values of the prediction block 365.
Loop filter unit 320 (either during or after the encoding cycle) is used to filter reconstructed block 315 to obtain filtered block 321 to facilitate pixel transitions or improve video quality. In one example, loop filter unit 320 may be used to perform any combination of the filtering techniques described below. Loop filter unit 320 is intended to represent one or more loop filters, such as a deblocking filter, a sample-adaptive offset (SAO) filter, or other filters, such as a bilateral filter, an Adaptive Loop Filter (ALF), or a sharpening or smoothing filter, or a collaborative filter. Although loop filter unit 320 is shown in fig. 3 as an in-loop filter, in other configurations, loop filter unit 320 may be implemented as a post-loop filter.
Decoded video block 321 in a given frame or picture is then stored in decoded picture buffer 330, which stores reference pictures for subsequent motion compensation.
Decoder 30 is used to output decoded picture 31, e.g., via output 332, for presentation to or viewing by a user.
Other variations of video decoder 30 may be used to decode the compressed bitstream. For example, decoder 30 may generate an output video stream without loop filter unit 320. For example, the non-transform based decoder 30 may directly inverse quantize the residual signal without the inverse transform processing unit 312 for certain blocks or frames. In another embodiment, video decoder 30 may have inverse quantization unit 310 and inverse transform processing unit 312 combined into a single unit.
Specifically, in the embodiment of the present application, the decoder 30 is used to implement the decoding method described in the following embodiments.
It is to be understood that the block division operation may be performed by the prediction processing unit 360, or by a separate unit (not shown in the drawings). The prediction processing unit 360 may be configured to partition the image block 203 into smaller block partitions or sub-blocks, for example, by iteratively using Quad-Tree (QT) partitions, binary-Tree (BT) partitions or triple-Tree (TT) or Extended Quad-Tree (EQT), or any combination thereof, in a manner that may be determined based on a preset rule or based on parsed syntax elements indicating a manner of partitioning, and to perform prediction, for example, for each of the block partitions or sub-blocks, wherein mode selection includes selecting a Tree structure of the partitioned image block 203 and selecting a prediction mode to apply to each of the block partitions or sub-blocks.
It should be understood that other structural variations of the video decoder 30 may be used to decode the encoded video bitstream. For example, video decoder 30 may generate an output video stream without processing by filter 320; alternatively, for some image blocks or image frames, the quantized coefficients are not decoded by entropy decoding unit 304 of video decoder 30 and, accordingly, do not need to be processed by inverse quantization unit 310 and inverse transform processing unit 312. Loop filter 320 is optional; and the inverse quantization unit 310 and the inverse transform processing unit 312 are optional for the case of lossless compression. It should be understood that the inter prediction unit and the intra prediction unit may be selectively enabled according to different application scenarios.
It should be understood that, in the encoder 20 and the decoder 30 of the present application, the processing result of a certain link may be further processed and then output to the next link, for example, after the links such as interpolation filtering, motion vector derivation, or loop filtering, the processing result of the corresponding link is further subjected to operations such as Clip or shift.
For example, the motion vector of the control point of the current image block derived according to the motion vector of the adjacent affine coding block, or the derived motion vector of the sub-block of the current image block may be further processed, which is not limited in the present application. For example, the value range of the motion vector is constrained to be within a certain bit width. Assuming that the allowed bit-width of the motion vector is bitDepth, the motion vector ranges from-2 ^ (bitDepth-1) to 2^ (bitDepth-1) -1, where the "^" symbol represents the power. And if the bitDepth is 16, the value range is-32768-32767. And if the bitDepth is 18, the value range is-131072-131071. As another example, the value of the motion vector (e.g., the motion vector MV of four 4x4 sub-blocks within an 8x8 image block) is constrained such that the maximum difference between the integer part of the four 4x4 sub-blocks MV is no more than N pixels, e.g., no more than one pixel.
It can be constrained to within a certain bit width in two ways:
mode 1, the high order bits of motion vector overflow are removed:
ux=(vx+2bitDepth)%2bitDepth
vx=(ux>=2bitDepth-1)?(ux-2bitDepth):ux
uy=(vy+2bitDepth)%2bitDepth
vy=(uy>=2bitDepth-1)?(uy-2bitDepth):uy
wherein vx is a horizontal component of a motion vector of the image block or a sub-block of the image block, vy is a vertical component of the motion vector of the image block or the sub-block of the image block, and ux and uy are median values; bitDepth represents the bit width.
For example, vx has a value of-32769, which is obtained by the above equation of 32767. Since in the computer the value is stored in binary's complement, -32769's complement is 1,0111,1111,1111,1111(17 bits), the computer processes the overflow to discard the high bits, the value of vx is 0111,1111,1111,1111, then 32767, consistent with the results obtained by the formula processing.
Method 2, the motion vector is clipped, as shown in the following formula:
vx=Clip3(-2bitDepth-1,2bitDepth-1-1,vx)
vy=Clip3(-2bitDepth-1,2bitDepth-1-1,vy)
wherein vx is the horizontal component of the motion vector of the image block or a sub-block of the image block, vy is the vertical component of the motion vector of the image block or a sub-block of the image block; wherein x, y and z respectively correspond to three input values of the MV clamping process Clip3, and the Clip3 is defined to indicate that the value of z is clamped between the intervals [ x, y ]:
Figure BDA0002149582270000201
Referring to fig. 4, fig. 4 is a schematic structural diagram of a video coding apparatus 400 (e.g., a video encoding apparatus 400 or a video decoding apparatus 400) provided by an embodiment of the present application. Video coding apparatus 400 is suitable for implementing the embodiments described herein. In one embodiment, video coding device 400 may be a video decoder (e.g., decoder 30 of fig. 1A) or a video encoder (e.g., encoder 20 of fig. 1A). In another embodiment, video coding device 400 may be one or more components of decoder 30 of fig. 1A or encoder 20 of fig. 1A described above.
Video coding apparatus 400 includes: an ingress port 410 and a reception unit (Rx)420 for receiving data, a processor, logic unit or Central Processing Unit (CPU)430 for processing data, a transmitter unit (Tx)440 and an egress port 450 for transmitting data, and a memory 460 for storing data. Video coding device 400 may also include optical-to-Electrical (EO) components and optical-to-electrical (opto) components coupled with ingress port 410, receiver unit 420, transmitter unit 440, and egress port 450 for egress or ingress of optical or electrical signals.
The processor 430 is implemented by hardware and software. Processor 430 may be implemented as one or more CPU chips, cores (e.g., multi-core processors), FPGAs, ASICs, and DSPs. Processor 430 is in communication with inlet port 410, receiver unit 420, transmitter unit 440, outlet port 450, and memory 460. Processor 430 includes a coding module 470 (e.g., encoding module 470 or decoding module 470). The encoding/decoding module 470 implements embodiments disclosed herein to implement the chroma block prediction methods provided by embodiments of the present application. For example, the encoding/decoding module 470 implements, processes, or provides various encoding operations. Accordingly, substantial improvements are provided to the functionality of the video coding apparatus 400 by the encoding/decoding module 470 and affect the transition of the video coding apparatus 400 to different states. Alternatively, the encode/decode module 470 is implemented as instructions stored in the memory 460 and executed by the processor 430.
The memory 460, which may include one or more disks, tape drives, and solid state drives, may be used as an over-flow data storage device for storing programs when such programs are selectively executed, and for storing instructions and data that are read during program execution. The memory 460 may be volatile and/or nonvolatile, and may be Read Only Memory (ROM), Random Access Memory (RAM), random access memory (TCAM), and/or Static Random Access Memory (SRAM).
Referring to fig. 5, fig. 5 is a simplified block diagram of an apparatus 500 that may be used as either or both of source device 12 and destination device 14 in fig. 1A according to an example embodiment. Apparatus 500 may implement the techniques of this application. In other words, fig. 5 is a schematic block diagram of an implementation manner of an encoding apparatus or a decoding apparatus (simply referred to as a decoding apparatus 500) of the embodiment of the present application. Among other things, the decoding device 500 may include a processor 510, a memory 530, and a bus system 550. Wherein the processor is connected with the memory through the bus system, the memory is used for storing instructions, and the processor is used for executing the instructions stored by the memory. The memory of the coding device stores program code, and the processor may invoke the program code stored in the memory to perform the various video encoding or decoding methods described herein, and in particular, the various new decoding methods. To avoid repetition, it is not described in detail here.
In the embodiment of the present application, the processor 510 may be a Central Processing Unit (CPU), and the processor 510 may also be other general-purpose processors, Digital Signal Processors (DSP), Application Specific Integrated Circuits (ASIC), Field Programmable Gate Arrays (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 530 may include a Read Only Memory (ROM) device or a Random Access Memory (RAM) device. Any other suitable type of memory device may also be used for memory 530. Memory 530 may include code and data 531 to be accessed by processor 510 using bus 550. Memory 530 may further include an operating system 533 and application programs 535, the application programs 535 including at least one program that allows processor 510 to perform the video encoding or decoding methods described herein, and in particular the decoding methods described herein. For example, the application programs 535 may include applications 1 through N, which further include a video encoding or decoding application (simply a video coding application) that performs the video encoding or decoding methods described herein.
The bus system 550 may include a power bus, a control bus, a status signal bus, and the like, in addition to a data bus. For clarity of illustration, however, the various buses are designated in the figure as bus system 550.
Optionally, the translator device 500 may also include one or more output devices, such as a display 570. In one example, the display 570 may be a touch-sensitive display that incorporates a display with a touch-sensitive unit operable to sense touch input. A display 570 may be connected to the processor 510 via the bus 550.
The scheme of the embodiment of the application is explained in detail as follows:
video coding standards partition a frame of pictures into non-overlapping Coding Tree Units (CTUs), which may be sized to 64 × 64 (the CTUs may be sized to other values, such as increasing the CTU size to 128 × 128 or 256 × 256). A 64 x 64 CTU comprises a rectangular pixel lattice of 64 columns of 64 pixels each comprising a luminance component or/and a chrominance component. And (3) CTU: a coding tree unit (coding tree unit), one image is composed of a plurality of CTUs, one CTU generally corresponds to a square image region, and includes luminance pixels and chrominance pixels (or may include only luminance pixels, or may include only chrominance pixels) in the image region; syntax elements are also included in the CTU that indicate how the CTU is divided into at least one Coding Unit (CU), and the method of decoding each coding unit resulting in a reconstructed picture.
CU: the coding unit, which generally corresponds to an a × B rectangular region, includes a × B luminance pixels and its corresponding chrominance pixels, a is the width of the rectangle, B is the height of the rectangle, a and B may be the same or different, and a and B generally take values to the power of 2, i.e. 256, 128, 64, 32, 16, 8, 4. An encoding unit can decode to obtain a reconstructed image of an A × B rectangular region through decoding processing, wherein the decoding processing generally comprises prediction, inverse quantization, inverse transformation and the like, predicted images and residual errors are generated, and the predicted images and the residual errors are superposed to obtain the reconstructed image.
Quadtree (QT), a Tree-like structure, a node can be divided into four sub-nodes. The video coding standard adopts a CTU partitioning mode based on a quadtree: the CTU is used as a root node, and each node corresponds to a square region, that is, the square region is divided into four square regions (the length and the width of each square region are half of the length and the width of the region before division) with the same size, and each region corresponds to a node, as shown in fig. 6 (a). A node may not be divided any more (at this time, its corresponding area is a CU), or the node may be continuously divided into nodes of the next hierarchy in the manner of QT, BT, TT or EQT.
Binary Tree (BT, Binary Tree): a tree structure in which a node is divided into two sub-nodes. There are two ways to divide into two nodes: 1) dividing the region corresponding to the node into two upper and lower regions of the same size, each region corresponding to a node, as shown in fig. 6 (b); or 2) vertically dividing the region corresponding to the node into two regions with the same size, i.e., left and right regions, each region corresponding to a node, as shown in fig. 6 (c). In the coding and decoding method using the binary tree, a node on a binary tree structure may not be divided (at this time, its corresponding area is a CU), or the node may be continuously divided into nodes of the next level in BT, TT or EQT manner.
Ternary tree (TT for short): a tree structure in which a node is divided into three sub-nodes. In the existing coding method using the ternary tree, a node on a ternary tree structure may not be divided, or the node may be divided into three nodes of the next level. There are two ways to divide into three nodes: 1) dividing the region corresponding to the node into an upper region, a middle region and a lower region, wherein each region corresponds to one node, and the heights of the upper region, the middle region and the lower region are 1/4, 1/2 and 1/4 of the node height respectively, as shown in fig. 6 (d); or 2) vertically dividing the region corresponding to the node into three regions, namely a left region, a middle region and a right region, wherein each region corresponds to one node, and the widths of the three regions are 1/4, 1/2 and 1/4 of the height of the node respectively, as shown in fig. 6 (e). In the encoding and decoding method using the ternary tree, a node on the ternary tree structure may not be divided (at this time, its corresponding region is a CU), or the node may be continuously divided into nodes of the next level in BT, TT or EQT manner.
An expanded Quad-Tree (EQT) is an I-shaped dividing structure, and one node can be divided into four sub-nodes. There are two ways to divide into three nodes: 1) dividing the region corresponding to the node into an upper region, a middle region and a lower region, wherein each region corresponds to one node, the heights of the upper region, the middle left region, the middle right region and the lower region are 1/4, 1/2, 1/2 and 1/4 of the node height respectively, and the middle left width, the middle right width and the middle left width are 1/2 and 1/2 of the node height respectively, as shown in fig. 6 (f); or 2) vertically dividing the region corresponding to the node into three regions, namely a left region, a middle upper region, a middle lower region and a right region, wherein each region corresponds to one node, the widths of the left region, the middle region and the right region are 1/4, 1/2, 1/2 and 1/4 of the node height respectively, and the widths of the middle upper region and the middle lower region are 1/2 and 1/2 of the node height respectively, as shown in fig. 6 (g). In the encoding method using the extended quadtree, a node on an extended quadtree structure may not be divided, or the node may be continuously divided into nodes of a next level in a BT, TT or EQT manner.
Video decoding (video decoding): and restoring the video code stream into a reconstructed image according to a specific grammar rule and a specific processing method.
Video encoding (video encoding): compressing the image sequence into code stream;
video coding (video coding): the video encoding and decoding are commonly called, and the Chinese translation name and the video encoding are the same.
VTM: the jfet organization develops new codec reference software.
Video coding standards partition a frame of pictures into non-overlapping Coding Tree Units (CTUs), which may be sized to 64 × 64 (the CTUs may be sized to other values, such as increasing the CTU size to 128 × 128 or 256 × 256). A 64 x 64 CTU comprises a rectangular pixel lattice of 64 columns of 64 pixels each comprising a luminance component or/and a chrominance component.
The method comprises the steps of using a CTU partitioning method based on a quad-tree (QT for short), using the CTU as a root node (root) of the quad-tree, and recursively partitioning the CTU into a plurality of leaf nodes (leaf nodes) according to a partitioning mode of the quad-tree. One node corresponds to one image area, if the node is not divided, the node is called a leaf node, and the image area corresponding to the node forms a CU; if the nodes are divided continuously, the image area corresponding to the nodes is divided into four areas (the length and the width of each area are half of the divided area) with the same size, each area corresponds to one node, and whether the nodes are divided or not needs to be determined respectively. Whether one node is divided is indicated by a division flag bit split _ cu _ flag corresponding to the node in the code stream. The quadtree level (qtDepth) of the root node is 0, and the quadtree level of the child node is the quadtree level +1 of the parent node. For the sake of brevity, the size and shape of a node hereinafter refers to the size and shape of the image area corresponding to the node.
More specifically, for a 64 × 64 CTU node (0 in the quadtree level), according to its corresponding split _ CU _ flag, it is selected to be divided into 1 64 × 64 CU without division, or to be divided into 4 32 × 32 nodes (1 in the quadtree level). Each of the four 32 × 32 nodes may select continuous partitioning or non-partitioning according to its corresponding split _ cu _ flag; if one 32 × 32 node continues to divide, four 16 × 16 nodes (quad tree level 2) result. And so on until all nodes are no longer partitioned, such that a CTU is partitioned into a set of CUs. The minimum size (size) of a CU is identified in SPS, e.g., 8 × 8 is the minimum CU. In the above recursive partitioning process, if the size of a node is equal to the minimum CU size (minimum CU size), the node defaults to no longer being partitioned, and does not need to include its partition flag bit in the bitstream.
When a leaf node is analyzed, the leaf node is a CU, the coding information (including information such as prediction mode and transform coefficient of the CU, for example, coding _ unit () syntax structure) corresponding to the CU is further analyzed, and the CU is subjected to decoding processing such as prediction, inverse quantization, inverse transform, loop filter, and the like according to the coding information, thereby generating a reconstructed image corresponding to the CU. The quadtree structure enables the CTU to be partitioned into a set of CUs of suitable size according to image local features, e.g. smooth regions partitioned into larger CUs and texture rich regions partitioned into smaller CUs.
One way of partitioning a CTU into a set of CUs corresponds to a coding tree (coding tree). What coding tree the CTU should adopt is usually determined by Rate Distortion Optimization (RDO) technique of the encoder. The encoder tries a plurality of CTU partitioning modes, wherein each partitioning mode corresponds to a rate distortion cost (RDcost); and the encoder compares the RD cost of various tried partition modes, finds the partition mode with the minimum RD cost, and uses the partition mode as the optimal partition mode of the CTU for the actual coding of the CTU. The various CTU partitioning schemes attempted by the encoder need to comply with the partitioning rules specified by the decoder, so that they can be correctly identified by the decoder.
In screen content video, the same image usually contains the same content, such as a picture containing numbers or graphics, and the current block can find the same numbers or graphics around it, as shown in fig. 7. Therefore, if the current block is encoded, if a copy block which can be referred to can be searched around, and reconstructed pixels thereof are directly referred to, the encoding compression rate can be greatly improved. Intra Block Copy (IBC) is an Intra prediction technique that looks for the same Block in the current screen content. For example, whether the current coding unit uses the IBC prediction mode may be represented using a syntax element pred _ mode _ IBC _ flag as described in table 2.
On the basis of the quadtree division, a Binary Tree (BT) division mode and an Extended Quad Tree (EQT) division mode can be added.
Binary tree division divides a node into 2 sub-nodes, and the specific two-tree division modes include two types:
1) horizontally dividing into two parts: dividing the region corresponding to the node into an upper region and a lower region with the same size (namely, the width is unchanged, and the height is changed into half of the region before division), wherein each region corresponds to one node; as shown in fig. 6 (b).
2) Dividing vertically into two parts: dividing the region corresponding to the node into a left region and a right region with the same size (namely, the height is unchanged, and the width is half of the region before division); as shown in fig. 6 (c).
The expanded quad tree division divides a node into 4 sub-nodes, and the specific expanded quad tree division modes include two types:
1) dividing the region corresponding to the node into an upper region, a middle region and a lower region, wherein each region corresponds to one node, the heights of the upper region, the middle left region, the middle right region and the lower region are 1/4, 1/2, 1/2 and 1/4 of the node height respectively, and the middle left width, the middle right width and the middle left width are 1/2 and 1/2 of the node height respectively, as shown in fig. 6 (f);
2) and (3) vertically dividing the region corresponding to the node into three regions, namely a left region, a middle upper region, a middle lower region and a right region, wherein each region corresponds to one node, the widths of the left region, the middle region and the right region are 1/4, 1/2, 1/2 and 1/4 of the node height respectively, and the widths of the middle upper region and the middle lower region are 1/2 and 1/2 of the node height respectively, as shown in fig. 6 (g).
The dividing mode of QT cascade BT/EQT is that the nodes on the first level coding tree can be divided into child nodes only by using QT, and the leaf nodes of the first level coding tree are root nodes of the second level coding tree; the nodes on the second-level coding tree can be divided into sub-nodes by using one of BT or EQT dividing modes; leaf nodes of the second level coding tree are coding units. It should be noted that, when a leaf node is BT or EQT partition, its leaf node can only use BT or EQT partition, but cannot use QT partition.
On the basis of the quad tree division, a Binary Tree (BT) division mode and a Ternary Tree (TT) division mode can be added.
Binary tree division divides a node into 2 sub-nodes, and the specific two-tree division modes include two types:
1) horizontally dividing into two parts: dividing the region corresponding to the node into an upper region and a lower region with the same size (namely, the width is unchanged, and the height is changed into half of the region before division), wherein each region corresponds to one node; as shown in fig. 6 (b).
2) Dividing vertically into two parts: dividing the region corresponding to the node into a left region and a right region with the same size (namely, the height is unchanged, and the width is half of the region before division); as shown in fig. 6 (c).
The ternary tree division divides a node into 2 sub-nodes, and the specific two-way tree division modes include two types:
horizontal trisection: dividing the region corresponding to the node into an upper region, a middle region and a lower region, wherein each region corresponds to a node, and the heights of the upper region, the middle region and the lower region are 1/4, 1/2 and 1/4 of the node height respectively, as shown in fig. 6 (d);
dividing vertically into three parts: dividing the region corresponding to the node into a left region, a middle region and a right region, wherein each region corresponds to a node, the widths of the left region, the middle region and the right region are 1/4, 1/2 and 1/4 of the height of the node respectively, as shown in fig. 6(e)
The dividing mode of QT cascade BT/TT, named QT-BTT for short, namely nodes on a first-level coding tree can be divided into child nodes only by using the QT, and leaf nodes of the first-level coding tree are root nodes of a second-level coding tree; the nodes on the second-level coding tree can be divided into sub-nodes by using one of four dividing modes of horizontal dichotomy, vertical dichotomy, horizontal trisection and vertical trisection; leaf nodes of the second level coding tree are coding units.
The partial syntax structure of the CU level may be as shown in table 1, if the current node is no longer divided into sub-nodes, the current node is a coding unit, and a prediction block of the coding unit is parsed according to the following syntax structure.
The skip _ flag is a flag of the skip mode, a value of 1 indicates that the current CU uses the skip mode, and a value of 0 indicates that the current CU does not use the skip mode.
merge _ flag is a direct mode flag, and a value of 1 indicates that the current CU uses a merge mode; a value of 0 indicates that no fusion mode is used.
cu _ pred _ mode is a coding unit prediction mode flag, and a value of 1 indicates that the current prediction unit uses an intra prediction mode; a value of 0 indicates that the current prediction unit uses the normal inter prediction mode.
TABLE 1
Figure BDA0002149582270000241
Figure BDA0002149582270000251
The partial syntax parsing at the CU level can also be as shown in table 2, where table 2 is only an example, and the skip _ flag in table 1 has the same meaning, and the pred _ mode _ flag in table 1 has the same meaning as CU _ pred _ mode.
Wherein CU _ skip _ flag is a flag of the skip mode, a value of 1 indicates that the current CU uses the skip mode, and a value of 0 indicates that the current CU does not use the skip mode.
general _ merge _ flag is a merge mode flag, and a value of 1 indicates that the current CU uses a merge mode; a value of 0 indicates that no fusion mode is used.
pred _ mode _ flag is a coding unit prediction mode flag, and a value of 1 indicates that the current coding unit uses an intra prediction mode; a value of 0 indicates that the current coding unit uses the normal inter prediction mode. If pred _ MODE _ flag is 1, CuPredMode [ x0] [ y0] is CuPredMode [ x0] [ y0] is MODE _ INTRA; if pred _ MODE _ flag is 0, CuPredMode [ x0] [ y0] is MODE _ INTER.
pred _ mode _ IBC _ flag of 1 indicates that the current coding unit uses the IBC prediction mode, and a value of 0 indicates that the current coding unit does not use the IBC prediction mode. If pred _ MODE _ IBC _ flag is 1, CuPredMode [ x0] [ y0] is MODE _ IBC.
Where CuPredMode [ x0] [ y0] represents the prediction mode of the current coding unit, and (x0, y0) represents the position of the current coding unit in the current picture.
TABLE 2
Figure BDA0002149582270000252
An 8xM (or Mx8) sized node using a vertical bisection (or horizontal bisection) partition would yield two 4xM (or Mx4) sized child nodes; similarly, a 16xM (or Mx16) sized node using vertical extension quartering (or horizontal extension quartering) partitioning results in four 4xM (or Mx4) sized children and an 8xN (or Nx8) sized children. Similarly, a 16xM (or Mx16) sized node using a vertical three-part (or horizontal three-part) partition would result in two 4xM (or Mx4) sized children and one 8xM (or Nx8) sized child. For the YUV4:2:0 data format, the resolution of the chrominance components is 1/2 for the luminance components, i.e., a 4xM node contains a 4xM luminance block and two 2x (M/2) chrominance blocks. For a hardware decoder, the processing cost of small blocks (especially 2x2, 2x4, 2x8) is high. However, the division method generates small blocks such as 2x2 and 2x4, which is not favorable for the implementation of the hardware decoder. For a hardware decoder, the processing complexity of the small block is high, and specifically includes the following 3 aspects.
1) The intra prediction problem: in order to increase the processing speed in the hardware design, the intra prediction generally processes 16 pixels at a time, and small blocks such as 2x2, 2x4, 4x2 contain less than 16 pixels, which reduces the processing performance of the intra prediction.
2) The problem of coefficient coding: transform coefficient coding in HEVC is based on Coefficient Groups (CGs) containing 16 coefficients, whereas small blocks of 2x2, 2x4, 4x2 contain 4 or 8 transform coefficients, which results in the need to increase coefficient groups containing 4 and 8 coefficients to support coefficient coding of these small blocks, thus increasing implementation complexity.
3) Inter prediction problem: inter-frame prediction of small blocks has high requirements on data bandwidth and also affects the processing speed of decoding.
When one of the child nodes generated by continuously dividing one node by using a dividing mode comprises a chroma block with the side length of 2, the luminance block contained in the child node is continuously divided by using the dividing mode, and the chroma block contained in the child node is not divided any more. By the method, the chrominance block with the side length of 2 can be prevented from being generated, the maximum throughput rate of the decoder is reduced, and the realization of the decoder is facilitated. Meanwhile, a method for determining the chroma block prediction mode according to the brightness block prediction mode is provided, so that the coding efficiency is effectively improved.
The image prediction method provided by the present application can be applied to the video encoder 18 or the video decoder 24 shown in fig. 8. It should be noted that the individual steps in the following embodiments are only performed in the video decoder 24, and the corresponding positions are specifically described below.
The following embodiments are used to describe the image prediction method in detail, and it should be noted that the following embodiments may be combined with each other, and the description of the same or similar contents in different embodiments is not repeated.
Fig. 9 is a flowchart illustrating a first image prediction method according to an embodiment of the present application. Referring to fig. 9, the image prediction method provided in this embodiment includes the following steps:
and 101, acquiring the division mode of the current node.
In this embodiment, the partition information of the current node is first parsed, and the partition information is used to indicate whether the current node is partitioned or not partitioned. And if the division information indicates that the current node is divided, acquiring the division mode of the current node. The current node may be divided in at least one of a quadtree division, a vertical dichotomy, a horizontal dichotomy, a vertical trisection, and a horizontal trisection, or may be divided in other manners, which is not limited in this embodiment.
The partition information of the current node can be transmitted in the code stream, and the partition information of the current node can be analyzed by analyzing the corresponding syntax elements in the code stream, and a specific partition mode is determined. The current node division mode may also be determined based on other preset rules, and this embodiment is not particularly limited.
In this embodiment, if the partition information that is analyzed to the current node is used to indicate that the current node is partitioned, the partition information specifically includes a partition manner of a luminance block included in the current node and/or a partition manner of a chrominance block included in the current node. The dividing manner of the luminance block included in the current node and the dividing manner of the chrominance block included in the current node may be the same or different, and this embodiment is not limited specifically. Illustratively, the partition information is used to indicate that a quadtree partition is employed for both the luma block and the chroma block of the current node. Or the division information is used for indicating that the luminance block of the current node is divided by a quadtree and the chrominance block of the current node is divided by a vertical dichotomy.
And 102, judging whether the current node is divided based on the dividing mode to obtain an image block with a preset size.
The image block of the preset size may be a luminance block with a size smaller than a threshold, where the threshold may be 128, 64, or 32 luminance sample points, or 32, 16, or 8 chrominance sample points. The size of the current node may be greater than or equal to the threshold.
Executing step 103 under the condition that it is determined that the image block with the preset size is obtained by dividing the current node based on the dividing mode; and in the case that it is determined that the image block with the preset size cannot be obtained by dividing the current node based on the dividing manner, executing step 104.
And 103, using intra-frame prediction for all coding blocks covered by the current node or using inter-frame prediction for all coding blocks covered by the current node.
It should be noted that the current node in this embodiment may be understood as an image area or an image block corresponding to the node to be processed or divided. All the coding blocks covered by the current node can be understood as all the coding blocks located in the current node area. All the coding blocks in the present embodiment include a luminance coding block and a chrominance coding block which are divided or not divided for the current node. The coding block may also be a coding unit (coding unit).
Alternatively, the intra prediction may be performed using a normal intra mode (intra mode), or may be performed using an ibc (intra block copy) mode.
Optionally, in a case where the type (slice type) of the slice in which the current node is located is an Intra (Intra) type, Intra prediction is used for all coding blocks covered by the current node, and inter prediction is not used.
In one implementation, using intra prediction for all coding blocks covered by the current node may include:
dividing the brightness blocks included by the current node according to a dividing mode to obtain divided brightness blocks, and performing intra-frame prediction on the divided brightness blocks; and taking the chroma block included by the current node as a chroma coding block, and using intra-frame prediction for the chroma coding block.
That is, if it is determined that all the coding blocks of the current node use intra-frame prediction, dividing the brightness block of the current node according to the dividing mode of the brightness block to obtain N brightness coding tree nodes; and (3) the chroma block of the current node is not divided to obtain a chroma coding block (chroma CB for short).
Wherein, the N luminance coding tree nodes can be limited not to be divided any more, or not to be limited. If the luminance coding tree node is continuously divided, the division mode is analyzed to carry out recursive division, and when the luminance coding tree node is not divided, the luminance coding tree node corresponds to a luminance coding block (called luminance CB for short). The luminance CB obtains a luminance prediction block corresponding to the luminance CB using intra prediction.
And the chroma CB uses intra-frame prediction to obtain a chroma prediction block corresponding to the chroma CB, and the chroma prediction block and the chroma CB have the same size.
In one implementation, using inter prediction for all coding blocks covered by the current node may include:
dividing the brightness blocks included by the current node according to a dividing mode to obtain divided brightness blocks; using inter prediction for the divided luminance blocks; and taking the chroma block included by the current node as a chroma coding block, and using inter-frame prediction for the chroma coding block.
That is, if it is determined that all the coding blocks of the current node use inter-frame prediction, dividing the luminance block of the current node according to the dividing mode of the luminance block to obtain N luminance coding tree nodes; and (3) the chroma block of the current node is not divided to obtain a chroma coding block (chroma CB for short).
In this embodiment, when it is determined that all the coding blocks of the current node use intra prediction, or when it is determined that all the coding blocks of the current node use inter prediction, the luminance block included in the current node is divided according to the division manner of the current node, and the chrominance block of the current node is not divided any more. The method can avoid generating the chroma small blocks using the intra-frame prediction, thereby solving the problem of the chroma small block intra-frame prediction.
In one implementation, using inter prediction for all coding blocks covered by the current node may include:
dividing the brightness blocks included by the current node according to a dividing mode to obtain divided brightness blocks, and performing inter-frame prediction on the divided brightness blocks; and dividing the chrominance blocks included by the current node according to a dividing mode to obtain divided chrominance blocks, and using inter-frame prediction for the divided chrominance blocks.
That is, if it is determined that all the coding blocks of the current node use inter-frame prediction, dividing the luminance block of the current node according to the dividing mode of the luminance block to obtain N luminance coding tree nodes; and dividing the chrominance blocks of the current node according to the dividing mode of the chrominance blocks to obtain M chrominance coding tree nodes. Wherein, N and M are positive integers, and N and M can be the same or different. The partitioning of the N luma and M chroma coding tree nodes may be restricted from continuing, or not. And when the division is not continued, the N luminance coding tree nodes correspond to the N luminance CBs of the current node, and the M chrominance coding tree nodes correspond to the M chrominance CBs of the current node. The N luminance CBs use inter-prediction to obtain corresponding luminance prediction blocks, and the M chrominance CBs use inter-prediction to obtain corresponding chrominance prediction blocks.
Optionally, in the case that inter prediction is used for all coding blocks covered by the current node, using inter prediction for all coding blocks covered by the current node may include:
acquiring a subdivision mode of a child node of a current node, wherein the child node comprises a brightness block and a chrominance block; judging whether a brightness block with a first preset size is obtained by dividing the child nodes of the current node based on a child division mode; under the condition that the brightness block with the first preset size is obtained by determining that the sub-node of the current node is divided based on the sub-division mode, the corresponding coding block is obtained by dividing the sub-node of the current node by a division mode other than the sub-division mode, and inter-frame prediction is used for the corresponding coding block, or the sub-node of the current node is used as the coding block to use the inter-frame prediction.
That is, if dividing the child node in a sub-division manner of the child node of the current node would generate a luminance block having a first preset size (4x4), the sub-division manner of the child node is not allowed, or the child node cannot continue to be divided, or the child node is divided in a division manner other than the sub-division manner. For example, if the current node is 8x8 in size and two 8x4 (or two 4x8) nodes are generated using horizontal binary tree (or vertical binary tree) partitioning, continued partitioning of the 8x4 (or 4x8) nodes would generate 4x4 blocks, and thus, at this point, the 8x4 (or 4x8) nodes cannot continue partitioning.
And step 104, dividing the current node by adopting a dividing mode of the current node without limiting the prediction modes of all coding blocks covered by the current node.
Specifically, the luminance block of the current node is divided by adopting a dividing mode of the luminance block of the current node, and the chrominance block of the current node is divided by adopting a dividing mode of the chrominance block of the current node.
It should be noted that the prediction modes of all coding blocks covered by the current node in step 104 are not limited to be: all the coding blocks covered by the current node can be predicted according to different prediction modes, namely the prediction mode of each coding block is analyzed, and each coding block is predicted according to the analyzed prediction mode.
Optionally, after step 103 or step 104, the method further includes:
and 105, analyzing prediction block and residual error information of all coding blocks covered by the current node.
And 106, decoding each coding block to obtain a reconstruction signal of the image block corresponding to the current node.
It should be noted that the above two steps can be applied to the video decoder 24 shown in fig. 8.
Wherein the prediction block includes: prediction mode (indicating intra-prediction or non-intra-prediction modes), intra-prediction mode, inter-prediction mode, motion information, and the like. The motion information may include prediction direction (forward, backward, or bi-directional), reference frame index (reference index), motion vector (motion vector), and the like.
The residual information includes: coded block flag (cbf), transform coefficients, transform types (e.g., DCT-2, DST-7, DCT-8), etc. The transform type may be defaulted to DCT-2 transform.
If all coding blocks covered by the current node are limited to only use intra-frame prediction, the analysis of the prediction block of the luminance CB obtained by dividing the current node comprises that skip _ flag, merge _ flag and cu _ pred _ mode respectively default to 0, 0 and 1 (namely that no skip _ flag, merge _ flag and cu _ pred _ mode appear in the code stream), or skip _ flag and cu _ pred _ mode respectively default to 0 and 1 (namely that no skip _ flag and cu _ pred _ mode appear in the code stream) and the intra-frame prediction mode information of the luminance CB is analyzed; the analysis of the prediction block of the chroma CB obtained by the current node division comprises the analysis of the intra-frame prediction mode of the chroma CB. The method for analyzing the intra prediction mode of the chroma CB may be: 1) parsing the syntax element from the code stream to obtain the syntax element; 2) directly set to one of the set of chroma intra prediction modes, such as linear model mode, DM Mode (DM), IBC mode, etc.
If all coding blocks covered by the current node can only use inter-frame prediction, the prediction mode analysis of the CU obtained by dividing the current node comprises analyzing skip _ flag or/and merge _ flag, the CU _ pred _ mode is defaulted to be 0, and the inter-frame prediction block is analyzed, such as a fusion index (merge index), an inter-frame prediction direction (inter dir), a reference frame index (reference index), a motion vector predictor index (motion vector predictor index) and a motion vector difference component (motion vector difference). The skip _ flag is a flag of the skip mode, a value of 1 indicates that the current CU uses the skip mode, and a value of 0 indicates that the current CU does not use the skip mode. merge _ flag is a merge mode flag, and a value of 1 indicates that the current CU uses the merge mode; a value of 0 indicates that no fusion mode is used. cu _ pred _ mode is a coding unit prediction mode flag, and a value of 1 indicates that the current prediction unit uses intra prediction; a value of 0 indicates that the current prediction unit uses normal inter prediction (information identifying inter prediction direction, reference frame index, motion vector predictor index, motion vector difference component, etc. in the code stream).
Optionally, if all coding blocks covered by the current node are restricted to only use inter prediction, parsing the prediction block of the luma CB partitioned by the current node includes parsing skip _ flag or/and merge _ flag, defaulting cu _ pred _ mode to 0, parsing the inter prediction block, such as a merge index (merge index), an inter prediction direction (inter dir), a reference frame index (reference index), a motion vector predictor index (motion vector predictor index), and a motion vector difference component (motion vector difference). From the inter-prediction block obtained by the analysis, motion information of each 4 × 4 sub-block in the luminance CB is derived. If all coding blocks covered by the current node can only use inter-frame prediction, the prediction block of the chroma CB obtained by dividing the current node does not need to be analyzed, the chroma CB is divided into 2x2 chroma sub-blocks (the dividing mode can be a dividing mode S), and the motion information of each 2x2 chroma sub-block is the motion information of a 4x4 luminance area corresponding to each 2x2 chroma sub-block. By the above division, neither chroma small blocks using intra prediction nor transform blocks smaller than 16 pixels are generated, and thus the above intra prediction problem and coefficient coding problem are solved.
Optionally, if all coding blocks covered by the current node are restricted to only use inter-frame prediction, a prediction block of the chroma CB obtained by dividing the current node does not need to be analyzed, the chroma prediction block and the chroma coding block have the same size, and the motion information of the chroma CB is the motion information of a certain preset position in a luminance region corresponding to the chroma CB (for example, the center, the lower right corner or the upper left corner of the luminance region). By the above-described division method, neither a chroma small block using intra prediction nor a small block transform block nor a chroma small block using inter prediction will be generated.
It should be noted that the intra prediction mode in this embodiment is a prediction mode that generates a prediction value of a coding block using spatial reference pixels of an image in which the coding block is located, such as a direct current mode (DC mode), a Planar mode (Planar mode), an angular mode (angular mode), and possibly a template matching mode (IBC mode). The Inter prediction mode is a prediction mode for generating a prediction value of the coding block using temporal reference pixels in a reference picture of the coding block, such as a Skip mode (Skip mode), a Merge mode (Merge mode), an amvp (advanced vector prediction) mode or a general Inter mode.
And performing inter-frame prediction or intra-frame prediction on each coding block by the prediction block of each coding block to obtain an inter-frame prediction image or an intra-frame prediction image of each coding block. And according to the residual information of each coding block, carrying out inverse quantization and inverse transformation processing on the transformation coefficient to obtain a residual image, and overlapping the residual image on the predicted image of the corresponding area to generate a reconstructed image.
Optionally, in a possible implementation manner, the image block with the preset size includes a luminance block with a first preset size, and correspondingly, step 102 includes:
and judging whether the current node is divided based on the dividing mode of the brightness block of the current node to obtain the brightness block with the first preset size. The first preset-sized luminance block is a luminance block with a pixel size of 4 × 4.
In a case that a luminance block having a first preset size is obtained by dividing the luminance block of the current node based on the dividing manner of the luminance block of the current node, step 103 correspondingly includes: intra prediction is used for all encoded blocks covered by the current node.
Under the condition that the luminance block of the current node is divided based on the dividing manner of the luminance block of the current node, so that the luminance block with the first preset size cannot be obtained, correspondingly, the step 104 includes: the method comprises the steps of dividing a brightness block of a current node by adopting a dividing mode of the brightness block of the current node, and dividing a chrominance block of the current node by adopting a dividing mode of the chrominance block of the current node, without limiting prediction modes of all coding blocks covered by the current node.
Optionally, in another possible implementation manner, the image block with a preset size includes a chrominance block with a second preset size, and correspondingly, step 102 includes:
and judging whether the current node is divided based on the dividing mode of the chrominance block of the current node to obtain the chrominance block with the second preset size. The second preset-sized chroma block is a chroma block with a pixel size of 2 × 2, 2 × 4, or 4 × 2.
Under the condition that a chroma block with a second preset size is obtained by dividing the chroma block of the current node based on the dividing mode of the chroma block of the current node, correspondingly, the step 103 includes: intra-prediction is used for all coding blocks covered by the current node, or inter-prediction is used for all coding blocks covered by the current node.
When the chroma block of the current node is divided based on the dividing manner of the chroma block of the current node, so as not to obtain a chroma block with a second preset size, correspondingly, the step 104 includes: and the chroma block of the current node is divided by adopting a dividing mode of the chroma block of the current node, and the brightness block of the current node is divided by adopting a dividing mode of the brightness block of the current node, so that the prediction modes of all coding blocks covered by the current node are not limited. In the image prediction method provided in this embodiment, by obtaining the partition mode of the current node, it is determined whether to obtain an image block with a preset size by partitioning the current node based on the partition mode of the current node, where the image block includes a luminance block or a chrominance block. Under the condition that an image block with a preset size is obtained by dividing a current node based on a dividing mode of the current node, intra-frame prediction is used for all coding blocks covered by the current node, or inter-frame prediction is used for all coding blocks covered by the current node. The method uses intra-frame prediction or inter-frame prediction for all the coding blocks of the current node, can realize parallel processing of all the coding blocks of the current node, and improves the processing performance of image prediction, thereby improving the processing speed of encoding and decoding.
On the basis of the embodiment shown in fig. 9, the following embodiment provides an image prediction method to describe in detail a process of determining whether a luminance block with a first preset size is obtained by dividing a current node based on a dividing manner of a luminance block of the current node, and specifically discloses a determination set of luminance blocks with the first preset size.
Fig. 10 is a flowchart illustrating a second image prediction method according to an embodiment of the present application. As shown in fig. 10, the image prediction method provided in this embodiment includes:
step 201, obtaining the partition mode of the current node.
Specifically, the dividing information of the current node is analyzed, and if the dividing information indicates that the luminance block of the current node is divided, the dividing mode of the luminance block of the current node is further determined. The dividing manner of the luminance block includes at least one of a quad-tree division, a vertical dichotomy, a horizontal dichotomy, a vertical trisection, and a horizontal trisection, and of course, other dividing manners may also be used, which is not specifically limited in this embodiment.
Step 202, according to the size and the dividing mode of the current node, determining whether the current node is divided based on the dividing mode to obtain a brightness block with a first preset size.
The first preset-sized luminance block may be a luminance block having a pixel size of 4 × 4 or 8 × 8.
Executing step 203 under the condition that it is determined that the brightness block with the first preset size is obtained by dividing the current node based on the dividing mode; in a case that it is determined that the dividing of the current node based on the dividing manner does not result in a luminance block having a first preset size, step 204 is performed.
Specifically, whether the current node is divided based on the dividing mode of the brightness block to obtain the brightness block with the first preset size is determined according to the size of the current node and the dividing mode of the brightness block of the current node.
In an embodiment, the size of the current node may be understood as the pixel size of the image block corresponding to the current node. The size of the current node may be determined according to the width and height of the image block corresponding to the current node, may also be determined according to the area of the image block corresponding to the current node, and may also be determined according to the number of luminance pixels of the image block corresponding to the current node. For example, the current node including 128 luminance pixel points may be described as the area of the current node being 128, and may also be described as the product of the width and the height of the current node being 128.
According to the size of the current node and the dividing mode of the brightness block of the current node, it is determined that the brightness block of the first preset size is obtained by dividing the current node based on the dividing mode of the brightness block, and the brightness block includes one or more of the following first sets.
1) The current node comprises M1 pixels and the division mode of the current node is quad-tree division, for example, M1 is 64;
2) the current node comprises M2 pixels and the division mode of the current node is a ternary tree division, for example, M2 is 64;
3) the current node contains M3 pixels and the division mode of the current node is binary tree division, for example, M3 is 32;
4) the current node contains 64 luma pixels and uses a ternary tree partition (vertical thirds or horizontal thirds) or a quaternary tree partition, or the current node contains 32 luma pixels and uses a binary tree partition (vertical dichotomy or horizontal dichotomy);
5) the width of the current node is equal to 4 times of the second threshold value, the height of the current node is equal to the second threshold value, and the current node is divided in a vertical ternary tree manner;
6) the width of the current node is equal to a second threshold value, the height of the current node is equal to 4 times of the second threshold value, and the division mode of the current node is horizontal ternary tree division;
7) the width of the current node is equal to 2 times of the second threshold value, the width is higher than the second threshold value, and the division mode of the current node is vertical bisection;
8) The height of the current node is higher than 2 times of the second threshold value, the width of the current node is equal to the second threshold value, and the division mode of the current node is horizontal halving;
9) the width or/and height of the current node is 2 times of the second threshold value, and the division mode of the current node is quadtree division.
In the first set, the width of the current node is the width of the luminance block corresponding to the current node, and the height of the current node is the height of the luminance block corresponding to the current node. In a particular implementation, for example, the second threshold may be 4.
The first set described above applies to video data formats as YUV4:2:0 or YUV4:2: 2.
Optionally, when the first preset-size luminance block has a pixel size of 4 × 4, determining whether the current node is divided based on the dividing method to obtain the luminance block having the first preset size according to the size of the current node and the dividing method may include:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode is quadtree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the brightness block of the current node is 64, and the dividing mode is ternary tree division; alternatively, the first and second electrodes may be,
3) the number of sampling points of the luminance block of the current node is 32, and the division mode is binary tree division.
The number of sampling points of the luminance block of the current node is the number of luminance pixels (pixel size) of the image block corresponding to the current node.
Step 203, intra prediction is used for all coding blocks covered by the current node.
Using intra prediction for all encoded blocks covered by the current node may include:
dividing the brightness blocks included by the current node according to a dividing mode to obtain divided brightness blocks, and performing intra-frame prediction on the divided brightness blocks; and taking the chroma block included by the current node as a chroma coding block, and using intra-frame prediction for the chroma coding block.
That is, if it is determined that all the coding blocks of the current node use intra-frame prediction, dividing the brightness block of the current node according to the dividing mode of the brightness block to obtain N brightness coding tree nodes; and (3) the chroma block of the current node is not divided to obtain a chroma coding block (chroma CB for short).
Wherein, the N luminance coding tree nodes can be limited not to be divided any more, or not to be limited. If the luminance coding tree node is continuously divided, the division mode is analyzed to carry out recursive division, and when the luminance coding tree node is not divided, the luminance coding tree node corresponds to a luminance coding block (called luminance CB for short). The luminance CB obtains a luminance prediction block corresponding to the luminance CB using intra prediction.
And the chroma CB uses intra-frame prediction to obtain a chroma prediction block corresponding to the chroma CB, and the chroma prediction block and the chroma CB have the same size.
Optionally, using intra prediction for all coding blocks covered by the current node may include:
using a brightness block included by a current node as a brightness coding block, and using intra-frame prediction for the brightness coding block; and taking the chroma block included by the current node as a chroma coding block, and using intra-frame prediction for the chroma coding block. That is, neither the luma block nor the chroma block for the current node may be partitioned.
And 204, dividing the current node by adopting a dividing mode of the current node without limiting the prediction modes of all coding blocks covered by the current node.
Step 204 of this embodiment is the same as step 104 of the embodiment shown in fig. 9, and reference is specifically made to the above embodiments, which are not repeated herein.
Optionally, after step 203 or step 204, the method further includes:
and step 205, analyzing prediction block and residual error information of all coding blocks covered by the current node.
And step 206, decoding each coding block to obtain a reconstruction signal of the image block corresponding to the current node.
Step 205 and step 206 in this embodiment are the same as step 105 and step 106 in the embodiment shown in fig. 9, and refer to the above embodiments specifically, which are not described herein again.
In the image prediction method provided in this embodiment, by obtaining the partition mode of the current node, according to the size of the current node and the partition mode of the luminance block, it is determined whether the luminance block of the first preset size is obtained by partitioning the current node based on the partition mode of the luminance block, and under the condition that it is determined that the luminance block of the first preset size is obtained by partitioning the current node based on the partition mode of the luminance block, intra-frame prediction is used for all coding blocks covered by the current node. The method uses the intra-frame prediction for all the coding blocks of the current node, can realize the parallel processing of all the coding blocks of the current node, and improves the processing performance of the image prediction, thereby improving the processing speed of the coding and decoding.
Fig. 11 is a flowchart illustrating a third image prediction method according to an embodiment of the present application. On the basis of the embodiment shown in fig. 10, as shown in fig. 11, it should be noted that the scheme shown in fig. 11 may be a scheme in the case that the video data format is YUV4:2:0 or YUV4:2:2, or may be a scheme only in the case that the video data format is YUV4:2: 0. In a case that it is determined that the dividing of the current node based on the dividing manner does not result in a luminance block having a first preset size, step 204 may include:
Step 2041, it is determined whether the current node is divided based on the division manner to obtain a chroma block with a second preset size.
Executing step 2042 when it is determined that a chroma block with a second preset size is obtained by dividing the current node based on the dividing manner; if it is determined that the chroma block with the second preset size cannot be obtained by dividing the current node based on the dividing manner, step 2043 is performed.
Specifically, step 2041 includes: and determining whether the current node is divided based on the dividing mode of the chrominance blocks to obtain the chrominance blocks with the second preset size or not according to the size of the current node and the dividing mode of the chrominance blocks. The second preset-sized chroma block may be a chroma block having a pixel size of 2 × 2, 2 × 4, or 4 × 2.
According to the size of the current node and the division mode of the chroma blocks, determining that the chroma block division mode based on the chroma blocks divides the current node to obtain the chroma blocks with a second preset size, wherein the chroma blocks include one or more of the following second sets.
When the video data format is YUV 4:2:2, the second set comprises:
1) a size of a chroma block of at least one child node of the current node is 2x2, 2x4, or 4x 2;
2) The width or height of the chroma block of at least one child node of the current node is 2;
3) the current node comprises 64 brightness pixels and the division mode of the current node is ternary tree division or quaternary tree division;
4) the current node comprises 32 luminance pixels and the division mode of the current node is binary tree division or ternary tree division;
5) the area (or the product of width and height) of the current node is S, S/2< th1, and the division mode of the current node is vertical halving or horizontal halving; or the area (or the product of width and height) of the current node is S, S/4< th1, and the current node is divided into three vertical divisions, three horizontal divisions or a quadtree division. The threshold th1 is 32.
When the video data format is YUV 4:2:0, the second set comprises:
1) if the size of the chroma block of at least one child node of the current node is 2x2, 2x4, or 4x 2;
2) if the width or height of the chroma block of at least one child node of the current node is 2;
3) if the current node contains 128 luma pixels and the current node uses a ternary tree partition, or if the current node contains 64 luma pixels and the current node uses a binary tree partition or a quaternary tree partition or a ternary tree partition;
4) If the current node contains 256 luma pixels and the node uses a ternary tree partition or a quaternary tree partition, or if the current node contains 128 luma pixels and the node uses a binary tree partition;
5) if the current node contains N1 luma pixels and the current node uses a ternary tree partition, N1 is 64, 128, or 256.
6) If the current node contains N2 luma pixels and the current node uses quadtree partitioning, N2 is 64 or 256.
7) If the current node contains N3 luma pixels and the current node uses binary tree partitioning, N3 is 64, 128, or 256.
8) The area (or the product of width and height) of the current node is S, S/2< th1, and the division mode of the current node is vertical halving or horizontal halving; or the area (or the product of width and height) of the current node is S, S/4< th1, and the current node is divided into three vertical divisions, three horizontal divisions or a quadtree division. Where the threshold th1 is 64.
Alternatively, the luminance block having the first preset size may be a 4 × 4 luminance block, and in the case where the luminance block having the first preset size is a 4 × 4 luminance block, the chrominance block having the second preset size may be a 2 × 4 or 4 × 2 pixel size chrominance block, excluding a 2 × 2 pixel size chrominance block.
Alternatively, the luminance block having the first preset size may be a 4 × 4 luminance block, and in the case where the luminance block having the first preset size is a 4 × 4 luminance block, the chrominance block having the second preset size may be a 4 × 8 or 8 × 4 luminance block having a pixel size, excluding the luminance block having the pixel size of 4 × 4.
Optionally, when the chroma block of the second preset size is a chroma block with a pixel size of 2 × 4 or 4 × 2, or a luma block with a pixel size of 4 × 8 or 8 × 4, determining whether the current node is divided based on the dividing manner to obtain the chroma block of the second preset size may include:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode is binary tree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the luminance block of the current node is 128, and the division manner is the ternary tree division.
Step 2042, intra prediction is used for all coding blocks covered by the current node, or inter prediction is used for all coding blocks covered by the current node.
The intra prediction or inter prediction is used for all coding blocks covered by the current node, and can be determined by the following method.
The method comprises the following steps: analyzing the prediction mode state identification of the current node, and using inter-frame prediction for all coding blocks covered by the current node when the value of the prediction mode state identification is a first value; or, when the value of the prediction mode state identifier is the second value, using intra-frame prediction for all coding blocks covered by the current node.
The method actually determines the prediction mode of all coding blocks covered by the current node according to the flag bit in the grammar table. Specifically, the prediction mode status flag cons _ pred _ mode _ flag is parsed from the code stream. The first value of the cons _ pred _ mode _ flag is set to 0, which indicates that all coding blocks obtained by dividing or not dividing the current node use inter-frame prediction, and the second value of the cons _ pred _ mode _ flag is set to 1, which indicates that all coding blocks obtained by dividing or not dividing the current node use intra-frame prediction. Optionally, a first value of the cons _ pred _ mode _ flag is set to 1, which indicates that all coding blocks obtained by dividing or not dividing the current node use inter-frame prediction, and a second value of the cons _ pred _ mode _ flag is set to 0, which indicates that all coding blocks obtained by dividing or not dividing the current node use intra-frame prediction. The meaning expressed by the cons _ pred _ mode _ flag may also be expressed by using another identifier (for example, mode _ cons _ flag), which is not limited in this embodiment.
The cons _ pred _ mode _ flag may be a syntax element that needs to be parsed in the block division process, and when the syntax element is parsed, the coding unit prediction mode flag cu _ pred _ mode of the coding unit of the current node coverage area may not be parsed any more, and its value is a default value corresponding to the value of the cons _ pred _ mode _ flag.
The semantics of the syntax element cons _ pred _ mode _ flag are described as follows: the cons _ pred _ mode _ flag is 0, which means that only inter prediction is used for all coding blocks covered by the current node of the current node, and the cons _ pred _ mode _ flag is 1, which means that only intra prediction is used for all coding blocks covered by the current node.
If the current node is in the Intra-frame image area (namely the image type or slice type where the current node is located is Intra type or I type), and the IBC mode is allowed to be used, the value of cu _ pred _ mode is deduced to be 1, and the cu _ pred _ mode is not required to be obtained by analyzing the code stream; if the current node is in the intra-frame image area and the IBC mode is not allowed to be used, the cu _ pred _ mode is deduced to be 1, and the cu _ skip _ flag is 0, and the cu _ skip _ flag does not need to be obtained by analysis from the code stream.
If the current node is in the Inter-frame image area (that is, the image type or slice type where the current node is located is Inter type or B type), the value of cu _ pred _ mode is derived to be 0, and does not need to be obtained by parsing from the code stream.
Among other things, IBC prediction may be attributed to intra prediction mode, since the reference pixels of IBC prediction are from reconstructed pixels in the current image. Therefore, in the embodiment of the present application, the intra prediction may include the IBC mode. That is, in the embodiment of the present application, the intra prediction may be performed using an IBC mode, or may be performed using a normal intra prediction mode intra, or may be performed using an IBC mode + a normal intra prediction mode intra. Therefore, in the embodiments of the present application, intra prediction may also be understood as non-inter prediction.
Optionally, the type (slice type) of the slice in which the current node is located is not an Intra (Intra) type.
The second method comprises the following steps: when the prediction mode of any coding block covered by the current node is inter-frame prediction, using inter-frame prediction for all coding blocks covered by the current node; or, when the prediction mode of any coding block covered by the current node is intra prediction, intra prediction is used for all coding blocks covered by the current node.
The method actually determines the prediction modes of all the coding blocks covered by the current node according to the prediction mode of any coding block in the current node.
Optionally, any coding block is the first coding block in decoding order of all coding blocks covered by the current node. Specifically, the prediction mode of the first coding block B0 in the current node area is parsed, and the prediction mode of the first coding block B0 is not limited in this embodiment. When the prediction mode of the analysis B0 is intra-frame prediction, all coding blocks covered by the current node use intra-frame prediction; when the prediction mode of the parsing B0 is inter prediction, all coding blocks covered by the current node use inter prediction.
It should be noted that the steps performed by the above-mentioned method one and method two can be applied to the video decoder 24 shown in fig. 8.
And 2043, dividing the current node by adopting a dividing mode of the current node without limiting the prediction modes of all coding blocks covered by the current node.
Optionally, after step 203, step 2042, or step 2043, the method further includes:
and step 205, analyzing prediction block and residual error information of all coding blocks covered by the current node.
And step 206, decoding each coding block to obtain a reconstruction signal of the image block corresponding to the current node.
Step 205 and step 206 in this embodiment are the same as step 105 and step 106 in the embodiment shown in fig. 9, and refer to the above embodiments specifically, which are not described herein again.
In the image prediction method provided in this embodiment, by obtaining a partition manner of a current node, according to a size of the current node and a partition manner of a luminance block, it is determined whether a luminance block of a first preset size is obtained by partitioning the current node based on the partition manner of the luminance block, and under the condition that it is determined that a luminance block of the first preset size is obtained by partitioning the current node based on the partition manner of the luminance block, it is further determined whether a chrominance block of a second preset size is obtained by partitioning the current node based on the partition manner of the chrominance block, and under the condition that it is determined that a chrominance block of the second preset size is obtained by partitioning the current node based on the partition manner of the chrominance block, intra-frame prediction is used for all coding blocks covered by the current node, or inter-frame prediction is used. The method uses intra-frame prediction or inter-frame prediction for all the coding blocks of the current node, can realize parallel processing of all the coding blocks of the current node, and improves the processing performance of image prediction, thereby improving the processing speed of encoding and decoding.
The following describes the image prediction method provided in the embodiment shown in fig. 11 with reference to two specific examples.
The first example applies to the video data format YUV being 4: 2: 0 or 4: 2: 2, or video data format YUV only for 4: 2: 0.
the image prediction method of the present example includes:
step 1, obtaining the partition mode of the current node.
Step 2, judging whether the area and the dividing mode of the current node meet at least one of the following conditions A:
(1) the area of the current node is equal to 32, and the division mode of the current node is vertical halving or horizontal halving;
(2) the area of the current node is equal to 64, and the current node is divided into three vertical divisions, three horizontal divisions or a quadtree.
And if the area and the dividing mode of the current node meet at least one item of the condition A, executing the step 3.
And 3, limiting all coding blocks covered by the current node to use intra-frame prediction.
Optionally, the cons _ pred _ mode _ flag value is set to 1.
And if the area and the dividing mode of the current node do not meet the condition A, executing the step 4.
Step 4, judging whether the area and the dividing mode of the current node meet at least one of the following conditions B:
(1) The area S of the current node meets S/2< th1, and the division mode of the current node is vertical halving or horizontal halving;
(2) the area S of the current node meets S/4< th1, and the current node is divided into three vertical divisions, three horizontal divisions or a quadtree.
Among them, the threshold th1 is related to the video data format, for example, th1 is 64 when the video data format is YUV 4:2:0, and th1 is 32 when the video data format is YUV 4:2: 2.
And if the area and the dividing mode of the current node meet at least one item of the condition B, executing the step 5.
And 5, resolving a flag bits cons _ pred _ mode _ flag from the code stream, and determining whether the coding units of the coverage area of the current node all use inter-frame prediction or all use intra-frame prediction according to the value of the cons _ pred _ mode _ flag.
And if the area and the dividing mode of the current node do not meet the condition B, executing the step 6.
And 6, dividing the current node by using the dividing mode of the current node without limiting the prediction modes of all coding blocks covered by the current node.
Optionally, after step 6, the method further includes:
and 7, analyzing prediction block and residual error information of all coding blocks covered by the current node.
And 8, decoding each coding block to obtain a reconstruction signal of the image block corresponding to the current node.
A second example is suitable for a video data format YUV of 4:2: 0.
The image prediction method of the present example includes:
step 1, obtaining the partition mode of the current node.
Step 2, judging whether the area and the dividing mode of the current node meet the condition C:
the area of the current node is equal to 64 and the current node uses horizontal thirds, vertical thirds or quadtree partitioning.
And if the area and the dividing mode of the current node meet the condition C, executing the step 3.
And 3, using intra-frame prediction for coding units in the coverage area of the current node.
Optionally, cons _ pred _ mode _ flag is set to 1.
And if the area and the dividing mode of the current node do not meet the condition C, executing the step 4.
Step 4, judging whether the area and the dividing mode of the current node meet at least one of the conditions D:
(1) the area of the current node is equal to 64 and the current node uses horizontal bisection or vertical bisection;
(2) the area of the current node is equal to 128 and the current node uses either horizontal thirds or vertical thirds.
And if the area and the dividing mode of the current node meet at least one of the conditions D, executing the step 5.
And 5, resolving a flag bits cons _ pred _ mode _ flag from the code stream, and determining whether the coding units of the coverage area of the current node all use inter-frame prediction or all use intra-frame prediction according to the value of the cons _ pred _ mode _ flag.
And if the area and the dividing mode of the current node do not meet the condition D, executing the step 6.
And 6, dividing the current node by using the dividing mode of the current node without limiting the prediction modes of all coding blocks covered by the current node.
Optionally, after step 6, the method further includes:
and 7, analyzing prediction block and residual error information of all coding blocks covered by the current node.
And 8, decoding each coding block to obtain a reconstruction signal of the image block corresponding to the current node.
Fig. 12 is a flowchart illustrating a fourth image prediction method according to an embodiment of the present application. As shown in fig. 12, the image prediction method provided in this embodiment includes:
and 301, acquiring the dividing mode of the current node.
Specifically, the partition information of the current node is analyzed, and if the partition information indicates that the chroma block of the current node is partitioned, the partition mode of the chroma block of the current node is further determined. The dividing manner of the chrominance block includes at least one of a quad-tree division, a vertical dichotomy, a horizontal dichotomy, a vertical trisection, and a horizontal trisection, and of course, other dividing manners may also be used, which is not specifically limited in this embodiment.
Step 302, determining whether the current node is divided based on the dividing mode to obtain a chrominance block with a second preset size according to the size and the dividing mode of the current node.
Executing step 303 when it is determined that the chroma block with the second preset size is obtained by dividing the current node based on the dividing manner; in a case that it is determined that the chroma block having the second preset size cannot be obtained by dividing the current node based on the dividing manner, step 304 is performed.
Step 302 of this embodiment is the same as step 2041 of the embodiment shown in fig. 11, and reference may be made to the above embodiments for details, which are not repeated herein.
Optionally, step 302 may include: and determining whether the current node is divided based on the dividing mode to obtain a brightness block with a third preset size or not according to the size and the dividing mode of the current node.
Alternatively, the luminance block having the third preset size may be a 4 × 4, 4 × 8, or 8 × 4 luminance block.
Optionally, the determining whether the current node is divided based on the dividing manner to obtain the chrominance block of the second preset size may include:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode is quadtree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the brightness block of the current node is 64, and the dividing mode is ternary tree division; alternatively, the first and second electrodes may be,
3) the number of sampling points of the brightness block of the current node is 32, and the division mode is binary tree division; alternatively, the first and second electrodes may be,
4) The number of sampling points of the brightness block of the current node is 64, and the division mode is binary tree division; alternatively, the first and second electrodes may be,
5) the number of sampling points of the luminance block of the current node is 128, and the division manner is the ternary tree division.
Alternatively, the second preset-sized chrominance block may be a chrominance block having a pixel size of 2 × 4 or 4 × 2, excluding a chrominance block having a pixel size of 2 × 2. Similarly, the luminance block having the third preset size may be a luminance block having a pixel size of 4 × 8 or 8 × 4, excluding a luminance block having a pixel size of 4 × 4. Correspondingly, the determining whether the current node is divided based on the dividing manner to obtain the chrominance block with the second preset size may include:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode is binary tree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the luminance block of the current node is 128, and the division manner is the ternary tree division.
Step 303, using intra-frame prediction for all coding blocks covered by the current node, or using inter-frame prediction for all coding blocks covered by the current node.
Whether intra prediction is used for all coding blocks covered by the current node or inter prediction is used for all coding blocks covered by the current node can be determined through step 2042 in the embodiment shown in fig. 11, which may specifically refer to the above embodiments, and details are not described here.
And step 304, dividing the current node by adopting a dividing mode of the current node without limiting the prediction modes of all coding blocks covered by the current node.
Step 304 of this embodiment is the same as step 104 of the embodiment shown in fig. 9, and reference is specifically made to the above embodiments, which are not repeated herein.
Optionally, after step 303 or step 304, the method further includes:
and 305, analyzing prediction block and residual error information of all coding blocks covered by the current node.
And step 306, decoding each coding block to obtain a reconstruction signal of the image block corresponding to the current node.
Step 305 and step 306 in this embodiment are the same as step 105 and step 106 in the embodiment shown in fig. 9, and refer to the above embodiments specifically, which are not described herein again.
In the image prediction method provided in this embodiment, by obtaining the partition mode of the current node, according to the size of the current node and the partition mode, it is determined whether the current node is partitioned based on the partition mode to obtain a chroma block with a second preset size, and under the condition that it is determined that the current node is partitioned based on the partition mode to obtain the chroma block with the second preset size, intra-frame prediction is used for all coding blocks covered by the current node, or inter-frame prediction is used for all coding blocks covered by the current node. The method uses intra-frame prediction or inter-frame prediction for all the coding blocks of the current node, can realize parallel processing of all the coding blocks of the current node, and improves the processing performance of image prediction, thereby improving the processing speed of encoding and decoding.
Fig. 13 is a flowchart illustrating a fifth image prediction method according to an embodiment of the present application. Based on the embodiment shown in fig. 12, as shown in fig. 13, in a case that it is determined that the dividing the current node based on the dividing manner would result in a chroma block with a second preset size, step 304 may include:
step 3041, it is determined whether the current node is divided based on the dividing manner to obtain a luminance block with a first preset size.
Executing step 3042 under the condition that it is determined that the dividing of the current node based on the dividing manner will result in a brightness block of a first preset size; if it is determined that the first preset-sized luminance block is not obtained by dividing the current node based on the dividing manner, step 3043 is performed.
Specifically, whether the current node is divided based on the dividing mode of the brightness block to obtain the brightness block with the first preset size is determined according to the size of the current node and the dividing mode of the brightness block. The first preset-sized luminance block is a luminance block with a pixel size of 4 × 4. The specific determination process is the same as step 202 in the embodiment shown in fig. 10, which can be referred to the above embodiment and is not described herein again.
Step 3042, intra prediction is used for all code blocks covered by the current node.
Step 3043, the current node is divided by the current node dividing method without limiting the prediction modes of all the coding blocks covered by the current node.
Optionally, after step 303, step 3042, or step 3043, the method further includes:
and 305, analyzing prediction block and residual error information of all coding blocks covered by the current node.
And step 306, decoding each coding block to obtain a reconstruction signal of the image block corresponding to the current node.
Step 305 and step 306 in this embodiment are the same as step 105 and step 106 in the embodiment shown in fig. 9, and refer to the above embodiments specifically, which are not described herein again.
In the image prediction method provided by this embodiment, by obtaining a partition manner of a current node, according to a size of the current node and the partition manner, it is determined whether a chroma block with a second preset size is obtained by partitioning the current node based on the partition manner, under a condition that it is determined that a chroma block with the second preset size is not obtained by partitioning the current node based on the partition manner, it is further determined whether a luminance block with a first preset size is obtained by partitioning the current node based on the partition manner, and under a condition that it is determined that a luminance block with the first preset size is obtained by partitioning the current node based on the partition manner, intra-frame prediction is used for all coding blocks covered by the current node. The method uses intra-frame prediction or inter-frame prediction for all the coding blocks of the current node, can realize parallel processing of all the coding blocks of the current node, and improves the processing performance of image prediction, thereby improving the processing speed of encoding and decoding.
Fig. 14 is a flowchart illustrating a sixth image prediction method according to an embodiment of the present application. The image prediction method provided by the present embodiment is applied to the video encoder 18 shown in fig. 8. As shown in fig. 14, the method provided by this embodiment includes:
step 401, obtaining a partition mode of a current node.
At the encoding end, generally, the partition mode allowed by the current node is determined, and then an optimal partition mode is determined as the partition mode of the current node by using a Rate-distortion optimization (RDO) method. This step is prior art and will not be described in detail here.
And 402, judging whether the size and the dividing mode of the current node meet one of first preset conditions.
Wherein, the first preset condition comprises:
1) the number of sampling points of the brightness block of the current node is 64, and the division mode of the current node is quadtree division; alternatively, the first and second electrodes may be,
2) the number of sampling points of the brightness block of the current node is 64, and the division mode of the current node is ternary tree division; alternatively, the first and second electrodes may be,
3) the number of sampling points of the brightness block of the current node is 32, and the division mode of the current node is binary tree division.
The number of sampling points of the luminance block of the current node, that is, the number of luminance pixels (pixel size) of the image block corresponding to the current node, may be obtained according to a product of the width and the height of the current node.
In another embodiment, the first preset condition further includes the following condition 4):
4) and dividing the current node according to the dividing mode to obtain a brightness block with a preset size, wherein the preset size is 4x4 or 8x 8.
It should be noted that, satisfying one of the first preset conditions may result in a luminance block of a first preset size (4x4 or 8x8) and may result in a chrominance block of a second preset size (2x4 or 4x 2).
In this step, if the size and the dividing manner of the current node satisfy one of the first preset conditions, step 403 is executed; otherwise, if the size and the partition mode of the current node do not satisfy all the conditions in the first preset condition, step 404 is executed.
Step 403, intra prediction is used for all coding blocks covered by the current node.
Optionally, it is determined that all code blocks covered by the current node use intra-frame prediction, the value of mode _ constraint _ flag is set to 1, and the value of mode _ constraint _ flag does not need to be written into the code stream, and correspondingly, the decoding end may also derive the value of mode _ constraint _ flag to 1 according to the same method.
And step 404, judging whether the size and the dividing mode of the current node meet one of second preset conditions.
Wherein the second preset condition comprises:
1) the number of sampling points of the brightness block of the current node is 64, and the current node is divided into a vertical binary tree or a horizontal binary tree. Alternatively, the first and second electrodes may be,
2) the number of sampling points of the brightness block of the current node is 128, and the current node is divided into a vertical ternary tree or a horizontal ternary tree.
In another embodiment, the second predetermined condition further includes condition 3)
3) And dividing the current node according to the dividing mode to obtain a chroma block with a preset size, wherein the preset size is 2x4 or 4x 2.
In this step, if the size and the division manner of the current node satisfy one of the second preset conditions, step 405 is executed; if the size and the division manner of the current node do not satisfy all the conditions in the second preset conditions, step 406 is executed.
Step 405, using intra-frame prediction for all coding blocks covered by the current node, or using inter-frame prediction for all coding blocks covered by the current node.
The prediction modes used by all coding blocks of the current node can be determined in several ways in this embodiment as follows:
in one implementation, if the type of picture or slice (slice) in which the current node is located is type I, it is determined that all coding blocks in the current node are only suitable for intra prediction (non-inter prediction). Optionally, the value of mode _ constraint _ flag is set to 1, and at this time, the mode _ constraint _ flag does not need to be written into the code stream.
If the picture type or slice type (slice) in which the current node is located is not type I, a Rate-distortion optimization (RDO) method or other methods are used to determine the value of mode _ constraint _ flag.
The RDO method is that an encoder respectively calculates rate-distortion costs (RD cost) when all coding blocks covered by a current node use inter-frame prediction and intra-frame prediction, the values of the rate-distortion costs in two prediction modes are compared, and the prediction mode with the minimum rate-distortion cost value is determined as a final prediction mode. Setting mode _ constraint _ flag to 1 if the prediction mode with the smallest rate-distortion value is intra prediction; and if the prediction mode with the minimum rate distortion is inter-frame prediction, setting the value of the mode _ constraint _ flag to be 0, and writing the value of the mode _ constraint _ flag into the code stream.
For example, the encoder calculates RDcost when all the coding blocks covered by the current node use inter-frame prediction, and then calculates RD cost when using intra-frame prediction, and if all the coding blocks covered by the current node use inter-frame prediction without residual (for example, skip mode), it is determined that all the coding blocks covered by the current node use inter-frame prediction, and the value of mode _ constraint _ flag is set to 0, and RD cost when using intra-frame prediction does not need to be calculated. The encoder may also calculate RD cost when all coding blocks covered by the current node use intra-frame prediction, then calculate RD cost when inter-frame prediction is used, and determine the prediction mode with the minimum RD cost as the final prediction mode.
It should be noted that if the current node is in the Intra picture area (i.e., the picture type or slice type (slice _ type) where the current node is located is Intra or I type), and IBC mode is allowed to be used, the pred _ mode _ flag value defaults to 1. If the current node is in the intra picture area and IBC mode is not allowed to be used, pred _ mode _ flag defaults to 1 and cu _ skip _ flag defaults to 0 (indicating that skip mode is not used by the current block). If the current node is in the intra picture area (i.e. the picture type or slice type (slice _ type) where the current node is located is intra or I type), mode _ constraint _ flag defaults to 1.
And 406, dividing the current node by adopting a dividing mode of the current node without limiting the prediction modes of all coding blocks covered by the current node.
Optionally, after steps 403, 405, and 406, the method further includes:
step 407, determining whether the luminance block and the chrominance block of the current node are continuously divided according to the prediction mode of the current node.
Determining that all coding blocks in a current node only use intra-frame prediction, dividing brightness blocks included in the current node according to a dividing mode to obtain divided brightness blocks, and using intra-frame prediction on the divided brightness blocks; and taking the chroma block included by the current node as a chroma coding block, and using intra-frame prediction for the chroma coding block. That is, if it is determined that all the coding blocks in the current node use intra-frame prediction, dividing the brightness block of the current node according to the dividing mode of the brightness block to obtain N brightness coding tree nodes; and (3) the chroma block of the current node is not divided to obtain a chroma coding block (chroma CB for short). Wherein, the N luminance coding tree nodes can be limited not to be divided any more, or not to be limited. If the luminance coding tree node is continuously divided, the division mode is analyzed to carry out recursive division, and when the luminance coding tree node is not divided, the luminance coding tree node corresponds to a luminance coding block (called luminance CB for short). The luminance CB obtains a luminance prediction block corresponding to the luminance CB using intra prediction. And the chroma CB uses intra-frame prediction to obtain a chroma prediction block corresponding to the chroma CB, and the chroma prediction block and the chroma CB have the same size.
In another implementation mode, a luminance block and a chrominance block included in the current node are divided according to a dividing mode to obtain divided nodes, when a coding tree node is not divided any more, the coding tree node corresponds to a coding unit which comprises a luminance coding unit and a chrominance coding unit, and the divided luminance coding unit and the divided chrominance coding unit use intra-frame prediction.
In one implementation mode, a brightness block included in the current node is divided according to a dividing mode to obtain a divided brightness block, and the divided brightness block is subjected to inter-frame prediction; and dividing the chrominance blocks included by the current node according to a dividing mode to obtain divided chrominance blocks, and using inter-frame prediction for the divided chrominance blocks. That is, if it is determined that all the coding blocks of the current node use inter-frame prediction, dividing the luminance block of the current node according to the dividing mode of the luminance block to obtain N luminance coding tree nodes; and dividing the chrominance blocks of the current node according to the dividing mode of the chrominance blocks to obtain M chrominance coding tree nodes. Wherein, N and M are positive integers, and N and M can be the same or different. The partitioning of the N luma and M chroma coding tree nodes may be restricted from continuing, or not. And when the division is not continued, the N luminance coding tree nodes correspond to the N luminance CBs of the current node, and the M chrominance coding tree nodes correspond to the M chrominance CBs of the current node. The N luminance CBs use inter-prediction to obtain corresponding luminance prediction blocks, and the M chrominance CBs use inter-prediction to obtain corresponding chrominance prediction blocks.
In one implementation mode, a brightness block included in the current node is divided according to a dividing mode to obtain a divided brightness block, and the divided brightness block is subjected to inter-frame prediction; and dividing the chrominance blocks included by the current node according to a dividing mode to obtain divided chrominance blocks, and using inter-frame prediction for the divided chrominance blocks. That is, if it is determined that all the coding blocks of the current node use inter-frame prediction, dividing the luminance block of the current node according to the dividing mode of the luminance block to obtain N luminance coding tree nodes; and dividing the chrominance blocks of the current node according to the dividing mode of the chrominance blocks to obtain M chrominance coding tree nodes. Wherein, N and M are positive integers, and N and M can be the same or different. The partitioning of the N luma and M chroma coding tree nodes may be restricted from continuing, or not. And when the division is not continued, the N luminance coding tree nodes correspond to the N luminance CBs of the current node, and the M chrominance coding tree nodes correspond to the M chrominance CBs of the current node. The N luminance CBs use inter-prediction to obtain corresponding luminance prediction blocks, and the M chrominance CBs use inter-prediction to obtain corresponding chrominance prediction blocks.
Specifically, if all the coding blocks in the current node only use inter-frame prediction, and if the current node is divided according to the division manner to obtain a child node, if the child node needs to be further divided, and the division according to the division manner of the child node to obtain a luminance block with a preset size, for example, a preset size of 4x4 (i.e., width and height are both 4), the division manner of the child node is not allowed, or the child node cannot be further divided. Specifically, if a node is restricted to use only inter prediction and the number of luma sample points of the node is 32 (or the product of the width and height of the node is 32), the node is not allowed to use binary tree partitioning (including horizontal binary tree and vertical binary tree partitioning). If a node is restricted to use only inter prediction and the number of luma samples of the node is 64 (or the product of the width and height of the node is 64), the node is not allowed to use the treeing partition (including horizontal treeing and vertical treeing partitions). The judgment method is suitable for video data formats of YUV4:2:0 and YUV4:2: 2.
For example, if the current node is 8x8 in size and two 8x4 (or two 4x8) nodes are generated using horizontal binary tree (or vertical binary tree) partitioning, continued partitioning of the 8x4 (or 4x8) nodes would generate 4x4 blocks, and therefore, 8x4 (or 4x8) nodes cannot use vertical binary tree partitioning (or horizontal binary tree partitioning) or cannot continue partitioning. For another example, if the number of sampling points of the luminance block of the current node is 128 and the division manner is horizontal treeing or vertical treeing, the number of sampling points of the luminance block may be 64, and the luminance node of the number of sampling points 64 may obtain a luminance block of 4 × 4 if the division manner of the sub-node is horizontal treeing or vertical treeing, so that when only inter-frame prediction has been restricted to be used, the horizontal treeing or vertical treeing cannot be used or cannot be continuously divided for the node of the number of sampling points 64.
And step 408, predicting the coding blocks of the CU obtained by dividing the current node to obtain the prediction values of the coding blocks.
If all the coding blocks in the current node only use intra-frame prediction, the coding end determines the optimal intra-frame prediction mode used by the current coding block by using a Rate-distortion optimization (RDO) method or other methods, and the current coding block uses the corresponding intra-frame prediction mode for prediction to obtain the prediction value of the current block.
If all coding blocks in the current node only use inter-frame prediction, the coding end determines the optimal inter-frame prediction mode used by the current coding block by using a utilization rate distortion optimization method or other methods, and the current coding block uses the corresponding inter-frame prediction mode for prediction to obtain the prediction value of the current block.
And meanwhile, the coding end assigns values of the syntax elements related to the CU level, and writes the values of all the syntax elements into the code stream according to the syntax definition criteria of the CU level. For example, if all the coding blocks within the current node use only intra prediction, the value of pred _ mode _ flag is set to 1 and is not written into the bitstream and does not appear in the bitstream. If all the coding blocks in the current node only use intra-frame prediction and the IBC mode is determined not to be used, the value of cu _ skip _ flag (or skip _ flag) is 0, and the code stream is not written, otherwise, the value of cu _ skip _ flag needs to be determined, the code stream is written, and the code stream is transmitted to a decoding end.
If all the coding blocks in the current node only use inter-frame prediction, the value of pred _ mode _ flag is set to 0, and the code stream is not written into and does not appear in the code stream. pred _ mode _ ibc _ flag is set to 0, and no stream is written and is not present in the stream.
And step 409, acquiring a reconstruction signal of the image block in the current node.
After prediction information is obtained by using intra-frame prediction or inter-frame prediction, residual information is obtained by subtracting the corresponding prediction information (or predicted value) from the pixel value of a pixel point in a current coding block, then the residual information is transformed by using methods such as Discrete Cosine Transform (DCT), and the like, and then a code stream is obtained by using quantization entropy coding. The encoding end transmits the residual information to the decoding end. After the prediction signal is added with the reconstructed residual signal, further filtering operation is required to obtain a reconstructed signal, and the reconstructed signal is used as a reference signal of subsequent coding. Particularly, if the coding block uses a skip mode, residual information is not needed, transformation is not needed, and the predicted value is the final reconstructed value.
In this embodiment, the image prediction method is described from the perspective of a video decoding end, and a video decoder determines whether all coding blocks of a current node use intra-frame prediction or inter-frame prediction according to the size and the division mode of the current node, so that parallel processing of all coding blocks of the current node can be realized, the processing performance of image prediction is improved, and the processing speed of decoding is increased.
The image prediction method provided by the present embodiment is applied to the video encoder 18 and/or the video decoder 24 shown in fig. 8. The embodiment comprises the following steps:
and 501, acquiring the division mode of the current node.
Step 501 of this embodiment is the same as step 101 of the embodiment shown in fig. 9, and is not described here again.
Step 502, the value of the variable modeTypeCondition is derived according to the following method
The modeTypeCondition value is a first value, such as 0, if one or more of the following preset conditions one holds.
1) The picture type or Slice (Slice) type where the current node is located is I type (Slice _ type ═ I), and the value of qtbtt _ dual _ tree _ intra _ flag is 1.
2) The prediction mode type of the current node is to use only intra prediction or inter prediction, i.e., has been restricted to use only inter prediction or intra prediction (non-inter prediction).
3) The chroma sampling structure is a Monochrome sampling structure (Monochrome) or a 4:4:4 structure. For example, the value of chroma _ format _ idc is 0 or 3.
In another embodiment, the first preset condition further includes the following condition 4):
4) the chroma sampling structure is a monochromatic sampling structure (Monochrome) or a 4:4:4 or 4:2:2 structure. For example, the value of chroma _ format _ idc is 0 or 3 or 2.
Otherwise, if one or more of the following preset conditions two holds, the value of modeTypeCondition is a second value, e.g., 1.
1) The product of the width and the height of the brightness block of the current node is 64, and the division mode of the current node is quadtree division;
2) the product of the width and the height of the brightness block of the current node is 64, and the current node is divided in a horizontal ternary tree or a vertical ternary tree;
3) the product of the width and the height of the luminance block of the current node is 32, and the current node is divided in a horizontal binary tree or a vertical binary tree.
Otherwise, if one or more of the following preset conditions three holds, and the chroma sampling structure is 4:2:0 (the value of chroma _ format _ idc is 1), the value of modeTypeCondition is derived according to the following formula: 1+ (slice _ type |.
1) The product of the width and the height of the brightness block of the current node is 64, and the current node is divided in a horizontal binary tree or a vertical binary tree;
2) the product of the width and the height of the luminance block of the current node is 128, and the current node is divided in a horizontal or vertical ternary tree manner.
Note that, as shown in table 3, the correspondence between the chroma sampling structure and chroma _ format _ idc is shown.
TABLE 3
chroma_format_idc separate_colour_plane_flag Chroma format SubWidthC SubHeightC
0 0 Monochrome 1 1
1 0 4:2:0 2 2
2 0 4:2:2 2 1
3 0 4:4:4 1 1
3 1 4:4:4 1 1
In Monochrome sampling (Monochrome), there is no chroma component format, and only a sequence of luma components exists.
In the following step 4: 2: in the 0 sample, the width of the two chrominance components is half of the width of the corresponding luminance component, and the height of the chrominance component is half of the height of the luminance component.
In the following step 4: 2: in 2 samples, the height of two chrominance components is the same as the corresponding luminance component, and the width of the chrominance component is half the width of the corresponding luminance component.
In the following step 4: 4: of 4 samples, depending on the value of separate _ colour _ plane _ flag: if separate _ color _ plane _ flag is equal to 0, the width and height of the two chrominance components are the same as the luminance height and width, respectively. Otherwise (separate _ color _ plane _ flag is equal to 1), the three components are encoded in monochrome sample images, respectively.
separate_colour_plane_flag equal to 1 specifies that the three colourcomponents of the 4:4:4 chroma format are coded separately.separate_colour_plane_flag equal to 0 specifies that the colour components are not codedseparately.
qtbtt_dual_tree_intra_flag equal to 1 specifies that for I slices,each CTU is split into coding units with 64x64 luma samples using an implicitquadtree split and that these coding units are the root of two separatecoding_tree syntax structure for luma and chroma.qtbtt_dual_tree_intra_flagequal to 0 specifies separate coding_tree syntax structure is not used for Islices.When qtbtt_dual_tree_intra_flag is not present,it is inferred to beequal to 0.
Step 503, determining the prediction mode types of all coding units in the current node according to the value of modeTypeCondition.
Specifically, if the modeTypeCondition has a value of 1, all coding units in the current node are restricted from using INTRA prediction (MODE _ INTRA). Otherwise, if the value of modeTypeCondition is 2, parsing the value of syntax element MODE _ constraint _ flag from the bitstream, if the value of MODE _ constraint _ flag is 0, all coding units in the current node use INTER prediction (MODE _ INTER), and if the value is 1, all coding units in the current node use INTRA prediction (non-INTER prediction/MODE _ INTRA).
Otherwise, the prediction mode types of all the coding units in the current node are not limited and are the same as the prediction mode type of the current node.
Step 504, determining whether the chroma block and the luminance block corresponding to the current node are continuously divided, so as to obtain a chroma coding unit and a luminance coding unit.
(same as step 407.)
Determining that all coding blocks in a current node only use intra-frame prediction, dividing brightness blocks included in the current node according to a dividing mode to obtain divided brightness blocks, and using intra-frame prediction on the divided brightness blocks; and taking the chroma block included by the current node as a chroma coding block, and using intra-frame prediction for the chroma coding block. That is, if it is determined that all the coding blocks in the current node use intra-frame prediction, dividing the brightness block of the current node according to the dividing mode of the brightness block to obtain N brightness coding tree nodes; and (3) the chroma block of the current node is not divided to obtain a chroma coding block (chroma CB for short). Wherein, the N luminance coding tree nodes can be limited not to be divided any more, or not to be limited. If the luminance coding tree node is continuously divided, the division mode is analyzed to carry out recursive division, and when the luminance coding tree node is not divided, the luminance coding tree node corresponds to a luminance coding block (called luminance CB for short). The luminance CB obtains a luminance prediction block corresponding to the luminance CB using intra prediction. And the chroma CB uses intra-frame prediction to obtain a chroma prediction block corresponding to the chroma CB, and the chroma prediction block and the chroma CB have the same size.
In another implementation mode, a luminance block and a chrominance block included in the current node are divided according to a dividing mode to obtain divided nodes, when a coding tree node is not divided any more, the coding tree node corresponds to a coding unit which comprises a luminance coding unit and a chrominance coding unit, and the divided luminance coding unit and the divided chrominance coding unit use intra-frame prediction.
In one implementation mode, a brightness block included in the current node is divided according to a dividing mode to obtain a divided brightness block, and the divided brightness block is subjected to inter-frame prediction; and dividing the chrominance blocks included by the current node according to a dividing mode to obtain divided chrominance blocks, and using inter-frame prediction for the divided chrominance blocks. That is, if it is determined that all the coding blocks of the current node use inter-frame prediction, dividing the luminance block of the current node according to the dividing mode of the luminance block to obtain N luminance coding tree nodes; and dividing the chrominance blocks of the current node according to the dividing mode of the chrominance blocks to obtain M chrominance coding tree nodes. Wherein, N and M are positive integers, and N and M can be the same or different. The partitioning of the N luma and M chroma coding tree nodes may be restricted from continuing, or not. And when the division is not continued, the N luminance coding tree nodes correspond to the N luminance CBs of the current node, and the M chrominance coding tree nodes correspond to the M chrominance CBs of the current node. The N luminance CBs use inter-prediction to obtain corresponding luminance prediction blocks, and the M chrominance CBs use inter-prediction to obtain corresponding chrominance prediction blocks.
In one implementation mode, a brightness block included in the current node is divided according to a dividing mode to obtain a divided brightness block, and the divided brightness block is subjected to inter-frame prediction; and dividing the chrominance blocks included by the current node according to a dividing mode to obtain divided chrominance blocks, and using inter-frame prediction for the divided chrominance blocks. That is, if it is determined that all the coding blocks of the current node use inter-frame prediction, dividing the luminance block of the current node according to the dividing mode of the luminance block to obtain N luminance coding tree nodes; and dividing the chrominance blocks of the current node according to the dividing mode of the chrominance blocks to obtain M chrominance coding tree nodes. Wherein, N and M are positive integers, and N and M can be the same or different. The partitioning of the N luma and M chroma coding tree nodes may be restricted from continuing, or not. And when the division is not continued, the N luminance coding tree nodes correspond to the N luminance CBs of the current node, and the M chrominance coding tree nodes correspond to the M chrominance CBs of the current node. The N luminance CBs use inter-prediction to obtain corresponding luminance prediction blocks, and the M chrominance CBs use inter-prediction to obtain corresponding chrominance prediction blocks.
Specifically, if all the coding blocks in the current node only use inter-frame prediction, and if the current node is divided according to the division manner to obtain a child node, if the child node needs to be further divided, and the division according to the division manner of the child node to obtain a luminance block with a preset size, for example, a preset size of 4x4 (i.e., width and height are both 4), the division manner of the child node is not allowed, or the child node cannot be further divided. Specifically, if a node is restricted to use only inter prediction and the number of luma sample points of the node is 32 (or the product of the width and height of the node is 32), the node is not allowed to use binary tree partitioning (including horizontal binary tree and vertical binary tree partitioning). If a node is restricted to use only inter prediction and the number of luma samples of the node is 64 (or the product of the width and height of the node is 64), the node is not allowed to use the treeing partition (including horizontal treeing and vertical treeing partitions). The judgment method is suitable for video data formats of YUV4:2:0 and YUV4:2: 2.
For example, if the current node is 8x8 in size and two 8x4 (or two 4x8) nodes are generated using horizontal binary tree (or vertical binary tree) partitioning, continued partitioning of the 8x4 (or 4x8) nodes would generate 4x4 blocks, and therefore, 8x4 (or 4x8) nodes cannot use vertical binary tree partitioning (or horizontal binary tree partitioning) or cannot continue partitioning. For another example, if the number of sampling points of the luminance block of the current node is 128 and the division manner is horizontal treeing or vertical treeing, the number of sampling points of the luminance block may be 64, and the luminance node of the number of sampling points 64 may obtain a luminance block of 4 × 4 if the division manner of the sub-node is horizontal treeing or vertical treeing, so that when only inter-frame prediction has been restricted to be used, the horizontal treeing or vertical treeing cannot be used or cannot be continuously divided for the node of the number of sampling points 64.
Step 505 of parsing the coding unit to obtain the prediction mode information
And analyzing syntax elements related to intra-frame or inter-frame prediction according to the type of the prediction mode of the coding unit to obtain the final prediction mode of the coding unit. And predicting by using the corresponding prediction mode to obtain a predicted value.
If the current node is in the Intra-frame image area (namely the image type or slice type where the current node is located is Intra type or I type), and the IBC mode is allowed to be used, the value of cu _ pred _ mode is deduced to be 1, and the cu _ pred _ mode is not required to be obtained by analyzing the code stream; if the current node is in the intra-frame image area and the IBC mode is not allowed to be used, the cu _ pred _ mode is deduced to be 1, and the cu _ skip _ flag is 0, and the cu _ skip _ flag does not need to be obtained by analysis from the code stream.
If the current node is in the Inter-frame image area (that is, the image type or slice type where the current node is located is Inter type or B type), the value of cu _ pred _ mode is derived to be 0, and does not need to be obtained by parsing from the code stream.
And step 506, decoding each coding block to obtain a reconstruction signal of the image block corresponding to the current node.
For example, the prediction block of each CU performs inter prediction processing or intra prediction processing on each CU to obtain an inter prediction image or an intra prediction image of each CU. And according to the residual information of each CU, carrying out inverse quantization and inverse transformation on the transformation coefficient to obtain a residual image, and overlapping the residual image on the predicted image of the corresponding area to generate a reconstructed image.
The corresponding encoding end method comprises the following steps:
the image prediction method provided by the present embodiment is applied to the video encoder 18 shown in fig. 8. The embodiment comprises the following steps:
step 601, obtaining the dividing mode of the current node.
Step 601 of this embodiment is the same as step 501, and is not described herein again.
Step 602, the value of the variable modeTypeCondition is derived according to the following method
The modeTypeCondition value is a first value, such as 0, if one or more of the following preset conditions one holds.
1) The picture type or Slice (Slice) type where the current node is located is I type (Slice _ type ═ I), and the value of qtbtt _ dual _ tree _ intra _ flag is 1.
2) The prediction mode type of the current node is to use only intra prediction or inter prediction, i.e., has been restricted to use only inter prediction or intra prediction (non-inter prediction).
3) The chroma sampling structure is a Monochrome sampling structure (Monochrome) or a 4:4:4 structure. For example, the value of chroma _ format _ idc is 0 or 3.
In another embodiment, the first preset condition further includes the following condition 4):
4) the chroma sampling structure is a monochromatic sampling structure (Monochrome) or a 4:4:4 or 4:2:2 structure. For example, the value of chroma _ format _ idc is 0 or 3 or 2.
Otherwise, if one or more of the following preset conditions two holds, the value of modeTypeCondition is a second value, e.g., 1.
1) The product of the width and the height of the brightness block of the current node is 64, and the division mode of the current node is quadtree division;
2) the product of the width and the height of the brightness block of the current node is 64, and the current node is divided in a horizontal ternary tree or a vertical ternary tree;
3) the product of the width and the height of the luminance block of the current node is 32, and the current node is divided in a horizontal binary tree or a vertical binary tree.
Otherwise, if one or more of the following preset conditions three holds, and the chroma sampling structure is 4:2:0 (the value of chroma _ format _ idc is 1), the value of modeTypeCondition is derived according to the following formula: 1+ (slice _ type |.
1) The product of the width and the height of the brightness block of the current node is 64, and the current node is divided in a horizontal binary tree or a vertical binary tree;
2) the product of the width and the height of the luminance block of the current node is 128, and the current node is divided in a horizontal or vertical ternary tree manner.
Step 603, determining the prediction mode types of all coding units in the current node according to the value of modeTypeCondition.
Specifically, if the modeTypeCondition has a value of 1, all coding units in the current node are restricted from using INTRA prediction (MODE _ INTRA). An optional mode _ constraint _ flag is set to 1.
Otherwise, if the value of modeTypeCondition is 2, the value of syntax element mode _ constraint _ flag is determined using the RDO method. For example, RDcost when all coding units in the current node use inter-frame prediction is calculated first, and then RD cost when intra-frame prediction is used is calculated, wherein if all coding units in the current node use no residual error when inter-frame prediction is used (for example, skip mode), it is determined that all coding units in the current node use inter-frame prediction, and the value of mode _ constraint _ flag is set to 0, and RD cost when intra-frame prediction is not required to be calculated again. The RD cost when all coding units in the current node use intra-frame prediction can be calculated first, then the RDcost when inter-frame prediction is used is calculated, and then the prediction mode with the minimum RD cost is obtained to determine the prediction as the final prediction.
Otherwise, the prediction mode types of all the coding units in the current node are not limited and are the same as the prediction mode type of the current node.
Specifically, if the current node is in an Intra image area (i.e., the image type or slice type (slice _ type) where the current node is located is Intra or I type), and IBC mode is allowed to be used, the pred _ mode _ flag value defaults to 1; if the current node is in the intra picture area and IBC mode is not allowed, pred _ mode _ flag defaults to 1 and cu _ skip _ flag is 0.
And step 604, determining the division mode of the chrominance block and the luminance block corresponding to the current node to obtain a chrominance coding unit and a luminance coding unit.
Specifically, the same as step 504.
And step 605, predicting the coding blocks of the CU obtained by dividing the current node to obtain the prediction values of the coding blocks.
If all the coding blocks in the current node only use intra-frame prediction, the coding end determines the optimal intra-frame prediction mode used by the current coding block by using a Rate-distortion optimization (RDO) method or other methods, and the current coding block uses the corresponding intra-frame prediction mode for prediction to obtain the prediction value of the current block.
If all coding blocks in the current node only use inter-frame prediction, the coding end determines the optimal inter-frame prediction mode used by the current coding block by using a utilization rate distortion optimization method or other methods, and the current coding block uses the corresponding inter-frame prediction mode for prediction to obtain the prediction value of the current block.
And meanwhile, the coding end assigns values of the syntax elements related to the CU level, and writes the values of all the syntax elements into the code stream according to the syntax definition criteria of the CU level. For example, if all the coding blocks within the current node use only intra prediction, the value of pred _ mode _ flag is set to 1 and is not written into the bitstream and does not appear in the bitstream. If all the coding blocks in the current node only use intra-frame prediction and the IBC mode is determined not to be used, the value of cu _ skip _ flag (or skip _ flag) is 0, and the code stream is not written, otherwise, the value of cu _ skip _ flag needs to be determined, the code stream is written, and the code stream is transmitted to a decoding end.
If all the coding blocks in the current node only use inter-frame prediction, the value of pred _ mode _ flag is set to 0, and the code stream is not written into and does not appear in the code stream. pred _ mode _ ibc _ flag is set to 0, and no stream is written and is not present in the stream.
Step 606: obtaining a reconstructed signal of an image block within a current node
After prediction information is obtained by using intra-frame prediction or inter-frame prediction, residual information is obtained by subtracting the corresponding prediction information (or predicted value) from the pixel value of a pixel point in a current coding block, then the residual information is transformed by using methods such as Discrete Cosine Transform (DCT), and the like, and then a code stream is obtained by using quantization entropy coding. The encoding end transmits the residual information to the decoding end. After the prediction signal is added with the reconstructed residual signal, further filtering operation is required to obtain a reconstructed signal, and the reconstructed signal is used as a reference signal of subsequent coding. Particularly, if the coding block uses a skip mode, residual information is not needed, transformation is not needed, and the predicted value is the final reconstructed value.
Fig. 15 is a functional structure diagram of an image prediction apparatus according to an embodiment of the present application. As shown in fig. 15, the image prediction apparatus 40 according to the present embodiment includes:
An obtaining module 41, configured to obtain a partition mode of a current node;
a determining module 42, configured to determine whether an image block with a preset size will be obtained by dividing the current node based on the dividing manner; the image block comprises a luminance block or a chrominance block;
an executing module 43, configured to use intra prediction for all coding blocks covered by the current node or use inter prediction for all coding blocks covered by the current node, when it is determined that the current node is divided based on the dividing manner to obtain the image block with the preset size.
Optionally, the image block with the preset size includes a luminance block with a first preset size, and the determining module 42 is specifically configured to: and determining whether the current node is divided based on the dividing mode to obtain a brightness block with a first preset size or not according to the size of the current node and the dividing mode.
Optionally, in a case that it is determined that the luminance block having the first preset size is obtained by dividing the current node based on the dividing manner, the executing module 43 is specifically configured to:
intra prediction is used for all encoded blocks covered by the current node.
Optionally, under the condition that it is determined that the luminance block with the first preset size cannot be obtained by dividing the current node based on the dividing manner, the determining module 42 is further configured to determine whether the chrominance block with the second preset size can be obtained by dividing the current node based on the dividing manner;
in a case that it is determined that the chroma block with the second preset size is obtained by dividing the current node based on the dividing manner, the performing module 43 is specifically configured to use intra prediction for all coding blocks covered by the current node, or use inter prediction for all coding blocks covered by the current node.
Optionally, the image block with the preset size includes a chrominance block with a second preset size, and the determining module 42 is specifically configured to: and determining whether the current node is divided based on the dividing mode to obtain a chrominance block with a second preset size or not according to the size and the dividing mode of the current node.
Optionally, under the condition that it is determined that the chroma block with the second preset size is obtained by dividing the current node based on the dividing manner, the executing module 43 is specifically configured to:
Analyzing the prediction mode state identification of the current node;
when the value of the prediction mode state identifier is a first value, using inter-frame prediction for all coding blocks covered by the current node; or, when the value of the prediction mode state identifier is a second value, intra-frame prediction is used for all coding blocks covered by the current node.
Optionally, under the condition that it is determined that the chroma block with the second preset size is obtained by dividing the current node based on the dividing manner, the executing module 43 is specifically configured to: when the prediction mode of any coding block covered by the current node is inter-frame prediction, using inter-frame prediction for all coding blocks covered by the current node; or, when the prediction mode of any coding block covered by the current node is intra-frame prediction, using intra-frame prediction for all coding blocks covered by the current node.
Optionally, any coding block is a first coding block in a decoding order of all coding blocks covered by the current node.
Optionally, under the condition that it is determined that the chroma block with the second preset size is obtained by dividing the current node based on the dividing manner, the executing module 43 is specifically configured to:
Judging whether a brightness block with a first preset size is obtained by dividing the current node based on the dividing mode;
and under the condition that the luminance block with the first preset size is obtained by determining that the current node is divided based on the dividing mode of the current node, using intra-frame prediction for all coding blocks covered by the current node.
Optionally, under the condition that it is determined that the luminance block with the first preset size cannot be obtained by dividing the current node based on the dividing manner, the executing module 43 is specifically configured to:
analyzing the prediction mode state identification of the current node;
when the value of the prediction mode state identifier is a first value, using inter-frame prediction for all coding blocks covered by the current node; or, when the value of the prediction mode state identifier is a second value, intra-frame prediction is used for all coding blocks covered by the current node.
Optionally, under the condition that it is determined that the luminance block with the first preset size cannot be obtained by dividing the current node based on the dividing manner, the executing module 43 is specifically configured to: when the prediction mode of any coding block covered by the current node is inter-frame prediction, using inter-frame prediction for all coding blocks covered by the current node; or, when the prediction mode of any coding block covered by the current node is intra-frame prediction, using intra-frame prediction for all coding blocks covered by the current node.
Optionally, the executing module 43 is specifically configured to:
dividing the brightness blocks included in the current node according to the dividing mode to obtain divided brightness blocks, using intra-frame prediction on the divided brightness blocks, using the chrominance blocks included in the current node as chrominance coding blocks, and using the intra-frame prediction on the chrominance coding blocks; alternatively, the first and second electrodes may be,
dividing the luminance block included in the current node according to the dividing mode to obtain a divided luminance block, using inter-frame prediction on the divided luminance block, dividing the chrominance block included in the current node according to the dividing mode to obtain a divided chrominance block, and using inter-frame prediction on the divided chrominance block.
Optionally, the executing module 43 is specifically configured to:
dividing the brightness blocks included in the current node according to the dividing mode to obtain divided brightness blocks, using intra-frame prediction on the divided brightness blocks, using the chrominance blocks included in the current node as chrominance coding blocks, and using the intra-frame prediction on the chrominance coding blocks; alternatively, the first and second electrodes may be,
and dividing the brightness block included by the current node according to the dividing mode to obtain a divided brightness block, using inter-frame prediction on the divided brightness block, using the chrominance block included by the current node as a chrominance coding block, and using inter-frame prediction on the chrominance coding block.
Optionally, in a case that inter-frame prediction is used for all coding blocks covered by the current node, the obtaining module 41 is further configured to obtain a sub-division manner of a child node of the current node, where the child node includes a luminance block and a chrominance block;
the determining module 42 is further configured to determine whether the division of the child node of the current node based on the sub-division manner will result in a luminance block with a first preset size;
the executing module 43 is specifically configured to divide the current node into sub-nodes in a dividing manner other than the sub-dividing manner to obtain corresponding coding blocks, and use inter-frame prediction for the corresponding coding blocks, or use inter-frame prediction for the current node as the coding blocks, when it is determined that the dividing of the current node into the sub-nodes based on the sub-dividing manner will obtain a luminance block with a first preset size.
The image prediction apparatus provided in the embodiment of the present application may implement the technical solution of the method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 16 is a schematic hardware structure diagram of a video encoding apparatus according to an embodiment of the present application. As shown in fig. 16, the present embodiment provides a video encoding apparatus 50, which includes a processor 51 and a memory 52 for storing executable instructions of the processor 51; the processor 51 may execute the image prediction method corresponding to the video coding device in the above method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
Alternatively, the memory 52 may be separate or integrated with the processor 51.
When the memory 52 is a device separate from the processor 51, the video encoding apparatus 50 further includes: a bus 53 for connecting the memory 52 and the processor 51.
Fig. 17 is a schematic hardware structure diagram of a video decoding apparatus according to an embodiment of the present application. As shown in fig. 17, the present embodiment provides a video decoding apparatus 60 including a processor 61 and a memory 62 for storing executable instructions of the processor 61; the processor 61 may execute the image prediction method corresponding to the video decoding device in the above method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
Alternatively, the memory 62 may be separate or integrated with the processor 61.
When the memory 62 is a device separate from the processor 61, the video decoding apparatus 60 further includes: a bus 63 for connecting the memory 62 and the processor 61.
Fig. 18 is a schematic structural diagram of an image prediction system according to an embodiment of the present application. As shown in fig. 18, the image prediction system provided by the present embodiment includes a video capture device 70, a video encoding device 50 according to the embodiment shown in fig. 16, a video decoding device 60 according to the embodiment shown in fig. 17, and a display device 80.
Wherein, the video encoding device 50 is respectively connected with the video collecting device 70 and the video decoding device 60, and the video decoding device 60 is connected with the display device 80.
Specifically, the video encoding device 50 receives video or image information sent by the video capture device 70, the video encoding device 50 may perform an image prediction method corresponding to the video encoding device 50 in the above method embodiment, the video encoding device 50 sends the encoded video or image information to the video decoding device 60, the video decoding device 60 may perform an image prediction method corresponding to the video decoding device 60 in the above method embodiment, and the video decoding device 60 sends the decoded video or image information to the display device 80 for display.
The image prediction system provided in the embodiment of the present application includes a video encoding device capable of executing the above method embodiment and a video decoding device capable of executing the above method embodiment, and the implementation principle and the technical effect are similar, and are not described herein again.
Embodiments of the present application also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the above-mentioned method embodiments.
The embodiment of the present application further provides a video decoding method, where the method includes:
acquiring a division mode of a current node, wherein the current node comprises a brightness block and a chrominance block;
judging whether the current node is continuously divided based on the division mode of the current node to obtain a chromaticity small block, wherein the chromaticity small block is a chromaticity block smaller than or equal to a first preset value or the chromaticity small block is a chromaticity block of which the number of pixels in the block is smaller than or equal to a second preset value;
and if the current node is continuously divided based on the dividing mode of the current node, obtaining a chromaticity small block, performing inter-frame prediction on a coding block (coding block) obtained by dividing with the current node as a root node, or performing intra-frame prediction on the coding block (coding block) obtained by dividing with the current node as the root node, thereby obtaining prediction information of the coding block obtained by dividing.
Optionally, the performing inter-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes: performing inter-frame prediction on all coding blocks (codingblocks) obtained by dividing by taking the current node as a root node; alternatively, the first and second electrodes may be,
The performing intra-frame prediction on a coding block (coding block) obtained by dividing the current node as a root node includes: and executing intra-frame prediction on all coding blocks (coding blocks) obtained by dividing by taking the current node as a root node.
Optionally, the performing inter-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes: performing inter-frame prediction on all the small chroma blocks obtained by dividing by taking the current node as a root node; alternatively, the first and second electrodes may be,
the performing intra-frame prediction on a coding block (coding block) obtained by dividing the current node as a root node includes: and performing intra-frame prediction on all the chroma small blocks obtained by dividing by taking the current node as a root node.
Optionally, the performing inter-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes: performing inter-frame prediction on coding units (coding units) obtained by dividing the current node as a root node; alternatively, the first and second electrodes may be,
the performing intra-frame prediction on a coding block (coding block) obtained by dividing the current node as a root node includes: and executing intra-frame prediction on coding units (coding units) obtained by dividing the current node as a root node.
Optionally, the performing inter-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node, or performing intra-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node includes:
parsing a node prediction mode flag (cons _ pred _ mode _ flag) of the current node;
when the value of the node prediction mode identifier is a first value, performing inter-frame prediction on a coding block (coding block) obtained by dividing the current node serving as a root node;
and when the value of the node prediction mode identifier is a second value, performing intra-frame prediction on a coding block (coding block) obtained by dividing by taking the current node as a root node.
Optionally, the performing inter-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node, or performing intra-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node includes:
when the prediction mode of any coding block obtained by dividing with the current node as a root node is inter-frame prediction, performing inter-frame prediction on a coding block (coding block) obtained by dividing with the current node as a root node;
And when the prediction mode of any coding block obtained by dividing with the current node as a root node is intra-frame prediction, performing intra-frame prediction on a coding block (coding block) obtained by dividing with the current node as the root node.
Optionally, the first preset value is 2 or 4, or the second preset value is 16, 8 or 32.
Optionally, the performing intra-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes:
dividing the brightness blocks included by the current node according to the node dividing mode to obtain brightness coding blocks;
performing intra-frame prediction on the brightness coding block;
and performing intra-frame prediction by taking the chroma block included by the current node as a chroma coding block.
Optionally, the performing inter-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node, or performing intra-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node includes:
dividing the brightness blocks included by the current node according to the node dividing mode to obtain brightness coding blocks;
Performing inter-frame prediction or intra-frame prediction on the brightness coding block;
and performing inter-frame prediction or intra-frame prediction by taking the chroma block included by the current node as a chroma coding block.
Optionally, the performing inter prediction or intra prediction on the chroma block included in the current node as a chroma coding block includes:
taking the chroma coding block as a chroma prediction block to perform intra-frame prediction; alternatively, the first and second electrodes may be,
and dividing the chroma coding block to obtain a chroma prediction block, and performing inter-frame prediction on the divided chroma prediction block.
The embodiment of the present application further provides a video decoding method, where the method includes:
acquiring a division mode of a current node, wherein the current node comprises a brightness block and a chrominance block;
judging whether the current node is continuously divided based on the dividing mode of the current node to obtain a brightness block with a preset size;
if the current node is continuously divided based on the dividing mode of the current node, a brightness block with the preset size is obtained, intra-frame prediction is carried out on all coding blocks (coding blocks) obtained by dividing with the current node as a root node, and therefore a prediction block of the coding blocks obtained by dividing is obtained.
Optionally, the method further includes:
if the luminance block with the preset size cannot be obtained by continuously dividing the current node based on the dividing mode of the current node, judging whether a chromaticity small block is obtained by continuously dividing the current node based on the dividing mode of the current node, wherein the chromaticity small block is a chromaticity block smaller than or equal to a first preset value or the chromaticity small block is a chromaticity block of which the number of pixels in the block is smaller than or equal to a second preset value;
and if the current node is continuously divided based on the dividing mode of the current node, obtaining a chromaticity small block, performing inter-frame prediction on a coding block (coding block) obtained by dividing with the current node as a root node, or performing intra-frame prediction on the coding block (coding block) obtained by dividing with the current node as the root node, thereby obtaining a prediction block of the coding block obtained by dividing.
Optionally, the performing inter-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes: performing inter-frame prediction on all coding blocks (codingblocks) obtained by dividing by taking the current node as a root node; alternatively, the first and second electrodes may be,
The performing intra-frame prediction on a coding block (coding block) obtained by dividing the current node as a root node includes: and executing intra-frame prediction on all coding blocks (coding blocks) obtained by dividing by taking the current node as a root node.
Optionally, the performing inter-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes: performing inter-frame prediction on all the small chroma blocks obtained by dividing by taking the current node as a root node; alternatively, the first and second electrodes may be,
the performing intra-frame prediction on a coding block (coding block) obtained by dividing the current node as a root node includes: and performing intra-frame prediction on all the chroma small blocks obtained by dividing by taking the current node as a root node.
Optionally, the performing inter-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes: performing inter-frame prediction on coding units (coding units) obtained by dividing the current node as a root node; alternatively, the first and second electrodes may be,
the performing intra-frame prediction on a coding block (coding block) obtained by dividing the current node as a root node includes: and executing intra-frame prediction on coding units (coding units) obtained by dividing the current node as a root node.
Optionally, the performing inter-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node, or performing intra-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node includes:
parsing a node prediction mode flag (cons _ pred _ mode _ flag) of the current node;
when the value of the node prediction mode identifier is a first value, performing inter-frame prediction on a coding block (coding block) obtained by dividing the current node serving as a root node;
and when the value of the node prediction mode identifier is a second value, performing intra-frame prediction on a coding block (coding block) obtained by dividing by taking the current node as a root node.
Optionally, the performing inter-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node, or performing intra-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node includes:
when the prediction mode of any coding block obtained by dividing with the current node as a root node is inter-frame prediction, performing inter-frame prediction on a coding block (coding block) obtained by dividing with the current node as a root node;
And when the prediction mode of any coding block obtained by dividing with the current node as a root node is intra-frame prediction, performing intra-frame prediction on a coding block (codingblock) obtained by dividing with the current node as the root node.
Optionally, the first preset value is 2 or 4, or the second preset value is 16, 8 or 32.
Optionally, the performing intra-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes:
dividing the brightness blocks included by the current node according to the node dividing mode to obtain brightness coding blocks;
performing intra-frame prediction on the brightness coding block;
and performing intra-frame prediction by taking the chroma block included by the current node as a chroma coding block.
Optionally, the performing inter-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node, or performing intra-frame prediction on the coding block (coding block) obtained by dividing with the current node as a root node includes:
dividing the brightness blocks included by the current node according to the node dividing mode to obtain brightness coding blocks;
Performing inter-frame prediction or intra-frame prediction on the brightness coding block;
and performing inter-frame prediction or intra-frame prediction by taking the chroma block included by the current node as a chroma coding block.
Optionally, the performing inter prediction or intra prediction on the chroma block included in the current node as a chroma coding block includes:
taking the chroma coding block as a chroma prediction block to perform intra-frame prediction; alternatively, the first and second electrodes may be,
and dividing the chroma coding block to obtain a chroma prediction block, and performing inter-frame prediction on the divided chroma prediction block.
Optionally, the performing inter-frame prediction on a coding block (coding block) obtained by dividing by using the current node as a root node includes:
dividing the current node according to the division mode of the current node to obtain child nodes of the current node;
acquiring a subdivision mode of a child node of the current node, wherein the child node comprises a brightness block and a chrominance block;
judging whether the sub-nodes of the current node are continuously divided based on the sub-division mode to obtain a brightness block with a preset size;
if the sub-nodes of the current node are continuously divided based on the sub-division mode, a brightness block with a preset size is obtained, the sub-nodes of the current node are divided by adopting a division mode except the sub-division mode to obtain corresponding coding units and perform inter-frame prediction on the corresponding coding units, or the sub-nodes of the current node are used as the coding units to perform inter-frame prediction.
Optionally, the preset size includes 4 × 4, 4 × 8, 8 × 4, 2 × 4 or 4 × 2.
The embodiment of the present application further provides a video decoding method, where the method includes:
acquiring a division mode of a current node, wherein the current node comprises a brightness block and a chrominance block;
when the prediction modes of all coding blocks (coding blocks) obtained by dividing the current node serving as a root node are inter-frame prediction modes, dividing the current node by the current node according to the division mode of the current node to obtain child nodes of the current node;
acquiring a subdivision mode of a child node of the current node, wherein the child node comprises a brightness block and a chrominance block;
judging whether the sub-nodes of the current node are continuously divided based on the sub-division mode to obtain a brightness block with a preset size;
if the sub-nodes of the current node are continuously divided based on the sub-division mode, a brightness block with a preset size is obtained, the sub-nodes of the current node are divided by adopting a division mode except the sub-division mode to obtain corresponding coding units and perform inter-frame prediction on the corresponding coding units, or the sub-nodes of the current node are used as the coding units to perform inter-frame prediction.
The first video decoding method provided by the embodiment of the application relates to a block division mode in video decoding. The video data format in this embodiment is YUV4:2:0 format. A similar approach may be used for YUV4:2:2 data.
Step 1: analyzing the division mode S of the node A, and executing the step 2 if the node A continues to divide; if the current node is not divided into the sub-nodes any more, the current node corresponds to one coding unit, and coding unit information is analyzed;
the division mode of the node a may be at least one of a quadtree division, a vertical dichotomy, a horizontal dichotomy, a vertical trisection, and a horizontal trisection, and may also be other division modes, which is not limited in this application. The information of the division mode of the current node can be transmitted in the code stream, and the division mode of the current node can be obtained by analyzing the corresponding syntax element in the code stream. The division mode of the current node may also be determined based on a preset rule, which is not limited in the present application.
Step 2: and judging whether the node A has at least one chroma block of the sub node B as a small block in the sub nodes obtained by the division according to the division mode S (judging whether the width and the height of the node A and/or the division mode and/or the width and the height of the node B meet at least one of the conditions). If the chroma block of at least one sub node B in the sub nodes obtained by dividing the node A is a small block, executing the steps 3 to 6
Specifically, the method for determining that the chroma block of at least one child node B of the node a is a small block may use the following one.
1) If the chroma blocks of at least one child node B of the node a are 2x2, 2x4, or 4x2 in size, the chroma blocks of at least one child node B of the node a are small blocks.
2) If the chroma block of at least one child node B of the node a has a width or height of 2, the chroma block of at least one child node B of the node a is a small block.
3) If node a contains 128 luma pixels and node a uses a ternary tree partition, or if node a contains 64 luma pixels and node a uses a binary tree partition or a quaternary tree partition or a ternary tree partition, the chroma block of at least one child node B of node a is a small block.
4) If node a contains 256 luminance pixels and the node uses a ternary tree partition or a quaternary tree partition, or if node a contains 128 luminance pixels and the node uses a binary tree partition, the chrominance blocks of at least one child node B of node a are small blocks.
5) If node A contains N1 luma pixels and node A uses a ternary tree partition, N1 is 64, 128, or 256.
6) If node A contains N2 luma pixels and node A uses quadtree partitioning, N2 is 64 or 256.
7) If node A contains N3 luma pixels and node A uses binary tree partitioning, N3 is 64, 128, or 256.
It should be noted that the node a includes 128 luminance pixel points, which may also be described as that the area of the current node is 128, or that the product of the width and the height of the node a is 128, which is not described herein again.
And step 3: all coding units within the coverage area of node a are restricted to use intra prediction or to use inter prediction. The intra-frame or inter-frame prediction is used, so that the small blocks can be processed in parallel by hardware, and the coding and decoding performance is improved.
All the coding units in the coverage area of node a use intra prediction or use inter prediction can be determined by one of the following methods.
The method comprises the following steps: determined according to the flag bits in the syntax table.
If the node A is divided according to the dividing mode S to obtain that the chroma block of at least one sub node B is a small block (and the chroma block of the node A is not the small block), analyzing a flag bits cons _ pred _ mode _ flag from a code stream; wherein a cons _ pred _ mode _ flag of 0 indicates that the coding units of the coverage area of the node a all use inter prediction, and a cons _ pred _ mode _ flag of 1 indicates that the coding units of the coverage area of the node a all use intra prediction. The cons _ pred _ mode _ flag may be a syntax element that needs to be parsed in the block division process, and when the syntax element is parsed, the cu _ pred _ mode of the coding unit of the coverage area of the node a may not be parsed any more, and its value is a default value corresponding to the value of the cons _ pred _ mode _ flag.
It should be noted that if the sub-node of node a can only use the Intra prediction mode, for example, node a is in an Intra picture (i.e. the picture type of node a is Intra or I), or node a is in an Intra picture and the sequence does not use the IBC technique, then the cons _ pred _ mode _ flag defaults to 1 and does not appear in the bitstream. The IBC technique may belong to inter prediction or intra prediction.
The second method comprises the following steps: determined by the prediction mode of the first node in the node a area.
Parsing the prediction mode of the first coding unit B0 in the node a area (the prediction mode of the first coding unit B0 is not limited), if the prediction mode of B0 is intra prediction, all coding units in the node a coverage area use intra prediction; if the prediction mode of B0 is inter prediction, all coding units within the node a region cover region use inter prediction.
And 4, step 4: the division mode of the chroma block and the luma block of the node a is determined according to the prediction mode used by the coding unit of the node a coverage area.
If the coding units of the node A coverage area all use the intra-frame prediction mode, dividing the brightness blocks of the node A according to the dividing mode S to obtain N brightness coding tree nodes; the chroma block of the node a is not divided and corresponds to a chroma coding block (abbreviated as chroma CB). Wherein, the N luminance coding tree nodes can be limited not to be divided any more, or not to be limited. If the brightness subnode continues to be divided, the dividing mode is analyzed to carry out recursive division, and when the brightness coding tree node is not divided any more, the brightness coding tree node corresponds to a brightness coding block (called brightness CB for short). The chroma transformation block and the chroma coding block corresponding to the chroma CB have the same size, and the chroma prediction block and the chroma coding block have the same size.
If the coding units of the node A coverage area all use the inter-frame prediction mode, the luminance block and the chrominance block of the node A are continuously divided into N coding tree nodes containing the luminance block and the chrominance block according to a dividing mode S, the N coding tree nodes can be continuously divided or not divided, and the N coding tree nodes correspond to the coding units containing the luminance block and the chrominance block when not divided.
And 5: and analyzing the prediction information and residual information of the CU obtained by the node A division.
The prediction information includes: prediction mode (indicating intra-prediction or non-intra-prediction modes), intra-prediction mode, inter-prediction mode, motion information, and the like. The motion information may include prediction direction (forward, backward, or bi-directional), reference frame index (reference index), motion vector (motion vector), and the like.
The residual information includes: coded block flag (cbf), transform coefficients, transform types (e.g., DCT-2, DST-7, DCT-8), etc. The transform type may be defaulted to DCT-2 transform.
If each CU obtained by the node A through division can only use intra-frame prediction, the prediction information analysis of the luminance CB obtained by the node A through division comprises that skip _ flag, merge _ flag and CU _ pred _ mode are respectively defaulted to be 0, 0 and 1 (namely, the skip _ flag, the merge _ flag and the CU _ pred _ mode are not present in the code stream), or the skip _ flag and the CU _ pred _ mode are respectively defaulted to be 0 and 1 (namely, the skip _ flag and the CU _ pred _ mode are not present in the code stream), and the intra-frame prediction mode information of the luminance CB is analyzed; the analysis of the prediction information of the chroma CB obtained by the node A through division comprises the analysis of an intra-frame prediction mode of the chroma CB. The method for analyzing the intra prediction mode of the chroma CB may be: 1) parsing the syntax element from the code stream to obtain the syntax element; 2) directly set to one of the set of chroma intra prediction modes, such as linear model mode, DM Mode (DM), IBC mode, etc.
If each CU obtained by node A division can only use inter-frame prediction, the prediction mode analysis of the CU obtained by node A division comprises the analysis of skip _ flag or/and merge _ flag, CU _ pred _ mode is defaulted to be 0, and inter-frame prediction information such as fusion index (merge index), inter-frame prediction direction (inter dir), reference frame index (reference index), motion vector predictor index (motion vector predictor index) and motion vector difference component (motion vector difference) is analyzed
The skip _ flag is a flag of the skip mode, a value of 1 indicates that the current CU uses the skip mode, and a value of 0 indicates that the current CU does not use the skip mode. merge _ flag is a merge mode flag, and a value of 1 indicates that the current CU uses the merge mode; a value of 0 indicates that no fusion mode is used. cu _ pred _ mode is a coding unit prediction mode flag, and a value of 1 indicates that the current prediction unit uses intra prediction; a value of 0 indicates that the current prediction unit uses normal inter prediction (information identifying inter prediction direction, reference frame index, motion vector predictor index, motion vector difference component, etc. in the code stream).
It should be noted that the intra prediction mode in this embodiment is a prediction mode that generates a prediction value of a coding block using spatial reference pixels of an image in which the coding block is located, such as a direct current mode (DC mode), a Planar mode (Planar mode), an angular mode (angular mode), and possibly a template matching mode (template matching mode), and an IBC mode.
The Inter prediction mode is a prediction mode for generating a prediction value of the coding block using temporal reference pixels in a reference picture of the coding block, such as a Skip mode (Skip mode), a Merge mode (Merge mode), an amvp (advanced vector prediction) mode or a general Inter mode, an IBC mode, and the like.
Step 6: decoding each CU to obtain a reconstructed signal of the image block corresponding to the node A
For example, inter prediction processing or intra prediction processing is performed on each CU based on the prediction information of each CU, and an inter prediction image or an intra prediction image of each CU is obtained. And according to the residual information of each CU, carrying out inverse quantization and inverse transformation on the transformation coefficient to obtain a residual image, and overlapping the residual image on the predicted image of the corresponding area to generate a reconstructed image.
By the dividing mode of the embodiment, the chroma small block using the intra-frame prediction cannot be generated, so that the problem of small-block intra-frame prediction is solved.
The second video decoding method provided in the embodiment of the present application includes steps 1, 2, 3, and 6, which are the same as those of the first decoding method. The difference lies in that:
and 4, step 4: the division method of the chrominance block and the luminance block of the node A is determined.
And the brightness block of the node A is continuously divided according to the dividing mode S to generate N brightness coding tree nodes. The chroma block of the node a is not divided any more and corresponds to a chroma coding block (chroma CB). And the chroma transformation block corresponding to the chroma CB and the chroma coding block have the same size. [ note: in contrast to the first embodiment, in the present embodiment, regardless of whether the inter or intra prediction mode is restricted, the chroma block is always not divided, and the luma block is always divided by the division manner S, regardless of the prediction mode of the node a coverage area ].
And 5: and analyzing the prediction information and residual information of the CU obtained by the node A division.
If the node a divides each CU to limit intra prediction, the process is the same as the first embodiment.
If each CU obtained by the node A through division can only use inter-frame prediction, the prediction information analysis of the luminance CB obtained by the node A through division comprises the analysis of skip _ flag or/and merge _ flag, the CU _ pred _ mode is defaulted to be 0, and the inter-frame prediction information such as a fusion index (merge index), an inter-frame prediction direction (inter dir), a reference frame index (reference index), a motion vector predictor index (motion vector predictor index) and a motion vector difference component (motion vector difference) is analyzed. From the inter prediction information obtained by the analysis, motion information of each 4 × 4 sub-block in the luminance CB is derived.
If the CU obtained by the node a partition only uses inter prediction, the prediction information of the chroma CB obtained by the node a partition does not need to be analyzed, the chroma CB is divided into 2x2 chroma sub-blocks (the division method may be a division method S), and the motion information of each 2x2 chroma sub-block is the motion information of the 4x4 luma region corresponding to each 2x2 chroma sub-block.
By the division method of the embodiment, a small chroma block using intra prediction and a transformation block smaller than 16 pixels are not generated, and the intra prediction problem and the coefficient coding problem are solved.
The third video decoding method provided in the embodiment of the present application includes steps 1, 2, 3, 4, and 6, which are the same as the second decoding method. The difference lies in that:
and 5: and analyzing the prediction information and residual information of the CU obtained by the node A division.
If the node a divides each CU to limit intra prediction, the process is the same as the embodiment.
If the CU obtained by the node A division can only use the inter-frame prediction, the prediction information analysis of the luminance CB obtained by the node A division is the same as that of the embodiment.
If each CU obtained by node A division can only use inter-frame prediction, the prediction information of the chroma CB obtained by node A division does not need to be analyzed, the chroma prediction block and the chroma coding block have the same size, and the motion information of the chroma CB is the motion information of a certain preset position in a luminance area corresponding to the chroma CB (such as the center, the lower right corner or the upper left corner of the luminance area).
By the dividing method of the embodiment, a chroma small block using intra prediction, a small block transform block, and a chroma small block using inter prediction are not generated.
A fourth video decoding method provided in the embodiment of the present application includes:
Step 1: same as step 1 of the first video decoding method described above
Step 2: and judging whether at least one brightness block with the brightness block of 4x4 of the child node B exists in the child nodes obtained by the node A through division according to the division mode S (judging whether the width, the height and/or the division mode of the node A and/or the width and the height of the node B meet at least one of the conditions in the first case).
And if the size (width, height) of the node A and/or the dividing mode S meet at least one condition in the first condition, limiting all coding units in the coverage area of the node A to use the intra-frame prediction. Otherwise, judging whether the node a has at least one chroma block of the sub node B as a small block in the sub nodes obtained by the division according to the division mode S (judging whether the size of the node a, and/or the division mode S, and/or the width and height of the node B meet at least one condition of the second condition, then executing the steps 3 to 6.
Specifically, the method for determining that the chroma block of at least one child node B of the node a is a small block is divided into the following two cases.
The first condition is as follows:
if one or more of the following preset conditions are met, the node a is divided according to the dividing mode S to obtain a luminance block of 4 × 4:
1) Node A contains M1 pixels and is divided in a quadtree division, e.g., M1 is 64;
2) node A contains M2 pixels and is divided in a ternary tree, for example, M2 is 64;
3) node a contains M3 pixels and the partition of node a is a binary tree partition, e.g., M3 is 32;
4) the width of the node A is equal to 4 times of the second threshold value, the height of the node A is equal to the second threshold value, and the node A is divided in a vertical ternary tree manner;
5) the width of the node A is equal to a second threshold value, the height of the node A is equal to 4 times of the second threshold value, and the node A is divided in a horizontal ternary tree manner;
6) the width of the node A is equal to 2 times of the second threshold value, the width is higher than the second threshold value, and the division mode of the current node is vertical bisection;
7) the height of the node A is equal to 2 times of the second threshold, the width of the node A is equal to the second threshold, and the division mode of the current node is horizontal halving;
8) the width or/and height of the node A is 2 times of the second threshold value, and the division mode of the node A is quadtree division.
The size may be the width and height of the image region corresponding to the node a, the number of luminance pixels included in the image region corresponding to the node a, or the area of the image region corresponding to the node a.
In general, the width of the current node is the width of the luminance block corresponding to the current node, and the height of the current node is the height of the luminance block corresponding to the current node. In a particular implementation, for example, the second threshold may be 4.
Case two:
1) if the chroma block of at least one child node B of node A is 2x4, or 4x2, in size;
2) if the width or height of the chroma block of at least one child node B of node A is 2;
3) if node a contains 128 luma pixels and node a uses a ternary tree partition, or if node a contains 64 luma pixels and node a uses a binary tree partition or a quaternary tree partition or a ternary tree partition;
4) if node a contains 256 luma pixels and the nodes use either a ternary tree partition or a quaternary tree partition, or if node a contains 128 luma pixels and the nodes use a binary tree partition;
5) if node A contains N1 luma pixels and node A uses a ternary tree partition, N1 is 64, 128, or 256.
6) If node A contains N2 luma pixels and node A uses quadtree partitioning, N2 is 64 or 256.
7) If node A contains N3 luma pixels and node A uses binary tree partitioning, N3 is 64, 128, or 256.
It should be noted that the node a includes 128 luminance pixel points, which may also be described as that the area of the current node is 128, or that the product of the width and the height of the node a is 128, which is not described herein again.
And step 3: the same as step 3 of the first video decoding method described above.
And 4, step 4: the division mode of the chroma block and the luma block of the node a is determined according to the prediction mode used by the coding unit of the node a coverage area.
And if the coding units of the area covered by the node A all use the inter-frame prediction mode, dividing the brightness block and the chrominance block of the node A according to the dividing mode S to obtain the node A or/and the child nodes in the area covered by the node A. If a 4 × 4 luminance block is generated according to the node a or/and the division manner of the sub-node in the area covered by the node a, the division manner of the sub-node is not allowed or the sub-node cannot continue to be divided. For example, if node a is 8x8 in size and two 8x4 (or two 4x8) nodes are generated using a horizontal binary tree (or vertical binary tree) split, a continued split of 8x4 (or 4x8) nodes would generate a 4x4 block, and thus, a node of 8x4 (or 4x8) cannot continue to split at this point.
If the coding units in the coverage area of the node a all use the intra prediction mode, the implementation method may use the first, second, and third video decoding methods, which are not described herein again. For example, the luminance block of node a is divided, and the chrominance block is not divided.
And 5: and analyzing prediction block and residual error information of the CU obtained by the node A division.
Similar to step 5 of the first video decoding method, further description is omitted here.
Step 6: decoding each CU to obtain a reconstructed signal of the image block corresponding to the node A
It can be implemented as step 6 of the first video decoding method, and will not be described herein.
A fifth video decoding method provided in an embodiment of the present application includes:
step 1: the same as step 1 of the first video decoding method described above.
Step 2: and judging whether at least one brightness block with the brightness block of 4x4 of the child node B exists in the child nodes obtained by the node A through division according to the division mode S (judging whether the width, the height and/or the division mode of the node A and/or the width and the height of the node B meet at least one of the conditions in the first case). And if the size (width, height) of the node A and/or the dividing mode S meet at least one condition in the first condition, limiting all coding units in the coverage area of the node A to use the intra-frame prediction.
Or, determining whether at least one chroma block of the child node B is a small block in the child nodes obtained by the node a through partitioning according to the partitioning manner S (determining whether the size of the node a, and/or the partitioning manner S, and/or the width and height of the node B satisfy at least one of the conditions in the second case, then executing steps 3 to 6.
Specifically, the method for determining that the chroma block of at least one child node B of the node a is a small block is divided into the following two cases.
The first condition is as follows:
if one or more of the following preset conditions are met, the node a is divided according to the dividing mode S to obtain a luminance block of 4 × 4:
1) node A contains M1 pixels and is divided in a quadtree division, e.g., M1 is 64;
2) node A contains M2 pixels and is divided in a ternary tree, for example, M2 is 64;
3) node a contains M3 pixels and the partition of node a is a binary tree partition, e.g., M3 is 32;
4) the width of the node A is equal to 4 times of the second threshold value, the height of the node A is equal to the second threshold value, and the node A is divided in a vertical ternary tree manner;
5) the width of the node A is equal to a second threshold value, the height of the node A is equal to 4 times of the second threshold value, and the node A is divided in a horizontal ternary tree manner;
6) the width of the node A is equal to 2 times of the second threshold value, the width is higher than the second threshold value, and the division mode of the current node is vertical bisection;
7) the height of the node A is equal to 2 times of the second threshold, the width of the node A is equal to the second threshold, and the division mode of the current node is horizontal halving;
8) the width or/and height of the node A is 2 times of the second threshold value, and the division mode of the node A is quadtree division.
The size may be the width and height of the image region corresponding to the node a, the number of luminance pixels included in the image region corresponding to the node a, or the area of the image region corresponding to the node a.
In general, the width of the current node is the width of the luminance block corresponding to the current node, and the height of the current node is the height of the luminance block corresponding to the current node. In a particular implementation, for example, the second threshold may be 4.
Case two:
1) if the chroma block of at least one child node B of node A is 2x4, or 4x2, in size;
2) if the width or height of the chroma block of at least one child node B of node A is 2;
3) if node a contains 128 luma pixels and node a uses a ternary tree partition, or if node a contains 64 luma pixels and node a uses a binary tree partition or a quaternary tree partition or a ternary tree partition;
4) if node a contains 256 luma pixels and the nodes use either a ternary tree partition or a quaternary tree partition, or if node a contains 128 luma pixels and the nodes use a binary tree partition;
5) if node A contains N1 luma pixels and node A uses a ternary tree partition, N1 is 64, 128, or 256.
6) If node A contains N2 luma pixels and node A uses quadtree partitioning, N2 is 64 or 256.
7) If node A contains N3 luma pixels and node A uses binary tree partitioning, N3 is 64, 128, or 256.
It should be noted that the node a includes 128 luminance pixel points, which may also be described as that the area of the current node is 128, or that the product of the width and the height of the node a is 128, which is not described herein again.
And step 3: the same as step 3 of the first video decoding method described above.
And 4, step 4: the division mode of the chroma block and the luma block of the node a is determined according to the prediction mode used by the coding unit of the node a coverage area.
And if the coding units of the area covered by the node A all use the inter-frame prediction mode, dividing the brightness block and the chrominance block of the node A according to the dividing mode S to obtain the node A or/and the child nodes in the area covered by the node A. If a 4 × 4 luminance block is generated according to the node a or/and the division manner of the sub-node in the area covered by the node a, the division manner of the sub-node is not allowed or the sub-node cannot continue to be divided. For example, if node a is 8x8 in size and two 8x4 (or two 4x8) nodes are generated using a horizontal binary tree (or vertical binary tree) split, a continued split of 8x4 (or 4x8) nodes would generate a 4x4 block, and thus, a node of 8x4 (or 4x8) cannot continue to split at this point.
If the coding units in the coverage area of the node a all use the intra prediction mode, the implementation method may use the first, second, and third video decoding methods, which are not described herein again. For example, the luminance block of node a is divided, and the chrominance block is not divided.
And 5: and analyzing prediction block and residual error information of the CU obtained by the node A division.
Similar to step 5 of the first video decoding method, further description is omitted here.
Step 6: decoding each CU to obtain a reconstructed signal of the image block corresponding to the node A
It can be implemented as step 6 of the first video decoding method, and will not be described herein.
In some embodiments, if the current region is divided once, which would result in a 4x4 luma block (e.g., 64 luma pixels divided using QT, or 128 luma pixels divided using TT), the current region defaults to being restricted to using Intra mode only.
Otherwise, transmitting a flag to indicate that the current area can only use the inter mode or only use the intra mode;
if the current region constraint can only use inter mode, then luminance and chrominance are partitioned together, where such partitioning is not allowed if the node partitioning in the current region yields a 4x4 luminance block. For example, if the current node is 8x8 and two 8x4 nodes are generated using HBT (or VBT) partitioning, then these nodes continuing partitioning will generate 4x4CU, so these 8x4 nodes cannot continue partitioning.
If the region limitation can only use the Intra mode, the same as the embodiment of the first embodiment (i.e. luminance division, chrominance division not).
Beneficial effect that this application technical scheme brought
The embodiment of the application provides a block division method, which avoids the occurrence of the situation that a chroma block with a smaller area uses an intra-frame prediction mode, facilitates the pipeline processing of hardware and the realization of a decoder, and can skip the analysis process of syntax elements of some prediction modes in inter-frame prediction, thereby reducing the encoding complexity.
The problem of coefficient coding is solved, and the coding complexity is reduced.
The block division method may be as follows:
partition mode of analysis node A
And judging whether the node A is divided according to the division mode S to obtain the small chroma block of at least one sub-node B. (determination of whether the Width, height, and/or division of node A, and/or the Width and height of node B meet at least one of the above conditions)
If the judgment result is true, all the coding units in the coverage area of the node A are limited to be in the intra-frame prediction mode or the inter-frame prediction mode.
And deciding whether the chroma block and the luma block of the node A are continuously divided.
If all coding units in the coverage area of the node A use intra-frame prediction, the luminance block of the node A is continuously divided according to the dividing mode S, and the chrominance block of the node A is not divided any more. If all coding units in the coverage area of the node A use inter-frame prediction, the luminance block and the chrominance block of the node A are continuously divided into N coding tree nodes containing the luminance block and the chrominance block according to a dividing mode S.
And the luminance block of the node A is continuously divided according to the dividing mode S, and the chrominance block of the node A is not divided any more. The chroma transform block and the chroma coding block are the same size.
When all coding units in the coverage area of the node A use intra-frame prediction, the chroma prediction block and the chroma coding block have the same size; when all coding units in the coverage area of the node A use inter-frame prediction, the chroma prediction block is divided into sub-blocks (the sub-blocks are smaller than the chroma coding block), and the motion vector of each sub-block is the motion vector in the luminance area corresponding to the sub-block.
The brightness blocks of the node A are continuously divided according to a dividing mode S; the chroma block of node a is not divided. The chroma transformation block and the chroma coding block corresponding to the chroma coding block have the same size, the chroma prediction block and the chroma coding block have the same size, and the motion information of the chroma CB is the motion information of a certain preset position in a brightness region corresponding to the chroma CB.
Those of skill in the art will appreciate that the functions described in connection with the various illustrative logical blocks, modules, and algorithm steps described in the disclosure herein may be implemented as hardware, software, firmware, or any combination thereof. If implemented in software, the functions described in the various illustrative logical blocks, modules, and steps may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. The computer-readable medium may include a computer-readable storage medium, which corresponds to a tangible medium, such as a data storage medium, or any communication medium including a medium that facilitates transfer of a computer program from one place to another (e.g., according to a communication protocol). In this manner, a computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium, such as a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described herein. The computer program product may include a computer-readable medium.
By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory tangible storage media. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The instructions may be executed by one or more processors, such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, Application Specific Integrated Circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Thus, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. Additionally, in some aspects, the functions described by the various illustrative logical blocks, modules, and steps described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.
The techniques of this application may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an Integrated Circuit (IC), or a set of ICs (e.g., a chipset). Various components, modules, or units are described in this application to emphasize functional aspects of means for performing the disclosed techniques, but do not necessarily require realization by different hardware units. Indeed, as described above, the various units may be combined in a codec hardware unit, in conjunction with suitable software and/or firmware, or provided by an interoperating hardware unit (including one or more processors as described above).
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The above description is only an exemplary embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (28)

1. A method of image prediction, the method comprising:
acquiring a division mode of a current node;
judging whether the current node is divided based on the dividing mode to obtain an image block with a preset size; the image block comprises a luminance block or a chrominance block;
and under the condition that the image blocks with the preset size are obtained by determining that the current node is divided based on the dividing mode, using intra-frame prediction for all coding blocks covered by the current node, or using inter-frame prediction for all coding blocks covered by the current node.
2. The method according to claim 1, characterized in that the following two cases both belong to the image block with preset size: the using intra prediction for all coding blocks covered by the current node or inter prediction for all coding blocks covered by the current node comprises:
judging whether the current node is divided based on the dividing mode to obtain the brightness block with the first preset size;
under the condition that a brightness block with a first preset size is obtained by determining that the current node is divided based on the dividing mode, intra-frame prediction is used for all coding blocks covered by the current node;
and under the condition that the brightness block with the first preset size cannot be obtained by dividing the current node based on the dividing mode, using intra-frame prediction for all coding blocks covered by the current node or using inter-frame prediction for all coding blocks covered by the current node.
3. The method of claim 1, wherein the image blocks with the preset size comprise luminance blocks with a first preset size, and the determining whether the current node is divided based on the dividing manner to obtain the image blocks with the preset size comprises:
And determining whether the current node is divided based on the dividing mode to obtain a brightness block with a first preset size or not according to the size of the current node and the dividing mode.
4. The method according to claim 3, wherein the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node in case that it is determined that the dividing of the current node based on the dividing manner would result in an image block with the preset size comprises:
and under the condition that the brightness block with the first preset size is obtained by determining that the current node is divided based on the dividing mode, using intra-frame prediction for all coding blocks covered by the current node.
5. The method of claim 3, wherein in the case that it is determined that partitioning the current node based on the partitioning manner does not result in a luminance block having a first preset size, the method further comprises: judging whether the current node is divided based on the dividing mode to obtain a chrominance block with a second preset size;
and under the condition that the chroma block with the second preset size is obtained by determining that the current node is divided based on the dividing mode, using intra-frame prediction for all coding blocks covered by the current node, or using inter-frame prediction for all coding blocks covered by the current node.
6. The method of claim 1, wherein the image blocks with the preset size comprise chrominance blocks with a second preset size, and the determining whether the current node is divided based on the dividing manner to obtain the image blocks with the preset size comprises:
and determining whether the current node is divided based on the dividing mode to obtain a chrominance block with a second preset size or not according to the size and the dividing mode of the current node.
7. The method according to claim 5 or 6, wherein in a case where it is determined that partitioning the current node based on the partitioning manner results in a chroma block with a second preset size, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node comprises:
analyzing the prediction mode state identification of the current node;
when the value of the prediction mode state identifier is a first value, using inter-frame prediction for all coding blocks covered by the current node; or, when the value of the prediction mode state identifier is a second value, intra-frame prediction is used for all coding blocks covered by the current node.
8. The method according to claim 5 or 6, wherein in a case where it is determined that partitioning the current node based on the partitioning manner results in a chroma block with a second preset size, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node comprises:
when the prediction mode of any coding block covered by the current node is inter-frame prediction, using inter-frame prediction for all coding blocks covered by the current node; or, when the prediction mode of any coding block covered by the current node is intra-frame prediction, using intra-frame prediction for all coding blocks covered by the current node.
9. The method of claim 8, wherein any coding block is the first coding block in decoding order of all coding blocks covered by the current node.
10. The method of claim 6, wherein in a case that it is determined that partitioning the current node based on the partitioning manner results in a chroma block with a second preset size, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node comprises:
Judging whether a brightness block with a first preset size is obtained by dividing the current node based on the dividing mode;
and under the condition that the luminance block with the first preset size is obtained by determining that the current node is divided based on the dividing mode of the current node, using intra-frame prediction for all coding blocks covered by the current node.
11. The method of claim 10, wherein in a case that it is determined that partitioning the current node based on the partitioning manner does not result in a luma block of a first preset size, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node comprises:
analyzing the prediction mode state identification of the current node;
when the value of the prediction mode state identifier is a first value, using inter-frame prediction for all coding blocks covered by the current node; or, when the value of the prediction mode state identifier is a second value, intra-frame prediction is used for all coding blocks covered by the current node.
12. The method of claim 10, wherein in a case that it is determined that partitioning the current node based on the partitioning manner does not result in a luma block of a first preset size, the using intra prediction for all coding blocks covered by the current node or using inter prediction for all coding blocks covered by the current node comprises:
When the prediction mode of any coding block covered by the current node is inter-frame prediction, using inter-frame prediction for all coding blocks covered by the current node; or, when the prediction mode of any coding block covered by the current node is intra-frame prediction, using intra-frame prediction for all coding blocks covered by the current node.
13. The method according to any of claims 1-12, wherein said using intra prediction for all coding blocks covered by said current node or inter prediction for all coding blocks covered by said current node comprises:
dividing the brightness blocks included in the current node according to the dividing mode to obtain divided brightness blocks, using intra-frame prediction on the divided brightness blocks, using the chrominance blocks included in the current node as chrominance coding blocks, and using the intra-frame prediction on the chrominance coding blocks; alternatively, the first and second electrodes may be,
dividing the luminance block included in the current node according to the dividing mode to obtain a divided luminance block, using inter-frame prediction on the divided luminance block, dividing the chrominance block included in the current node according to the dividing mode to obtain a divided chrominance block, and using inter-frame prediction on the divided chrominance block.
14. The method according to any of claims 1-12, wherein said using intra prediction for all coding blocks covered by said current node or inter prediction for all coding blocks covered by said current node comprises:
dividing the brightness blocks included in the current node according to the dividing mode to obtain divided brightness blocks, using intra-frame prediction on the divided brightness blocks, using the chrominance blocks included in the current node as chrominance coding blocks, and using the intra-frame prediction on the chrominance coding blocks; alternatively, the first and second electrodes may be,
and dividing the brightness block included by the current node according to the dividing mode to obtain a divided brightness block, using inter-frame prediction on the divided brightness block, using the chrominance block included by the current node as a chrominance coding block, and using inter-frame prediction on the chrominance coding block.
15. The method according to any of claims 1-12, wherein in case inter prediction is used for all coding blocks covered by the current node, the using inter prediction for all coding blocks covered by the current node comprises:
dividing the current node according to the division mode of the current node to obtain child nodes of the current node; determining an unallowable partition mode of the child node of the current node according to the size of the child node of the current node;
Determining a fast partitioning strategy of the child nodes of the current node according to the unallowed partitioning modes of the child nodes of the current node;
and obtaining a coding block corresponding to the child node of the current node according to the block division strategy of the child node of the current node, and using inter-frame prediction for the corresponding coding block.
16. An image prediction apparatus comprising:
the acquisition module is used for acquiring the division mode of the current node;
the judging module is used for judging whether the current node is divided based on the dividing mode to obtain an image block with a preset size; the image block comprises a luminance block or a chrominance block;
and the execution module is used for using intra-frame prediction for all coding blocks covered by the current node or using inter-frame prediction for all coding blocks covered by the current node under the condition that the image block with the preset size is obtained by determining that the current node is divided based on the dividing mode.
17. A video encoding device comprising a processor and a memory for storing executable instructions for the processor; wherein the processor performs the method of any one of claims 1-15.
18. A video decoding device comprising a processor and a memory for storing executable instructions of the processor; wherein the processor performs the method of any one of claims 1-15.
19. An image prediction system, comprising: video capture device, video encoding device according to claim 17, video decoding device according to claim 18 and display device, the video encoding device being connected to the video capture device and the video decoding device respectively, the video decoding device being connected to the display device.
20. A computer-readable storage medium, having stored thereon a computer program for execution by a processor to perform the method of any one of claims 1-15.
21. A method of image prediction, the method comprising:
acquiring a division mode of a current node, wherein the current node is an image block in a coding tree unit (coding tree) in a current image;
judging whether the current node meets a first condition or not according to the division mode of the current node and the size of the current node;
And under the condition that the current node is determined to meet the first condition, performing intra-frame prediction on all coding blocks belonging to the current node, so as to obtain the predicted values of all the coding blocks belonging to the current node.
22. The method of claim 21, wherein if it is determined that the current node does not satisfy the first condition, the method further comprises:
judging whether the current node meets a second condition or not according to the division mode of the current node and the size of the current node;
and under the condition that the current node is determined to meet the second condition, predicting all the coding blocks belonging to the current node by using the same prediction mode so as to obtain the predicted values of all the coding blocks belonging to the current node, wherein the prediction mode is intra-frame prediction or inter-frame prediction.
23. The method of claim 22, wherein predicting using the same prediction for all coding blocks belonging to the current node comprises:
analyzing the prediction mode state identification of the current node;
under the condition that the value of the prediction mode state identifier is a first value, performing inter-frame prediction on all coding blocks belonging to the current node; or, performing intra-frame prediction on all coding blocks belonging to the current node under the condition that the value of the prediction mode state identifier is a second value.
24. The method according to claim 22 or 23, wherein said inter-predicting all coding blocks belonging to the current node comprises:
dividing the current node according to the division mode of the current node to obtain child nodes of the current node;
determining an unallowable partition mode of the child node of the current node according to the size of the child node of the current node;
determining a fast partitioning strategy of the child nodes of the current node according to the unallowed partitioning modes of the child nodes of the current node;
and obtaining a coding block corresponding to the child node of the current node according to the block division strategy of the child node of the current node, and using inter-frame prediction for the corresponding coding block.
25. The method according to any of claims 21 to 23, wherein said intra-predicting all coding blocks belonging to the current node comprises:
dividing the brightness block included in the current node according to the dividing mode to obtain the divided brightness block, using intra-frame prediction on the divided brightness block, using the chroma block included in the current node as a chroma coding block, and using the intra-frame prediction on the chroma coding block.
26. A method of image prediction, the method comprising:
acquiring a division mode of a current node, wherein the current node is an image block in a coding tree unit (coding tree) in a current image;
judging whether the current node meets a preset condition or not according to the division mode of the current node and the size of the current node;
and under the condition that the current node is determined to meet the preset condition, predicting all the coding blocks belonging to the current node by using the same prediction mode so as to obtain the predicted values of all the coding blocks belonging to the current node, wherein the prediction mode is intra-frame prediction or inter-frame prediction.
27. The method of claim 26, wherein predicting using the same prediction for all coding blocks belonging to the current node comprises:
analyzing the prediction mode state identification of the current node;
under the condition that the value of the prediction mode state identifier is a first value, performing inter-frame prediction on all coding blocks belonging to the current node; or, performing intra-frame prediction on all coding blocks belonging to the current node under the condition that the value of the prediction mode state identifier is a second value.
28. The method according to claim 26 or 27, wherein said inter-predicting all coding blocks belonging to the current node comprises:
dividing the current node according to the division mode of the current node to obtain child nodes of the current node;
determining an unallowable partition mode of the child node of the current node according to the size of the child node of the current node;
determining a fast partitioning strategy of the child nodes of the current node according to the unallowed partitioning modes of the child nodes of the current node;
and obtaining a coding block corresponding to the child node of the current node according to the block division strategy of the child node of the current node, and using inter-frame prediction for the corresponding coding block.
CN201910696741.0A 2018-08-28 2019-07-30 Image prediction method, device, equipment, system and storage medium Pending CN111669583A (en)

Priority Applications (40)

Application Number Priority Date Filing Date Title
BR112021003269-0A BR112021003269A2 (en) 2018-08-28 2019-08-28 image and device partitioning method
JP2021510741A JP7204891B2 (en) 2018-08-28 2019-08-28 Picture partitioning method and apparatus
AU2019333452A AU2019333452B2 (en) 2018-08-28 2019-08-28 Picture partitioning method and apparatus
PCT/CN2019/103094 WO2020043136A1 (en) 2018-08-28 2019-08-28 Picture partition method and device
ES19855934T ES2966509T3 (en) 2018-08-28 2019-08-28 Image partition method and device
KR1020217008065A KR102631517B1 (en) 2018-08-28 2019-08-28 Picture segmentation method and device
CA3110477A CA3110477C (en) 2018-08-28 2019-08-28 Picture partitioning method and apparatus
HUE19855934A HUE064218T2 (en) 2018-08-28 2019-08-28 Picture partition method and device
KR1020247003066A KR20240017109A (en) 2018-08-28 2019-08-28 Picture partitioning method and apparatus
NZ773632A NZ773632A (en) 2018-08-28 2019-08-28 Picture partitioning method and apparatus
PT198559346T PT3836542T (en) 2018-08-28 2019-08-28 Picture partition method and device
MX2021002396A MX2021002396A (en) 2018-08-28 2019-08-28 Picture partition method and device.
EP23200770.8A EP4387224A1 (en) 2018-08-28 2019-08-28 Picture partitioning method and apparatus
EP19855934.6A EP3836542B1 (en) 2018-08-28 2019-08-28 Picture partition method and device
PCT/CN2020/070976 WO2020143684A1 (en) 2019-01-08 2020-01-08 Image prediction method, device, apparatus and system and storage medium
EP20738949.5A EP3907988A4 (en) 2019-01-08 2020-01-08 Image prediction method, device, apparatus and system and storage medium
CN202111475069.6A CN114173114B (en) 2019-01-08 2020-01-08 Image prediction method, device, equipment, system and storage medium
KR1020237043658A KR20240005108A (en) 2019-01-08 2020-01-08 Image prediction method, apparatus, and system, device, and storage medium
KR1020217025090A KR102616713B1 (en) 2019-01-08 2020-01-08 Image prediction methods, devices and systems, devices and storage media
MX2021008340A MX2021008340A (en) 2019-01-08 2020-01-08 Image prediction method, device, apparatus and system and storage medium.
CN202111468095.6A CN114245113B (en) 2019-01-08 2020-01-08 Image prediction method, device, equipment, system and storage medium
CN202080001551.3A CN112075077B (en) 2019-01-08 2020-01-08 Image prediction method, device, equipment, system and storage medium
AU2020205376A AU2020205376B2 (en) 2019-01-08 2020-01-08 Image prediction method, device, apparatus and system and storage medium
CN202111467815.7A CN114157864B (en) 2019-01-08 2020-01-08 Image prediction method, device, equipment, system and storage medium
CA3125904A CA3125904A1 (en) 2019-01-08 2020-01-08 Image prediction method, apparatus, system, device, and storage medium for processing performance and speed
JP2021539883A JP7317973B2 (en) 2019-01-08 2020-01-08 IMAGE PREDICTION METHOD, DEVICE AND SYSTEM, APPARATUS AND STORAGE MEDIUM
BR112021013444-1A BR112021013444A2 (en) 2019-01-08 2020-01-08 IMAGE, DEVICE, AND STORAGE MEDIA METHOD, DEVICE AND PREDICTION SYSTEM
PH12021550378A PH12021550378A1 (en) 2018-08-28 2021-02-22 Picture partitioning method and apparatus
CL2021000494A CL2021000494A1 (en) 2018-08-28 2021-02-26 Image partitioning method and apparatus
ZA2021/01354A ZA202101354B (en) 2018-08-28 2021-02-26 Picture partitioning method and apparatus
US17/187,184 US11323708B2 (en) 2018-08-28 2021-02-26 Picture partitioning method and apparatus
IL281144A IL281144A (en) 2018-08-28 2021-02-28 Picture partition method and device
US17/369,350 US11388399B2 (en) 2019-01-08 2021-07-07 Image prediction method, apparatus, and system, device, and storage medium
US17/734,829 US11758134B2 (en) 2018-08-28 2022-05-02 Picture partitioning method and apparatus
US17/843,798 US11849109B2 (en) 2019-01-08 2022-06-17 Image prediction method, apparatus, and system, device, and storage medium
JP2022212121A JP2023038229A (en) 2018-08-28 2022-12-28 Picture partitioning method and apparatus
JP2023117695A JP2023134742A (en) 2019-01-08 2023-07-19 Image prediction method, device, and system, apparatus, and storage medium
US18/360,639 US20230370597A1 (en) 2018-08-28 2023-07-27 Picture partitioning method and apparatus
AU2023229600A AU2023229600A1 (en) 2018-08-28 2023-09-15 Picture partitioning method and apparatus
US18/503,304 US20240146909A1 (en) 2019-01-08 2023-11-07 Image prediction method, apparatus, and system, device, and storage medium

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201910173454 2019-03-07
CN2019101734541 2019-03-07
CN2019102194409 2019-03-21
CN201910219440 2019-03-21

Publications (1)

Publication Number Publication Date
CN111669583A true CN111669583A (en) 2020-09-15

Family

ID=72382419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910696741.0A Pending CN111669583A (en) 2018-08-28 2019-07-30 Image prediction method, device, equipment, system and storage medium

Country Status (1)

Country Link
CN (1) CN111669583A (en)

Similar Documents

Publication Publication Date Title
CN112075077B (en) Image prediction method, device, equipment, system and storage medium
CN111327904B (en) Image reconstruction method and device
CN113491132B (en) Video image decoding method, video image encoding method, video image decoding device, video image encoding device, and readable storage medium
CN111355951A (en) Video decoding method, device and decoding equipment
CN112055211B (en) Video encoder and QP setting method
CN111327894B (en) Block division method, video coding and decoding method and video coder and decoder
CN111294603A (en) Video coding and decoding method and device
CN113316939A (en) Context modeling method and device for zone bit
CN111901593A (en) Image dividing method, device and equipment
CN112135148B (en) Non-separable transformation method and device
WO2020143684A1 (en) Image prediction method, device, apparatus and system and storage medium
CN111669583A (en) Image prediction method, device, equipment, system and storage medium
WO2020119742A1 (en) Block division method, video encoding and decoding method, and video codec
CN112135129A (en) Inter-frame prediction method and device
CN111726630A (en) Processing method and device based on triangular prediction unit mode

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination